Changes

Srashti Goyal · 7d3312a1
--- a/Code-Review.md
+++ b/Code-Review.md
+{:toc}
 * [Introduction](#introduction)
 * [Overview](#overview)
 * [Code review plan](#code-review-plan)
@@ -10,38 +11,6 @@
  * [ML models: Training, Cross-validation, Optimisation, Testing, Comparison with BLU, Investigations.](#ml-models-training-cross-validation-optimisation-testing-comparison-with-blu-investigations)
  * [ML Predictions: O3 Real events, Data preparation, FAP computation, Comparison with BLU](#ml-predictions-o3-real-events-data-preparation-fap-computation-comparison-with-blu)
    * [O3 analysis in git repo: lensid-ml-o3](#o3-analysis-in-git-repo-lensid-ml-o3)
-* [Meetings](#meetings)
-  * [Presentation Slides](#presentation-slides)
-  * [7 May 2021](#7-may-2021)
-    * [Action items:](#action-items)
-  * [14 May 2021](#14-may-2021)
-    * [Action items:](#action-items-1)
-  * [21 May 2021](#21-may-2021)
-    * [Action items:](#action-items-2)
-  * [28 May 2021](#28-may-2021)
-    * [Action items:](#action-items-3)
-  * [4 June 2021](#4-june-2021)
-    * [Action items:](#action-items-4)
-  * [11 June 2021](#11-june-2021)
-    * [Action items:](#action-items-5)
-  * [18 June 2021](#18-june-2021)
-    * [Action items:](#action-items-6)
-  * [2 July 2021](#2-july-2021)
-    * [Action items:](#action-items-7)
-  * [22 July 2021](#22-july-2021)
-    * [Action items:](#action-items-8)
-  * [23 July 2021](#23-july-2021)
-    * [Action items:](#action-items-9)
-  * [06 August 2021](#06-august-2021)
-    * [Action items:](#action-items-10)
-  * [13 August 2021](#13-august-2021)
-    * [Action items:](#action-items-11)
-  * [20 August 2021](#20-august-2021)
-    * [Action items:](#action-items-12)
-  * [24 & 27 August 2021](#24-27-august-2021)
-    * [Action items:](#action-items-13)
-  * [7 September 2021](#7-september-2021)
-    * [Action items:](#action-items-14)
 # Introduction
@@ -410,103 +379,3 @@ Meeting ID: 860 7262 9011 Password: 001303
 * [x] Look into skymap calculations, theta to declination conversion as JR pointed out.
 * [x] Wrap up the code review.
-## 3 December 2021
-* Discuss the plot going to the paper. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/O3_ML_gwtc3/plot_ml_blu_predictions.ipynb)
-* Skymap sanity check.[notebook](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/review/hp_cartview.ipynb)
-* XGBQT optimisation, include phenom and include power exploration etc. 
-* Updated [results](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/tree/master/O3_ML_gwtc3/results_3dec_kaggle) with more events considered by Posterior overlap analysis. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/O3_ML_gwtc3/get_candidates_compare_to_blu_tagged.ipynb)
-### Action items:
-* [ ] Fix learning rates for densenets as JR suggested.
-* [x] Compare ML FPP and BLU FPP with [GOLUM's CLU]. (https://docs.google.com/spreadsheets/d/1wrpzwudP1MbraJNlCB6arnTCzbeuCPLBx16eReMdQvM/edit#gid=899668020). [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/comparison_with_golum.ipynb)
-* [x] Share updated results with Justin, after optimisation perhaps.
-* [x] Figure out why the unlensed pairs FPP in O3a background and in test set is not going below 1e-3, whereas test lensed pairs is going upto 1e-5.
-* [x] Produce whitened QTs results.
-* [x] optimise XGB with QTs with, more info,  missing data etc. [script](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/development/optimise_densenets.py) [notebook](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/development/optimise_XGB_QTs.ipynb)
-## 10 December 2021
-* The O3 results status update [page](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/wikis/O3-result-updates) 
-* Discuss about the plots and FPPs going to the paper.
-* Corner plots for the image pairs.
-* Discussed about a few events which have BLU FPP 1 but ML FPP<1e-2.
-### Action items:
-* [x] Write a script or notebook for investigating a particular pair. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigations_visualisations.ipynb)
-## 21 December 2021
-* We discussed the ML and BLU results comparison to GOLUM. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/comparison_with_golum.ipynb)
-* We investigated the 4 outliers for the ML FPP v/s GOlum CLU and found them to be all associated with a same single detector event, using this [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigations_visualisations.ipynb).
-* JR suggested to test machine learning with whitening, and investigate the issue with single det events like this further.
-* the next meeting call shall be on January 4, 3 PM CEST most likely.
-### Action items:
-* [x] Investigate the ML with QTs for the 4 outliers.
-* [x] Make a result review page. [O3b lensing result review](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Result-Review-O3b-lensing)
-## 5 January 2022
-* We discussed the performance of XGBoost with QTs for the 2 detector real events.
-* Imputing the dense outputs(i.e. input features of XGBoost) for the missing data help but is not logically understood. (notebook)[https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts.ipynb] 
-*  We converged on the training configurations for the final ML results for O3.
-### Action items:
-* [x] Retrain XGBoost by putting NaN's randomly and without filling 1 in place of NaNs.
-## 12 January 2022
-* We discussed about training XGBoost with missing values. i.e. for single or double detector events. Imputing before training is the only option seems feasible.
-* Imputing using mean/median strategy is implemented [here](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/tree/master/review). Another possibility is to do regression like [this](https://medium.com/swlh/impute-missing-values-the-right-way-c63735fccccd) post.
-* We also discussed the performance of the current pipeline with simulated super subthreshold lensed and unlensed event pairs. The performance is not bad. [notebook](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/subthreshold/test_super_sub_pairs.ipynb)
-* Lastly we saw that [pair S190412m & S200129m](https://docs.google.com/presentation/d/1k-xPvD8iAknxt-JVfI-Fbr4SfS_Adnlh3Q8u1fR4KJc/edit#slide=id.p) which is thought to be type 2 image by Golum, doesn't seem to be lensed as is eliminated by both PO and lensid pipeline and also has almost zero mass and sky overlaps, as seen [here](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/967e30eb7166546b8a858cdf68ab0a5275730119/review/investigations_visualisations.ipynb) 
-* We will meet next Wednesday 3 pm CEST.
-### Action items:
-* [x] Solve single/double detector issue.
-* [x] Produce results with whitening for the O3.
-* [x] Produce the final O3 results asap.
-* [x] Produce preliminary results for super-sub pairs [found](https://sites.google.com/ligo.org/gstlal-sensedb/search?) by Alvin here.
-## 25 January 2022
-* We discussed the preliminary results for the O3 super subthreshold pairs.[notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/subthreshold/get_candidates_super_sub.ipynb)
-* We discussed the ML and BLU correlations with GOLUM and the changes after using imputation for single/double detector events.
-*  We also  discussed the variation in results after retraining and including whitening . After retraining the ML is more tight i.e. says lesser no . of pairs  as lensed at FPP <1e-2.  We decided to continue with the older machines itself (with imputing) as whitening doesnot show much improvement and the pairs have been already been followed up by the other pipelines. However we also hope to investigate these further in future. 
-* We  [merged](https://git.ligo.org/srashti.goyal/lensid/-/merge_requests/2) the changes to the pipeline incorporated to include different data directories for super and sub threshold events , and  the minor changes for  single double det events in the ML with QTs.
-### Action items:
-* [ ] Gather injection parameters for O3b background from Apratim.
-* [x] Update code review and result review page.
-* [x] Update the plot that is supposed to go into the paper.
-* [x] Keep investigating and improving. Make a note of the possible improvements for O4 and O5.
-## 10th and 16th February 2022
-*  We discussed about the two different strategies for imputation: sklearn's median and verstack's regression. Seems that sklearn is a little better choice. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigate_missing_strain_ML_qts.ipynb).
-* We also discussed about skymaps getting affected due two no. of detectors and hence their chance of overlap. This issue could be addressed perhaps by simulating single/double detector events background, for O4 development.
-* We also discussed that in ML with QTs the relation between the waveform from different detectors is not taken into account, also the skymaps and QTs are put on equal footing while calculating the FPPs. This is again something to think about for O4.
-* We discussed about the pair (S191103a-S191105e) that had highest CLU with hanabi so far, and is selected by PO analysis but is missed by the ML lensid pipeline.[slides](https://docs.google.com/presentation/d/1d_f5mUeT5PonFH3dd0c0VxlPxo9Hwd4uYOJRkhocRpQ/edit#slide=id.p). There are two possible reasons, one is that bayestar skymaps **used** is incorrect or not good and the ML is not doing good with QTs for low mass binaries. 
-* The ML is not trained with good no. of low mass binaries, which JR suggested should be in development for O4.
-### Action items:
-* [ ] Check the bayestar skymaps for all events, specially for S191103a-S191105e, and re-do the analysis based on correct channels and bayestar skymaps.
-* [ ]  Make note of all possible improvements for O4 .