Changes

Jean-Rene Cudell · b1322198
--- a/Result-Review-O3b-lensing.md
+++ b/Result-Review-O3b-lensing.md
 [[_TOC_]]

 # Introduction
+
 This repository deals with the result review of the machine learning based pipeline, [**lensid**](https://git.ligo.org/srashti.goyal/lensid), for the LVK O3b lensing analysis, to identify the potential strongly lensed candidate BBH event pairs.

 ## Useful links
+
 - Code review page:[here](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Code-Review/)
 - Code review statement: [here](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Code-review-statement-for-lensid)
 - O3 analysis repository: [here](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/tree/master/)
@@ -18,7 +20,6 @@ This repository deals with the result review of the machine learning based pipel
 ## Reviewer: Jean-Rene Cudell

 ## Scripts/Configs
-
 | Script | Short description | Status | git hash | Comment | final sign-off |
 |--------|-------------------|--------|----------|---------|----------------|
 | [data_download.py](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/data_download_preparation/data_download.py) | Script for downloading events info, skymaps(.fits) from GraceDB and strain data from ligo servers using GWpy. Note: needs valid ligo key path(line 26) for accessing non-public data in Gracedb., eg: `/tmp/x509up*` | DC: OK | 2e3215024a42c08081d612c3713ffd54fbba5f7e | ------- | -------------- |
@@ -27,7 +28,6 @@ This repository deals with the result review of the machine learning based pipel
 | [get_candidates_compare_to_blu_tagged.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/O3_ML_gwtc3/get_candidates_compare_to_blu_tagged.ipynb) | Notebook for comparing ML and BLU results for the full O3 catalogue of BBHs. | JR: OK | ---------- | --------- | ---------------- |
 | [condor_lensid_make_predictions.py](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/condor_lensid_make_predictions.py) | Condor Script for computing ML predictions and FPPs using, [ml_predict_workflow.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/ml_predict_workflow.py). Eg: `python condor_lensid_make_predictions` Note: change `exec_file_loc` in the script according to your installation and `odir` in config file. | OK-DC | ca47c5ce71fa7405b84c944235a8646abcd216d4 | --------- | ---------------- |

-
 ## Investigations
 | Notebook | Short description | Status | git hash | Comment | final sign-off |
 |----------|-------------------|--------|----------|---------|----------------|
@@ -38,23 +38,22 @@ This repository deals with the result review of the machine learning based pipel
 | [background_injections_ML_blu.ipynb](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/notebooks/O3a_events/background_injections_ML_blu.ipynb) | Notebook showing ML and BLU outputs for the background unlensed injections as simulated by Haris during O3a analysis. | JRC: OK | -------- | ------- | -------------- |
 | [investigations_visualisations.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigations_visualisations.ipynb) | Notebook to visualise qtransforms and skymaps of the interesting candidate real event pairs. | JRC: OK |  | --------- | ---------------- |
 | [optimise_densenets.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/development/optimise_densenets.py) | Optimise densenet learning rates, with and without whitening of strain | JRC: OK | ------- | --------- | ---------------- |
-|[missing_strain_ML_qts.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts.ipynb) | Implement XGBoost with QTs using imputed values instead of 1s(default) for the single/double detector real events. Additionally compare the results of PO and ML to Golum for the selected candidates. |  Needs further investigation  | ------- | --------- | ---------------- |
+| [missing_strain_ML_qts.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts.ipynb) | Implement XGBoost with QTs using imputed values instead of 1s(default) for the single/double detector real events. Additionally compare the results of PO and ML to Golum for the selected candidates. | JRC: 0K | ------- | --------- | ---------------- |

 ## Results

 In progress: [page](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/wikis/O3-result-updates)

 ## O3b paper
-
 | Item | Short description | Status | git hash | Comment | final sign-off |
-|----------|-------------------|--------|----------|---------|----------------|
+|------|-------------------|--------|----------|---------|----------------|
 | Final config | ------------------- | -------- | ---------- | --------- | ---------------- |
 | Final results summary | ------------------- | -------- | ---------- | --------- | ---------------- |
 | PO and ML plot | ------------------- | -------- | ---------- | --------- | ---------------- |

 ## Additional: Targeted Sub-threshold Search
 | Script/Notebook | Short description | Status | git hash | Comment | final sign-off |
-|--------|-------------------|--------|----------|---------|----------------|
+|-----------------|-------------------|--------|----------|---------|----------------|
 | [sub_data_download_prep.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/subthreshold/sub_data_download_prep.ipynb) | Subthreshold candidates data downloading and preparing Qtransforms and input sky features for ML analysis | JRC: OK | ---------- | --------- | ---------------- |
 | [config_O3_super_sub_events.yaml](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/subthreshold/configs/config_O3_super_sub_events.yaml) | Config used for producing the preliminary results with the `lensid_make_predictions` command line. | JRC: OK | ---------- | --------- | ---------------- |
 | [get_candidates_super_sub.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/subthreshold/get_candidates_super_sub.ipynb) | Notebook for loading the results and sorting candidates | JRC: OK | ---------- | --------- | ---------------- |
@@ -71,49 +70,51 @@ In progress: [page](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/wikis/O3-r
 ### Action items:

 * [x] Fix learning rates for densenets as JR suggested.
-* [x] Compare ML FPP and BLU FPP with [GOLUM's CLU]. (https://docs.google.com/spreadsheets/d/1wrpzwudP1MbraJNlCB6arnTCzbeuCPLBx16eReMdQvM/edit#gid=899668020). [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/comparison_with_golum.ipynb)
+* [x] Compare ML FPP and BLU FPP with \[GOLUM's CLU\]. (<https://docs.google.com/spreadsheets/d/1wrpzwudP1MbraJNlCB6arnTCzbeuCPLBx16eReMdQvM/edit#gid=899668020>). [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/comparison_with_golum.ipynb)
 * [x] Share updated results with Justin, after optimisation perhaps.
 * [x] Figure out why the unlensed pairs FPP in O3a background and in test set is not going below 1e-3, whereas test lensed pairs is going upto 1e-5.
 * [x] Produce whitened QTs results.
 * [x] optimise XGB with QTs with, more info, missing data etc. [script](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/development/optimise_densenets.py) [notebook](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/development/optimise_XGB_QTs.ipynb)

 ## 10 December 2021
+
 * The O3 results status update [page](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/wikis/O3-result-updates)
 * Discuss about the plots and FPPs going to the paper.
 * Corner plots for the image pairs.
 * Discussed about a few events which have BLU FPP 1 but ML FPP<1e-2.

 ### Action items:
+
 * [x] Write a script or notebook for investigating a particular pair. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigations_visualisations.ipynb)

 ## 21 December 2021
+
 * We discussed the ML and BLU results comparison to GOLUM. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/comparison_with_golum.ipynb)
 * We investigated the 4 outliers for the ML FPP v/s GOlum CLU and found them to be all associated with a same single detector event, using this [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigations_visualisations.ipynb).
 * JR suggested to test machine learning with whitening, and investigate the issue with single det events like this further.
 * the next meeting call shall be on January 4, 3 PM CEST most likely.

 ### Action items:
+
 * [x] Investigate the ML with QTs for the 4 outliers.
 * [x] Make a result review page. [O3b lensing result review](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Result-Review-O3b-lensing)

-
 ## 5 January 2022
+
 * We discussed the performance of XGBoost with QTs for the 2 detector real events.
-* Imputing the dense outputs(i.e. input features of XGBoost) for the missing data help but is not logically understood. (notebook)[https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts.ipynb] 
+* Imputing the dense outputs(i.e. input features of XGBoost) for the missing data help but is not logically understood. (notebook)\[<https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts.ipynb>\]
 * We converged on the training configurations for the final ML results for O3.

 ### Action items:
+
 * [x] Retrain XGBoost by putting NaN's randomly and without filling 1 in place of NaNs.

 ## 12 January 2022
-* We discussed about training XGBoost with missing values. i.e. for single or double detector events. Imputing before training is the only option seems feasible.

+* We discussed about training XGBoost with missing values. i.e. for single or double detector events. Imputing before training is the only option seems feasible.
 * Imputing using mean/median strategy is implemented [here](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/tree/master/review). Another possibility is to do regression like [this](https://medium.com/swlh/impute-missing-values-the-right-way-c63735fccccd) post.
-
 * We also discussed the performance of the current pipeline with simulated super subthreshold lensed and unlensed event pairs. The performance is not bad. [notebook](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/subthreshold/test_super_sub_pairs.ipynb)
-
 * Lastly we saw that [pair S190412m & S200129m](https://docs.google.com/presentation/d/1k-xPvD8iAknxt-JVfI-Fbr4SfS_Adnlh3Q8u1fR4KJc/edit#slide=id.p) which is thought to be type 2 image by Golum, doesn't seem to be lensed as is eliminated by both PO and lensid pipeline and also has almost zero mass and sky overlaps, as seen [here](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/967e30eb7166546b8a858cdf68ab0a5275730119/review/investigations_visualisations.ipynb)
-
 * We will meet next Wednesday 3 pm CEST.

 ### Action items:
@@ -126,58 +127,40 @@ In progress: [page](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/wikis/O3-r
 ## 25 January 2022

 * We discussed the preliminary results for the O3 super subthreshold pairs.[notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/subthreshold/get_candidates_super_sub.ipynb)
-
 * We discussed the ML and BLU correlations with GOLUM and the changes after using imputation for single/double detector events.
-
 * We also discussed the variation in results after retraining and including whitening . After retraining the ML is more tight i.e. says lesser no . of pairs as lensed at FPP <1e-2. We decided to continue with the older machines itself (with imputing) as whitening doesnot show much improvement and the pairs have been already been followed up by the other pipelines. However we also hope to investigate these further in future.
-
 * We [merged](https://git.ligo.org/srashti.goyal/lensid/-/merge_requests/2) the changes to the pipeline incorporated to include different data directories for super and sub threshold events , and the minor changes for single double det events in the ML with QTs.

 ### Action items:

 * [ ] Gather injection parameters for O3b background from Apratim.
-
 * [x] Update code review and result review page.
-
 * [x] Update the plot that is supposed to go into the paper.
-
 * [x] Keep investigating and improving. Make a note of the possible improvements for O4 and O5.

 ## 10th and 16th February 2022

 * We discussed about the two different strategies for imputation: sklearn's median and verstack's regression. Seems that sklearn is a little better choice. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/investigate_missing_strain_ML_qts.ipynb).
-
 * We also discussed about skymaps getting affected due two no. of detectors and hence their chance of overlap. This issue could be addressed perhaps by simulating single/double detector events background, for O4 development.
-
 * We also discussed that in ML with QTs the relation between the waveform from different detectors is not taken into account, also the skymaps and QTs are put on equal footing while calculating the FPPs. This is again something to think about for O4.
-
 * We discussed about the pair (S191103a-S191105e) that had highest CLU with hanabi so far, and is selected by PO analysis but is missed by the ML lensid pipeline.[slides](https://docs.google.com/presentation/d/1d_f5mUeT5PonFH3dd0c0VxlPxo9Hwd4uYOJRkhocRpQ/edit#slide=id.p). There are two possible reasons, one is that bayestar skymaps **used** is incorrect or not good and the ML is not doing good with QTs for low mass binaries.
-
 * The ML is not trained with good no. of low mass binaries, which JR suggested should be in development for O4.

 ### Action items:

 * [x] Check the bayestar skymaps for all events, specially for S191103a-S191105e, and re-do the analysis based on correct channels and bayestar skymaps.

-
 ## 1st and 9th March 2022

 * The input strains, Qtransforms, bayestar skymaps along with the PE skymaps are re-downloaded, using the channels and deglitched frames(using [config](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/data_download_preparation/events_config_golum_180222.json)) consistent with PE. Scripts modified: [data_download.py](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/data_download_preparation/data_download.py), [data_prepare.py](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/data_download_preparation/data_prepare.py).
-
 * Many of the events are double detector events in O3, hence we changed the strategy for getting predictions for ML with Qtransforms. Instead of using XGBoost machine we now multipy the DenseNet outputs for the individual detectors. [notebook](https://git.ligo.org/srashti.goyal/lensid-ml-o3/-/blob/master/review/missing_strain_ML_qts_product.ipynb)
-
 * We also discussed about the pair S191103a-S191105e. After using the PE skymaps ML FPP lowered, but still below the threshold. JR pointed out that the S191105e has weird looking Qtransform, in L1 detector, its not like a chirp but like a broken chirp with intensity in the middle rather than end.
-
 * We discussed about the latest results with ML, with new strategy, data and using PE skymaps. The candidates are followed up by Golum. We also discussed about how ML results compare with PO and Golum results. The lensing group is updated about it in a call. Slides 33-36 [here](https://docs.google.com/presentation/d/10bIhtFae5RIJ3WBJg1Lcy7PueSKwxh1m2APRDN0w0PA/edit?usp=sharing).

 ### Action items:

 * [ ] Update code review page with relevant git hashes and all the scripts/configs.
-
 * [ ] Visually inspect the pairs which are missed or preferred by ML, and try to identify the reason(s).
-
 * [ ] Check if the background is needed to be updated.
-
 * [ ] Make note of all possible improvements for O4 .
-
 * [ ] Make the plot that will go into the paper, with the latest ML and PO results.
\ No newline at end of file