Script for downloading events info from CBCFlow (path loaded on CIT), skymaps(.fits) from GraceDB and strain data from ligo servers using GWpy. Note: needs valid ligo key path for accessing non-public data in Gracedb., eg: `/tmp/x509up\\\\\\\*` Raw data on CIT is in `/home/srashti.goyal/lensid_O4/data_download_preparation/O4a_events_data`
</td>
<td>OK</td>
<td>
</td>
<td></td>
<td>include a link to the instructions for generating the LIGO key path in comments.</td>
<td>
</td>
<td></td>
</tr>
<tr>
<td>
...
...
@@ -86,79 +84,52 @@ Prepare cartesian skymaps and qtransforms for O4a the real events. Also filters
| [pop_datasets.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/retraining_for_O4/pop_datasets.ipynb) | Notebook having plots of injection parameters for training and testing. | | \-------------- | ||
|[PSD_plots.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/retraining_for_O4/PSD_plots.ipynb) | Comparison of the PSDs used for training and testing in O3 and O4 etc. |Done | OK, OK-Sourabh | 5f525a652d0374f11a207bc8158f7c75d37de884 |:heavy_check_mark: :heavy_check_mark: |
| ML QTs [L1](https://ldas-jobs.ligo.caltech.edu/~srashti.goyal/O4a_training/L1/uniform_config_lr_0.01_ep_15_bs_500/), [H1](https://ldas-jobs.ligo.caltech.edu/~srashti.goyal/O4a_training/uniform_config_lr_0.01_ep_15_bs_500/) | Production ML QTs models train directories. Uniform in masses for H1 and L1 | | | \-------------- | |
| [2vs3det.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/retraining_for_O4/2vs3det.ipynb) | Skymaps ML performance for HL v/s HLV | OK | 5f525a652d0374f11a207bc8158f7c75d37de884 | Why some values are NaN in DataFrame?, Why no orange curve in plots?, Choice of hyper parameters for training? - Sourabh | :heavy_check_mark: |
| [make_predictions.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/make_predictions.ipynb?ref_type=heads) | ML models benchmark performance and background estimations along with MDC results for the final model. | In development | | \-------------- | |
ML Models
O4a Predictions
| File(s) | Short description | Status | Comment | Reviewed |
| [pop_datasets.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/retraining_for_O4/pop_datasets.ipynb) | Notebook having plots of injection parameters for training and testing. | | | \-------------- |
| ML QTs [L1](https://ldas-jobs.ligo.caltech.edu/~srashti.goyal/O4a_training/L1/uniform_config_lr_0.01_ep_15_bs_500/), [H1](https://ldas-jobs.ligo.caltech.edu/~srashti.goyal/O4a_training/uniform_config_lr_0.01_ep_15_bs_500/) | | | \-------------- |
| [Config](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/config_o4a.yaml?ref_type=heads) | ML models and config for production runs | Needs to be finalised | | \-------------- |
|[make_predictions.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/make_predictions.ipynb?ref_type=heads) | ML models benchmark performance and background estimations along with MDC results for the final model. | In development | | \-------------- |
| [investigations_visualisations.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/investigations_visualisations.ipynb?ref_type=heads) | Investigating/Eyeballing pairs and compare the performances with the other pipelines | | | \-------------- |
| File(s) | Short description | Status | Comment | Git hash | Sign-off |
| [Config](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/config_o4a.yaml?ref_type=heads) | ML models and config for production runs | Needs to be finalised | | \-------------- || |
| [result dir](https://ldas-jobs.ligo.caltech.edu/~srashti.goyal/lensid_runs/O4a_alensid/result/) | ML models and config for production runs | Needs to be finalised | | \-------------- || |
| [investigations_visualisations.ipynb](https://git.ligo.org/srashti.goyal/lensid-ml-o4/-/blob/master/investigations_visualisations.ipynb?ref_type=heads) | Investigating/Eyeballing pairs and comparing the performances with the other pipelines | | | \-------------- ||
Publication results:
| File(s) | Short description | Status | Comment | Git hash | Sign-off |
| [Interesting Pairs]() | pairs passed for follow-up analysis | Needs to be finalised | | \-------------- || |
| [results_plot]() | plot of rejected and selected pairs | Needs to be finalised | | \-------------- || |
# Review Calls
...
...
@@ -234,7 +205,7 @@ The review call happen on Wednesdays 1 PM CEST/ 4:30 PM IST virtual IFPA room: h
## 8 November 2023
* We discussed the integration of lensid with the lensing flow and visited the new package : https://git.ligo.org/srashti.goyal/alensidforlensingflow
* JR suggested to use O4a real noise PSDs for the training and testing of the final ML model for the production runs. (detchar)\[https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary]
* JR suggested to use O4a real noise PSDs for the training and testing of the final ML model for the production runs. (detchar)\[https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary\]
* We are still unsure about the inclusion of the population and the time delay lensing priors for the follow-up strategy.
## Action items
...
...
@@ -242,16 +213,15 @@ The review call happen on Wednesdays 1 PM CEST/ 4:30 PM IST virtual IFPA room: h
* [x] Update the result review page with the new scripts.
* [x] Note the changes in the reviewed code.
## 26 January 2024
* We discussed the doubts of JR for PSDs and using threads.
* We think documenting the population stuff and 2 v/s 3 det stuff would be useful for the future.
* We discussed the doubts of JR for PSDs and using threads.
* We think documenting the population stuff and 2 v/s 3 det stuff would be useful for the future.
## 12 February 2024
* We discussed the first preliminary results for O4a: 94/3486 pairs have FPP < 0.01 threshold.
* Visual investigations and how the events are being passed on need to be fixed. Especially for m2<5Msun events which LensID doesn't consider.
* We discussed the first preliminary results for O4a: 94/3486 pairs have FPP \< 0.01 threshold.
* Visual investigations and how the events are being passed on need to be fixed. Especially for m2\<5Msun events which LensID doesn't consider.
* We discussed the configuration file and things that need to be finally reviewed.
* We also discussed the subthreshold events workflow and how to go about it.
...
...
@@ -263,14 +233,12 @@ The review call happen on Wednesdays 1 PM CEST/ 4:30 PM IST virtual IFPA room: h
* [x] Check event with the missing skymap.
* [x] Compare results with other pipelines.
We also want to prepare a document to quantify all the things. One main issue is the bias-variance trade-off. https://www.bmc.com/blogs/bias-variance-machine-learning/
## 19 February 2024
* We did the investigations of preliminary results.
* S230630bq and S231226av has some problem with skymaps.
* S230630bq and S231226av has some problem with skymaps.
* We compared the results with BLU and bhattacharya.
### Action items
...
...
@@ -280,7 +248,7 @@ We also want to prepare a document to quantify all the things. One main issue is
## 26 February 2024
* We discussed the preliminary results once again.
* We think that Bhattacharya distance <3 should be used as an additional quick cut for passing on the events in the flow.
* We think that Bhattacharya distance \<3 should be used as an additional quick cut for passing on the events in the flow.
* The `S230606d S231226av` event needs extrapolation and its skymap is inconsistent between PE and Gracedb
* We talked about O4b and integration choice with the new ML pipeline SLICK.
...
...
@@ -288,39 +256,33 @@ We also want to prepare a document to quantify all the things. One main issue is
* [x] Implement extrapolation while calculating FPP, for events in the edge.
## 13 May
* Discussed the outstanding action items.
* We also discussed if there is a need for an extra reviewer or analyst. As things are close to completion we don't think it's required.
* We went through the script for calculating FAPs and for low FAPs it needs some modifications.
* We discussed why SLICK might be doing better for QTs. One possibility is the training set size another is the way of training i.e. they freeze the initial 10 layers. We need to talk about the integration for O4a/O4b.
* We also discussed the population model while training. It is still a tough choice but we may want to use astro distribution as it seems to do well on Haris et al as well.
## 13 May
* [ ] Discussed the outstanding action items.
* We also discussed if there is a need for an extra reviewer or analyst. As things are close to completion we don't think it's required.
* We went through the script for calculating FAPs and for low FAPs it needs some modifications.
* We discussed why SLICK might be doing better for QTs. One possibility is the training set size another is the way of training i.e. they freeze the initial 10 layers. We need to talk about the integration for O4a/O4b.
* We also discussed the population model while training. It is still a tough choice but we may want to use astro distribution as it seems to do well on Haris et al as well.
* [x] Fix extrapolation or low FAP values that are going to 0.
* [x] Implement the BD <3 cut in the final is_lensing_favoured output.
* [x] Implement the BD \<3 cut in the final is_lensing_favoured output.
## 27 May 2024
* We discussed the preliminary results.
* We noticed that BD < 3 cut isn't that good given that PE and match-filter chirp mass can be very different.
* We noticed that BD \< 3 cut isn't that good given that PE and match-filter chirp mass can be very different.
* Saurabh is now on board with the results.
### Action items
* [x] Investigate the events with zero mass posterior overlaps.
## 14 June 2024
* We eyeballed events with zero mass overlaps but selected by lensID.
* Saurabh reviewed some of the scripts and discussed conceptual things regarding bhattacharya distance.
* We also discussed the findings of Adrien and also SAurabh that GWPy QTs are better than PYCBC ones and that seems to be one of reasons for improvement for SLICK.
* We also discussed the findings of Adrien and also SAurabh that GWPy QTs are better than PYCBC ones and that seems to be one of reasons for improvement for SLICK.
### Action items
* [ ] Prepare the scripts for result review.
* [ ] Train a final ML and background while optimising.
* [ ] Train a final ML and background while optimising.