|
|
|
[[_TOC_]]
|
|
|
|
|
|
|
|
# Intro
|
|
|
|
|
|
|
|
This page deals with the change in the pipeline after O3 run. For code review of the pipeline before that please refer to this [page](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Code-Review).
|
|
|
|
|
|
|
|
Reviewer: Jean-Rene Cudell
|
|
|
|
|
|
|
|
[**Installation instructions**](https://git.ligo.org/srashti.goyal/strong-lensing-ml/-/wikis/Installation-instructions)
|
|
|
|
|
|
|
|
# Final Statement of Code Review
|
|
|
|
|
|
|
|
Can be found [here](https://git.ligo.org/srashti.goyal/lensid/-/wikis/Code-review-statement-for-lensid-2023)
|
|
|
|
|
|
|
|
### Sign-off
|
|
|
|
|
|
|
|
* [ ] Jean-Rene:
|
|
|
|
|
|
|
|
|
|
|
|
# Relevant slides
|
|
|
|
[Till O3 method, review, results](https://docs.google.com/presentation/d/10bIhtFae5RIJ3WBJg1Lcy7PueSKwxh1m2APRDN0w0PA/edit?usp=sharing)
|
|
|
|
|
|
|
|
[O4](https://docs.google.com/presentation/d/1Lwmb-D-rCLF3Dr4gHbU5T9Rv3g6Mf0v2u9FN4UUI1lk/edit?usp=sharing)
|
|
|
|
|
|
|
|
# Developments
|
|
|
|
|
|
|
|
- [x] Training with O4 gaussian noise PSD
|
|
|
|
- [x] Whitening
|
|
|
|
- [x] Change input method for Q-transforms: Superposition -> Superposition + individual
|
|
|
|
- [x] Data Generator for large
|
|
|
|
- [x] Single detector training with uniform in masses dataset, including subthreshold triggers.. single det snr : 4 to 40 (powerlaw)
|
|
|
|
- [x] O4 MDC here [git](https://git.ligo.org/srashti.goyal/lensing_mdc_o4)
|
|
|
|
- [x] O4 simulated dataset generation
|
|
|
|
- [x] O4 simulated noise trained and tested Machines
|
|
|
|
- [x] O4 uniform and astrophysical pop. data preparation
|
|
|
|
- [ ] Ensembling (to reduce overfitting and variance)
|
|
|
|
|
|
|
|
|
|
|
|
## Package Scripts
|
|
|
|
|
|
|
|
### Data preparation
|
|
|
|
| Script | Short description | Status | old git hash | new git hash| Comment | Final Sign-off|
|
|
|
|
|--------|-------------------|--------|----------|---------|----------------|-----|
|
|
|
|
| [qt_utils.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/utils/qt_utils.py) | helper script for injecting gaussian noise given a psd and waveform. Also plots and saves Qtransforms. Added these functionalities: .npz , flow (lower frequency), qrange : wide (3,30) | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c|[diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L825-916) |:heavy_check_mark: |
|
|
|
|
| [lensid_create_qts_lensed_injs.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_qts_lensed_injs.py) | generates waveforms and q-transforms for simulated lensed events given a set of injection parameters, using analytical/O3a PSDs. Eg: `lensid_create_qts_lensed_injs -odir check -start 10 -n 3 -infile ~/lensid/data/injection_pars/haris-et-al/lensed_inj_data.npz -psd_mode 1 -qrange 2 -mode 2`. Added single detector option eg: `--single_det H1`, changed injection parameters names, waveform approximant, and default qrange. | OK |32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L45-199) line 641: shouldn't the default be 'whitened'? Also, line 927, why is tensorflow commented out?| :heavy_check_mark: |
|
|
|
|
| [lensid_create_qts_unlensed_injs.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_qts_unlensed_injs.py) | generates waveforms and q-transforms for simulated unlensed events given a set of injection parameters, using analytical/O3a PSDs. Eg: `lensid_create_qts_unlensed_injs -odir check -start 10 -n 3 -infile ~/lensid/data/injection_pars/haris-et-al/unlensed_inj_data.npz -psd_mode 1 -qrange 2 -mode 2` | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L201-343) same comment as on previous file| :heavy_check_mark: |
|
|
|
|
| [lensid_png_to_npz.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/utils/lensid_png_to_npz.py) | script for converting png transform images to .npz files for faster IO. eg: `lensid_png_to_npz --indir check --outdir check_npz -n 3` | OK | NA|1e813099b6c3d2824016d059f4230e398e099d0e | for dataloader. |:heavy_check_mark: |
|
|
|
|
| [lensid_create_lensed_df.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_lensed_df.py) | generates dataframe containing tags for lensed simulated event pairs, with columns as img_0, img_1 and Lensing(=1). Eg: `lensid_create_lensed_df -odir check -outfile lensed.csv -start 10 -n 3 -infile ~/lensid/data/injection_pars/haris-et-al/lensed_inj_data.npz` | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L1-12) | :heavy_check_mark: |
|
|
|
|
| [lensid_create_unlensed_df.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_unlensed_df.py) | generates dataframe containing tags for pairs of unlensed simulated events, with columns as img_0, img_1 and Lensing(=0). Eg: `lensid_create_unlensed_df -odir check -outfile unlensed.csv -start 10 -n 3 -infile ~/lensid/data/injection_pars/haris-et-al/unlensed_inj_data.npz` | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c| [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L345-356)| :heavy_check_mark: |
|
|
|
|
| [lensid_create_lensed_inj_xmls.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_lensed_inj_xmls.py) | helper script that outputs LAL inj.xml file for lensed simulated events given the injection parameters for bayestar. | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | a46b1d4a9755bae8438baaf053d2fb552a0808b9 | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L14-44) | :heavy_check_mark: |
|
|
|
|
| [lensid_create_unlensed_inj_xmls.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_create_unlensed_inj_xmls.py) | helper script that outputs LAL inj.xml file for unlensed simulated events given the injection parameters for bayestar. minor changes in the parameter names. | OK | 32d0854b1a68cf21827e65ca1c36feb7ca53d0f5 | a46b1d4a9755bae8438baaf053d2fb552a0808b9 | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff#L358-388) | :heavy_check_mark: |
|
|
|
|
| [lensid_create_bayestar_sky_lensed_injs.sh](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/scripts/lensid_create_bayestar_sky_lensed_injs.sh) | generates bayestar skymaps(.fits) for lensed simulated events, using analytical/O3a PSDs. Also converts them to cartesian format and save as .npz files. Eg: `lensid_create_bayestar_sky_lensed_injs.sh -o check -s 10 -n 3 -i ~/lensid/data/injection_pars/haris-et-al/lensed_inj_data.npz -p ~/lensid/data/PSDs/analytical_psd.xml` Note: if this does not work try running this before `export PATH=$HOME/.local/bin:$PATH` | OK | 493ea099f42fc50d2cc081754d5395f57fafae76 | ------- | -------------- |:heavy_check_mark:|
|
|
|
|
| [lensid_create_bayestar_sky_unlensed_injs.sh](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/scripts/lensid_create_bayestar_sky_unlensed_injs.sh) | generates bayestar skymaps(.fits) for unlensed simulated events, using analytical/O3a PSDs. Also converts them to cartesian format and save as .npz files. Eg: `lensid_create_bayestar_sky_unlensed_injs.sh -o check -s 10 -n 3 -i ~/lensid/data/injection_pars/haris-et-al/unlensed_inj_data.npz -p ~/lensid/data/PSDs/analytical_psd.xml` | OK | 493ea099f42fc50d2cc081754d5395f57fafae76 | ------- | -------------- | :heavy_check_mark: 83c208d82024a6abc49af67928a2d63653743ce3 |
|
|
|
|
| [lensid_fits_to_cart.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/utils/lensid_fits_to_cart.py) | helper script for converting HealPix skymap format(.fits) to cartesian. | OK | ac95f97e0c7e8d584b68ed364f353a5ed4bbb12d | need sanity check for hp.cartview during results review | unchanged | :heavy_check_mark: 83c208d82024a6abc49af67928a2d63653743ce3|
|
|
|
|
| [lensid_sky_injs_cart.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/injections/lensid_sky_injs_cart.py) | helper script for managing IO of fits_to_cart.py script for injection study | OK | 493ea099f42fc50d2cc081754d5395f57fafae76 | OK | -unchaged | :heavy_check_mark: |
|
|
|
|
|
|
|
|
### Features extraction, Train/test/predict utilities
|
|
|
|
| Script | Short description | Status | old git hash | new git hash| Comment | Final Sign-off |
|
|
|
|
|--------|-------------------|--------|----------|---------|----------------|--|
|
|
|
|
| [lensid_get_features_qts_ml.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/feature_extraction/lensid_get_features_qts_ml.py) | Script for calculating densenets predictions for a single detector Qtransforms given the trained densenet. Eg: `lensid_get_features_qts_ml -infile /home/srashti.goyal/lensing_MDC_O4/data_prep/data/dataframes/pairs.csv -outfile check_lensid_qts.csv -data_dir /home/srashti.goyal/lensing_MDC_O4/data_prep/data/qts/ -det H1 -whitened 1 -dense_model /home/srashti.goyal/lensid/development/retraining_for_O4/out/uniform_lr_005/dense_H1.h5` modified to single detector as compared to three detectors earlier.|OK | f9b7075d0e6ca8db211a0c3e43299af1eb428410 | 5e0e774713072dfeedafeb5a0cc2965dcd4d068c | [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/review/diff_lensid_get_features_qts_ml_py.diff) | :heavy_check_mark:|
|
|
|
|
| [lensid_get_features_sky_ml.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/feature_extraction/lensid_get_features_sky_ml.py) | Script for calculating features from the bayestar skymaps which go as input to "XGBoost with Skymaps" model. Eg: `lensid_get_features_sky_ml -infile check/lensed.csv -outfile check/lensed_sky.csv -data_dir check` | -OK-DC , OK-jrc | f9b7075d0e6ca8db211a0c3e43299af1eb428410 | NA | NA |:heavy_check_mark: |
|
|
|
|
| [ml_utils.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/utils/ml_utils.py) | utility script containing all machine learning model functions for training, FAP computation, predictions etc. Added data loader, Qtransforms input options file type | ? | 493ea099f42fc50d2cc081754d5395f57fafae76 |5e0e774713072dfeedafeb5a0cc2965dcd4d068c | [diff] (https://git.ligo.org/srashti.goyal/lensid/-/blob/master/review/diff_ml_utils_py.diff) question line 426: is Keras trained on .png or .npz? | |
|
|
|
|
|
|
|
|
|
|
|
|
### ML models: Training, Optimisation, Testing, Comparison with BLU, Predictions
|
|
|
|
| Scripts | Short description | Status | git hash | Comment | final sign-off |
|
|
|
|
|---------|-------------------|--------|----------|---------|----------------|
|
|
|
|
| [train_densenets_qts.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/train_test/train_densenets_qts.py) | Train densenet with qtransform for a given detector. Eg: `from lensid.train_test.train_densenets_qts import _main; _main('out/','/home/srashti.goyal/lensid_runs/uniform_dataset/npz/','/home/srashti.goyal/lensid_runs/uniform_dataset/dataframe/lensed.csv','/home/srashti.goyal/lensid_runs/uniform_dataset/dataframe/unlensed_half.csv',size_lensed=8000,size_unlensed=8000,batch_size=500,det='V1',epochs=20,lr=0.005,whitened=1,file_type='npz',colored=0,model_id=0)`. Note: requires `tensorflow-gpu` to load CUDA libraries. | |7631c2b7530be09372721c2d3d3f0e27e792a53c | changed significantly from the preivous release for using the dataloader and npz files. | ---------------- |
|
|
|
|
[train_crossvalidate_test_XGB_sky.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/train_test/train_crossvalidate_test_XGB_sky.py) | Train, cross-validates and compare to BLU "XGBoost with Skymaps" model. Requires dataframe that already has the input features calculated from the Bayestar/PE skymaps. `python train_crossvalidate_test_XGB_sky.py -help` | OK-DC; OK-jrc | a60740bb5a0cccb2be8e8184f16c0c7c93f8150b | No change. |:heavy_check_mark: |
|
|
|
|
[ml_predict_workflow.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/lensid/ml_predict_workflow.py) | Make predictions gives the Qtransforms, Skymaps and trained ML models. Needs CONFIG.yaml file for running. example: [config_O3_events_18022022](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/examples/config_O3_events_18022022.yaml) `lensid_make_predictions --config /home/srashti.goyal/lensid/package/examples/config_O3_events_18022022.yaml` | | 83c208d82024a6abc49af67928a2d63653743ce3 | Minor changes w.r.t result review O3. |:heavy_check_mark: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Other scripts
|
|
|
|
| Scripts | Short description | Status | git hash | Comment | final sign-off |
|
|
|
|
|---------|-------------------|--------|----------|---------|----------------|
|
|
|
|
| [condor_data_gen_train_test_config.py](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/train_test_workflow/condor_data_gen_train_test_config.py) | Generate Qtransforms, Dataframes, Bayestar skymaps for training and testing given the injection parameters using condor dag jobs submission. Note: change `exec_file_loc` in the script according to your installation and `base_out_dir` as desired. Eg: [config_o4_datagen.yaml](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/package/examples/config_o4_datagen.yaml) `python condor_data_gen_train_test_config.py -config ../package/examples/config_o4_datagen.yaml` | | 1a3fc5d0ce285cec52576ccd7de645f79e772879 | very minor change, just added `request_disk` option [diff](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/review/diff_condor_data_gen.diff) | ---------------- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Review Calls
|
|
|
|
|
|
|
|
The review call happen on Wednesdays 1 PM CEST/ 4:30 PM IST virtual IFPA room: https://shorturl.at/uxLS5
|
|
|
|
|
|
|
|
## 4th April 2023
|
|
|
|
|
|
|
|
- Discussed MDC results
|
|
|
|
- Discussed the developments
|
|
|
|
- Discussed training with real noise with one week of O4 data.
|
|
|
|
|
|
|
|
### Action items
|
|
|
|
|
|
|
|
- [x] Ask chairs about real noise training and review.
|
|
|
|
|
|
|
|
|
|
|
|
## 3rd May 2023
|
|
|
|
|
|
|
|
- Discussed the new functionalities for the ML QTs package scripts.
|
|
|
|
- Discussed the new uniform in the masses training set.
|
|
|
|
- A [diff file](https://git.ligo.org/srashti.goyal/lensid/-/blob/master/diff_package_03052023_reviewed_2022.diff) is created to keep track of package changes since the last reviewed version.
|
|
|
|
- The hard deadline for code review is the start of O4 so the focus is more on the `code` than performance at the moment.
|
|
|
|
|
|
|
|
### Action items
|
|
|
|
|
|
|
|
- [x] Prepare feature extraction and utils code for review.
|
|
|
|
- [x] Sign off data preparation codes.
|
|
|
|
|
|
|
|
## 10th May 2023
|
|
|
|
|
|
|
|
- We discussed JR's comments on the data preparation scripts.
|
|
|
|
- We discussed the changes made to codes for dense predictions, ml_utils and the data generator for ML QTs.
|
|
|
|
- The sign-off column was added to the tables.
|
|
|
|
- We discussed how should we proceed as the Virgo will be joining O4 3-6 months later.
|
|
|
|
- During the result review we should check the training size, batch_size etc.
|
|
|
|
- During the result review we will give results as a function of SNR.
|
|
|
|
|
|
|
|
### Action items
|
|
|
|
|
|
|
|
- [x] Prepare training and testing scripts.
|
|
|
|
- [x] Sign off feature extraction and ML utils codes.
|
|
|
|
|
|
|
|
## 17 May 2023
|
|
|
|
|
|
|
|
- We discussed the JR's comments on changing the defaults to `whitened =1` and `file_type ='npz'` along with the other clarifications.
|
|
|
|
- We discussed the important code review scripts and their timeline. Hoping to sign off by 23rd May.
|
|
|
|
- We discussed things for result review.
|
|
|
|
- We will meet next Tuesday at 2:30 pm CEST.
|
|
|
|
|
|
|
|
For the result review we will have:
|
|
|
|
- Compare machines trained in uniform masses v/s astrophysical pop. model masses.
|
|
|
|
- Optimisation, benchmarking results on simulated data, final trained machines.
|
|
|
|
- Background computation
|
|
|
|
- CBCFlow integration for data preparation of O4 events.
|
|
|
|
- O4/O3 real noise training?
|
|
|
|
|
|
|
|
### Action items
|
|
|
|
|
|
|
|
- [] Prepare ensembling and combining ML qts + ML sky scripts.
|
|
|
|
- [] Sign off remaining code review scripts |
|
|
|
\ No newline at end of file |