This merge request is for the integration of the
EM-Bright infrastructure to
LALsuite. For O3, we are completely overhauling the
EM-Bright pipeline. We are moving away from the ellipsoid based computation that we conducted in O2 (https://dcc.ligo.org/LIGO-T1600570) to a machine learning based infrastructure for O3 (https://dcc.ligo.org/LIGO-G1801592). Thus the pipeline now will have two aspects,
- Training: A training pipeline that relies on injection campaigns from the detection pipelines: We use a Random Forest classifier to train the EM-Bright pipeline to discriminate between EM-Bright and EM-Dark sources.
Inference: Events obtained in low-latency will trigger the inference code (
lalinference.embrightand that will give the probability of the event being EM-Bright.
The code is packaged as follows: In
lalinference/python the executables are put which are necessary for running of a dag that will be required to conduct the training of the EM-Bright pipeline. To do this we have to ascertain the amount of disk mass that is created during the coalescence using a variation of Foucart’s formula (arXiv:1807.00011) by Foucart, Hinderer and Nissanke. The code performing this calculation is called
computeDiskMass.py which is placed in a newly constructed
embright directory under
lalinference/python/lalinference. This code (currently) uses an extremely stiff equation of state of the neutron star (2H) to compute the amount of tidally disrupted matter. A data file containing the required information for this state (namely the Baryonic mass and the compactness) also resides in this directory. All these are packaged such that they are accessible as:
from lalinference.embright import computeDiskMass. A script which is designed to write a dag specifically for the training jobs is in the
lalinference/python directory which uses the various other EM-Bright scripts (listed below) to create the dag that when run will generate pickle files that contains the machine learned information for triggers.
When a new trigger arrives, this pickle file can be used to generate the probability whether an event is EM-Bright or EM-Dark.
Details on how the training process can be emulated:
I used conda to create an environment that has scikit-learn (since it is not available on the ldg clusters). Once this version of LALsuite is installed, running command will generate a dag
embright_create_train_dag -c conf.ini
Where, conf.ini is a configuration script, an example of which is attached in this description(conf.ini). The user will have to change the path to the following elements of the file (
injdir). Submitting the dag should generate the required machine learned pickle files as long as you have injection sqlite files in
injdir which are of the form
NUM is any number. An examples of such sqlite files could be found here
/work/gstlalcbc/observing/2/offline/C00/chunk_21_1186624818-1187312718_run_3/H1L1-ALL_LLOID_split_injections_0000-1186624818-687900.sqlite on NEMO.
This is a work in progress, and the inference part of this project is currently being updated.