Skip to content
Snippets Groups Projects
Commit abdbb179 authored by Shio Sakon's avatar Shio Sakon Committed by Shio Sakon
Browse files

doc/source/cbc_analysis.rst: updated tutorial

parent 8ca3b580
No related branches found
No related tags found
1 merge request!303Updated cbc offline configuration tutorial
......@@ -15,8 +15,8 @@ Open Science Grid (OSG).
Running Workflows
^^^^^^^^^^^^^^^^^^
1. Build Singularity image (optional)
""""""""""""""""""""""""""""""""""""""
1.A Build Singularity image (using the gstlal master branch)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
NOTE: If you are using a reference Singularity container (suitable in most
cases), you can skip this step. The ``<image>`` throughout this doc refers to
......@@ -31,6 +31,47 @@ To pull a container with gstlal installed, run:
$ singularity build --sandbox --fix-perms <image-name> docker://containers.ligo.org/lscsoft/gstlal:master
1.B Build Singularity image (using a gstlal non-master branch)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
If using a non-master branch, create a singularity build directory by running:
.. code:: bash
$ mkdir <singularity-dir>
$ cd <singularity-dir>
$ singularity build --sandbox --fix-perms <image-name> docker://containers.ligo.org/lscsoft/gstlal:<name-of-branch>
If running on the ICDS (PSU cluster), add a directory called ``ligo`` inside
``<image-name>``, and the following singularity commands should contain
``-B /ligo``.
In the directory where ``<image-name>`` exists, run:
.. code:: bash
$ singularity run --writable <image-name>
$ cd gstlal
If one is modifying code, apply changes at this step.
Then, install gstlal by running the following where ``<gstlal-sub>`` is
``gstlal``, ``gstlal-burst``, ``gstlal-inspiral``, and ``gstlal-ugly``.
.. code:: bash
$ cd <gstlal-sub> && echo | ./00init.sh
$ ./configure --prefix /usr
$ make
$ make install
$ cd ..
To get out of the singularity container, run
.. code:: bash
$ exit
2. Set up workflow
""""""""""""""""""""
......@@ -40,22 +81,35 @@ First, we create a new analysis directory and switch to it:
$ mkdir <analysis-dir>
$ cd <analysis-dir>
$ mkdir mass_model
$ mkdir bank
Default configuration files and data files (template bank/mass model) for a
Default configuration files and environment (``env.sh``) for a
variety of different banks are contained in the
`offline-configuration <https://git.ligo.org/gstlal/offline-configuration>`_
`offline-configuration <https://git.ligo.org/gstlal/offline-configuration/configs>`_
repository.
One can run the commands below to grab the configuration files, or clone the
repository and copy the files as needed into the analysis directory.
To download data files (mass model, template banks) that may be needed for
offline runs, see
`offline-configuration README <https://git.ligo.org/gstlal/offline-configuration/-/blob/main/README.md>`_
Move the template bank(s) into ``bank`` and the mass model into ``mass_model``
For example, to grab the configuration and data files for the BNS test bank:
For example, to grab the configuration file and environment for the a small BNS dag:
.. code:: bash
$ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/config.yml
$ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/mass_model/mass_model_small.h5
$ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/bank/gstlal_bank_small.xml.gz
$ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/configs/bns-small/config.yml
$ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/env.sh
Then run the following to get the template banks and mass models.
.. code:: bash
Alternatively, one can clone the repository and copy files as needed into the
analysis directory.
$ conda activate igwn
$ dcc archive --archive-dir=. --files --interactive T2200318-v2
$ conda deactivate
Now, we'll need to modify the configuration as needed to run the analysis. At
the very least, setting the start/end times and the instruments to run over:
......@@ -73,12 +127,12 @@ the right place in the configuration:
.. code-block:: yaml
data:
template-bank: gstlal_bank_small.xml.gz
template-bank: bank/gstlal_bank_small.xml.gz
.. code-block:: yaml
prior:
mass-model: mass_model_small.h5
mass-model: bank/mass_model_small.h5
If you're creating a summary page for results, you'll need to point at a
location where they are web-viewable:
......@@ -105,7 +159,7 @@ In addition, update the ``singularity-image`` the ``condor`` section of your con
singularity-image: /cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master
If not using the reference Singularity image, you can replace this line with the
full path to a local container.
full path to a local singularity container ``<image>``.
For more detailed configuration options, take a look at the :ref:`configuration
section <analysis-configuration>` below.
......@@ -178,20 +232,28 @@ Finally, set up the rest of the workflow including the DAG for submission:
$ singularity exec -B $TMPDIR <image> make dag
This should create condor DAGs for the workflow. Mounting a temporary directory
is important as some of the steps will leverage a temporary space to generate files.
If one desires to see detailed error messages, add ``<PYTHONUNBUFFERED=1>`` to
``environment`` in the submit (``*.sub``) files by running:
.. code:: bash
$ sed -i 's@environment = "LAL_DATA_PATH=/cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation"@environment = "LAL_DATA_PATH=/cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation PYTHONUNBUFFERED=1"@g' *.sub
3. Launch workflows
"""""""""""""""""""""""""
.. code:: bash
$ source env.sh
$ make launch
This is simply a thin wrapper around `condor_submit_dag` launching the DAG in question.
You can monitor the dag with Condor CLI tools such as ``condor_q``.
You can monitor the dag with Condor CLI tools such as ``condor_q`` and ``tail -f full_inspiral_dag.dag.dagman.out``.
4. Generate Summary Page
"""""""""""""""""""""""""
......@@ -200,8 +262,13 @@ After the DAG has completed, you can generate the summary page for the analysis:
.. code:: bash
$ singularity exec -B $TMPDIR <image> make summary
$ singularity exec <image> make summary
To make an open page, run:
.. code:: bash
$ make unlock
.. _analysis-configuration:
......@@ -256,6 +323,7 @@ Section: Source
.. code-block:: yaml
source:
data-source: frames
data-find-server: datafind.gw-openscience.org
frame-type:
H1: H1_GWOSC_O2_16KHZ_R1
......@@ -263,8 +331,10 @@ Section: Source
channel-name:
H1: GWOSC-16KHZ_R1_STRAIN
L1: GWOSC-16KHZ_R1_STRAIN
sample-rate: 4096
frame-segments-file: segments.xml.gz
frame-segments-name: datasegments
x509-proxy: x509_proxy
The ``data-find-server`` option points to a server that is queried to find the
location of frame files. The address shown above is a publicly available server
......@@ -277,7 +347,8 @@ available. These files are generalized enough that they could describe different
types of data, so ``frame-segments-name`` is used to specify which segment to
consider. In practice, the segments file we produce will only contain the
segments we want. Users will typically not change any of these options once they
are set for a given instrument and observing run.
are set for a given instrument and observing run. ``x509-proxy`` is the path to
your ``x509-proxy``.
Section: Segments
""""""""""""""""""
......@@ -333,6 +404,7 @@ Section: PSD
psd:
fft-length: 8
sample-rate: 4096
The PSD estimation method used by GstLAL is a modified median-Welch method that
is described in detail in Section IIB of Ref [1]. The FFT length sets the length
......@@ -355,10 +427,10 @@ Section: SVD
num-chi-bins: 1
sort-by: mchirp
approximant:
- 0:1000:TaylorF2
- 0:1.73:TaylorF2
- 1.73:1000:SEOBNRv4_ROM
tolerance: 0.9999
max-f-final: 512.0
sample-rate: 1024
max-f-final: 1024.0
num-split-templates: 200
overlap: 30
num-banks: 5
......@@ -366,7 +438,7 @@ Section: SVD
samples-max-64: 2048
samples-max-256: 2048
samples-max: 4096
autocorrelation-length: 351
autocorrelation-length: 701
manifest: svd_manifest.json
``f-low`` sets the lower frequency cutoff for the analysis in Hz.
......@@ -376,7 +448,10 @@ procedure; specifically, sets the number of effective spin parameter bins to use
in the chirp-mass / effective spin binning procedure described in Sec. IID and
Fig. 6 of [1].
``sort-by`` selects the template sort column. This controls how to bin the bank in sub-banks suitable for the svd decomposition.
``sort-by`` selects the template sort column. This controls how to bin the
bank in sub-banks suitable for the svd decomposition. It can be ``mchirp``
(sorts by chirp mass), ``mu`` (sorts by mu1 and mu2 coordiantes), or
``template_duration`` (sorts by template duration).
``approximant`` specifies the waveform approximant that should be used along
with chirp mass bounds to use that approximant in. 0:1000:TaylorF2 means use the
......@@ -430,14 +505,18 @@ Section: Filter
filter:
fir-stride: 1
min-instruments: 1
coincidence-threshold: 0.01
ht-gate-threshold: 0.8:15.0-45.0:100.0
veto-segments-file: vetoes.xml.gz
time-slide-file: tisi.xml
injection-time-slide-file: inj_tisi.xml
time-slides:
H1: 0:0:0
L1: 0.62831:0.62831:0.62831
injections:
bns:
file: injections/bns_injections.xml
file: bns_injections.xml
range: 0.01:1000.0
``fir-stride`` is a tunable parameter related to the matched-filter procedure,
......@@ -481,9 +560,9 @@ Section: Injections
.. code-block:: yaml
injections:
expected-snr:
f-low: 15.0
sets:
expected-snr:
f-low: 15.0
bns:
f-low: 14.0
seed: 72338
......@@ -508,6 +587,7 @@ Section: Injections
distance:
min: 10000
max: 80000
spin-aligned: True
file: bns_injections.xml
The ``sets`` subsection is used to create injection sets to be used within the
......@@ -519,6 +599,9 @@ of the ``filter`` section.
Besides creating injection sets, the ``expected-snr`` subsection is used for the
expected SNR jobs. These settings are used to override defaults as needed.
``spin-aligned`` specifies whether the injections should be spin-(mis)aligned
spins (if ``spin-aligned: True``) or precessing spins (if ``spin-aligned: False``).
In the case of multiple injection sets that need to be combined, one can add
a few options to create a combined file and reference that within the filter
jobs. This can be useful for large banks with a large set of templates. To
......@@ -547,7 +630,7 @@ Section: Prior
.. code-block:: yaml
prior:
mass-model: model/mass_model_small.h5
mass-model: mass_model/mass_model_small.h5
``mass-model`` is a relative path to the file that contains the mass model. This
model is used to weight templates appropriately when assigning ranking
......@@ -589,7 +672,8 @@ Section: Condor
condor:
profile: osg-public
accounting-group: ligo.dev.o3.cbc.uber.gstlaloffline
singularity-image: /cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master
accounting-group-user: <albert.einstein>
singularity-image: <image>
``profile`` sets a base level of configuration options for condor.
......@@ -598,7 +682,8 @@ the machinery to produce an analysis dag requires this option, but the option is
not actually used by analyses running on non-LDG resources.
``singularity-image`` sets the path of the container on cvmfs that the analysis
should use. Users will not typically change this option.
should use. Users will not typically change this option
(use ``/cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master``).
.. _install-custom-profiles:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment