diff --git a/doc/source/cbc_analysis.rst b/doc/source/cbc_analysis.rst index 98d86d7512febec7ef2e0e66f91c2d3e65c34eec..e4d1ba74154feb1d98ba7e0c903a6159d1988c99 100644 --- a/doc/source/cbc_analysis.rst +++ b/doc/source/cbc_analysis.rst @@ -15,8 +15,8 @@ Open Science Grid (OSG). Running Workflows ^^^^^^^^^^^^^^^^^^ -1. Build Singularity image (optional) -"""""""""""""""""""""""""""""""""""""" +1.A Build Singularity image (using the gstlal master branch) +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" NOTE: If you are using a reference Singularity container (suitable in most cases), you can skip this step. The ``<image>`` throughout this doc refers to @@ -31,6 +31,47 @@ To pull a container with gstlal installed, run: $ singularity build --sandbox --fix-perms <image-name> docker://containers.ligo.org/lscsoft/gstlal:master + +1.B Build Singularity image (using a gstlal non-master branch) +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +If using a non-master branch, create a singularity build directory by running: + +.. code:: bash + + $ mkdir <singularity-dir> + $ cd <singularity-dir> + $ singularity build --sandbox --fix-perms <image-name> docker://containers.ligo.org/lscsoft/gstlal:<name-of-branch> + +If running on the ICDS (PSU cluster), add a directory called ``ligo`` inside +``<image-name>``, and the following singularity commands should contain +``-B /ligo``. + +In the directory where ``<image-name>`` exists, run: + +.. code:: bash + + $ singularity run --writable <image-name> + $ cd gstlal + +If one is modifying code, apply changes at this step. +Then, install gstlal by running the following where ``<gstlal-sub>`` is +``gstlal``, ``gstlal-burst``, ``gstlal-inspiral``, and ``gstlal-ugly``. + +.. code:: bash + + $ cd <gstlal-sub> && echo | ./00init.sh + $ ./configure --prefix /usr + $ make + $ make install + $ cd .. + +To get out of the singularity container, run + +.. code:: bash + + $ exit + 2. Set up workflow """""""""""""""""""" @@ -40,22 +81,35 @@ First, we create a new analysis directory and switch to it: $ mkdir <analysis-dir> $ cd <analysis-dir> + $ mkdir mass_model + $ mkdir bank -Default configuration files and data files (template bank/mass model) for a +Default configuration files and environment (``env.sh``) for a variety of different banks are contained in the -`offline-configuration <https://git.ligo.org/gstlal/offline-configuration>`_ +`offline-configuration <https://git.ligo.org/gstlal/offline-configuration/configs>`_ repository. +One can run the commands below to grab the configuration files, or clone the +repository and copy the files as needed into the analysis directory. +To download data files (mass model, template banks) that may be needed for +offline runs, see +`offline-configuration README <https://git.ligo.org/gstlal/offline-configuration/-/blob/main/README.md>`_ +Move the template bank(s) into ``bank`` and the mass model into ``mass_model`` + -For example, to grab the configuration and data files for the BNS test bank: +For example, to grab the configuration file and environment for the a small BNS dag: .. code:: bash - $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/config.yml - $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/mass_model/mass_model_small.h5 - $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/bns-small/bank/gstlal_bank_small.xml.gz + $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/configs/bns-small/config.yml + $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/env.sh + +Then run the following to get the template banks and mass models. + +.. code:: bash -Alternatively, one can clone the repository and copy files as needed into the -analysis directory. + $ conda activate igwn + $ dcc archive --archive-dir=. --files --interactive T2200318-v2 + $ conda deactivate Now, we'll need to modify the configuration as needed to run the analysis. At the very least, setting the start/end times and the instruments to run over: @@ -73,12 +127,12 @@ the right place in the configuration: .. code-block:: yaml data: - template-bank: gstlal_bank_small.xml.gz + template-bank: bank/gstlal_bank_small.xml.gz .. code-block:: yaml prior: - mass-model: mass_model_small.h5 + mass-model: bank/mass_model_small.h5 If you're creating a summary page for results, you'll need to point at a location where they are web-viewable: @@ -105,7 +159,7 @@ In addition, update the ``singularity-image`` the ``condor`` section of your con singularity-image: /cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master If not using the reference Singularity image, you can replace this line with the -full path to a local container. +full path to a local singularity container ``<image>``. For more detailed configuration options, take a look at the :ref:`configuration section <analysis-configuration>` below. @@ -178,20 +232,28 @@ Finally, set up the rest of the workflow including the DAG for submission: $ singularity exec -B $TMPDIR <image> make dag - This should create condor DAGs for the workflow. Mounting a temporary directory is important as some of the steps will leverage a temporary space to generate files. +If one desires to see detailed error messages, add ``<PYTHONUNBUFFERED=1>`` to +``environment`` in the submit (``*.sub``) files by running: + +.. code:: bash + + $ sed -i 's@environment = "LAL_DATA_PATH=/cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation"@environment = "LAL_DATA_PATH=/cvmfs/oasis.opensciencegrid.org/ligo/sw/pycbc/lalsuite-extra/current/share/lalsimulation PYTHONUNBUFFERED=1"@g' *.sub + + 3. Launch workflows """"""""""""""""""""""""" .. code:: bash + $ source env.sh $ make launch This is simply a thin wrapper around `condor_submit_dag` launching the DAG in question. -You can monitor the dag with Condor CLI tools such as ``condor_q``. +You can monitor the dag with Condor CLI tools such as ``condor_q`` and ``tail -f full_inspiral_dag.dag.dagman.out``. 4. Generate Summary Page """"""""""""""""""""""""" @@ -200,8 +262,13 @@ After the DAG has completed, you can generate the summary page for the analysis: .. code:: bash - $ singularity exec -B $TMPDIR <image> make summary + $ singularity exec <image> make summary +To make an open page, run: + +.. code:: bash + + $ make unlock .. _analysis-configuration: @@ -256,6 +323,7 @@ Section: Source .. code-block:: yaml source: + data-source: frames data-find-server: datafind.gw-openscience.org frame-type: H1: H1_GWOSC_O2_16KHZ_R1 @@ -263,8 +331,10 @@ Section: Source channel-name: H1: GWOSC-16KHZ_R1_STRAIN L1: GWOSC-16KHZ_R1_STRAIN + sample-rate: 4096 frame-segments-file: segments.xml.gz frame-segments-name: datasegments + x509-proxy: x509_proxy The ``data-find-server`` option points to a server that is queried to find the location of frame files. The address shown above is a publicly available server @@ -277,7 +347,8 @@ available. These files are generalized enough that they could describe different types of data, so ``frame-segments-name`` is used to specify which segment to consider. In practice, the segments file we produce will only contain the segments we want. Users will typically not change any of these options once they -are set for a given instrument and observing run. +are set for a given instrument and observing run. ``x509-proxy`` is the path to +your ``x509-proxy``. Section: Segments """""""""""""""""" @@ -333,6 +404,7 @@ Section: PSD psd: fft-length: 8 + sample-rate: 4096 The PSD estimation method used by GstLAL is a modified median-Welch method that is described in detail in Section IIB of Ref [1]. The FFT length sets the length @@ -355,10 +427,10 @@ Section: SVD num-chi-bins: 1 sort-by: mchirp approximant: - - 0:1000:TaylorF2 + - 0:1.73:TaylorF2 + - 1.73:1000:SEOBNRv4_ROM tolerance: 0.9999 - max-f-final: 512.0 - sample-rate: 1024 + max-f-final: 1024.0 num-split-templates: 200 overlap: 30 num-banks: 5 @@ -366,7 +438,7 @@ Section: SVD samples-max-64: 2048 samples-max-256: 2048 samples-max: 4096 - autocorrelation-length: 351 + autocorrelation-length: 701 manifest: svd_manifest.json ``f-low`` sets the lower frequency cutoff for the analysis in Hz. @@ -376,7 +448,10 @@ procedure; specifically, sets the number of effective spin parameter bins to use in the chirp-mass / effective spin binning procedure described in Sec. IID and Fig. 6 of [1]. -``sort-by`` selects the template sort column. This controls how to bin the bank in sub-banks suitable for the svd decomposition. +``sort-by`` selects the template sort column. This controls how to bin the +bank in sub-banks suitable for the svd decomposition. It can be ``mchirp`` +(sorts by chirp mass), ``mu`` (sorts by mu1 and mu2 coordiantes), or +``template_duration`` (sorts by template duration). ``approximant`` specifies the waveform approximant that should be used along with chirp mass bounds to use that approximant in. 0:1000:TaylorF2 means use the @@ -430,14 +505,18 @@ Section: Filter filter: fir-stride: 1 + min-instruments: 1 coincidence-threshold: 0.01 ht-gate-threshold: 0.8:15.0-45.0:100.0 veto-segments-file: vetoes.xml.gz time-slide-file: tisi.xml injection-time-slide-file: inj_tisi.xml + time-slides: + H1: 0:0:0 + L1: 0.62831:0.62831:0.62831 injections: bns: - file: injections/bns_injections.xml + file: bns_injections.xml range: 0.01:1000.0 ``fir-stride`` is a tunable parameter related to the matched-filter procedure, @@ -481,9 +560,9 @@ Section: Injections .. code-block:: yaml injections: - expected-snr: - f-low: 15.0 sets: + expected-snr: + f-low: 15.0 bns: f-low: 14.0 seed: 72338 @@ -508,6 +587,7 @@ Section: Injections distance: min: 10000 max: 80000 + spin-aligned: True file: bns_injections.xml The ``sets`` subsection is used to create injection sets to be used within the @@ -519,6 +599,9 @@ of the ``filter`` section. Besides creating injection sets, the ``expected-snr`` subsection is used for the expected SNR jobs. These settings are used to override defaults as needed. +``spin-aligned`` specifies whether the injections should be spin-(mis)aligned +spins (if ``spin-aligned: True``) or precessing spins (if ``spin-aligned: False``). + In the case of multiple injection sets that need to be combined, one can add a few options to create a combined file and reference that within the filter jobs. This can be useful for large banks with a large set of templates. To @@ -547,7 +630,7 @@ Section: Prior .. code-block:: yaml prior: - mass-model: model/mass_model_small.h5 + mass-model: mass_model/mass_model_small.h5 ``mass-model`` is a relative path to the file that contains the mass model. This model is used to weight templates appropriately when assigning ranking @@ -589,7 +672,8 @@ Section: Condor condor: profile: osg-public accounting-group: ligo.dev.o3.cbc.uber.gstlaloffline - singularity-image: /cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master + accounting-group-user: <albert.einstein> + singularity-image: <image> ``profile`` sets a base level of configuration options for condor. @@ -598,7 +682,8 @@ the machinery to produce an analysis dag requires this option, but the option is not actually used by analyses running on non-LDG resources. ``singularity-image`` sets the path of the container on cvmfs that the analysis -should use. Users will not typically change this option. +should use. Users will not typically change this option +(use ``/cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master``). .. _install-custom-profiles: