Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • steffen.grunewald/gstlal
  • sumedha.biswas/gstlal
  • spiir-group/gstlal
  • madeline-wade/gstlal
  • hunter.schuler/gstlal
  • adam-mercer/gstlal
  • amit.reza/gstlal
  • alvin.li/gstlal
  • duncanmmacleod/gstlal
  • rebecca.ewing/gstlal
  • javed.sk/gstlal
  • leo.tsukada/gstlal
  • brian.bockelman/gstlal
  • ed-maros/gstlal
  • koh.ueno/gstlal
  • leo-singer/gstlal
  • lscsoft/gstlal
17 results
Show changes
Showing
with 1595 additions and 260 deletions
......@@ -7,6 +7,5 @@ GstLAL API
gstlal/python-modules/*modules
gstlal-inspiral/python-modules/*modules
gstlal-calibration/python-modules/*modules
gstlal-burst/python-modules/*modules
gstlal-ugly/python-modules/*modules
This diff is collapsed.
......@@ -22,7 +22,6 @@ sys.path.insert(0, os.path.abspath('.'))
sys.path.insert(0, os.path.abspath('../../gstlal/python'))
sys.path.insert(0, os.path.abspath('../../gstlal-inspiral/python'))
sys.path.insert(0, os.path.abspath('../../gstlal-burst/python'))
sys.path.insert(0, os.path.abspath('../../gstlal-calibration/python'))
sys.path.insert(0, os.path.abspath('../../gstlal-ugly/python'))
# on_rtd is whether we are on readthedocs.org, this line of code grabbed
......@@ -44,11 +43,19 @@ extensions = [
'sphinx.ext.intersphinx',
'sphinx.ext.todo',
'sphinx.ext.coverage',
'sphinx.ext.pngmath',
# 'sphinx.ext.imgmath',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
'sphinx.ext.githubpages',
'sphinx.ext.graphviz']
'sphinx.ext.graphviz',
'sphinx.ext.mathjax',
'myst_parser',
]
myst_enable_extensions = [
"amsmath",
"dollarmath",
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
......@@ -56,8 +63,8 @@ templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
source_suffix = ['.rst', '.md']
# source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
......@@ -65,7 +72,7 @@ master_doc = 'index'
# General information about the project.
# FIXME get from autotools
project = u'GstLAL'
copyright = u'2018, GstLAL developers'
copyright = u'2021, GstLAL developers'
author = u'GstLAL developers'
# The version info for the project you're documenting, acts as replacement for
......@@ -73,10 +80,9 @@ author = u'GstLAL developers'
# built documents.
#
# The short X.Y version.
# FIXME get from autotools
version = u'1.x'
#version = u'1.x'
# The full version, including alpha/beta/rc tags.
release = u'1.x'
release = ''
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
......@@ -126,6 +132,9 @@ if not on_rtd: # only import and set the theme if we're building docs locally
#html_sidebars = { '**': ['navigation.html', 'relations.html', 'searchbox.html'] }
#html_last_updated_fmt = None
# Add a favicon to doc pages
html_favicon = '_static/favicon.ico'
# -- Options for HTMLHelp output ------------------------------------------
# Output file base name for HTML help builder.
......
# Container Development Environment
The container development workflow consists of a few key points:
- Build tools provided by and used within a writable gstlal container.
- Editor/git used in or outside of the container as desired.
- Applications are run in the development container.
The benefits of developing in a writable container:
- Your builds do not depend on the software installed on the system, you don't have to worry about behavior changes due to system package updates.
- Your build environment is the same as that of everyone else using the same base container. This makes for easier collaboration.
- Others can run your containers and get the same results. You don't have to worry about environment mis-matches.
## Create a writable container
The base of a development environment is a gstlal container. It is typical to start with the
current master build. However, you can use the build tools to overwite the install in the container so the
choice of branch in your gstlal repository matters more than the container that you start with. The job of
the container is to provide a well-defined set of dependencies.
```bash
singularity build --sandbox --fix-perms CONTAINER_NAME docker://containers.ligo.org/lscsoft/gstlal:master
```
This will creat a directory named CONTAINER_NAME. That directory is a *singularity container*.
## Check out gstlal
In a directory of your choice, under your home directory, run:
```
git clone https://git.ligo.org/lscsoft/gstlal DIRNAME
```
This will create a git directory named DIRNAME which is referred to in the following as your "gstlal dir". The gstlal dir
contains several directories that contain components that can be built independently (e.g., `gstlal`, `gstlal-inspiral`, `gstlal-ugly`, ...).
A common practice is to run the clone command in the CONTAINER_NAME directory and use `src` as `DIRNAME`. In this case, when you run your
container, your source will be available in the directory `/src`.
## Develop
Edit and make changes under your gstlal dir using editors and git outside of the container (or inside if you prefer).
## Build a component
To build a component:
1. cd to your gstlal directory
2. Run your container:
```
singularity run --writable -B $TMPDIR CONTAINER_NAME /bin/bash
```
3. cd to the component directory under your gstlal dir.
4. Initialize the build system for your component. You only need to do this once per container per component directory:
```
./00init.sh
./configure --prefix=/usr --libdir=/usr/lib64
```
The arguments to configure are required so that you overwrite the build of gstlal in your container.
Some components have dependencies on others. You should build GstLAL components in the following order:
1. `gstlal`
2. `gstlal-ugly`
3. `gstlal-inspiral`, `gstlal-burst`, `gstlal-calibrarion` (in any order)
For example, if you want to build `gstlal-ugly`, you should build `gstlal` first.
5. Run make and make install
```
make
make install
```
Note that the container is writable, so your installs will persist after you exit the container and run it again.
## Run your code
You can run your code in the following ways:
1. Run your container using singularity and issue commands interactively "inside the container":
```
singularity run --writable -B $TMPDIR PATH_TO_CONTAINER /bin/bash
/bin/gstlal_reference_psd --channel-name=H1=foo --data-source=white --write-psd=out.psd.xml --gps-start-time=1185493488 --gps-end-time=1185493788
```
2. Use `singularity exec` and give your command on the singularity command line:
```
singularity exec --writable -B $TMPDIR PATH_TO_CONTAINER /bin/gstlal_reference_psd --channel-name=H1=foo --data-source=white --write-psd=out.psd.xml --gps-start-time=1185493488 --gps-end-time=1185493788
```
3. Use your container in a new or existing [container-based gstlal workflow](/gstlal/cbc_analysis.html) on a cluster with a shared filesystem where your container resides. For example, you can run on the CIT cluster or on the PSU cluster, but not via the OSG (you can run your container as long as your container is available on the shared filesystem of the cluster where you want to run). In order to run your code on the OSG, you would have to arrange to have your container published to cvmfs.
# Contributing Workflow
## Git Branching
The `gstlal` team uses the standard git-branch-and-merge workflow, which has brief description
at [GitLab](https://docs.gitlab.com/ee/gitlab-basics/feature_branch_workflow.html) and a full description
at [BitBucket](https://www.atlassian.com/git/tutorials/comparing-workflows/feature-branch-workflow). As depicted below,
the workflow involves the creation of new branches for changes, the review of those branches through the Merge Request
process, and then the merging of the new changes into the main branch.
![git-flow](_static/img/git-flow.png)
### Git Workflow
In general the steps for working with feature branches are:
1. Create a new branch from master: `git checkout -b feature-short-desc`
1. Edit code (and tests)
1. Commit changes: `git commit . -m "comment"`
1. Push branch: `git push origin feature-short-desc`
1. Create merge request on GitLab
## Merge Requests
### Creating a Merge Request
Once you push feature branch, GitLab will prompt on gstlal repo [home page](). Click “Create Merge Request”, or you can
also go to the branches page (Repository > Branches) and select “Merge Request” next to your branch.
![mr-create](_static/img/mr-create.png)
When creating a merge request:
1. Add short, descriptive title
1. Add description
- (Uses markdown .md-file style)
- Summary of additions / changes
- Describe any tests run (other than CI)
1. Click “Create Merge Request”
![mr-create](_static/img/mr-create-steps.png)
### Collaborating on merge requests
The Overview page give a general summary of the merge request, including:
1. Link to other page to view changes in detail (read below)
1. Code Review Request
1. Test Suite Status
1. Discussion History
1. Commenting
![mr-overview](_static/img/mr-overview.png)
#### Leaving a Review
The View Changes page gives a detailed look at the changes made on the feature branch, including:
1. List of files changed
1. Changes
- Red = removed
- Green = added
1. Click to leave comment on line
1. Choose “Start a review”
![mr-changes](_static/img/mr-changes.png)
After review started:
1. comment pending
1. Submit review
![mr-changes](_static/img/mr-change-submit.png)
#### Responding to Reviews
Reply to code review comments as needed Use “Start a review” to submit all replies at once
![mr-changes](_static/img/mr-respond.png)
Resolve threads when discussion on a particular piece of code is complete
![mr-changes](_static/img/mr-resolve.png)
### Merging the Merge Request
Merging:
1. Check all tests passed
1. Check all review comments resolved
1. Check at least one review approval
1. Before clicking “Merge”
- Check “Delete source branch”
- Check “Squash commits” if branch history not tidy
1. Click “Merge”
1. Celebrate
![mr-merge](_static/img/mr-merge.png)
# Contributing Documentation
This guide assumes the reader has read the [Contribution workflow](contributing.md) for details about making changes to
code within gstlal repo, since the documentation files are updated by a similar workflow.
## Writing Documentation
In general, the gstlal documentation uses [RestructuredText (rst)](https://docutils.sourceforge.io/rst.html) files
ending in `.rst` or [Markdown](https://www.markdownguide.org/basic-syntax/) files ending in `.md`.
The documentation files for gstlal are located under `gstlal/doc/source`. If you add a new page (doc file), make sure to
reference it from the main index page.
Useful Links:
- [MyST Directive Syntax](https://myst-parser.readthedocs.io/en/latest/syntax/syntax.html#syntax-directives)
......@@ -6,6 +6,5 @@ Executables
gstlal/bin/bin
gstlal-inspiral/bin/bin
gstlal-calibration/bin/bin
gstlal-burst/bin/bin
gstlal-ugly/bin/bin
.. _extrinsic-parameters-generation:
Generating Extrinsic Parameter Distributions
============================================
This tutorial will show you how to regenerate the extrinsic parameter
distributions used to determine the likelihood ratio term that accounts for the
relative times-of-arrival, phases, and amplitudes of a CBC signal at each of
the LVK detectors.
There are two parts described below that represents different terms. Full
documentation can be found here:
https://lscsoft.docs.ligo.org/gstlal/gstlal-inspiral/python-modules/stats.inspiral_extrinsics.html
Setting up the dt, dphi, dsnr dag
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1. Setup a work area and obtain the necessary input files
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
You will need to create a directory on a cluster running HTCondor, e.g.,
.. code:: bash
$ mkdir dt_dphi_dsnr
$ cd dt_dphi_dsnr
This workflow requires estimates of the power spectral densities for LIGO,
Virgo and KAGRA. For this tutorial we use projected O4 sensitivities in the
LIGO DCC. You can feel free to substitute these to suit your needs.
We will use the following files found at: https://dcc.ligo.org/LIGO-T2000012 ::
aligo_O4high.txt
avirgo_O4high_NEW.txt
kagra_3Mpc.txt
Download the above files and place them in the dt_dphi_dsnr directory that you are currently in.
2. Excecute commands to generate the HTCondorDAG
"""""""""""""""""""""""""""""""""""""""""""""""""
For this tutorial, we assume that you have a singularity container with the
gstlal software. More details can be found here:
https://lscsoft.docs.ligo.org/gstlal/installation.html
The following Makefile illustrates the sequence of commands required to generate an HTCondor workflow. You can copy this into a file called ``Makefile`` and modify it as you wish.
.. code:: make
SINGULARITY_IMAGE=/ligo/home/ligo.org/chad.hanna/development/gstlal-dev/
sexec=singularity exec $(SINGULARITY_IMAGE)
all: dt_dphi.dag
# 417.6 Mpc Horizon
H1_aligo_O4high_psd.xml.gz: aligo_O4high.txt
$(sexec) gstlal_psd_xml_from_asd_txt --instrument=H1 --output $@ $<
# 417.6 Mpc Horizon
L1_aligo_O4high_psd.xml.gz: aligo_O4high.txt
$(sexec) gstlal_psd_xml_from_asd_txt --instrument=L1 --output $@ $<
# 265.8 Mpc Horizon
V1_avirgo_O4high_NEW_psd.xml.gz: avirgo_O4high_NEW.txt
$(sexec) gstlal_psd_xml_from_asd_txt --instrument=V1 --output $@ $<
# 6.16 Mpc Horizon
K1_kagra_3Mpc_psd.xml.gz: kagra_3Mpc.txt
$(sexec) gstlal_psd_xml_from_asd_txt --instrument=K1 --output $@ $<
O4_projected_psds.xml.gz: H1_aligo_O4high_psd.xml.gz L1_aligo_O4high_psd.xml.gz V1_avirgo_O4high_NEW_psd.xml.gz K1_kagra_3Mpc_psd.xml.gz
$(sexec) ligolw_add --output $@ $^
# SNR ratios according to horizon ratios
dt_dphi.dag: O4_projected_psds.xml.gz
$(sexec) gstlal_inspiral_create_dt_dphi_snr_ratio_pdfs_dag \
--psd-xml $< \
--H-snr 8.00 \
--L-snr 8.00 \
--V-snr 5.09 \
--K-snr 0.12 \
--m1 1.4 \
--m2 1.4 \
--s1 0.0 \
--s2 0.0 \
--flow 15.0 \
--fhigh 1024.0 \
--NSIDE 16 \
--n-inc-angle 33 \
--n-pol-angle 33 \
--singularity-image $(SINGULARITY_IMAGE)
clean:
rm -rf H1_aligo_O4high_psd.xml.gz L1_aligo_O4high_psd.xml.gz V1_avirgo_O4high_NEW_psd.xml.gz logs dt_dphi.dag gstlal_inspiral_compute_dtdphideff_cov_matrix.sub gstlal_inspiral_create_dt_dphi_snr_ratio_pdfs.sub gstlal_inspiral_add_dt_dphi_snr_ratio_pdfs.sub dt_dphi.sh
3. Submit the HTCondor DAG and monitor the output
""""""""""""""""""""""""""""""""""""""""""""""""""
Next run make to generate the HTCondor DAG
.. code:: bash
$ make
Then submit the DAG
.. code:: bash
$ condor_submit_dag dt_dphi.dag
You can check the DAG progress by doing
.. code:: bash
$ tail -f dt_dphi.dag.dagman.out
4. Test the output
"""""""""""""""""""
When the DAG completes successfully, you should have a file called ``inspiral_dtdphi_pdf.h5``. You can verify that this file works with a python terminal, e.g.,
.. code:: bash
$ singularity exec /ligo/home/ligo.org/chad.hanna/development/gstlal-dev/ python3
Python 3.6.8 (default, Nov 10 2020, 07:30:01)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gstlal.stats.inspiral_extrinsics import InspiralExtrinsics
>>> IE = InspiralExtrinsics(filename='inspiral_dtdphi_pdf.h5')
>>>
Setting up probability of instrument combinations dag
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1. Setup a work area and obtain the necessary input files
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
You will need to create a directory on a cluster running HTCondor, e.g.,
.. code:: bash
$ mkdir p_of_instruments
$ cd p_of_instruments
2. Excecute commands to generate the HTCondorDAG
"""""""""""""""""""""""""""""""""""""""""""""""""
Below is a sample Makefile that will work if you are using singularity
3. Submit the HTCondor DAG
"""""""""""""""""""""""""""
.. code:: bash
$ condor_submit_dag p_of_I_H1K1L1V1.dag
.. code:: make
SINGULARITY_IMAGE=/ligo/home/ligo.org/chad.hanna/development/gstlal-dev/
sexec=singularity exec $(SINGULARITY_IMAGE)
all:
$(sexec) gstlal_inspiral_create_p_of_ifos_given_horizon_dag --instrument=H1 --instrument=L1 --instrument=V1 --instrument=K1 --singularity-image $(SINGULARITY_IMAGE)
clean:
rm -rf gstlal_inspiral_add_p_of_ifos_given_horizon.sub gstlal_inspiral_create_p_of_ifos_given_horizon.sub logs p_of_I_H1K1L1V1.dag p_of_I_H1K1L1V1.sh
See Also
^^^^^^^^
* https://arxiv.org/abs/1901.02227
* https://lscsoft.docs.ligo.org/gstlal/gstlal-inspiral/python-modules/stats.inspiral_extrinsics.html
* https://lscsoft.docs.ligo.org/gstlal/gstlal-inspiral/bin/gstlal_inspiral_create_dt_dphi_snr_ratio_pdfs.html
* https://lscsoft.docs.ligo.org/gstlal/gstlal-inspiral/bin/gstlal_inspiral_create_dt_dphi_snr_ratio_pdfs_dag.html
* https://lscsoft.docs.ligo.org/gstlal/gstlal-inspiral/bin/gstlal_inspiral_compute_dtdphideff_cov_matrix.html
.. _fake-data:
Fake Data Generation
=========================
WRITEME
......@@ -3,26 +3,36 @@
Feature Extraction
====================================================================================================
The `fxtools` module and related feature-based executables contain relevant libraries to identify
glitches in low-latency using auxiliary channel data.
SNAX (Signal-based Noise Acquisition and eXtraction), the `snax` module and related SNAX executables
contain relevant libraries to identify glitches in low-latency using auxiliary channel data.
`gstlal_feature_extractor` functions as a modeled search for data quality by applying matched filtering
SNAX functions as a modeled search for data quality by applying matched filtering
on auxiliary channel timeseries using waveforms that model a large number of glitch classes. Its primary
purpose is to whiten incoming auxiliary channels and extract relevant features in low-latency.
There are two different modes of output `gstlal_feature_extractor` can function in:
.. _feature_extraction-intro:
1. **Timeseries:** Production of regularly-spaced feature rows, containing the SNR, waveform parameters,
and the time of the loudest event in a sampling time interval.
2. **ETG:** This produces output that resembles that of a traditional event trigger generator (ETG), in
which only feature rows above an SNR threshold will be produced.
Introduction
------------
There are two different modes of feature generation:
1. **Timeseries:**
Production of regularly-spaced feature rows, containing the SNR, waveform parameters,
and the time of the loudest event in a sampling time interval.
2. **ETG:**
This produces output that resembles that of a traditional event trigger generator (ETG), in
which only feature rows above an SNR threshold will be produced.
One useful feature in using a matched filter approach to detect glitches is the ability to switch between
different glitch templates or generate a heterogeneous bank of templates.. Currently, there are Sine-Gaussian
and half-Sine-Gaussian waveforms implemented for use in detecting glitches, but the feature extractor was
designed to be fairly modular and so it isn't difficult to design and add new waveforms for use.
different glitch templates or generate a heterogeneous bank of templates. Currently, there are Sine-Gaussian,
half-Sine-Gaussian, and tapered Sine-Gaussian waveforms implemented for use in detecting glitches, but the feature
extractor is designed to be fairly modular and so it isn't difficult to design and add new waveforms for use.
Since the GstLAL feature extractor uses time-domain convolution to matched filter auxiliary channel timeseries
Since SNAX uses time-domain convolution to matched filter auxiliary channel timeseries
with glitch waveforms, this allows latencies to be much lower than in traditional ETGs. The latency upon writing
features to disk are O(5 s) in the current layout when using waveforms where the peak occurs at the edge of the
template (zero-latency templates). Otherwise, there is extra latency incurred due to the non-causal nature of
......@@ -32,7 +42,7 @@ the waveform itself.
digraph llpipe {
labeljust = "r";
label="gstlal_feature_extractor"
label="gstlal_snax_extract"
rankdir=LR;
graph [fontname="Roman", fontsize=24];
edge [ fontname="Roman", fontsize=10 ];
......@@ -128,17 +138,19 @@ the waveform itself.
}
.. _feature_extraction-highlights:
**Highlights:**
Highlights
----------
* Launch feature extractor jobs in online or offline mode:
* Launch SNAX jobs in online or offline mode:
* Online: Using /shm or framexmit protocol
* Offline: Read frames off disk
* Online/Offline DAGs available for launching jobs.
* Offline DAG parallelizes by time, channels are processed sequentially by subsets to reduce I/O concurrency issues. There are options to allow flexibility in choosing this, however.
* Offline DAG parallelizes by time, channels are processed sequentially by subsets to reduce I/O concurrency issues.
* On-the-fly PSD generation (or take in a prespecified PSD)
......@@ -161,3 +173,73 @@ the waveform itself.
* Waveform type (currently Sine-Gaussian and half-Sine-Gaussian only)
* Specify parameter ranges (frequency, Q for Sine-Gaussian based)
* Min mismatch between templates
.. _feature_extraction-online:
Online Operation
----------------
An online DAG is provided in /gstlal-burst/share/snax/Makefile.gstlal_feature_extractor_online
in order to provide a convenient way to launch online feature extraction jobs as well as auxiliary jobs as
needed (synchronizer/hdf5 file sinks). A condensed list of instructions for use is also provided within the Makefile itself.
There are four separate modes that can be used to launch online jobs:
1. Auxiliary channel ingestion:
a. Reading from framexmit protocol (DATA_SOURCE=framexmit).
This mode is recommended when reading in live data from LHO/LLO.
b. Reading from shared memory (DATA_SOURCE=lvshm).
This mode is recommended for reading in data for O2 replay (e.g. UWM).
2. Data transfer of features:
a. Saving features directly to disk, e.g. no data transfer.
This will save features to disk directly from the feature extractor,
and saves features periodically via hdf5.
b. Transfer of features via Kafka topics.
This requires a Kafka/Zookeeper service to be running (can be existing LDG
or your own). Features get transferred via Kafka from the feature extractor,
parallel instances of the extractor get synchronized, and then sent downstream
where it can be read by other processes (e.g. iDQ). In addition, an streaming
hdf5 file sink is launched where it'll dump features periodically to disk.
In order to start up online runs, you'll need an installation of gstlal. An installation Makefile that
includes Kafka dependencies are located at: gstlal/gstlal-burst/share/feature_extractor/Makefile.gstlal_idq_icc
To run, making sure that the correct environment is sourced:
$ make -f Makefile.gstlal_feature_extractor_online
Then launch the DAG with:
$ condor_submit_dag feature_extractor_pipe.dag
.. _feature_extraction-offline:
Offline Operation
-----------------
An offline DAG is provided in /gstlal-burst/share/snax/Makefile.gstlal_feature_extractor_offline
in order to provide a convenient way to launch offline feature extraction jobs. A condensed list of
instructions for use is also provided within the Makefile itself.
For general use cases, the only configuration options that need to be changed are:
* User/Accounting tags: GROUP_USER, ACCOUNTING_TAG
* Analysis times: START, STOP
* Data ingestion: IFO, CHANNEL_LIST
* Waveform parameters: WAVEFORM, MISMATCH, QHIGH
In order to start up offline runs, you'll need an installation of gstlal. An installation Makefile that
includes Kafka dependencies are located at: gstlal/gstlal-burst/share/feature_extractor/Makefile.gstlal_idq_icc
To generate a DAG, making sure that the correct environment is sourced:
$ make -f Makefile.gstlal_feature_extractor_offline
Then launch the DAG with:
$ condor_submit_dag feature_extractor_pipe.dag
......@@ -10,9 +10,7 @@ Overview
The GstLAL software package is used for the following activities:
- ``gstlal`` provides core Gstreamer plugins for signal processing workflows with LIGO data and core python bindings for constructing such workflows.
- ``gstlal-calibration`` provides real-time calibration of LIGO control system data into strain data.
- ``gstlal`` provides core Gstreamer plugins for signal processing workflows with LIGO data and core python bindings for constructing such workflows.
- ``gstlal-inspiral`` provides additional signal processing plugins that are specific for LIGO / Virgo searches for compact binaries as well as a substantial amount of python code for post-processing raw signal processing results into gravitational wave candidate lists. Several publications about the methodology and workflow exist, see :ref:`publications`
......@@ -34,11 +32,20 @@ The GstLAL software package is used for the following activities:
:maxdepth: 2
cbc_analysis
feature_extraction
fake_data
psd_estimation
simulated_data
workflow_config
publications
.. toctree::
:caption: Developer Guide
:maxdepth: 2
local_environment
container_environment
contributing
contributing_docs
.. toctree::
:caption: API Reference
:maxdepth: 2
......@@ -63,3 +70,4 @@ Indices and tables
* :ref:`modindex`
* :ref:`search`
.. _installation:
Installation
===============
=============
There are various ways to get started with GstLAL:
* :ref:`Install the latest release <install-release>`. Pre-built packages are available through various mechanisms.
* :ref:`Use a version provided in an IGWN reference distribution <install-igwn>`. This option is available to members of the International Gravitational-Wave Obervatory Network (IGWN).
* :ref:`Building the package from source <install-source>`. This is needed for users who wish to contribute to the project.
.. _install-release:
Installing the latest release
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Docker container (recommended)
""""""""""""""""""""""""""""""
The following should pull a GstLAL container and enter an environment with
GstLAL and all its dependencies pre-installed:
.. code:: bash
$ docker run -it --rm containers.ligo.org/lscsoft/gstlal:latest
Note that you will need `Docker <https://docs.docker.com/get-docker/>`_
installed. If that is not an option (Docker requires sudo privileges), you can
instead use `Singularity
<https://sylabs.io/guides/3.7/user-guide/quick_start.html>`_ in place of Docker,
which is available on many shared computing resources such as XSEDE and the OSG:
.. code:: bash
$ singularity run docker://containers.ligo.org/lscsoft/gstlal:latest
Conda installation
"""""""""""""""""""
Install conda using the `miniconda <https://docs.conda.io/projects/conda/en/latest/user-guide/install/>`_ installer, then run:
.. code:: bash
$ conda install -c conda-forge gstlal-inspiral
In order to check your installation, you can use:
.. code:: bash
$ conda list gstlal-inspiral # to check which version is installed
$ gstlal_play --help
.. warning::
These packages don't make use of any :ref:`math optimizations <install-math_optimizations>`
and is not currently recommended for production or larger-scale analyses.
.. _install-igwn:
IGWN distributions of GstLAL
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you are an IGWN member and have access to shared computing resources,
up-to-date GstLAL libraries are available in the set of reference software
environments maintained by the IGWN Computing and Software Working Group.
LIGO Data Grid (LDG)
"""""""""""""""""""""
GstLAL packages are installed and available by default on the LDG. You can start
using the GstLAL library immediately:
.. code:: bash
$ gstlal_play --help
IGWN Conda Distribution
""""""""""""""""""""""""
GstLAL is also available on the IGWN Conda Distribution in a variety of
pre-packaged environments. For more information, see
`computing.docs.ligo.org/conda/ <https://computing.docs.ligo.org/conda/>`_.
.. _install-source:
You can get a development copy of the gstlal software suite from git. Doing this at minimum will require a development copy of lalsuite.
* https://git.ligo.org/lscsoft/gstlal
* https://git.ligo.org/lscsoft/lalsuite
Building from source
^^^^^^^^^^^^^^^^^^^^^
Source tarballs for GstLAL packages and all the LIGO/Virgo software dependencies are available here: http://software.ligo.org/lscsoft/source/
Building from source is required for development (bug fixes, new features,
documentation improvements). You can check out the latest source of GstLAL from
git:
Limited binary packages are available here: https://wiki.ligo.org/Computing/DASWG/SoftwareDownloads
.. code:: bash
Building and installing from source follows the normal GNU build procedures
involving:
$ git clone https://git.ligo.org/lscsoft/gstlal.git
$ cd gstlal
Building and installing from source follows the normal GNU build procedures involving:
1. ./00init.sh
2. ./configure
3. make
4. make install.
You should build the packages in order of gstlal, gstlal-ugly,
gstlal-calibration, gstlal-inspiral. If you are building to a non FHS place
(e.g., your home directory) you will need to ensure some environment variables
are set so that your installation will function. The following five variables
must be set. As **just an example**::
GI_TYPELIB_PATH="/path/to/your/installation/lib/girepository-1.0:${GI_TYPELIB_PATH}"
GST_PLUGIN_PATH="/path/to/your/installation/lib/gstreamer-0.10:${GST_PLUGIN_PATH}"
PATH="/path/to/your/installation/bin:${PATH}"
# Debian systems need lib, RH systems need lib64, including both doesn't hurt
PKG_CONFIG_PATH="/path/to/your/installation/lib/pkgconfig:/path/to/your/installation/lib64/pkgconfig:${PKG_CONFIG_PATH}"
# Debian systems need lib, RH systems need lib and lib64
PYTHONPATH="/path/to/your/installation/lib64/python2.7/site-packages:/path/to/your/installation/lib/python2.7/site-packages:$PYTHONPATH"
4. make install
Since GstLAL is a collection of packages, there is a required build order to
install the packages:
1. gstlal
2. gstlal-ugly
3. gstlal-burst / gstlal-inspiral (any order)
If you are building from source, you will also need to install all dependencies
before building GstLAL, including:
* fftw
* gsl
* gstreamer
* lalsuite
* ldas-tools-framecpp
* numpy
* pygobject
* python-ligo-lw
* scipy
These dependencies can be installed in various ways, including conda, your
favorite package manager (apt/yum), or from source. We also provide containers
that are suitable for development.
Singularity container
""""""""""""""""""""""
A development container is provided with all necessary dependencies to install
GstLAL from source. Singularity also has extra features that make it possible to
create writable containers, making it easy to get started with development:
.. code:: bash
$ singularity build --sandbox --fix-perms gstlal-dev docker://containers.ligo.org/lscsoft/gstlal:master
This will pull a container from the container registry from the main branch and
builds it in 'sandbox' mode into ``/gstlal-dev``, which allows one to invoke it
in writable mode once it's built, and is needed to install software into the
container. This may take a few minutes to set up compared to a normal pull.
Once that's finished, you can enter the container in writable mode to install GstLAL from source:
.. code:: bash
$ singularity run --writable gstlal-dev
$ git clone https://git.ligo.org/lscsoft/gstlal.git
$ cd gstlal
.. note::
It's possible to run into issues when adding bind mounts to a writable
Singularity container depending on how Singularity is configured. This
may cause the following error to occur:
.. code:: bash
$ singularity run --writable gstlal-dev
WARNING: By using --writable, Singularity can't create /cvmfs destination automatically without overlay or underlay
FATAL: container creation failed: mount /cvmfs->/cvmfs error: while mounting /cvmfs: destination /cvmfs doesn't exist in container
In this case, one way to resolve this error is to add directories in the
container for each of the bind mounts explicitly, e.g.,:
.. code:: bash
$ mkdir gstlal-dev/cvmfs
$ singularity run --writable -B /cvmfs gstlal-dev
Singularity>
Now you can follow the normal GNU build procedures to build and install GstLAL.
It is also recommended to install GstLAL into the container's ``/usr``
directory, done at the configure step, e.g.
.. code:: bash
$ ./configure --prefix /usr
.. _install-math_optimizations:
Math-optimized Libraries
^^^^^^^^^^^^^^^^^^^^^^^^^^
GstLAL relies heavily on Linear Algebra and FFT routines to drive various
signal-processing components within the library. Being able to accelerate these
math-heavy routines which rely on BLAS with MKL considerably improves performance
and introduces significant speedups. The containers
that are available via Docker and Singularity are linked against MKL to take
advantage of these optimizations. There is no additional work needed from the user.
Work is ongoing to provide Conda packages to allow for similar optimizations.
# Local Development Environment
The local development workflow consists of a few key points:
- Managed conda environments using [conda-flow](https://git.ligo.org/james.kennington/conda-flow)
- Using integrated development environments (IDEs) like [PyCharm]()/[CLion]()
- Running applications that can consume `gstlal`, like [Jupyter Notebook]()
## Creating the environment
Thanks to conda-flow, creating the environment is simple. Before we can use conda-flow, it must be installed. In
whatever environment you prefer, install conda-flow from pip:
```bash
pip install conda-flow
```
Once conda-flow is installed, There are locked environment files contained within the gstlal repo under
`gstlal/gstlal/share/conda/envs`. Using conda-flow, we can make sure the local development conda environment is built:
```bash
conda-flow activate -n gstlal-dev -c /path/to/gstlal/gstlal/share/conda/conda-flow.yml
```
## Activating the environment
The `activate` command within conda-flow is done through subprocesses, and consequently will *not* affect the parent
process, such as the shell from which conda-flow is called. This is done to prevent unintended side effects; however, it
also means that unlike `conda activate`, `conda-flow activate` will not activate the environment inside the shell. If
you wish to activate the environment inside the shell, run `conda activate gstlal-dev`.
## Using Developer Tools
To use an IDE to develop `gstlal`, you will first need to start your IDE from within the appropriate conda environment.
For example, to launch the PyCharm IDE, run:
```bash
conda-flow activate -n gstlal-dev --run-after "open -na ""PyCharm.app""" -c /path/to/gstlal/gstlal/share/conda/conda-flow.yml
```
### Python Development
Note that the python source modules are not in a typical pythonic package structure (due to use of GNU build for c code
within gstlal). This can present problems with package indexing / imports in the IDE. The solution is to provide the IDE
with a map of the proper import paths, which can be done via two ways:
1. Build `gstlal` (or at least the python components) and add the build directory as a source directory in your IDE
1. Create a new source directory full of symlinks to the source files with a structure that mimicks the import paths.
There is a utility for constructing such symlinks at `gstlal/gstlal/share/conda/dev_links.py`
## Launching Applications
Conda-flow is capable of running arbitrary commands after activating the environment in the subprocess which is useful
for launching applications in a controlled way. For example, to run a jupyter notebook:
```bash
conda-flow activate -n gstlal-dev --run-after "jupyter notebook" -c /path/to/gstlal/gstlal/share/conda/conda-flow.yml
```
......@@ -3,4 +3,59 @@
PSD Generation
================
WRITEME
Using this workflow configuration (``config.yml``), we can generate a
median-averaged PSD across 100000 seconds of Hanford and Livingston data in O2:
.. code:: bash
# general options
start: 1187000000
stop: 1187100000
ifos: H1L1
# data discovery options
source:
frame-type:
H1: H1_GWOSC_O2_16KHZ_R1
L1: L1_GWOSC_O2_16KHZ_R1
channel-name:
H1: GWOSC-16KHZ_R1_STRAIN
L1: GWOSC-16KHZ_R1_STRAIN
sample-rate: 4096
frame-segments-file: segments.xml.gz
frame-segments-name: datasegments
# psd options
psd:
fft-length: 8
sample-rate: 4096
# condor options
condor:
accounting-group: your.accounting.group
profile: ldas
This sets general options such as the start and end times and the detectors to
analyze (H1 and L1). Data discovery options specify how to retrieve the strain
data as well as specifying the file used for science segments (``segments.xml.gz``),
and the sampling rate. PSD-specific options such as the FFT stride used.
Finally,
workflow-specific options (via HTCondor) which specify options like the accounting group
and the grid profile to use. If you have not done so yet, you can install grid profiles
locally and check which grid profiles are currently available via:
.. code:: bash
$ gstlal_grid_profile install
$ gstlal_grid_profile list # ldas should show up as one of the possible options
To launch the workflow via Condor with the configuration above, run:
.. code:: bash
$ gstlal_query_gwosc_segments 1187000000 1187100000 H1L1
$ gstlal_psd_workflow -c config.yml
$ condor_submit_dag psd_dag.dag
.. _simulated-data:
Simulated Data
===============
See :ref:`simulated-data-tutorial`.
Documentation for creating fake data
====================================
.. _simulated-data-tutorial:
Tutorial: Generation of simulated data
=======================================
Introduction
------------
......@@ -17,9 +19,9 @@ Consult :any:`gstlal_fake_frames` for more details
The basic steps to generate and validate LIGO colored noise are:
1. use gstlal_fake_frames to make the data (examples in the documenation include this)
2. verify that the PSD is as you would expect with gstlal_reference_psd (examples in the documentation include this)
3. plot the resulting PSD with gstlal_plot_psd (examples in the documentation include this)
1. Use ``gstlal_fake_frames`` to make the data
2. Verify that the PSD is as you would expect with ``gstlal_reference_psd``
3. Plot the resulting PSD with ``gstlal_plot_psd``
An example PSD plot:
......@@ -31,14 +33,9 @@ Custom colored noise, i.e. simulate your own detector
Consult :any:`gstlal_fake_frames` for more details
1. Start by obtaining a reference PSD that you wish to have as the target for
recoloring. If you actually have a text file ASD such as this one: e.g. <a
href=http://www.lsc-group.phys.uwm.edu/cgit/gstlal/plain/gstlal/share/v1_early_asd.txt>here</a>,
then you will need to first use gstlal_psd_xml_from_asd_txt to convert it
(examples in the documentation include this)
1. Next use gstlal_fake_frames to make the data with the desired PSD (examples
in the documentation include this)
1. Repeat the same validation steps as above to obtain, e.g.:
1. Start by obtaining a reference PSD that you wish to have as the target for recoloring. If you actually have a text file ASD such as this one: e.g. `here <https://git.ligo.org/lscsoft/gstlal/raw/master/gstlal/share/v1_early_asd.txt>`_, then you will need to first use ``gstlal_psd_xml_from_asd_txt`` to convert it
2. Next use ``gstlal_fake_frames`` to make the data with the desired PSD
3. Repeat the same validation steps as above to obtain, e.g.:
.. image:: ../gstlal/images/V1fakedataexamplepsd.png
:width: 400px
......@@ -53,36 +50,31 @@ This procedure assumes you are on an LDG cluster which has the data you wish to
recolor. Note that some of the tools required on not gstlal based. Please
consult the documentation for the external tools should you have questions.
1. First obtain segments for the data using ligolw_segment_query
2. Next obtain the frame file cache from ligo_data_find
3. Then create the PSD you wish to recolor to (perhaps using gstlal_psd_xml_from_asd_txt)
4. compute a reference spectrum from the frame data that you wish to recolor using gstlal_reference_psd
5. You might choose to optionally "smooth" the reference spectrum in order to leave lines in the underlying data. You can try using gstlal_psd_polyfit
6. Now with segments, a frame cache, a PSD (possibly smoothed) measured from the frame cache, and a PSD that is the target for the recolored spectrum, you are free to use gstlal_fake_frames according to the documentation.
1. First obtain segments for the data using ``ligolw_query_gwosc_segments``
2. Then create the PSD you wish to recolor to (perhaps using ``gstlal_psd_xml_from_asd_txt``)
3. compute a reference spectrum from the strain data that you wish to recolor using ``gstlal_reference_psd``
4. You might choose to optionally "smooth" the reference spectrum in order to leave lines in the underlying data. You can try using ``gstlal_psd_polyfit``
5. Now with segments, a PSD (possibly smoothed) measured from the strain data, and a PSD that is the target for the recolored spectrum, you are free to use ``gstlal_fake_frames`` according to the documentation.
Recoloring existing data with a HTCondor dag
--------------------------------------------
Some of the steps required to automate the batch processing of recoloring a
large data set has been automated in a script that generates a condor DAG. The
input to the condor dag script has itself been automated in makefiles such as:
<a
href=http://www.lsc-group.phys.uwm.edu/cgit/gstlal/plain/gstlal/share/Makefile.2015recolored>this</a>.
input to the condor dag script has itself been automated in makefiles such as
`this <https://git.ligo.org/lscsoft/gstlal/raw/master/gstlal/share/Makefile.2015recolored>`_.
As an example try this::
$ wget http://www.lsc-group.phys.uwm.edu/cgit/gstlal/plain/gstlal/share/Makefile.2015recolored
$ wget https://git.ligo.org/lscsoft/gstlal/raw/master/gstlal/share/Makefile.2015recolored
$ wget https://git.ligo.org/lscsoft/gstlal/raw/master/gstlal/share/recolored_config.yml
$ make -f Makefile.2015recolored
$ condor_submit_dag gstlal_fake_frames_pipe.dag
$ condor_submit_dag fake_frames_dag.dag
You can monitor the dag progress with::
$ tail -f gstlal_fake_frames_pipe.dag.dagman.out
You should have directories called LIGO and Virgo that contain the recolored frame data. Try changing values in the Makefile to match what you need
TODO
----
$ tail -f fake_frames_dag.dag.dagman.out
1. Add support for making colored noise in the gstlal_fake_frames_pipe
You should have a ``/frames`` directory that contains the recolored frame data. Experiment
with changing parameters in the Makefile to generate different PSDs, create frames over different stretches
of data, etc.
Running offline feature extraction jobs
####################################################################################################
An offline DAG is provided in /gstlal-burst/share/feature_extractor/Makefile.gstlal_feature_extractor_offline
in order to provide a convenient way to launch offline feature extraction jobs. A condensed list of
instructions for use is also provided within the Makefile itself.
For general use cases, the only configuration options that need to be changed are:
* User/Accounting tags: GROUP_USER, ACCOUNTING_TAG
* Analysis times: START, STOP
* Data ingestion: IFO, CHANNEL_LIST
* Waveform parameters: WAVEFORM, MISMATCH, QHIGH
Launching DAGs
====================================================================================================
In order to start up offline runs, you'll need an installation of gstlal. An installation Makefile that
includes Kafka dependencies are located at: gstlal/gstlal-burst/share/feature_extractor/Makefile.gstlal_idq_icc
To generate a DAG, making sure that the correct environment is sourced:
$ make -f Makefile.gstlal_feature_extractor_offline
Then launch the DAG with:
$ condor_submit_dag feature_extractor_pipe.dag
Configuration options
====================================================================================================
Analysis times:
* START: set the analysis gps start time
* STOP: set the analysis gps stop time
Data ingestion:
* IFO: select the IFO for auxiliary channels to be ingested (H1/L1).
* CHANNEL_LIST: a list of channels for the feature extractor to process. Provided
lists for O1/O2 and H1/L1 lists are in gstlal/gstlal-burst/share/feature_extractor.
* MAX_SERIAL_STREAMS: Maximum # of streams that a single gstlal_feature_extractor job will
process at once. This is determined by sum_i(channel_i * # rates_i). Number of rates for a
given channels is determined by log2(max_rate/min_rate) + 1.
* MAX_PARALLEL_STREAMS: Maximum # of streams that a single job will run in the lifespan of a job.
This is distinct from serial streams since when a job is first launched, it will cache
auxiliary channel frames containing all channels that meet the criterion here, and then process
each channel subset sequentially determined by the serial streams. This is to save on input I/O.
* CONCURRENCY: determines the maximum # of concurrent reads from the same frame file. For most
purposes, it will be set to 1. Use this at your own risk.
Waveform parameters:
* WAVEFORM: type of waveform used to perform matched filtering (sine_gaussian/half_sine_gaussian).
* MISMATCH: maximum mismatch between templates (corresponding to Omicron's mismatch definition).
* QHIGH: maximum value of Q
Data transfer/saving:
* OUTPATH: directory in which to save features.
* SAVE_CADENCE: span of a typical dataset within an hdf5 file.
* PERSIST_CADENCE: span of a typical hdf5 file.
Setting the number of streams (ADVANCED USAGE)
====================================================================================================
NOTE: This won't have to be changed for almost all use cases, and the current configuration has been
optimized to aim for short run times.
Definition: Target number of streams (N_channels x N_rates_per_channel) that each cpu will process.
* if max_serial_streams > max_parallel_streams, all jobs will be parallelized by channel
* if max_parallel_streams > num_channels in channel list, all jobs will be processed serially,
with processing driven by max_serial_streams.
* any other combination will produce a mix of parallelization by channels and processing channels serially per job.
Playing around with combinations of MAX_SERIAL_STREAMS, MAX_PARALLEL_STREAMS, CONCURRENCY, will entirely
determine the structure of the offline DAG. Doing so will also change the memory usage for each job, and so you'll
need to tread lightly. Changing CONCURRENCY in particular may cause I/O locks due to jobs fighting to read from the same
frame file.
Running online feature extraction jobs
####################################################################################################
An online DAG is provided in /gstlal-burst/share/feature_extractor/Makefile.gstlal_feature_extractor_online
in order to provide a convenient way to launch online feature extraction jobs as well as auxiliary jobs as
needed (synchronizer/hdf5 file sinks). A condensed list of instructions for use is also provided within the Makefile itself.
There are four separate modes that can be used to launch online jobs:
1. Auxiliary channel ingestion:
a. Reading from framexmit protocol (DATA_SOURCE=framexmit).
This mode is recommended when reading in live data from LHO/LLO.
b. Reading from shared memory (DATA_SOURCE=lvshm).
This mode is recommended for reading in data for O2 replay (e.g. UWM).
2. Data transfer of features:
a. Saving features directly to disk, e.g. no data transfer.
This will save features to disk directly from the feature extractor,
and saves features periodically via hdf5.
b. Transfer of features via Kafka topics.
This requires a Kafka/Zookeeper service to be running (can be existing LDG
or your own). Features get transferred via Kafka from the feature extractor,
parallel instances of the extractor get synchronized, and then sent downstream
where it can be read by other processes (e.g. iDQ). In addition, an streaming
hdf5 file sink is launched where it'll dump features periodically to disk.
Launching DAGs
====================================================================================================
In order to start up online runs, you'll need an installation of gstlal. An installation Makefile that
includes Kafka dependencies are located at: gstlal/gstlal-burst/share/feature_extractor/Makefile.gstlal_idq_icc
To run, making sure that the correct environment is sourced:
$ make -f Makefile.gstlal_feature_extractor_online
Then launch the DAG with:
$ condor_submit_dag feature_extractor_pipe.dag
Configuration options
====================================================================================================
General:
* TAG: sets the name used for logging purposes, Kafka topic naming, etc.
Data ingestion:
* IFO: select the IFO for auxiliary channels to be ingested.
* CHANNEL_LIST: a list of channels for the feature extractor to process. Provided
lists for O1/O2 and H1/L1 lists are in gstlal/gstlal-burst/share/feature_extractor.
* DATA_SOURCE: Protocol for reading in auxiliary channels (framexmit/lvshm).
* MAX_STREAMS: Maximum # of streams that a single gstlal_feature_extractor process will
process. This is determined by sum_i(channel_i * # rates_i). Number of rates for a
given channels is determined by log2(max_rate/min_rate) + 1.
Waveform parameters:
* WAVEFORM: type of waveform used to perform matched filtering (sine_gaussian/half_sine_gaussian).
* MISMATCH: maximum mismatch between templates (corresponding to Omicron's mismatch definition).
* QHIGH: maximum value of Q
Data transfer/saving:
* OUTPATH: directory in which to save features.
* SAVE_FORMAT: determines whether to transfer features downstream or save directly (kafka/hdf5).
* SAVE_CADENCE: span of a typical dataset within an hdf5 file.
* PERSIST_CADENCE: span of a typical hdf5 file.
Kafka options:
* KAFKA_TOPIC: basename of topic for features generated from feature_extractor
* KAFKA_SERVER: Kafka server address where Kafka is hosted. If features are run in same location,
as in condor's local universe, setting localhost:port is fine. Otherwise you'll need to determine
the IP address where your Kafka server is running (using 'ip addr show' or equivalent).
* KAFKA_GROUP: group for which Kafka producers for feature_extractor jobs report to.
Synchronizer/File sink options:
* PROCESSING_CADENCE: cadence at which incoming features are processed, so as to limit polling
of topics repeatedly, etc. Default value of 0.1s is fine.
* REQUEST_TIMEOUT: timeout for waiting for a single poll from a Kafka consumer.
* LATENCY_TIMEOUT: timeout for the feature synchronizer before older features are dropped. This
is to prevent a single feature extractor job from holding up the online pipeline. This will
also depend on the latency induced by the feature extractor, especially when using templates
that have latencies associated with them such as Sine-Gaussians.
......@@ -7,5 +7,3 @@ Tutorials
gstlal_fake_data_overview
online_analysis
offline_analysis
online_fx_jobs
offline_fx_jobs
.. _workflow-config:
Workflow Configuration
=======================
WRITEME