diff --git a/doc/source/gstlal-inspiral/tutorials/offline_analysis.rst b/doc/source/gstlal-inspiral/tutorials/offline_analysis.rst index 1c089ca5d0137646f711fbbd275ac63c8af87845..2b0f034cdee56c9c6c0b8efb3ad057e3fb592065 100644 --- a/doc/source/gstlal-inspiral/tutorials/offline_analysis.rst +++ b/doc/source/gstlal-inspiral/tutorials/offline_analysis.rst @@ -13,7 +13,7 @@ Introduction This tutorial will help you to setup and run a offline gravitational wave search for binary neutron stars. The information contained within this document can easily be modified to perform a wide range of searches. -The offline analysis has a somewhat involved setup procedure. This documentation covers all of it. The analysis itself is performed by a pipeline contained within a dag (Directed Acyclic Graph) that is managed by condor. The dag and job sub files are produced by running gstlal_inspiral_pipe. This program requires several input files that are produced in several steps, all of which are detailed below. These input files are: +The offline analysis has a somewhat involved setup procedure which is usually performed with use of a Makefile. This documentation covers everything needed to set up a offline search. The analysis itself is performed by a pipeline contained within a dag (Directed Acyclic Graph) that is managed by condor. The dag and job sub files are produced by running gstlal_inspiral_pipe. This program requires several input files that are produced in several steps, all of which are detailed below. These input files are: * segments.xml.gz * vetoes.xml.gz @@ -25,12 +25,12 @@ The offline analysis has a somewhat involved setup procedure. This documentation The steps to produce the full analysis dag file are: - 1. Analysis variables defined at the top of offline Makefile - 2. Generate frame cache, segments, vetoes, and tisi files - 3. Generate/copy template bank and then split this into sub-banks - 4. Run gstlal_inspiral_pipe to produce offline analysis dag + 1. Set analysis variables defined at top of offline Makefile. + 2. Generate frame cache, segments, vetoes, and tisi files. + 3. Generate/copy template bank and then split this into sub-banks. + 4. Run gstlal_inspiral_pipe to produce offline analysis dag and sub files. -The information contained within this page is based off the O2 BNS test dag, an offline analysis focused on 100,000s centered around GW170817. The dag used to perform the analysis can be produced using a `Makefile <https://git.ligo.org/lscsoft/gstlal/blob/master/gstlal-inspiral/share/O3/offline/O2/Makefile.BNS_HL_test_dag_O2>`_ that generates most of the required files. This tutorial will just cover the HL detector pair configuration, though a HLV Makefile can be found `here <https://git.ligo.org/lscsoft/gstlal/blob/master/gstlal-inspiral/share/O3/offline/O2/Makefile.BNS_HLV_test_dag_O2>`_. In this tutorial we detail each stage of the Makefile needed to run an offline analysis. +The information contained within this page is based off the O2 BNS HL test dag, an offline analysis focused on 100,000s centered around GW170817. The dag used to perform the analysis can be produced using a `Makefile <https://git.ligo.org/lscsoft/gstlal/blob/master/gstlal-inspiral/share/O3/offline/O2/Makefile.BNS_HL_test_dag_O2>`_ that generates most of the required files. This tutorial will just cover the HL detector pair configuration, though a HLV Makefile can be found `here <https://git.ligo.org/lscsoft/gstlal/blob/master/gstlal-inspiral/share/O3/offline/O2/Makefile.BNS_HLV_test_dag_O2>`_. In this tutorial we detail each stage of the Makefile needed to run an offline analysis. Analysis variables defined at the top of offline Makefile --------------------------------------------------------- @@ -43,7 +43,7 @@ An accounting tag used to measure LDG computational use. See https://ldas-gridmo GROUP_USER=albert.einstein -This should be your alber.einstein user idenification. This is only needed if using a shared account. :: +This should be your albert.einstein user idenification. This is only needed if using a shared account. :: IFOS = H1 L1 MIN_IFOS = 2 @@ -53,7 +53,7 @@ Define which detectors to include within the analysis. H1, L1, and V1 are curren START = 1187000000 STOP = 1187100000 -Set start and stop time of the analysis in GPS seconds. The times stated here are 100,000s around GW170817. See https://www.gw-openscience.org/gps/ for conversions. :: +Set start and stop time of the analysis in GPS seconds. The times stated here are 100,000s around GW170817. See https://www.gw-openscience.org/gps/ for GPS time conversions. :: TAG = BNS_test_dag RUN = run_1 @@ -63,11 +63,13 @@ Set output directory for summary page of results. :: MCHIRP_INJECTIONS := 0.5:100.0:1_injections.xml -Used to specify injection file, and chirpmass range over which to filter it. Multiple injection files can be given at once, these should be space separated, with no whitespace at the end of the line. :: +Used to specify injection file, and chirpmass range over which to filter it. Multiple injection files can be given at once, these should be space separated, with no whitespace at the end of the line. + +**NOTE, an injection file must be passed to gstlal_inspiral_pipe, it is unable to run without one.** :: VETODEF = /path/to/H1L1-CBC_VETO_DEFINER_CLEANED_C02_O2_1164556817-23176801.xml -Veto definer file. Used to determine what data to veto. See https://git.ligo.org/detchar/veto-definitions/tree/master/cbc for all veto definer files. :: +Veto definer file. Used to determine which data to veto. See https://git.ligo.org/detchar/veto-definitions/tree/master/cbc for all veto definer files. :: # GSTLAL_SEGMENTS Options SEG_SERVER=https://segments.ligo.org @@ -84,25 +86,24 @@ Veto definer file. Used to determine what data to veto. See https://git.ligo.org H1_CHANNEL=DCH-CLEAN_STRAIN_C02 L1_CHANNEL=DCH-CLEAN_STRAIN_C02 -Gravitational wave data segment, frame type, and channel name information. See https://wiki.ligo.org/LSC/JRPComm/ for full information about all observing runs. :: +Gravitational wave data segment, frame type, and channel name information. See https://wiki.ligo.org/LSC/JRPComm/ for full details about all observing runs. :: include /path/to/Makefile.offline_analysis_rules Full path to [Makefile.offline_analysis_rules](https://git.ligo.org/lscsoft/gstlal/blob/master/gstlal-inspiral/share/Makefile.offline_analysis_rules). This file contains sets of rules for string parsing/manipulation used within the main Makefile and an up-to-date version must be included. - -Generate segments, vetoes, frame cache, and tisi files +Generate frame cache, segments, vetoes, and tisi files ------------------------------------------------------ -Generating frame.cache file -^^^^^^^^^^^^^^^^^^^^^^^^^^^ +frame.cache file +^^^^^^^^^^^^^^^^ The frame.cache file contains the full paths to the Gravitational Wave data .gwf files using the following format: + Detector site identifier, frame type, start GPS time, duration, full path to file :: H H1__H1_CLEANED_HOFT_C02 1186998263 4096 file://localhost/hdfs/frames/O2/hoft_C02_clean/H1/H-H1_CLEANED_HOFT_C02-11869/H-H1_CLEANED_HOFT_C02-1186998263-4096.gwf - If the .gwf data files are stored locally, then you can produce individual detector frame cache files with:: gw_data_find -o H -t $(HANFORD_FRAME_TYPE) -l -s $(START) -e $(STOP) --url-type file | awk '{ print $$1" $*_"$$2" "$$3" "$$4" "$$5}' > H1_frame.cache @@ -120,8 +121,8 @@ And then create a combined frame.cache file with some additional formating:: sed -i s/H\ $(LIGO_FRAME_TYPE)/H\ H1_$(LIGO_FRAME_TYPE)/g frame.cache sed -i s/L\ $(LIGO_FRAME_TYPE)/L\ L1_$(LIGO_FRAME_TYPE)/g frame.cache -Generating segments.xml.gz and vetoes.xml.gz files -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +segments.xml.gz and vetoes.xml.gz files +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The segments.xml.gz file contains a list of all data segments that should be analysed. The vetoes.xml.gz file contains a list of all data segments that should be ignored. :: @@ -137,7 +138,6 @@ This returns an initial segments list. This command makes use of some Makefile v This queries the ligo segment server for all veto types (CAT1, CAT2, and CAT3) that are defined within the veto definer file :: - ligolw_add --output CAT1_vetoes.xml.gz $(CAT1_VETOES_FILES) ligolw_cut --delete-column segment:segment_def_cdb --delete-column segment:creator_db --delete-column segment_definer:insertion_time CAT1_vetoes.xml.gz gzip CAT1_vetoes.xml.gz @@ -149,14 +149,14 @@ Produce CAT1 vetoes file. :: gstlal_segments_operations --diff --output-file segments.xml.gz segdb.xml CAT1_vetoes.xml.gz gstlal_segments_trim --trim $(SEGMENT_TRIM) --gps-start-time $(START) --gps-end-time $(STOP) --min-length $(SEGMENT_MIN_LENGTH) --output segments.xml.gz segments.xml.gz -Combine initial segment files with CAT1 vetoe times removed. :: +Combine initial segment files with CAT1 vetoe times removed to produce segments.xml.gz file. :: ./lauras_txt_files_to_xml -i $* -c -o $*-gates.xml $*-GATES-1163203217-24537601.txt ligolw_no_ilwdchar $*-gates.xml gstlal_segments_operations --union --segment-name VETO_CAT3_CUMULATIVE --output-file %_vetoes.xml.tmp --output-segment-name vetoes $*-VETOTIME_CAT3-*.xml $*-VETOTIME_CAT3-*.xml gstlal_segments_operations --union --segment-name vetoes --output-file %_vetoes.xml --output-segment-name vetoes %_vetoes.xml.tmp $*-gates.xml -Include gating times into CAT3 veto times files. :: +Include gating times into CAT3 veto times files. The gating files contain aditional times to veto that are not included within the veto definer file. The ascii files are converted into readable xml files with lauras_txt_files_to_xml. :: ligolw_add --output vetoes.xml.gz $(VETOES_FILES) ligolw_cut --delete-column segment:segment_def_cdb --delete-column segment:creator_db --delete-column segment_definer:insertion_time vetoes.xml.gz @@ -164,9 +164,9 @@ Include gating times into CAT3 veto times files. :: Combine all vetoe files into single vetoes.xml.gz file. -Generating tisi.xml.gz and inj_tisi.xml.gz file -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -:: +tisi.xml.gz and inj_tisi.xml.gz file +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Tisi (time slide) files are used for the offsetting of events used within the analysis for the calculation of the background.:: lalapps_gen_timeslides --instrument=H1=0:0:0 --instrument=L1=0:0:0 inj_tisi.xml @@ -181,7 +181,9 @@ Generate analysis time slides file. Generate/copy template bank and then split this into sub-banks -------------------------------------------------------------- -The next step is to acquire a template bank that will be used to filter the data. The BNS Makefile produces its own BNS template bank containing ~13,500 templates (parameters are shown below) but there are also existing template bank that can be used. If you are using a pre-existing template bank, then much of the next two sections can be ignored/removed. :: +The next step is to acquire a template bank that will be used to filter the data. The BNS Makefile produces its own BNS template bank containing ~13,500 templates (parameters are shown below) but there are also existing template bank that can be used. If you are using a pre-existing template bank, then much of the next two sections can be ignored/removed, though some parameters are still used. + +**Note. lalapps_tmpltbank is deprecated code and should not be used for actual analyses. It is used here as it is faster to run than more modern codes such as `lalapps_cbc_sbank <https://lscsoft.docs.ligo.org/lalsuite/lalapps/namespacelalapps__cbc__sbank.html>`_. ** :: ############################ # Template bank parameters # @@ -220,63 +222,63 @@ The next step is to acquire a template bank that will be used to filter the data Template bank parameters. The bank is then produced with this command:: lalapps_tmpltbank \ - --disable-compute-moments \ - --grid-spacing Hexagonal \ - --dynamic-range-exponent 69.0 \ - --enable-high-pass $(HIGH_PASS_FREQ) \ - --high-pass-order 8 \ - --strain-high-pass-order 8 \ - --minimum-mass $(MIN_MASS) \ - --maximum-mass $(MAX_MASS) \ - --min-total-mass $(MIN_TOTAL_MASS) \ - --max-total-mass $(MAX_TOTAL_MASS) \ - --max-eta $(MAX_ETA) \ - --min-eta $(MIN_ETA) \ - --gps-start-time $(BANKSTART) \ - --gps-end-time $(BANKSTOP) \ - --calibrated-data real_8 \ - --channel-name H1:$(H1_CHANNEL) \ - --space Tau0Tau3 \ - --number-of-segments 15 \ - --minimal-match $(MM) \ - --high-pass-attenuation 0.1 \ - --min-high-freq-cutoff ERD \ - --segment-length 1048576 \ - --low-frequency-cutoff $(LOW_FREQUENCY_CUTOFF) \ - --pad-data 8 \ - --num-freq-cutoffs 1 \ - --sample-rate $(SAMPLE_RATE) \ - --high-frequency-cutoff $(HIGH_FREQUENCY_CUTOFF) \ - --resample-filter ldas \ - --strain-high-pass-atten 0.1 \ - --strain-high-pass-freq $(HIGH_PASS_FREQ) \ - --frame-cache H1_frame.cache \ - --max-high-freq-cutoff ERD \ - --approximant $(APPROXIMANT) \ - --order twoPN \ - --spectrum-type median \ - --verbose + --disable-compute-moments \ + --grid-spacing Hexagonal \ + --dynamic-range-exponent 69.0 \ + --enable-high-pass $(HIGH_PASS_FREQ) \ + --high-pass-order 8 \ + --strain-high-pass-order 8 \ + --minimum-mass $(MIN_MASS) \ + --maximum-mass $(MAX_MASS) \ + --min-total-mass $(MIN_TOTAL_MASS) \ + --max-total-mass $(MAX_TOTAL_MASS) \ + --max-eta $(MAX_ETA) \ + --min-eta $(MIN_ETA) \ + --gps-start-time $(BANKSTART) \ + --gps-end-time $(BANKSTOP) \ + --calibrated-data real_8 \ + --channel-name H1:$(H1_CHANNEL) \ + --space Tau0Tau3 \ + --number-of-segments 15 \ + --minimal-match $(MM) \ + --high-pass-attenuation 0.1 \ + --min-high-freq-cutoff ERD \ + --segment-length 1048576 \ + --low-frequency-cutoff $(LOW_FREQUENCY_CUTOFF) \ + --pad-data 8 \ + --num-freq-cutoffs 1 \ + --sample-rate $(SAMPLE_RATE) \ + --high-frequency-cutoff $(HIGH_FREQUENCY_CUTOFF) \ + --resample-filter ldas \ + --strain-high-pass-atten 0.1 \ + --strain-high-pass-freq $(HIGH_PASS_FREQ) \ + --frame-cache H1_frame.cache \ + --max-high-freq-cutoff ERD \ + --approximant $(APPROXIMANT) \ + --order twoPN \ + --spectrum-type median \ + --verbose ligolw_no_ilwdchar H1-TMPLTBANK-$(START)-2048.xml gstlal_inspiral_add_template_ids H1-TMPLTBANK-$(START)-2048.xml -lalapps_tmpltbank is a rather old program and newer ones exist, such as lalapps_cbc_sbank. Which ever program you use to generate the bank, gstlal_inspiral_add_template_ids needs to be run on it in order to work with the mass model used in the main analysis. :: +After obtaining a bank gstlal_inspiral_add_template_ids needs to be run on it in order to work with the mass model used in the main analysis. :: mkdir -p $*_split_bank gstlal_bank_splitter \ - --f-low $(LOW_FREQUENCY_CUTOFF) \ - --group-by-chi $(NUM_CHI_BINS) \ - --output-path $*_split_bank \ - --approximant $(APPROXIMANT1) \ - --approximant $(APPROXIMANT2) \ - --output-cache $@ \ - --overlap $(OVERLAP) \ - --instrument $* \ - --n $(NUM_SPLIT_TEMPLATES) \ - --sort-by mchirp \ - --max-f-final $(HIGH_FREQUENCY_CUTOFF) \ - --write-svd-caches \ - --num-banks $(NUMBANKS) \ - H1-TMPLTBANK-$(START)-2048.xml + --f-low $(LOW_FREQUENCY_CUTOFF) \ + --group-by-chi $(NUM_CHI_BINS) \ + --output-path $*_split_bank \ + --approximant $(APPROXIMANT1) \ + --approximant $(APPROXIMANT2) \ + --output-cache $@ \ + --overlap $(OVERLAP) \ + --instrument $* \ + --n $(NUM_SPLIT_TEMPLATES) \ + --sort-by mchirp \ + --max-f-final $(HIGH_FREQUENCY_CUTOFF) \ + --write-svd-caches \ + --num-banks $(NUMBANKS) \ + H1-TMPLTBANK-$(START)-2048.xml This program needs to be run on the template bank being used to split it up into sub banks that will be passed to the singular value decompositon code within the pipeline. @@ -286,35 +288,35 @@ Run gstlal_inspiral_pipe to produce offline analysis dag The final stage of the Makefile that produces the analysis dag. :: gstlal_inspiral_pipe \ - --data-source frames \ - --gps-start-time $(START) \ - --gps-end-time $(STOP) \ - --frame-cache frame.cache \ - --frame-segments-file segments.xml.gz \ - --vetoes vetoes.xml.gz \ - --frame-segments-name datasegments \ - --control-peak-time $(PEAK) \ - --template-bank H1-TMPLTBANK-$(START)-2048.xml \ - --num-banks $(NUMBANKS) \ - --fir-stride 1 \ - --web-dir $(WEBDIR) \ - --time-slide-file tisi.xml \ - --inj-time-slide-file inj_tisi.xml \ - $(INJECTION_LIST) \ - --bank-cache $(BANK_CACHE_STRING) \ - --tolerance 0.9999 \ - --overlap $(OVERLAP) \ - --flow $(LOW_FREQUENCY_CUTOFF) \ - $(CHANNEL_NAMES) \ - --autocorrelation-length $(AC_LENGTH) \ - $(ADDITIONAL_DAG_OPTIONS) \ - $(CONDOR_COMMANDS) \ - --ht-gate-threshold-linear 0.8:15.0-45.0:100.0 \ - --request-cpu 2 \ - --request-memory 5GB \ - --min-instruments $(MIN_IFOS) \ - --ranking-stat-samples 4194304 \ - --mass-model=ligo + --data-source frames \ + --gps-start-time $(START) \ + --gps-end-time $(STOP) \ + --frame-cache frame.cache \ + --frame-segments-file segments.xml.gz \ + --vetoes vetoes.xml.gz \ + --frame-segments-name datasegments \ + --control-peak-time $(PEAK) \ + --template-bank H1-TMPLTBANK-$(START)-2048.xml \ + --num-banks $(NUMBANKS) \ + --fir-stride 1 \ + --web-dir $(WEBDIR) \ + --time-slide-file tisi.xml \ + --inj-time-slide-file inj_tisi.xml \ + $(INJECTION_LIST) \ + --bank-cache $(BANK_CACHE_STRING) \ + --tolerance 0.9999 \ + --overlap $(OVERLAP) \ + --flow $(LOW_FREQUENCY_CUTOFF) \ + $(CHANNEL_NAMES) \ + --autocorrelation-length $(AC_LENGTH) \ + $(ADDITIONAL_DAG_OPTIONS) \ + $(CONDOR_COMMANDS) \ + --ht-gate-threshold-linear 0.8:15.0-45.0:100.0 \ + --request-cpu 2 \ + --request-memory 5GB \ + --min-instruments $(MIN_IFOS) \ + --ranking-stat-samples 4194304 \ + --mass-model=ligo sed -i '1s/^/JOBSTATE_LOG logs\/trigger_pipe.jobstate.log\n/' trigger_pipe.dag Additional commands and submitting the dag