Fix for pipeline hanging when run offline in high latency
Back in December, @avanivikrambhai.patel found that data flow would spontaneously stop in the gstlal calibration pipeline when run offline in high latency. She reported that this occurred with the config option FilterLatency: 1.0
, but if this is set to 0.0, as it is in the online (C00, GDS) pipeline, the problem disappears.
After some investigation, I traced the problem to the gating of the line subtraction with information derived from the existence of the h(t) data. When run online, occasional data dropouts can cause h(t) to be filled in with zeros for brief periods of time. Since we don't want this nonsensical data (or, rather, lack of data) to inform the estimates of TFs computed between witness channels and h(t), we use a gate to prevent that. When run offline in high latency (with FilterLatency: 1.0
), the queues that precede the gate can sometimes store data unnecessarily, refusing to release the data at end of stream. It appears that this may be due to a race condition, as the problem does not occur consistently. Since there are a large number of lines that we subtract, there are many instances of this gate, and the pipeline fails every time.
After trying many other methods, I found that the best solution (the only solution that worked) was to simply remove that gate anytime the config option FilterLatency
is nonzero. This will not affect the online pipeline, since FilterLatency
is zero when run online. When run offline, the chance of h(t) being absent is much smaller. Moreover, there are 3 other gates applied to the TFs that should cover every case needed.
Merge request reports
Activity
requested review from @avanivikrambhai.patel, @jameson.rollins, and @madeline-wade
assigned to @aaron-viets
1 1 #! /bin/sh 2 2 # depcomp - compile a program generating dependencies as side-effects 3 3 4 scriptversion=2024-06-19.01; # UTC 4 scriptversion=2018-03-07.03; # UTC 5 5 6 # Copyright (C) 1999-2024 Free Software Foundation, Inc. 6 # Copyright (C) 1999-2018 Free Software Foundation, Inc. These changes are made automatically. I am not aware of what effect they have on gstlal-calibration, if any. I pushed the changes I made in this MR from a container that optimizes gstlal, so that container may have had older software installed. In any case, the pipeline ran about 2.5 times faster than it did before the container.
changed this line in version 2 of the diff
reset approvals from @jameson.rollins by pushing to the branch
mentioned in commit cf03a9af