lalapps_splitSFTs rounding issue leading to 2 output files where user expects 1
First, for context, lalapps_splitSFTs
effectively has two modes (though both implemented in the same loop, just emergently behaving different for the user):
- If
start-frequency + frequency-bandwidth = end-frequency
, then it is expected to produce a single output file (the parameters are over-specified in this case, just bandwidth or end should be enough, but due to the code's ancient user-input handling history and for compatibility with mode 2, both are required) - If
frequency-bandwidth
is smaller, it will produce multiple output files, e.g. to cover a wide band in 50mHz steps as was the original E@H use case.
Then, as reported by @patrick-meyers : When one or more of the three options aren't integer multiples of frequency bin steps, but clearly intended by the user to produce a single file (start+band=end
in user-given Hz units), then the internal rounding sometimes ends up such that band<end-start
in terms of bin indices, and 2 output files instead of 1 are produced, with just 2 "spill-over" bins in the 2nd file.
@karl-wette I provide here a simplified example to reproduce and understand the issue, with some extra debug output hacked into the code. Below are a proposed solution by replacing the locally used rounding function, but further investigations would be required (request for feedback at the end).
- start with some fake data:
lalapps_Makefakedata_v5 --IFOs H1 --sqrtSX 1e-24 --startTime 1257800000 --duration 1800 --Tsft 1800 --fmin 50 --Band 10 --outSingleSFT=no --outSFTdir broadband
- Some nice round numbers that work as expected:
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 51 --frequency-bandwidth 1 --end-frequency 52 -- broadband/*.sft
2021-04-20 11:38:04.3590 (2576) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91800, endBin=93600, width=1800, overlap=0
2021-04-20 11:38:04.3590 (2576) [normal]: bin=91800, last_input_bin=108000, last_output_bin=93599, max_input_width=1800, max_output_width=1801, this_width=1800
2021-04-20 11:38:04.3590 (2576) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91800 binwidth=1800
2021-04-20 11:38:04.3590 (2576) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91800 binwidth=1800 filename=narrowband/H-1_H1_1800SFT_NB_F0051Hz0_W0001Hz0-1257800000-1800.sft
- Now a non-integer situation, but still produces a single output file:
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 51 --frequency-bandwidth 1.2212 --end-frequency 52.2212 -- broadband/*.sft
2021-04-20 11:39:08.6457 (2627) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91800, endBin=93998, width=2198, overlap=0
2021-04-20 11:39:08.6457 (2627) [normal]: bin=91800, last_input_bin=108000, last_output_bin=93997, max_input_width=2198, max_output_width=2199, this_width=2198
2021-04-20 11:39:08.6457 (2627) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91800 binwidth=2198
2021-04-20 11:39:08.6457 (2627) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91800 binwidth=2198 filename=narrowband/H-1_H1_1800SFT_NB_F0051Hz0_W0001Hz398-1257800000-1800.sft
- Same
frequency-bandwidth
, but lowerstart-frequency
and correspondingly movedend-frequency
, which still match exactly at the level of input precision, but now rounding producesendBin = startBin+width+1
instead ofendBin = startBin+width
, and the surprise bonus file appears:
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 50.9508 --frequency-bandwidth 1.2212 --end-frequency 52.172 -- broadband/*.sft
2021-04-20 11:40:31.7101 (2679) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91711, endBin=93910, width=2198, overlap=0
2021-04-20 11:40:31.7104 (2679) [normal]: bin=91711, last_input_bin=108000, last_output_bin=93908, max_input_width=2198, max_output_width=2200, this_width=2198
2021-04-20 11:40:31.7105 (2679) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91711 binwidth=2198
2021-04-20 11:40:31.7105 (2679) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91711 binwidth=2198 filename=narrowband/H-1_H1_1800SFT_NB_F0050Hz1711_W0001Hz398-1257800000-1800.sft
2021-04-20 11:40:31.7108 (2679) [normal]: bin=93909, last_input_bin=108000, last_output_bin=96106, max_input_width=2198, max_output_width=2, this_width=2
2021-04-20 11:40:31.7108 (2679) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=93909 binwidth=2
2021-04-20 11:40:31.7108 (2679) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=93909 binwidth=2 filename=narrowband/H-1_H1_1800SFT_NB_F0052Hz309_W0000Hz2-1257800000-1800.sft
ll narrowband/
total 24
-rw-r--r-- 1 dkeitel dkeitel 18824 Apr 20 11:40 H-1_H1_1800SFT_NB_F0050Hz1711_W0001Hz398-1257800000-1800.sft
-rw-r--r-- 1 dkeitel dkeitel 1256 Apr 20 11:40 H-1_H1_1800SFT_NB_F0052Hz309_W0000Hz2-1257800000-1800.sft
Possible solution
If I replace the locally defined MYROUND
macro by a call to XLALRoundFrequencyDownToSFTBin()
which claims to be an official rounding convention
, the first two calls produce unchanged results, but the third one now also rounds internally to endBin = startBin+width
and only produces 1 file as expected:
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 51 --frequency-bandwidth 1 --end-frequency 52 -- broadband/*.sft
2021-04-20 11:48:05.9393 (3087) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91800, endBin=93600, width=1800, overlap=0
2021-04-20 11:48:05.9395 (3087) [normal]: bin=91800, last_input_bin=108000, last_output_bin=93599, max_input_width=1800, max_output_width=1801, this_width=1800
2021-04-20 11:48:05.9396 (3087) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91800 binwidth=1800
2021-04-20 11:48:05.9396 (3087) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91800 binwidth=1800 filename=narrowband/H-1_H1_1800SFT_NB_F0051Hz0_W0001Hz0-1257800000-1800.sft
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 51 --frequency-bandwidth 1.2212 --end-frequency 52.2212 -- broadband/*.sft
2021-04-20 11:47:51.5129 (3067) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91800, endBin=93998, width=2198, overlap=0
2021-04-20 11:47:51.5130 (3067) [normal]: bin=91800, last_input_bin=108000, last_output_bin=93997, max_input_width=2198, max_output_width=2199, this_width=2198
2021-04-20 11:47:51.5130 (3067) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91800 binwidth=2198
2021-04-20 11:47:51.5130 (3067) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91800 binwidth=2198 filename=narrowband/H-1_H1_1800SFT_NB_F0051Hz0_W0001Hz398-1257800000-1800.sft
/home/dkeitel/git/lvc/lalsuite/lalapps/src/pulsar/SFTTools/lalapps_splitSFTs --output-directory narrowband --start-frequency 50.9508 --frequency-bandwidth 1.2212 --end-frequency 52.172 -- broadband/*.sft
2021-04-20 11:46:07.5048 (3026) [normal]: Tsft=1800.000000, rounded frequency parameters: startBin=91711, endBin=93909, width=2198, overlap=0
2021-04-20 11:46:07.5050 (3026) [normal]: bin=91711, last_input_bin=108000, last_output_bin=93908, max_input_width=2198, max_output_width=2199, this_width=2198
2021-04-20 11:46:07.5051 (3026) [normal]: Looking for SFT: det=H1 timebase=1800 firstbin=91711 binwidth=2198
2021-04-20 11:46:07.5051 (3026) [normal]: New/updated SFT: det=H1 timebase=1800 firstbin=91711 binwidth=2198 filename=narrowband/H-1_H1_1800SFT_NB_F0050Hz1711_W0001Hz398-1257800000-1800.sft
ll narrowband/
total 20
-rw-r--r-- 1 dkeitel dkeitel 18824 Apr 20 11:46 H-1_H1_1800SFT_NB_F0050Hz1711_W0001Hz398-1257800000-1800.sft
caveats / discussion
@karl-wette what do you think? This seems like a clean solution for the problem, and as the replacement function is supposed to be an "official convention", it seems it should be used here anyway. However, one probably would need to look very carefully for possible side-effects in other situations, and I don't think I have time for that at the moment.