Skip to content

Use measured foreground rate in far calculation

Problem Summary

In calculating FAR, we assume that false alarm probability was calculated with 100 backgrounds for every foreground. However, while we do initially generate that many backgrounds, we then filter both foregrounds and backgrounds based on cohsnr & singlsnr, with additional changes planned in signal removal (see !148 (merged)).

This change should correct that assumption

I've added Patrick for code review, but Manoj & Sunil to confirm the science stuff before any merge.

Problem detail

In cohfar_assignfar.c, when calculating FAR, we do the following:

false_alarm_rate = false_alarm_probability * num_background_triggers / (livetime * backgrounds_per_foreground),
In code, that's: gen_fap_from_feature(snr, chisq, stats) * stats->nevent / (stats->livetime * hist_trials)

hist_trials is the number of backgrounds we generate per foreground.
i.e. for each ifo peak > 4 snr, we try coherent search between IFOs with 100 different time-offsets to generate additional background triggers (the time-offset is > speed of light delay between detectors, guaranteeing no real signal in the result)

However, we filter both foregrounds and backgrounds back in postcoh, currently with the condition
sqrt(pklist->cohsnr_bg[background_cur]) > 1.414 + pklist->snglsnr[write_ifo][peak_cur] for backgrounds, and
sqrt(pklist->cohsnr[peak_cur]) > 1.414 + pklist->snglsnr[ifo_id][peak_cur] for foregrounds

That skews the number of backgrounds per foreground. In most of my tests, it's around 50-60 rather than 100 backgrounds per foreground.

Solution

Load up the zerolag (foreground) stats in cohfar_assignfar, and use the actual foreground rate: zl_stats->nevent / zl_stats->livetime (livetime is the same for zl & bg, and hist_trials is effectively bg_stats->nevent / zl_stats->nevent)

I don't like that this requires an additional file read (the new column is from the same file as the bg_stats), especially with #92 unresolved, but I think reworking trigger_stats_xml_from_xml is its own task, and shouldn't be a prereq for this change.

I've been inconsistent with zlstats vs zl_stats, remind me to fix that.

Tests

I've run an 8000 second injection test, and we get slightly worse FARs as a result: image

Seeing as this affects science results, it should be tested more thoroughly, on O3 data and with pretty plots showing how our FARs are closer to the theoretical line. Or something like that.

But this should make our FARs more consistent overall, and help with planned changes in signal removal.

Edited by Timothy Davies

Merge request reports