implement preliminary chisq signal model derived from templates' auto-correlation function
Overview
This MR introduces a new SNR-chisq signal model derived from templates' auto-correlation function.
While Andre is still working on his data-driven approach, we have implemented the preliminary version.
The whole development consists of the following features:
-
svd_bank.py
: compute lambda-eta related quantities for each template and store them inGamma2-4
columns -
gstlal_inspiral_create_prior_diststat
: take these four quantities from sngl_inspiral_table in a given svd bank file and pass it to newadd_signal_model_analytic
to construct ifo-dependent signal model (note that the auto-correlation depends on ifo through its PSD, so the resultant signal model technically varies over ifos) -
inspiral_lr.py
: accommodate the ifo-dependent signal model. also, since the new signal model is supposed to be more accurate without KDE, the KDE smoothing is disabled infinish()
only for thesignaldensity
class -
inspiral_extrinsics.py
: addadd_singl_model_analytic()
inNumeratorSNRCHIPDF
class. That constructs a new signal model for given lamda-eta related quantities and marginalize over given mismatch range.
things to discuss
- Since this new signal model is not supposed to be KDE-ed, I disabled it when calling
finish()
for the numerator pdf (see here). As expected, this created some region in SNR-chisq space where lnP cannot be evaluated, e.g. whitened region in the histogram plot paster below. This caused the program to crash when sampling such a region as it would have returnedNaN
value. So I avoided it by adding fast-cut here so thatrankingstat.numerator(**kwargs)
returnsNegInf
for samples falling onto the region. Since those samples don't really contribute to our background model anyways, this fast-cut shouldn't make much impact on FAR estimates, etc... But I would like to make sure that this is a right thing to do, or if there is any smarter way to do this, e.g. adding negligible yet non-zero prior onto the entire param space. - Or maybe should I just have expanded this limited region such that it will relax the boundaries in chisq and SNR=
1e10
?
Test results
I tested this new signal model with a rerank dag using Cort's offline dag with the manifold bank + mu
sorting.
Cort's offline open box results
New offline open box results
VT-FAR plot
Cort's original run
New run
ratio
VT-total SNR plot
Cort's original run
new run
ratio
The VT - FAR plot suggests that there is actually consistent improvement for BNS and lighter BBHs by 10 - 20 %. and the VT - SNR seem to imply that that improvement mostly come from SNR ~ 8-ish.
In these results, the ifo-dependent horizon factor was also implemented.
injection recovery table
The injection recovery suggests that the one with the new signal model found more injections for most of the categories compared to the old results.
old
new
\xi^2
histogram plots (H1 + no KDE)
SNR-BNS bin
NSBH/BBH bin
BBH bin
IMBH bin
In general, the new signal model show larger width at SNR~10 but narrower width at SNR~100 throughout these bins.
Also note that the width dependes on mismatch range one gives to the gstlal_inspiral_create_prior_diststat
program. For the above plots, the mismatch range of 0.1 - 30% was given.
analytic VT comparison
As suggested by Kipp, Here is the comparison between the analytic and measured VTs. The one with the new signal model seems to improve the accuracy of the analytic VT compared to that from Cort's run for BNS and lightest BBH categories, which is consitent with the fact that the new signal model improved VT for these two categories. Note that the both analytic and measured VTs vary between the two runs as the injection databse and diststat pdf are different.
M_\mathrm{c}\in[0.50 - 2.00]M_\odot
new signal model
old signal model
M_\mathrm{c}\in[2.00 - 4.50]M_\odot
new signal model
old signal model
M_\mathrm{c}\in[4.50 - 45.0]M_\odot
new signal model
old signal model
M_\mathrm{c}\in[45.0 -450.0]M_\odot
new signal model
old signal model
reference
summary notes
detailed note by Aaron