The relative binning (heterodyned) likelihood ([Zackay _et al._](https://arxiv.org/abs/1806.08792)) offers a method to accelerate the likelihood for arbitrary frequency-domain models.
While it is more widely applicable than ROQ bases, more care must be taken with tuning.
Compute likelihood difference for
For this review, we have focused on demonstrating two things:
- when a good fiducial point is provided, the results obtained with this approximation are high fidelity.
- when a bad fiducial point is used, or the approximation otherwise fails, we can identify this in a programmatic way.
To establish this, we importance sample the results obtained with relative binning using the regular likelihood.
If the mismatches, defined as the log of the absolute difference between (natural) log-likelihoods obtained with the two methods are small, the approximation is good.
By rejection sampling using the weights (true vs approximate likelihood ratios), we can find the fraction of samples obtained.
If the rejection sampling efficiency is small, then we can say that the approximation failed and we should repeat with a more robust method.
@@ -42,6 +50,17 @@ For some successful cases, the mean ln likelihood is > 0.1 in the tails, however
...
@@ -42,6 +50,17 @@ For some successful cases, the mean ln likelihood is > 0.1 in the tails, however
The fiducial BNS injection has been analyzed with the relative binning likelihood.
The fiducial BNS injection has been analyzed with the relative binning likelihood.
In all cases, we see good agreement with the ROQ-likelihood runs and good resampling efficiency.
In all cases where a suitable starting point was provided, we see good agreement with the ROQ-likelihood runs and good resampling efficiency.
Here is the distribution of likelihood mismatches for two identical analyses of the fiducial BNS signal with a processing spin prior with magnitudes up to 0.4 and tidal deformability up to 5000.
The legend entries show the fraction of samples surviving rejection sampling.
While we did not perform large-scale testing in this regime. These findings are consistent with the tests with the NSBH waveform model that we can get good results in a representative use case and identify bad results.