Changes

Colm Talbot · 7f4dc2b5
--- a/O4-review/relative-binning.md
+++ b/O4-review/relative-binning.md
-## Likelihood differences
+The relative binning (heterodyned) likelihood ([Zackay _et al._](https://arxiv.org/abs/1806.08792)) offers a method to accelerate the likelihood for arbitrary frequency-domain models.
+While it is more widely applicable than ROQ bases, more care must be taken with tuning.
-Compute likelihood difference for
+For this review, we have focused on demonstrating two things:
+- when a good fiducial point is provided, the results obtained with this approximation are high fidelity.
+- when a bad fiducial point is used, or the approximation otherwise fails, we can identify this in a programmatic way.
+To establish this, we importance sample the results obtained with relative binning using the regular likelihood.
+If the mismatches, defined as the log of the absolute difference between (natural) log-likelihoods obtained with the two methods are small, the approximation is good.
+By rejection sampling using the weights (true vs approximate likelihood ratios), we can find the fraction of samples obtained.
+If the rejection sampling efficiency is small, then we can say that the approximation failed and we should repeat with a more robust method.
 ## [Unit testing](https://git.ligo.org/lscsoft/bilby/-/blob/master/test/gw/likelihood/relative_binning_test.py)
@@ -42,6 +50,17 @@ For some successful cases, the mean ln likelihood is > 0.1 in the tails, however
 The fiducial BNS injection has been analyzed with the relative binning likelihood.
-In all cases, we see good agreement with the ROQ-likelihood runs and good resampling efficiency.
+In all cases where a suitable starting point was provided, we see good agreement with the ROQ-likelihood runs and good resampling efficiency.
+Here is the distribution of likelihood mismatches for two identical analyses of the fiducial BNS signal with a processing spin prior with magnitudes up to 0.4 and tidal deformability up to 5000.
+The legend entries show the fraction of samples surviving rejection sampling.
+It is very close to 1.
+![image](uploads/b3dc0d4f87076ec6c34844adc6f645b5/image.png)
+By accident, we performed some runs with fiducial parameters that are a very bad fit to the actual signal.
+In this case, we found that the rejection sampling efficiency was very small with large mismatches.
+![image](uploads/40f2dcd4e4d95c293222f6e36e230092/image.png)
-*TODO*: add figures and numbers here
+While we did not perform large-scale testing in this regime. These findings are consistent with the tests with the NSBH waveform model that we can get good results in a representative use case and identify bad results.
\ No newline at end of file