Review tests so far have used the phase marginalised likelihood to help speed up the sampling. For waveforms that include higher order modes, or precession, the phase marginalisation is not valid, and therefore.
Possible review items:
- comparison of bilby posteriors for standard (22-mode, non-precessing) simulated event with exactly the same priors, but one using phase marginalisation and one without.
- comparison of bilby posteriors for GW150914 with exactly the same priors, but one using phase marginalisation and one without.
- P-P plots for a waveform model (IMRPhenomXHM?) including (some) higher order modes without using phase marginalised likelihoods. If speed is an issue, maybe hold unimportant parameters (sky positions?) fixed.
IMRPhenomPv2 TD vs TDP comparison for GW150914 using parallel bilby
Sampling times using 320 cores (20x16) on ozstar (sstar) ~ 2hr for nact=50 ~ 1hr for nact=20 for the TD marginalization.
Link to MR implementing improvement which resolves shredding issue without increasing the runtime too substantially
PP tests using IMRPhenomPv2 using serial bilby
Results using bilby0.6.4-4436ec59-CLEAN_bilby_pipe0.3.9-82325e0-CLEAN
Link to an err page demonstrating phase marginalization is turned off (note that the phase-prior is not a Delta function as it would be when sampling with an analytically-phase-marginalized likelihood)
As of 14/02/2020, the set of injections is not complete. However, the
nact=1, n-parallel=1case (top row of the table below) has completed 98/100 injections. The two failures were in the generation step and the two injections (
data98) have low SNR (network SNR < 5). As such, they are highly unlikely to change the overall PP test. Once the DAG is completed, these jobs will be resubmitted in order to finish the test.
17/02/2020: Removed unnecessary nact=5 runs from the table below (still available on CIT here).
PP tests and meta data
Corner plots for high-SNR events:
The event below shows that for the highest SNR event in the PP test above, the posteriors do not look great. This is believed to be due to the "shredding" seen for high SNR events.
|Intrinsic, data70, network SNR=20.8,|
|Extrinsic, data70, network SNR=20.8,|
|Spins, data70, network SNR=20.8|
Conclusion: PP tests using IMRPhenomPv2 using serial bilby
The PP tests indicate that (for the range of SNRs tested), the posteriors are not biased at the level of the test. But, corner plots of high SNR events demonstrate the posteriors are not "clean". They exhibit signs of shredding: a known issue.