Skip to content

BUGFIX: fix random number generation with parallelization

Colm Talbot requested to merge rng-fix into master

This MR fixes a fairly long-standing bug with how we use parallelization, see, e.g., here for a related discussion. Currently, we naively use the np.random, this has been discouraged for a while in favour of a generator-based approach as the same seed is used across all spawned parallel processes. This means that we get essentially the same stream of random numbers for Bilby operations in parallel threads.

There are quite a lot of small changes here, basically, I changed every instance of np.random to use a new generator and added a small extra module to facilitate this.

This does not impact dynesty sampling as that has it's own parallel rng handling.

This may impact bilby_mcmc, I haven't checked in detail (@gregory.ashton).

This is responsible for weird behaviour we've seen a lot recently with optimal SNR and distance having posterior chains that look like

image

The bunching is because the same random number is used in the marginalized parameter reconstruction for each block of npool calls. With the changes here the same job becomes (the difference in the number of samples is due to a different inconsistent use of random seeds)

image

This means that for any event where the distance traces look like the first the sky localization will be unreliable as we have a much smaller effective sample size.

The same feature in the optimal SNR is because of the same effect.

Merge request reports