Scheduled tests failing

Not sure why this is happening now. Might either be my recently merged changes to the test structure (although I haven't made any changes to the syntax of the test.

Alternatively, this might have also happened due to an update of some dependency we are using. The outcome of the test we are doing is random, though we do fix the seed np.random.seed(8817023). Resetting to a version before the changes in the test suite on my own machine throws the same error "AssertionError: 0.009432960657697387 not greater than 0.01".

I suggest to change the seed to 8817020, which seems to pass the test. On a more general note, it is generally not desirable to test random behaviour even with fixed seeds precisely because of these issues.

=================================== FAILURES ===================================
______________________ Test.test_fifteen_dimensional_cbc _______________________
self = <test.gw.sample_from_the_prior_test.Test testMethod=test_fifteen_dimensional_cbc>
    def test_fifteen_dimensional_cbc(self):
        duration = 4.0
        sampling_frequency = 2048.0
        label = "full_15_parameters"
        np.random.seed(8817023)
    
        waveform_arguments = dict(
            waveform_approximant="IMRPhenomPv2",
            reference_frequency=50.0,
            minimum_frequency=20.0,
        )
        waveform_generator = bilby.gw.WaveformGenerator(
            duration=duration,
            sampling_frequency=sampling_frequency,
            frequency_domain_source_model=bilby.gw.source.lal_binary_black_hole,
            parameter_conversion=bilby.gw.conversion.convert_to_lal_binary_black_hole_parameters,
            waveform_arguments=waveform_arguments,
        )
    
        ifos = bilby.gw.detector.InterferometerList(["H1", "L1"])
        ifos.set_strain_data_from_power_spectral_densities(
            sampling_frequency=sampling_frequency, duration=duration, start_time=0
        )
    
        priors = bilby.gw.prior.BBHPriorDict()
        priors.pop("mass_1")
        priors.pop("mass_2")
        priors["chirp_mass"] = bilby.prior.Uniform(
            name="chirp_mass",
            latex_label="$M$",
            minimum=10.0,
            maximum=100.0,
            unit="$M_{\\odot}$",
        )
        priors["mass_ratio"] = bilby.prior.Uniform(
            name="mass_ratio", latex_label="$q$", minimum=0.5, maximum=1.0
        )
        priors["geocent_time"] = bilby.core.prior.Uniform(minimum=-0.1, maximum=0.1)
    
        likelihood = bilby.gw.GravitationalWaveTransient(
            interferometers=ifos,
            waveform_generator=waveform_generator,
            priors=priors,
            distance_marginalization=False,
            phase_marginalization=False,
            time_marginalization=False,
        )
    
        likelihood = bilby.core.likelihood.ZeroLikelihood(likelihood)
    
        result = bilby.run_sampler(
            likelihood=likelihood,
            priors=priors,
            sampler="dynesty",
            npoints=1000,
            walks=100,
            outdir=self.outdir,
            label=label,
        )
        pvalues = [
            ks_2samp_wrapper(
                result.priors[key].sample(10000), result.posterior[key].values
            ).pvalue
            for key in priors.keys()
        ]
        print("P values per parameter")
        for key, p in zip(priors.keys(), pvalues):
            print(key, p)
>       self.assertGreater(kstest(pvalues, "uniform").pvalue, 0.01)
E       AssertionError: 0.00943296065769672 not greater than 0.01