Skip to content

Clean

Reed Essick requested to merge clean into master

NOTE this has become a bit of a dumping ground for a large number of changes. Most of the "big" changes are documented here, and I'm abandoning the hope of thoroughly documenting all the other smaller changes.


This merge request cleans up a lot of code that hasn't been thoroughly tested. In part, this is done by removing distributions that we have not tested. The plan is that these may be added back after testing is implemented.

This merge request also implements "everything needed to generate injection sets" and the supporting logic with an executable.

Finally, this request builds unit tests so that individual distributions can be quickly sanity-check and further testing can be completely automated.


A more detailed summary of changes is as follows:

bin/gwdistributions-test-distributions

Added this executable to allow command-line testing of individual SamplingDistribution objects. Currently, our tests only work for 1D distributions, and for such distributions this executable produces several plots to demonstrate the functionality is correct

Associated command lines for all 1D distributions are within test/test-distributions.

etc/example.ini

Renamed example redshift distribution and added a section to demonstrate how to configure arbitrary conditionals within the config. The latter will be useful when specifying "hopeless" injections based on a simple SNR cut.

gwdistributions/backends.py

added a few more things that we'll try to import in order to support new functionality

gwdistributions/generators.py

Added support to count the number of attempted injections along with the number of retained injections. These will automatically be written to disk alongside the injections if the file format supports metadata annotations (currently only HDF does).

gwdistributions/io.py

Cleaned up some logic around how we write and read to XML files. Also added support to write/read metadata from HDF files.

gwdistributions/parse.py

Added support to parse arbitrary conditionals out of INI files.

gwdistributions/distributions/

throughout

removed jacobian and hessian methods, as they do not have any immediate use and are hard to test/maintain.

base.py

Change the way SamplingDistribution delegates to compute prob, logprob so that they eventually run through static methods. This really just means that logprob also passes self.params to the delegation to _logprob. This will allow us to implement auto-conditioning, in which the parameters of one distribution depend on the variates of another, in a much more computationally efficient way.

Also implemented a "domain" function so that SamplingDistributions delcare the domain of their variates over which they are defined. This is useful within the testing infrastructure so we can compute PDFs, CDFs on a grid.

Also declared a few other "helper" attributes

the rest of the distributions

The remaining distributions have been stripped down to only those deemed "most necessary" at the current moment. As much as possible (ie, for all 1D distributions), we tested the new implementation to make sure the objects behave as expected.

gwdistributions/test

Implemented some basic testing logic for 1D SamplingDistributions. A similar approach should work for multi-variate distributions, although we do not currently have a good statistic with which we quantify the agreement between histograms of drawn samples and predicted PDFs.

For 1D distributions, the tests essentially do the following

  • draw hyperparameters for the model from hyperpriors. The distribution should be well behaved for all hyperparameters, and this helps us guarantee that (modulo the support of the hyperpriors chosen)
  • for each set of hyperparameters
    • draw a set of samples from the sampling distribution.
    • compute the predicted PDF and CDF on a grid (boundaries defined by calls to dist.domain with logic to handle infinite domains)
    • compare the samples to the predicted PDF with a 2-sided KS test

Cumulative and differential histograms are computed for each realization of hyperparamters (if requested). At the same time, a p-p plot of the KS p-values is generated, showing whether the variation observed follows the expected scatter.

gwdistributions/transforms

STILL IN PROGRESS

Began to tweak these transforms to make sure we have everything we need to generate injections. This includes being able to

  • predict the SNR of injections. We want to plug in "easily" with different detector network models, which will require a bit more refactoring to make it obvious.
  • map the redshif to the luminosity distance (ie, support nontrivial cosmology)

Also cleaned up some stuff we don't anticipate needing.

gwdistributions/DEPRECATED

Moved many copies of older implementations here. These are either incompatible with some of the other syntax changes introduced here or simply have not been tested in the same way that the SamplingDistributions that are still available have been tested. The idea is to slowly move distributions from Deprecated into "production" as they are tested and/or needed.

Edited by Reed Essick

Merge request reports