Popsummary support
Adds the ability to automatically save gwpopulation data products to popsummary format.
Some working notes:
- Currently only saves hypersamples and parts of the metadata.
- Popsummary has a single metadata field for models (i.e. there's no field for vt models). Tentatively, I've just prepended 'vt:' to the string for each vt model, and then added these to the popsummary model list.
- Adding most of the data products to the file should be straightforward, but some metadata (e.g. hyperparameter descriptions) as well as draws from the PPD will require additional inputs/computing.
- Have added a toggle that allows for the popsummary re-formatting to be turned off. Considering this reformatting doesn't delete any old data products and is just a quick post-processing step, this may be unneccesary.
Merge request reports
Activity
@jacob.golomb My test run ran into a weird bug earlier in
common_format.py
(duringresample_events_per_population_sample
; there's a negative probability somewhere -- probably user error and definitely unrelated to the popsummary changes). I just commented out this section and re-ran on the head node and found the popsummary formatting works as expected. You can check the output at/home/christian.adamcewicz/projects/tests/outdir_popsummary
on CIT if you're interested.
- Resolved by Colm Talbot
@christian.adamcewicz would you be able to rebase to avoid merge conflicts? There have been some changes to data_collection and main since you submitted this MR.
added 10 commits
-
78396293...d3ade5ee - 9 commits from branch
RatesAndPopulations:master
- b8a6ce71 - resolve conflicts with main
-
78396293...d3ade5ee - 9 commits from branch
When I try to run with this branch, the post_plots step fails because it cannot find "models" in the result.meta_data (https://git.ligo.org/RatesAndPopulations/gwpopulation_pipe/-/blob/master/gwpopulation_pipe/post_plots.py?ref_type=heads#L270). I'm not sure if this is due to something specific to this branch or a broader issue with gwpopulation_pipe. When I look at result.meta_data.keys() for a result saved with this version of gwpopulation_pipe, "models" is not listed as one of the keys. @colm.talbot
Edited by Jacob GolombThe failure was due to an error before reaching the final steps of
data_analysis
. The problem is fixed in !68 (merged)I was able to fix this, but now it gets to this line: https://git.ligo.org/RatesAndPopulations/gwpopulation_pipe/-/blob/d7ac7753b91a291182eed346c773442c9f4d0d3a/gwpopulation_pipe/common_format.py#L506 and raises an error that nothing is stored in the result.meta_data["likelihood"]. I don't remember, is gwpopulation_pipe/bilby supposed to save a copy of the likelihood info in the result file? Specifically, I know it usually does save something into this field, but it is empty in this case.
Edited by Jacob GolombCould this have something to do with jaxified likelihoods? The test I ran used cupy as the backend and didn't have this issue.
Unless the parameters are already stored somewhere else in the result object that I'm not aware of, we could just add something like this to data_analysis:
result.meta_data["parameters"] = args.parameters
and get the parameters from there in common_format.
To get the correct likelihood metadata we can add this after creating the jitted likelihood
likelihood.meta_data = likelihood._likelihood.meta_data
Now I look at it, I don't think that should matter though, I don't see where it is querying the likelihood metadata.I was looking at the wrong branch.Edited by Colm TalbotIt's possible the data is being converted to JAX impl arrays and so a deep copy would be needed.
result.meta_data["parameters"] = args.parameters
@christian.adamcewicz this sounds like a good idea.
@christian.adamcewicz Do you know where the
rates_on_grids
are stored? I'm told that this is only saving the hyperparameter samples.@jacob.golomb I'm trying to run some tests, but I'm running into a cryptic JAX-related error at
compute_rate_posterior
in data_analysis.py. Unless there's known issues with this, I suspect this is probably an environment issue, as this branch doesn't alter anything incompute_rate_posterior
or its inputs yet. Would you be able to point me to a suitable environment on CIT I could clone?In the meantime, I'll switch the backend over to cupy and continue testing with that.
- Resolved by Colm Talbot
@jacob.golomb I've pushed some edits that save the gridded rates computed in post_plots to the popsummary output, along with the reweighted posteriors and injections computed in common_format. The code also inverse transform samples these grids to store
fair_population_draws
. This loses covariance between parameters, so it might be worth thinking of a different way to do this moving forward (maybe doing the sampling in post_plots makes the most sense).The only thing I'm not 100% certain of with these edits is the ordering of jobs in the dag file -- I've had to tweak this a little so that post_plot runs before common_format, as common_format now relies on some of its outputs.
added 1 commit
- d3c027ea - Added exceptions/warnings for failing to save rates
added 1 commit
- d096af5b - added rejection sampling for ffair population draws
- Resolved by Colm Talbot
This is working for me. @colm.talbot is there anything else before we can get it merged?
- Resolved by Colm Talbot
- Resolved by Colm Talbot