Skip to content

calibration development

Reed Essick requested to merge clean-calibration_development into master

This merge will bring in the recent developments within calibration. In particular, FixedBandwidth1DKDE has been shown to scale reasonably well to large numbers of samples and to provide faithful error estimates for the value of the KDE (the pdf, not the cdf).

I've attached a 2 files showing the coverage behavior for error estimates using 2 different "known distributions", a beta distribution and the normal distribution.

beta distribution (alpha=2, beta=5) sanitycheck_calibrate-coverage_beta

normal distribution (mean=0.5, stdv=0.1) sanitycheck_calibrate-coverage_gaussian

Such plots can be generated relatively quickly with the included sanitycheck_calibrate-coverage via

    sanitycheck_calibrate-coverage -v --size 100 --Ntrial 50 --num_points 101

Importantly, they show that the KDE representation matches the expectation from the true pdf (left-most panels) and that the error estimates have diagonal coverage plots. In particular, we show coverage plots for pdf(rank) for each rank separately as red lines in the two right panels. The saturation of these curves is controlled by the value of the pdf at that rank, so values of rank that will be "visited more often" are darker and coverage for ranks that are visited "less often" are lighter. The blue line averages the coverage for all ranks together, weighting the coverage for each rank by pdf(rank). We see that the blue and red coverage estimates typically stay within the expected 1-sigma error regions for the coverage plot (grey shaded region) based on simplistic binomial error estimates for a cumulative histogram.

One can show that the errors all scale together as expected when we increase the number of observations fed into the KDE (--size).

Note: the git history may have gotten mucked up again. Please if you need me to clean it up, I can, but I'm tired and wanted to get the ball rolling before I took a late lunch.

Merge request reports