Hi @kevin.turbang, this looks really good. I have a look at your comment on issue 109 and that does seem like it would account for some of the contribution. I'm pretty sure the variance bias term from overlapping segments is calculated in pygwb as w1w2squaredbar. It would be interesting to see if applying this factor and the Welch bias term gives the correct overall bias.
I also think it's interesting that the observed bias of 1.1 is about 1.05^2, perhaps that is a coincidience, or perhaps it's a sign that bias is being applied twice somewhere. Were you able to check if the sigmas used in the KS-test have already had the bias correction applied earlier in the code?
The normal Erf in the KS test is labelled as "Erf with \sigma
= ..." where sigma is the bias factor applied to the Delta-SNR variates.
https://git.ligo.org/pygwb/pygwb/-/blob/master/pygwb/statistical_checks.py?ref_type=heads#L1379
I think this label could be misleading. The Erf being plotted appears to be a standard normal N(0, 1) for comparison with the plot of the Delta-SNR data (bins_count) which has been corrected by the bias. So it might be clearer if the Erf was just labelled something like "Normal CDF" or similar. Perhaps the bias or correcting factor could be included in the data plot instead.
The use of \sigma
I feel is also easy to confuse with the stochastic sigma. It is not obvious this is the bias (or whatever std deviation is used to correct the Delta-SNR).
Philip Roy Charlton (e61ebc3e) at 24 Jan 10:00
I have sent off a quick note to Andrew via his gmail account - will update when/if i get a response.
Hi @arianna.renzini I was about to raise an issue about this but I see one already exists - great! It would be very nice if we could have the time axes in stochmon correspond exactly to the start and end of each day or week so we can easily compare it to the network status. Do you have some thoughts about how that might be achieved? I suppose the start and end time of each period from the config file would need to be passed into the statistical checks module in some fashion.
Stochmon seems to be creating the K-S statistic plot but it's not visible in the summary pages.
Hi @arianna.renzini, ignoring other considerations like scripts outside of pygwb that use this function, I think the "ideal" would be to have separate functions that return the point estimate and sigma. A lesser change would be to just change the return order. Either way, if we made it consistent with all internal pygwb calls do you think it would affect many external scripts? I do feel that, as written, I would consider the current behaviour "buggy" or at best, likely lead to bugs in the future, so maybe it's better to accept the pain now and fix any scripts that depend on this now rather than later.
I would also suggest puttting the S_h into its own function - it's useful to have that separate anyway in case you want to plot it, for example. Almost the same function appears in util.py as well, so there is some justification to give it its own function.
I am not sure if the calculation of n_segs in the coherence plot is quite right on line 485:
https://git.ligo.org/pygwb/pygwb/-/blob/master/pygwb/statistical_checks.py#L485
n_segs = len(self.sliding_omega_cut) * int(np.floor(self.params.segment_duration/(fftlength))-1)
To me it looks like the part calculating the number of FFT segments per segment is not accounting for the overlap using in calculating the PSD's and CSD's.
Yes, we can close this now.
Hi @max.lalleman, there is no change - I still don't think this part of the bias correction is actually correct, at least as I interpret Andrew's comment, but it has very little effect on the bias.
I could try contacting Andrew about it.
This should fix the x-axis of the coherence histograms being too long and the y-axis being too short.
Now for the x-axis it should round up to the nearest multiple of a power of 10 eg. 0.012 "rounds up" to 0.02 etc.
For the y-axis it will round up to the nearest power of 10 since this is a log axis eg. 110 "rounds up" to 1000 etc.
Philip Roy Charlton (e61ebc3e) at 12 Dec 05:50
Adjusted the calculations of the axis limits
Currently if there is no coincident data over a day (or week), stochmon doesn't produce a web page/HTML files so it's not clear to a user if this is "normal" or if something has failed. It would be very helpful if it still produced an informative web page to say, for example, that no coincident data was found for the period.
Thanks Arianna - it looks like things are working well now - the new plots are working.