From what Niko told me earlier, the way that the refinement works is that it measures the standard deviation of AS_A_NSUM something like 15 seconds before the lockloss, and the first time that the signal deviates (or increases?) by a certain number of standard deviations that is set as the refine time. It seems like in both of the cases here, simply increasing the number of standard deviations we allow would improve the timing.
Since we have been through several iterations of this refinement code, perhaps before making a change we could do a check of the proposed change. For example, if we get the last ~100 or so times that we lost lock, and overlay the refine time plots for them (maybe only 10 per plot or something for legibility) using whatever new refine time algorithm we are suggesting, that would be a good way to check that any changes work for a variety of locklosses.
I think we will have to be more clever than just using a standard deviation check. I have been testing using different values of std of H1:ASC-AS_A_DC_NSUM_OUT_DQ and for the lower std we get too many early refined times e.g. for our current std = 10, 60+ out of 429 are too early, often during EQs. But if we increase the std, lots of events aren't getting refined as their threshold is pushed too high.
Quick plot to show we can't win (higher std = more "no refinements", lower std = more "wrong refinements"):
(Note the v. few data points)
We should think about whether adding fixed thresholds or a combination of things would work better.
pdf showing some examples, scroll to bottom to see ho the threshold is pushed too high, it's higher than examples refinement fails even though it shouldn't: Refining_refinement.pdf
One idea would be that in the case where there are multiple threshold crossings, using the one that is closest to the guardian lockloss time, rather than the first one?
There might be some other drawback to this, but it looks like it would work in the examples in your PDF.
Yes @sheila-dwyer that's a good idea but I think that alone wouldn't give us a refined enough time if you look at a very zoomed in version (blue line):
If the refinement is too late I think it would effect the other plots, i.e. saturating suspensions, and isn't what we want. What do you think @yannick.lecoeuche? It would mean some refinements would be ~0.05s late. I could make the FAST tag use a different method so I don't think that would be a problem.
I was considering if looking at gradient could be a good method, lockloss when the gradient is steep.. maybe a little confusing though.
hmmm. This is getting more complicated, but what about using a high threshold to identify the time of the big spike in AS, and then looking at the closest crossing of a lower threshold before that?
I tried using the std of the gradient of the AS channel. This actually works pretty well. With std +/- 50, 7 out of 426 lock losses didn't refine correctly.
When the ASW channel looses lock the gradient of the AS channel gets very big ~ 10^8
However I don't really like this method as if there is a little blip in the channel, the gradient reacts a lot (only saw happen once):
I actually think having hard thresholds for all channels IMC, POP, AS would work well. For AS we could have a high and low threshold.
This would actually be more consistent than a std check as we've seen the stand deviations checks fail too often. I'll have a quick go and see If it looks viable!
That seems promising. @camilla.compton can you remind me what std is? Are you saying that you take the gradient of AS power, then look a certain before the lockloss to measure the standard deviation of the gradient, then look for when the gradient goes above (or below) a number of standard deviations?
Yes, sorry @sheila-dwyer , that is exactly what I am saying.
The main refinement method calculates the mean and standard deviation of the data between [-20,-15] seconds* before the lockloss (so 5 seconds of data). And then the threshold is calculated by a number (which I referred to as std, here 50) multiplied by the standard deviation added (or subtracted) from the mean.
e.g. if we used std_threshold = 50.
Upper threshold = calculated mean + (50 * calculated standard deviation)
Lower threshold = calculated mean - (50 * calculated standard deviation)
* @yannick.lecoeuche and I actually spoke about whether [-20, -15] is too soon before the lockloss but I haven't played with those numbers yet. I think [-10, -5] might make more sense and capture the ground as so PD signal shaking more if an EQ is coming. But then we risk the threshold never being crossed if the standard deviation is too high.