LHO has recently been having a string of TURN_ON_BS_STAGE_2 locklosses, and in searching for some to test on I realized that I wasn't able to find any locklosses from that state or nearby states happening later than December last year. I used ndscope to look at the ISC_LOCK state for a lockloss that was marked in Jenne's re-acquisition survey (by me) as a TURN_ON_BS_STAGE_2 loss, and it turns out that ISC_LOCK was bumped back to ACQUIRE_DRMI_1F for a fraction of a second before losing lock (see attached).
This would normally register as 2 locklosses, but the integer precision of the lockloss ID system writes over the first (correct) lockloss. I believe we have a lot of recent locklosses that have been miscategorized as ACQUIRE_DRMI_1F locklosses, we should tweak our state-finding method and ideally re-run on previous locklosses.
I'm not really understanding the explanation here. Is the claim that ISC_LOCK went from LOCKLOSS to ACQUIRE_DRMI_1F? If so, how does that make sense? I don't see how that's possible, or desirable. It seems like that's the issue that needs to be fixed.
I think it's fair to say that we can't actually physically have two lock losses in one second. If there are two within one second we should just go with the first and ignore the second. If that's not happening then that's something to be fixed.
But I think there are other questions here as well.
I've identified a change mode in November 2019 (e074d18a) that would cause a second event in a second to overwrite the first. I'm looking in to how to fix this.
@jameson.rollins It's possible to go from LOCKLOSS_DRMI to ACQUIRE_DRMI_1F (which I believe is the case above, though hard to pick out). We included both LOCKLOSS_DRMI (state number 3) and LOCKLOSS (state number 2) as valid lockloss state options since the IFO would sometimes go from its pre-lockloss state to 3, then to 2.
Actually, we've been complete ignoring transitions to LOCKLOSS_DRMI for
a long time, and in fact I removed that code entirely in
49dda054. So if this is really due to a
transition to LOCKLOSS_DRMI then I'm more confused...
@jameson.rollins Talking to TJ and Sheila, it seems like it is going to LOCKLOSS_DRMI (and not having a lockloss event created for it), and then goes to ACQUIRE_DRMI_1F and back down to LOCKLOSS. So in this case there is no event overwriting, the only logged lockloss is 101 -> 2.
The commissioners seem comfortable with the way that GUARDIAN goes between states, so my suggestion would be to record additional previous state values (if they occurred within a few seconds) and add logic to choose the correct previous state.
Sorry, can you be more explicit about what the suggestion is exactly?
It was my understanding that we didn't want to record the LOCKLOSS_DRMI
transitions.
I agree, we don't want to record LOCKLOSS_DRMI transitions. In data.gen_transitions we find every state transition in the chosen ISC_LOCK_STATE_N segment, I was hoping to check a few seconds before transitions to LOCKLOSS to see if there was a transition up from LOCKLOSS_DRMI. If so, choose the state before LOCKLOSS_DRMI and call that the lockloss state (even though the refined lockloss time would still be from ACQUIRE_DRMI_1F to LOCKLOSS).
However, looking at the code I'm remembering that the analysis of the transition states is done in the search followup, and the transition states are being yielded in sets of two by data.gen_transitions. So I'll have to think more about the ideal way to handle this.
I agree, we don't want to record LOCKLOSS_DRMI transitions. In
data.gen_transitions we find every state transition in the chosen
ISC_LOCK_STATE_N segment, I was hoping to check a few seconds before
transitions to LOCKLOSS to see if there was a transition up from
LOCKLOSS_DRMI. If so, choose the state before LOCKLOSS_DRMI and call
that the lockloss state (even though the refined lockloss time would
still be from ACQUIRE_DRMI_1F to LOCKLOSS).
I guess I'm confused about this. If we want to record the state before
a LOCKLOSS_DRMI transition, why aren't we just recording LOCKLOSS_DRMI
transitions?
However, looking at the code I'm remembering that the analysis of the
transition states is done in the search followup, and the transition
states are being yielded in sets of two by data.gen_transitions. So
I'll have to think more about the ideal way to handle this.
The search for lock loss state(s) records the state that we transition
from, which is also re-confirmed in the state history plugin.
hmm, I'm not sure I'm following all this comment thread. We do sometimes loose lock from only DRMI, if that happens the guardian goes back to acquire DRMI without redoing the ALS sequence. Sometimes, we go to lockloss DRMI, even though ALS has also lost lock, in which case we quickly transition from lockloss drmi to acquire DRMI to lockloss. This is then recorded as a lockloss from acquire drmi. I think that we should note this as a lockloss DRMI lockloss.
I don't see a problem with treating LOCKLOSS_DRMI in the same way that we treat LOCKLOSS, and agree that if they happen within the same second we should just use the first event. I think that in terms of summary plots they could be grouped in with other locklosses, this kind of DRMI only lockloss doesn't happen after we start the CARM offset reduction, so this won't be applicable to later guardian states.
Can you somehow make the check for ALS lock come before the check for
DRMI lock, so that these kinds of lock losses will be recorded as
LOCKLOSS instead of LOCKLOSS_DRMI?
It looks like the order of the checks is IMC, Xarm green, Y arm green, DRMI. So nominally it is checking for ALS first. But I could imagine that the ifo could unlock at any time, and it could be that our thresholds for DRMI are generally faster to respond to a lockloss. I don't want to try to mess with those thresholds, I don't see an easy way to avoid this situation.
My suggestion is we pass state transitions to the search tool in sets of 4 (ex: 104 -> 3 -> 101 -> 2) and, if the last number is 2, check if the first two transitions happened within a short amount of time.