guardian issueshttps://git.ligo.org/cds/software/guardian/-/issues2021-01-11T21:47:55Zhttps://git.ligo.org/cds/software/guardian/-/issues/75Python2ism left in worker.py2021-01-11T21:47:55ZThomas ShafferPython2ism left in worker.pyWhen trying to run a SPM snapshot, we crashed our test node. Upon further investigation we found that there was two instances of "iteritems" in worker.py (line 403 and 406 with version 1.4.3) that caused the crash.
See LHO alog for refe...When trying to run a SPM snapshot, we crashed our test node. Upon further investigation we found that there was two instances of "iteritems" in worker.py (line 403 and 406 with version 1.4.3) that caused the crash.
See LHO alog for reference and log screenshot: https://alog.ligo-wa.caltech.edu/aLOG/index.php?callRep=57595https://git.ligo.org/cds/software/guardian/-/issues/74logging all activities from MEDM GUIs2020-11-20T16:39:16ZFranco Carbognanilogging all activities from MEDM GUIsWhile trying to classify the lock losses causes during O3 for Virgo, we are realizing that sometime is not so trivial to understand when a certain change on the guardian nodes status happened because of an explicit transition requested f...While trying to classify the lock losses causes during O3 for Virgo, we are realizing that sometime is not so trivial to understand when a certain change on the guardian nodes status happened because of an explicit transition requested from a MEDM user interface.
Is there already the possibility (or could it be easily implemented) to log any request coming from a MEDM GUIs? If this logging would also report the user under which the GUI is running, this would be further informative.https://git.ligo.org/cds/software/guardian/-/issues/49GuardUtil Graph not Generating Network Graph due to unmet Networkx dependency.2020-05-14T18:21:20ZNathan HollandGuardUtil Graph not Generating Network Graph due to unmet Networkx dependency.On a fresh Guardian installation the command *guardutil graph* fails. The problem is that the module *Networkx* can't import the module *PyDot*. Find the code snippet below:
```
$ guardutil graph SUS_LOCK -o SUS_LOCK_rev1__20200402.pdf
T...On a fresh Guardian installation the command *guardutil graph* fails. The problem is that the module *Networkx* can't import the module *PyDot*. Find the code snippet below:
```
$ guardutil graph SUS_LOCK -o SUS_LOCK_rev1__20200402.pdf
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/lib/python2.7/dist-packages/guardian/guardutil.py", line 507, in <module>
main()
File "/usr/lib/python2.7/dist-packages/guardian/guardutil.py", line 504, in main
args.func(args)
File "/usr/lib/python2.7/dist-packages/guardian/guardutil.py", line 153, in draw_graph
edge_constraints=args.constraints,
File "/usr/lib/python2.7/dist-packages/guardian/graph.py", line 208, in sys2dot
dot = to_pydot(G)
File "/usr/lib/python2.7/dist-packages/networkx/drawing/nx_pydot.py", line 199, in to_pydot
pydot = _import_pydot()
File "/usr/lib/python2.7/dist-packages/networkx/drawing/nx_pydot.py", line 362, in _import_pydot
import pydot
ImportError: No module named pydot
```
A look at the installed module reveals the following:
```
$ apt search pydot
Sorting... Done
Full Text Search... Done
python-pydot/stable 1.4.1-1 all
Python interface to Graphviz's dot
python-pydot-ng/stable 1.0.0-3 all
interface to Graphviz's Dot - Python 2.7
python-pydotplus/stable,now 2.0.2-2 all [installed,automatic]
interface to Graphviz's Dot language - Python 2.7
python-pydotplus-doc/stable 2.0.2-2 all
interface to Graphviz's Dot language - doc
python3-pydot/stable 1.4.1-1 all
Python interface to Graphviz's dot (Python 3)
python3-pydotplus/stable,now 2.0.2-2 all [installed,automatic]
interface to Graphviz's Dot language - Python 3.x
```
The code for the specific Guardian is attached.[SUS_LOCK.py](/uploads/b497d04f965cd23600b8505d5d2f6a7a/SUS_LOCK.py)
https://git.ligo.org/cds/software/guardian/-/issues/11Unusual guardian message related to errors in user python code2020-05-14T18:20:30ZMichael ThomasUnusual guardian message related to errors in user python codeWhen user python code has syntax errors, the guardian node fails with a cryptic error message:
guardian -p TCS_KAL (code=exited, status=1/FAILURE)
Running this command on the l1guardian host itself shows a little more detail:
```
$ g...When user python code has syntax errors, the guardian node fails with a cryptic error message:
guardian -p TCS_KAL (code=exited, status=1/FAILURE)
Running this command on the l1guardian host itself shows a little more detail:
```
$ guardian -p TCS_KAL
System error: Module 'TCS_KAL' not found in any of the following search paths:
/opt/rtcds/userapps/release/als/common/guardian
/opt/rtcds/userapps/release/asc/common/guardian
/opt/rtcds/userapps/release/cal/common/guardian
/opt/rtcds/userapps/release/hpi/common/guardian
/opt/rtcds/userapps/release/ioo/common/guardian
/opt/rtcds/userapps/release/isc/common/guardian
/opt/rtcds/userapps/release/isi/common/guardian
/opt/rtcds/userapps/release/lsc/common/guardian
/opt/rtcds/userapps/release/omc/common/guardian
/opt/rtcds/userapps/release/psl/common/guardian
/opt/rtcds/userapps/release/sus/common/guardian
/opt/rtcds/userapps/release/sqz/common/guardian
/opt/rtcds/userapps/release/sys/common/guardian
/opt/rtcds/userapps/release/tcs/common/guardian
```
However, in no case do we get any indication that the node is failing because of a syntax error.
See https://services.ligo-la.caltech.edu/FRS/show_bug.cgi?id=11435 for the original LLO error report.https://git.ligo.org/cds/software/guardian/-/issues/51guardian package install does not drag in awg for INJ_TRANS2020-05-14T18:19:38ZKeith Thorneguardian package install does not drag in awg for INJ_TRANSThe INJ_TRANS Guardian node at both sites requires the python awg bindings from gds_crtools (or it may be in cds_crtools). It is not being pulled in by the 'guardian' package in 1.4.0 (this was true also for earlier releases. See [TST lo...The INJ_TRANS Guardian node at both sites requires the python awg bindings from gds_crtools (or it may be in cds_crtools). It is not being pulled in by the 'guardian' package in 1.4.0 (this was true also for earlier releases. See [TST log 12928](https://alog.ligo-la.caltech.edu/TST/index.php?callRep=12928)Jameson Rollinsjameson.rollins@ligo.orgJameson Rollinsjameson.rollins@ligo.orghttps://git.ligo.org/cds/software/guardian/-/issues/52Guardian nodes timeout with 1.4.1 point release2020-05-11T15:43:44ZKeith ThorneGuardian nodes timeout with 1.4.1 point releaseAfter upgrade from 1.4.0 to 1.4.1 in guardian, guardctrl, many previously-working nodes SUS,SEI, etc. now timeout on startup. A 'guardctrl disable' on these nodes also does not complete - see [TST log 13468](https://alog.ligo-la.caltech...After upgrade from 1.4.0 to 1.4.1 in guardian, guardctrl, many previously-working nodes SUS,SEI, etc. now timeout on startup. A 'guardctrl disable' on these nodes also does not complete - see [TST log 13468](https://alog.ligo-la.caltech.edu/TST/index.php?callRep=13468)guardian 1.5Jameson Rollinsjameson.rollins@ligo.orgJameson Rollinsjameson.rollins@ligo.orghttps://git.ligo.org/cds/software/guardian/-/issues/54guardian node disable does not complete with 1.4.12020-05-11T15:34:58ZKeith Thorneguardian node disable does not complete with 1.4.1Also noticed with point release 1.4.1 is that 'guardctrl disable <node>' starts but never completes. This is even on nodes that had started successfully (and stopped to attempt the disable)
```
[keith.thorne@x2portal ~]$ guardctrl disabl...Also noticed with point release 1.4.1 is that 'guardctrl disable <node>' starts but never completes. This is even on nodes that had started successfully (and stopped to attempt the disable)
```
[keith.thorne@x2portal ~]$ guardctrl disable TCS_SIM_COPY
Really disable the following nodes?
TCS_SIM_COPY
This will remove these nodes from the site list.
Type 'yes' to disable: yes
```
This may have been true with release 1.4.0 as had not been testedguardian 1.5Jameson Rollinsjameson.rollins@ligo.orgJameson Rollinsjameson.rollins@ligo.orghttps://git.ligo.org/cds/software/guardian/-/issues/53Python2-ezca not included in Debian10 guardian 1.3.4 dependencies.2020-05-11T15:21:21ZNathan HollandPython2-ezca not included in Debian10 guardian 1.3.4 dependencies.Guardian, as of 1.3.4, still runs on Python2. Python2-Ezca is not included in the dependencies when following installation instructions provided [here](https://git.ligo.org/cds/guardian/-/wikis/guardctrl). It is not possible to simply ch...Guardian, as of 1.3.4, still runs on Python2. Python2-Ezca is not included in the dependencies when following installation instructions provided [here](https://git.ligo.org/cds/guardian/-/wikis/guardctrl). It is not possible to simply change `/usr/bin/guardian` to call Python3-Guardian.https://git.ligo.org/cds/software/guardian/-/issues/50Wrong return values for _data_for_key in system.py2020-05-08T16:03:30ZKeith ThorneWrong return values for _data_for_key in system.pyWhen attempting to start some Guardian nodes after upgrade to Debian 10, Python 3, get the following error
```
2020-05-06_21:56:49.639195Z Starting Advanced LIGO Guardian service: FAST_SHUTTER...
2020-05-06_21:56:50.035708Z Traceback (mo...When attempting to start some Guardian nodes after upgrade to Debian 10, Python 3, get the following error
```
2020-05-06_21:56:49.639195Z Starting Advanced LIGO Guardian service: FAST_SHUTTER...
2020-05-06_21:56:50.035708Z Traceback (most recent call last):
2020-05-06_21:56:50.035708Z File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
2020-05-06_21:56:50.036789Z "__main__", mod_spec)
2020-05-06_21:56:50.036789Z File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
2020-05-06_21:56:50.036789Z exec(code, run_globals)
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/__main__.py", line 291, in <module>
2020-05-06_21:56:50.036789Z main()
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/__main__.py", line 152, in main
2020-05-06_21:56:50.036789Z system.load()
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/system.py", line 460, in load
2020-05-06_21:56:50.036789Z self.add_state(key, obj)
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/system.py", line 330, in add_state
2020-05-06_21:56:50.036789Z if index in self.indices and name != self.index(index):
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/system.py", line 766, in index
2020-05-06_21:56:50.036789Z return self._data_for_key(key)[0]
2020-05-06_21:56:50.036789Z File "/usr/lib/python3/dist-packages/guardian/system.py", line 732, in _data_for_key
2020-05-06_21:56:50.037791Z return state, data
2020-05-06_21:56:50.037791Z UnboundLocalError: local variable 'data' referenced before assignment
2020-05-06_21:56:50.076765Z guardian@FAST_SHUTTER.service: Control process exited, code=exited, status=1/FAILURE
2020-05-06_21:56:50.076901Z guardian@FAST_SHUTTER.service: Failed with result 'exit-code'.
2020-05-06_21:56:50.077156Z Failed to start Advanced LIGO Guardian service: FAST_SHUTTER.
```
In looking at the relevant section on system.py, it appears the return should be 'state, index' not 'state, data'
```
def _data_for_key(self, key):
"""Return state (key, data) tuple for state.
If a string is provided, key is the state index. If a number
is provided, key is the state name.
KeyError or TypeError exceptions are raised where appropriate.
"""
if isinstance(key, str):
try:
data = self._graph.nodes[key]
return data['index'], data
except KeyError:
raise KeyError("%s is not a state name" % key)
elif isinstance(key, int):
for state, index in self._graph.nodes(data='index'):
if index == key:
return state, data
raise KeyError("%s is not a state index" % key)
else:
raise TypeError("item must be state name string or index integer.")
```guardian 1.5Jameson Rollinsjameson.rollins@ligo.orgJameson Rollinsjameson.rollins@ligo.orghttps://git.ligo.org/cds/software/guardian/-/issues/47/usr/bin/guardian -> python32020-05-01T03:43:03ZJameson Rollinsjameson.rollins@ligo.org/usr/bin/guardian -> python31.4 releasehttps://git.ligo.org/cds/software/guardian/-/issues/30guardctrl: remove header from global channel file2020-04-29T22:15:43ZJameson Rollinsjameson.rollins@ligo.orgguardctrl: remove header from global channel fileI think only the following line needs to be removed: https://git.ligo.org/cds/guardian/blob/master/lib/guardctrl/util.py#L68I think only the following line needs to be removed: https://git.ligo.org/cds/guardian/blob/master/lib/guardctrl/util.py#L681.4 releasehttps://git.ligo.org/cds/software/guardian/-/issues/35Unclear why manager node is not OK2020-01-27T20:46:04ZThomas ShafferUnclear why manager node is not OKIt would be great to have the reason a manager node is not OK to be more clear. [Issue28](https://git.ligo.org/cds/guardian/issues/28) could be a good solution, but a notification could work as well. To be more specific, a notification w...It would be great to have the reason a manager node is not OK to be more clear. [Issue28](https://git.ligo.org/cds/guardian/issues/28) could be a good solution, but a notification could work as well. To be more specific, a notification when the manager is in its nominal state but its subordinates are not.
This caem up today as we were testing going to Observation without the squeezer. We changed the nominal state of SQZ_MANAGER, but forgot about the SQZ manager's subordinates. The SQZ_MANAGER node was not OK, but gave no obvious reason as to why (An old medm screen didn't help this, since it was missing one of the SQZ nodes). Eventually all of the subordinate nodes received their new nominal state, and all was well.1.4 releasehttps://git.ligo.org/cds/software/guardian/-/issues/9new request during same state redirect results in original state not being ex...2020-01-27T18:45:13ZJameson Rollinsjameson.rollins@ligo.orgnew request during same state redirect results in original state not being executed to completion1.4 releasehttps://git.ligo.org/cds/software/guardian/-/issues/37guardian DAQ INI file needs to use new external-edcu name/format2020-01-27T18:32:49ZDavid Barkerguardian DAQ INI file needs to use new external-edcu name/formatGuardian needs to create the new style DAQ INI file. Its name should be [L,H]1EPICS_GRD.ini and its content the same as the old EDCU_GRD.ini minus the header block.Guardian needs to create the new style DAQ INI file. Its name should be [L,H]1EPICS_GRD.ini and its content the same as the old EDCU_GRD.ini minus the header block.1.4 releaseJameson Rollinsjameson.rollins@ligo.orgJameson Rollinsjameson.rollins@ligo.orghttps://git.ligo.org/cds/software/guardian/-/issues/18remove same-state redirect?2019-02-15T19:16:28ZJameson Rollinsjameson.rollins@ligo.orgremove same-state redirect?This behavior seems a bit counter-intuitive to people, and can sometimes produce confusing results (see #9). Should it just be removed altogether? What would be the impact?This behavior seems a bit counter-intuitive to people, and can sometimes produce confusing results (see #9). Should it just be removed altogether? What would be the impact?https://git.ligo.org/cds/software/guardian/-/issues/10manager.Node "completed" property2019-02-04T02:20:51ZJameson Rollinsjameson.rollins@ligo.orgmanager.Node "completed" propertyIt would be useful if the Node and NodeManager objects included a "completed" property that was the union of .arrived and .done. Should be straightforward to implement:
```python
class Node:
@property
def completed(self):
...It would be useful if the Node and NodeManager objects included a "completed" property that was the union of .arrived and .done. Should be straightforward to implement:
```python
class Node:
@property
def completed(self):
return self.arrived and self.done
class NodeManager:
@property
def completed(self):
for node in self:
if not node.completed:
return False
return True
```https://git.ligo.org/cds/software/guardian/-/issues/21User Code Archiving2019-01-11T18:50:32ZRana AdhikariUser Code Archivinghttps://cds.docs.ligo.org/guardian/archive.html
When I reload ISC_LOCK, it says "code changes detected and committed"
where are they and **how** can I diff them?https://cds.docs.ligo.org/guardian/archive.html
When I reload ISC_LOCK, it says "code changes detected and committed"
where are they and **how** can I diff them?https://git.ligo.org/cds/software/guardian/-/issues/1GuardSystem incompatible with networkx 2.02018-06-18T23:25:53ZDuncan Macleodduncan.macleod@ligo.orgGuardSystem incompatible with networkx 2.0Trying to generate a graph with `guardutil==1.0.4` and `networkx==2.0` generates the following error:
```
$ SITE=llo IFO=L1 guardutil graph ISC_LOCK
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framewo...Trying to generate a graph with `guardutil==1.0.4` and `networkx==2.0` generates the following error:
```
$ SITE=llo IFO=L1 guardutil graph ISC_LOCK
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardutil/__main__.py", line 541, in <module>
args.func(args)
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardutil/__main__.py", line 74, in graph
system = cli.init_system(args, load=True)
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardian/cli.py", line 49, in init_system
sys.load()
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardian/system.py", line 455, in load
self.add_state(key, obj)
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardian/system.py", line 330, in add_state
if index in self.indices and name != self.index(index):
File "/Users/duncan/Library/Python/2.7/lib/python/site-packages/guardian/system.py", line 730, in indices
return [data['index'] for state, data in self._graph.nodes_iter(data=True)]
AttributeError: 'DiGraph' object has no attribute 'nodes_iter'
```
I am able to manually `pip install "networkx<2"` so this isn't a critical problem yet, but the macports version is 2.0 so this will likely start to hit other users, presuming anybody else tries to use `guardutil` on their laptop.https://git.ligo.org/cds/software/guardian/-/issues/4GitPython leaks system resources2018-03-20T23:59:56ZJameson Rollinsjameson.rollins@ligo.orgGitPython leaks system resourcesI just noticed this:
https://github.com/gitpython-developers/GitPython#limitations
This is directly relevant to our usage of python-git for the system git archives in the long-running daemon processes. This should be audited to make s...I just noticed this:
https://github.com/gitpython-developers/GitPython#limitations
This is directly relevant to our usage of python-git for the system git archives in the long-running daemon processes. This should be audited to make sure we're not leaking memory.https://git.ligo.org/cds/software/guardian/-/issues/3cdsutils import fail from node code2018-02-13T03:00:46ZLee McCullercdsutils import fail from node codeWith an
`import cdsutils`
At the start of my Guardian state code on 1.0.5 I'm getting an import error
```
[controlsP50 guardian]$guardian SQZ_OPO
Traceback (most recent call last):
File "/bin/guardian", line 9, in <module>
load...With an
`import cdsutils`
At the start of my Guardian state code on 1.0.5 I'm getting an import error
```
[controlsP50 guardian]$guardian SQZ_OPO
Traceback (most recent call last):
File "/bin/guardian", line 9, in <module>
load_entry_point('guardian==1.0.5', 'console_scripts', 'guardian')()
File "/usr/lib/python2.7/site-packages/guardian/__main__.py", line 130, in main
system = cli.init_system(args, load=True)
File "/usr/lib/python2.7/site-packages/guardian/cli.py", line 50, in init_system
sys.load()
File "/usr/lib/python2.7/site-packages/guardian/system.py", line 400, in load
module = self._load_module()
File "/usr/lib/python2.7/site-packages/guardian/system.py", line 288, in _load_module
self._module = self._import(self._modname)
File "/usr/lib/python2.7/site-packages/guardian/system.py", line 162, in _import
module = _builtin__import__(name, globals, locals, fromlist, level=level)
File "/opt/rtcds/userapps/cds_user_apps/trunk/sqz/m1/guardian/SQZ_OPO.py", line 9, in <module>
import cdsutils
File "/usr/lib/python2.7/site-packages/guardian/system.py", line 162, in _import
module = _builtin__import__(name, globals, locals, fromlist, level=level)
File "/usr/lib/python2.7/site-packages/cdsutils/__init__.py", line 1, in <module>
from _version import __version__
File "/usr/lib/python2.7/site-packages/guardian/system.py", line 162, in _import
module = _builtin__import__(name, globals, locals, fromlist, level=level)
ImportError: No module named _version
```
Other imports do no seem to trigger it. If I modify this __init__.py file of cdsutils to read
`from ._version import __version__`
instead of
`from _version import __version__`
the import error is fixed (actually just moves to another import line with a similar error). Looks like the monkeypatched import code is not distinguishing the two kinds of imports in py2.7 quite right.