|
|
# GUARDCTRL systemd daemon supervision
|
|
|
|
|
|
Guardian comes with `guardctrl`, which is an interface for controlling and supervising guardian nodes on a host system. It is a wrapper around [systemd](https://www.freedesktop.org/wiki/Software/systemd/), which is the built-in init and service supervision system standard on all major linux distributions. systemd handles stopping, starting, and logging of the guardian daemons. `guardctrl` is essentially a convenient wrapper around systemctl and journalctl which allows specifying nodes by name, as opposed to the underlying systemd service names.
|
|
|
Guardian comes with `guardctrl`, which is an interface for controlling and supervising guardian nodes on a host system. It is a wrapper around [systemd](https://www.freedesktop.org/wiki/Software/systemd/), which is the built-in init and service supervision system standard on all major linux distributions. systemd handles stopping, starting, and logging of the guardian daemons. `guardctrl` is essentially a convenient wrapper around systemctl and journalctl to allow specifying nodes by name, as opposed to by their underlying systemd service names which are slightly more cumbersome (e.g. "guardian@SUS_BS.service").
|
|
|
|
|
|
Each guardian node is handled by a systemd templated service unit, `guardian@.service`, which describes how the processes should be handled.
|
|
|
Each guardian node is handled by a systemd templated service unit, `guardian@.service`, which describes how the processes should be supervised.
|
|
|
|
|
|
## host setup
|
|
|
|
|
|
### install
|
|
|
|
|
|
The `guardctrl` package is available through the [LIGO Debian apt archives](http://apt.ligo-wa.caltech.edu/debian/), so once that archive is enabled it can be installed directly:
|
|
|
The `guardctrl` package is available through the [LIGO Debian apt archives](http://apt.ligo-wa.caltech.edu/debian/), so once that archive is enabled the package can be installed directly:
|
|
|
```shell
|
|
|
# apt install guardctrl
|
|
|
```
|
|
|
The `guardctrl` package depends on `guardian` package, so you'll automatically get them both.
|
|
|
The `guardctrl` package depends on `guardian` package, so you'll automatically get them both. guardctrl will install the command line interface, as well as all the needed systemd service unit files.
|
|
|
|
|
|
### setup user
|
|
|
### create guardctrl user
|
|
|
|
|
|
`guardctrl` uses a `systemd --user` instance. This means that `guardctrl` should always be invoked as the same user so that processes are managed in a unified way. The `guardctrl` interface knows that it's running as the correct user by the presence of a `~/.guardctrl-home` file.
|
|
|
`guardctrl` expects to be using the `systemd --user` instance of the invoking user. This means that `guardctrl` should always be invoked as the same user so that processes are managed in a unified way. The `guardctrl` interface knows that it's running as the correct user by the presence of a `~/.guardctrl-home` file. If this file is not present, guardctrl will assume it's running remotely and will try to ssh to GUARDCTRL_USER@GUARDCTRL_HOST to issue the command.
|
|
|
|
|
|
For the LIGO site installations we therefore create a `guardian` user:
|
|
|
For the LIGO site installations we want to run everything under a `guardian` user, and therefore create that user thusly:
|
|
|
```shell
|
|
|
# adduser --gecos '' --uid 1010 --ingroup controls --disabled-password guardian
|
|
|
```
|
|
|
We use `uid=1010` to not collide with any of the other standard system users, and we add it to the `controls` group so that it can write archive and channel info to locations owned by the controls group. (NOTE: For a site setup where guardctrl will be accessed through a ~passwordless-SSH-interface, the guardian user should not have a password. Otherwise the guardian user can have a password as usual.)
|
|
|
We use `uid=1010` to not collide with any of the other standard system users, and we add it to the `controls` group. (NOTE: For a site setup where guardctrl will be accessed through a ~passwordless-SSH-interface, the guardian user should not have a password. Otherwise the guardian user can have a password as usual.)
|
|
|
|
|
|
Once we've got the user that will handle supervision, we touch the `~/.guardctrl-home` file in the user's home directory:
|
|
|
```shell
|
|
|
# su guardian -c "touch ~/.guardctrl-home"
|
|
|
```
|
|
|
|
|
|
### configure user systemd
|
|
|
|
|
|
We need to inform the system systemd that the `guardctrl` user is "persistent", so that it's `systemd --user` process won't be shut down if the user is not logged in. We do this with `loginctl enable-linger`. So if we're running under the `guardian` user the command is:
|
|
|
```shell
|
|
|
# loginctl enable-linger guardian
|
|
|
```
|
|
|
|
|
|
Because we want caRepeater to be running system-wide before starting any of the guardian nodes, we declare a dependency of the guardian user of the caRepeater service. This is done by dropping the following conf file into the `user@1010.service.d` directory
|
|
|
```
|
|
|
$ cat /etc/systemd/system/user\@1010.service.d/ca.conf
|
|
|
[Unit]
|
|
|
Wants=caRepeater.service
|
|
|
After=caRepeater.service
|
|
|
```
|
|
|
|
|
|
### configure journald for persistent logs
|
|
|
|
|
|
For the LIGO sites, we want to store logs from all guardian processes in perpetuity. To do this, the journald system logger needs to be configured to store all logs indefinitely. This is done by setting `Storage=persistent` in `/etc/systemd/journald.conf`:
|
... | ... | @@ -45,7 +55,7 @@ Storage=persistent |
|
|
# systemctl force-reload systemd-journald
|
|
|
```
|
|
|
|
|
|
### setup local environment
|
|
|
### specify local environment
|
|
|
|
|
|
The `guardian@.service` expects an `/etc/guardian/local-env` environment file to exist, for providing any needed environment variables to the supervised guardian processes. Here's an example of the file for H1 at LHO:
|
|
|
```shell
|
... | ... | @@ -60,7 +70,7 @@ NDSSERVER=h1nds0:8088,h1nds1:8088 |
|
|
### passwordless SSH interface
|
|
|
|
|
|
The best way to allow remote control of guardctrl is via ssh.
|
|
|
For a site install on a protected network, where you want to allow "remote" users (users on the same network but on hosts other than the guardctrl host) to be able to control the nodes without entering a password, you can setup a passwordless ssh "ForceCommand" for the guardctrl user.
|
|
|
For a site install on a protected network, where you want to allow "remote" users (i.e. users on the same network but on hosts other than the guardctrl host) to be able to control the nodes without entering a password, you can setup a passwordless ssh "ForceCommand" for the guardctrl user.
|
|
|
|
|
|
First, we need to modify the system PAM stack to allow passwordless login via ssh. Usually PAM is configured to not allow passwordless login on anything except for special TTYs. To loosen that restriction, on Debian systems, we modify `/etc/pam.d/common-auth` to change the following line:
|
|
|
```
|
... | ... | @@ -71,7 +81,7 @@ to: |
|
|
auth [success=1 default=ignore] pam_unix.so nullok
|
|
|
```
|
|
|
|
|
|
We then need to add to the sshd_config a special "Match" stanza for the guardctrl user which specifies that it may login without a password, but is forced to execute only a single command (`guardctrl`). On most systems this would go in `/etc/ssh/sshd_config`:
|
|
|
We then add to the sshd_config a special "Match" stanza for the guardctrl user which specifies that it may login without a password, but is forced to execute only a single command (`guardctrl`). On most systems this would go in `/etc/ssh/sshd_config`:
|
|
|
```/etc/ssh/sshd_config
|
|
|
Match User guardian
|
|
|
PermitEmptyPasswords yes
|
... | ... | |