... | ... | @@ -2,7 +2,7 @@ |
|
|
|
|
|
Guardian comes with `guardctrl`, a tool for controlling and managing guardian node processes. It is a wrapper around [systemd](https://www.freedesktop.org/wiki/Software/systemd/), which is the built-in init and service supervision system standard on all major linux distributions. systemd handles stopping, starting, and logging of the guardian daemons. `guardctrl` is essentially a convenient wrapper around systemctl and journalctl to allow specifying nodes by name, as opposed to by their underlying systemd service names which are slightly more cumbersome to work with for the average user.
|
|
|
|
|
|
Each guardian node is handled by a systemd templated service unit, `guardian@.service`, which describes how the processes should be supervised.
|
|
|
Each guardian process is handled by a systemd templated service unit, `guardian@.service`, which describes how the processes should be supervised.
|
|
|
|
|
|
## host setup
|
|
|
|
... | ... | @@ -21,28 +21,41 @@ Once that archive is enabled the package can be installed directly: |
|
|
```
|
|
|
The `guardctrl` package depends on `guardian` package, so you'll automatically get them both. guardctrl will install the command line interface, as well as all the needed systemd service unit files.
|
|
|
|
|
|
### create guardctrl user
|
|
|
### create and configure guardctrl user
|
|
|
|
|
|
`guardctrl` expects to be using the `systemd --user` instance of the invoking user. This means that `guardctrl` should always be invoked as the same user so that processes are managed in a unified way. The `guardctrl` interface knows that it's running as the correct user by the presence of a `~/.guardctrl-home` file. If this file is not present, guardctrl will assume it's running remotely and will try to ssh to GUARDCTRL_USER@GUARDCTRL_HOST to issue the command.
|
|
|
`guardctrl` expects to be using the `systemd --user` instance of the invoking user. This means that `guardctrl` should always be invoked as the same user so that processes are managed in a unified way. The `guardctrl` interface knows it's running as the correct user by the presence of the `~/.guardctrl-home` file. If this file is not present, guardctrl will assume it's running remotely and will try to ssh to GUARDCTRL_USER@GUARDCTRL_HOST to issue the command.
|
|
|
|
|
|
For the LIGO site installations we want to run everything under a `guardian` user, and therefore create that user thusly:
|
|
|
For the LIGO site installations we want to run everything under the `guardian` user. We therefore start by creating the `guardian` user account on the machine:
|
|
|
```shell
|
|
|
# adduser --gecos '' --uid 1010 --ingroup controls --disabled-password guardian
|
|
|
```
|
|
|
We use `uid=1010` to not collide with any of the other standard system users, and we add it to the `controls` group. (NOTE: For a site setup where guardctrl will be accessed through a ~passwordless-SSH-interface, the guardian user should not have a password. Otherwise the guardian user can have a password as usual.)
|
|
|
|
|
|
Once we've got the user that will handle supervision, we touch the `~/.guardctrl-home` file in the user's home directory:
|
|
|
Once we've got the user that will handle supervision, we touch the `~/.guardctrl-home` file (as the `guardian` user) in the user's home directory:
|
|
|
```shell
|
|
|
# su guardian -c "touch ~/.guardctrl-home"
|
|
|
$ -c "touch ~/.guardctrl-home"
|
|
|
```
|
|
|
|
|
|
### configure user systemd
|
|
|
Finally, create and enable a `guardian.target` unit for auto-starting nodes on startup:
|
|
|
```shell
|
|
|
$ cat > ~/.config/systemd/user/guardian.target
|
|
|
[Unit]
|
|
|
Description=Advanced LIGO Guardian target
|
|
|
|
|
|
[Install]
|
|
|
WantedBy=default.target
|
|
|
$ systemctl --user daemon-reload
|
|
|
$ systemctl --user enable guardian.target
|
|
|
```
|
|
|
(This can probably just be (and should be) provided as part of the `guardctrl` package.)
|
|
|
|
|
|
We need to inform the system systemd that the `guardctrl` user is "persistent", so that it's `systemd --user` process won't be shut down if the user is not logged in. We do this with `loginctl enable-linger`. So if we're running under the `guardian` user the command is:
|
|
|
### user systemd persistence
|
|
|
|
|
|
Once desired guardian user account is ready, we need to inform the system systemd instance that the user is "persistent", so that it's `systemd --user` process won't be shut down if the user is not logged in. We do this with `loginctl enable-linger`. So if we intend to run under the `guardian` user, the correct command is:
|
|
|
```shell
|
|
|
# loginctl enable-linger guardian
|
|
|
```
|
|
|
In addition, you'll need to extend the startup timeout for this user, since starting all the guardian processes at boot takes awhile. 10 minutes should be enough, but this can be adjusted. We handle this with a "drop-in" for the guardctrl user service:
|
|
|
In addition, you'll need to extend the startup timeout for this user, since starting all the guardian processes at boot takes awhile. 10 minutes should be enough, but this can be adjusted. We handle this with a system-level "drop-in" for the relevant user's service:
|
|
|
```
|
|
|
# mkdir /etc/systemd/system/user\@1010.service.d
|
|
|
# cat > /etc/systemd/system/user\@1010.service.d/timeout.conf
|
... | ... | @@ -50,6 +63,8 @@ In addition, you'll need to extend the startup timeout for this user, since star |
|
|
TimeoutStartSec=10min
|
|
|
```
|
|
|
|
|
|
### caRepeater service
|
|
|
|
|
|
Because we want caRepeater to be running system-wide before starting any of the guardian nodes, we declare a dependency of the guardian user of the caRepeater service. This is also done with a drop-in:
|
|
|
```
|
|
|
# cat > /etc/systemd/system/user\@1010.service.d/ca.conf
|
... | ... | @@ -57,13 +72,35 @@ Because we want caRepeater to be running system-wide before starting any of the |
|
|
Wants=caRepeater.service
|
|
|
After=caRepeater.service
|
|
|
```
|
|
|
NOTE: the above assumes the existence of a `caRepeater.service`, so verify that it's there already, or create it if it doesn't exist:
|
|
|
```shell
|
|
|
# systemctl cat caRepeater.service
|
|
|
# /etc/systemd/system/caRepeater.service
|
|
|
[Unit]
|
|
|
Description=EPICS caRepeater
|
|
|
#Requires=caRepeater.socket
|
|
|
Wants=network-online.service
|
|
|
After=network-online.service
|
|
|
|
|
|
[Service]
|
|
|
ExecStart=/usr/bin/caRepeater
|
|
|
User=nobody
|
|
|
|
|
|
[Install]
|
|
|
WantedBy=multi-user.target
|
|
|
#
|
|
|
```
|
|
|
|
|
|
### configure journald for persistent logs
|
|
|
|
|
|
For the LIGO sites, we want to store logs from all guardian processes in perpetuity. To do this, the journald system logger needs to be configured to store all logs indefinitely. This is done by setting `Storage=persistent` in `/etc/systemd/journald.conf`:
|
|
|
For the LIGO sites, we want to store logs from all guardian processes in perpetuity. To do this, the journald system logger needs to be configured to store all logs indefinitely. This is done by setting `Storage=persistent` in `/etc/systemd/journald.conf` (included below are some other variables for increasing the log rate limit, and for increasing the disk storage limits for the logs):
|
|
|
```
|
|
|
[Journal]
|
|
|
Storage=persistent
|
|
|
RateLimitBurst=100000
|
|
|
SystemMaxUse=200G
|
|
|
SystemMaxFiles=100000
|
|
|
|
|
|
```
|
|
|
```shell
|
|
|
# systemctl force-reload systemd-journald
|
... | ... | @@ -108,7 +145,7 @@ After adding the Match stanza, reload sshd: |
|
|
```shell
|
|
|
# systemctl force-reload sshd
|
|
|
```
|
|
|
If for some reason you need to pass special environment variables to `guardctrl`, you can point the ForceCommand to someething like `/etc/guardian/guardctrl-ssh-bridge` which can be a shell script that sets the needed environment and then execs `/usr/bin/guardctrl` (without arguments). Be sure to set your wrapper script to be executable.
|
|
|
If for some reason you need to pass special environment variables to `guardctrl`, you can point the ForceCommand to something like `/etc/guardian/guardctrl-ssh-bridge` which can be a shell script that sets the needed environment and then execs `/usr/bin/guardctrl` (without arguments). Be sure to set your wrapper script to be executable.
|
|
|
|
|
|
#### local guardian user access
|
|
|
|
... | ... | |