This analysis should be performed on CIT cluster. Exact node shouldn't matter, but the testing was done on `ldas-pcdev6` and it seems like a good choice. Just remember to run all the analysis on the same node to not confuse the asimov. You can log in with `ssh albert.einstein@ldas-pcdev6.ligo.caltech.edu` where you have to replace the name with your credentials.
To grab reviewed and stable versions of common packages, clone igwn conda environment with:
```
conda create --name mdr --clone igwn-py310
conda activate mdr
```
You can choose different name, but I will keep using mdr in these instructions.
Next, we update important packages.
`mamba update -c conda-forge bilby bilby_pipe`
The 2 final packages it is important to get development versions of, so we will clone them with git. I recommend keeping them in the same folder for convenience. In your home directory (can move into it with `cd ~/`):
```
mkdir mdr_gwtc3
cd mdr_gwtc3
git clone git@git.ligo.org:lscsoft/bilby_tgr.git
git clone git@git.ligo.org:asimov/asimov.git
cd asimov
pip install .
cd ../bilby_tgr
git checkout mdr_review
pip install .
cd ..
```
This will get you proper package versions.
Now, due to cluster changes there have been problems with some packages. The fixes to them have not been forwarded to stable versions at the time of writing, so you may need to make 2 manual changes to the code:
```
vi ~/.conda/envs/mdr/lib/python3.10/site-packages/bilby_pipe/utils.py
:919 [enter]
```
This should move you to the line:
`run == "O3"` which you should change to `run = "O3"`. (If you cannot find it you can try searching by typing `/<search string> [enter]`)
To edit with vim: `i` to enter interactive mode, make your changes, `[esc]` and save and exit with `:x` (you can use close without saving with `:q!`)
Before running ~12 analyses that each of us will do, it would be good to verify that everything is running ok - it took me some time to get rid of all the bugs, so there is a possibility I forgot a step in the setup above.
The `apply_events.py` can apply your set of events for asimov. You will use it with just one event in it for the test:
This should populate your project with the selected event. Now run
`asimov manage build` to build config files for your analysis.
**Important:** Before submitting asimov jobs, like below, you have to ensure you have right credentials. To get them, run:
```
kinit
htgettoken -a vault.ligo.org -i igwn
```
and use your ligo password. They will expire when you log out of the cluster, so you will have to reapply them next time.
Now, you can submit your job with:
`asimov manage submit`
This 1st job will be very quick, as it just computes psd (should be done in less than 15 min).
You can check the status of the job with `condor_q`:
1. If you see that the status of this job is idle or running, you have to wait some more.
2. If the status of the job is `held`, it probably needs more resources. Run `condor_q -hold` to see the reason behind the problem. The probable cause is not enough disc or memory. For individual job run `condor_qedit jobID RequestMemory 8000` or some other number bigger than the one that caused the problem (`RequestDisk1 if the disk is the problem). If you have multiple problematic jobs, I suggest `condor_qedit -constraint 'JobStatus == 5' RequestMemory 8000` to modify all held jobs at once. You then have to release the jobs for them to start again with `condor_release -all`.
3. If the job is not appearing in the que, then either it finished successfully or an error occured. Run `asimov monitor` and the asimov will check job completion status and tell you of any errors. It updates information at most every 15 min. If you want to force the update earlier, you have to delete cashe file: `.asimov/_cache_jobs.yaml`. If there is an error, you can find error logs in `working/<eventname>/Prod0/logs/` with `.err` suffix. Let me know if it happens, as it shouldn't have.
If the `asimov monitor` told you that the jobs finished ok, you can now submit analysis proper. Run (remember to have your credentials generated if you haven't done so this session):
`asimov manage build`
`asimov manage submit`
It should inform you that it has submited 10 analysis proper that will take around day to finish. if after ~30 min you don't see active jobs with `condor_q`, it means some error occured and let me know (logs are in `working/<eventname>/<productiontype>/log_data_generation/`)
Next day check the progress with `condor_q` (to check for possible hold reasons) and with `asimov monitor` (the command only works inside your `project` directory). If the analysis are finished, the postprocessing will start (creation of the webpages). It should be done in few minutes and you will have to run `asimov monitor` again to catch it. You can check the webpages at https://ldas-jobs.ligo.caltech.edu/~<albert.einstein>/mdr-gwtc3/. You might need to run `asimov report html` for the updates to catch up.
If the webpages for all analysis are complete, then everything works ok and you can run full analysis.
## Running full analysis
Essentially you follow the steps above, but now populate multiple events:
Then again, ensure you have credentials, then run:
`asimov manage build`
`asimov manage submit`
To get psd, then after some time:
```
asimov monitor
asimov manage build
asimov manage submit
```
To run analysis proper. You might need to run it multiple times if not all psd calculations have finished the 1st time.
After all analyses are launched (you are not launching anything with command `asimov manage submit`) you can switch to just once in a while (a day or few):
`condor_q -hold` to monitor if something is held
`asimov monitor` - to check for completion
`asimov report html` - to update webpages when you want
Until everything finishes.
If `asimov monitor` tells you there are any problems, let me know and we can try to check what is wrong.
Technically everything after the `python apply_events.py` can be replaced with a single `asimov start`, to do the steps above every 15 min, but condor stops it after ~1 day, so at this point I think just running `asimov monitor` every once in a while is better.