|
|
Essential steps to successfully set up and run the pygwb_pipe
|
|
|
=============================================================
|
|
|
|
|
|
For the purposes of this tutorial, we will focus on setting up an environment and running the pipeline on the CIT cluster. This will allow us to avoid version issues with different computers, but most importantly it will allow us to make use of the CIT computing resources and run `pygwb_pipe` jobs in parallel.
|
|
|
|
|
|
** **DISCLAIMER** **
|
|
|
|
|
|
Please refrain from modifying the code locally and/or hardcoding things in; if you run into issues with the code, please open a git issue through [this page](https://git.ligo.org/pygwb/pygwb/-/issues), or get in touch with the team!
|
|
|
|
|
|
## 1. **Log into CIT**
|
|
|
|
|
|
This may be done either through the [JupyterLab portal](https://jupyter.ligo.caltech.edu/login), or through a terminal window by running
|
|
|
```
|
|
|
ssh {name.surname}@ldas-pcdev2.ligo.caltech.edu
|
|
|
```
|
|
|
Please substitute `{name.surname}` with your LIGO login credentials. Note that one may choose any other login node; the one chosen here is `pcdev2`. You will be prompted to enter your password: this is the regular password you use with your LIGO credentials.
|
|
|
|
|
|
## 2. **Create a dedicated `conda` environment for your `pygwb` installation**
|
|
|
|
|
|
To avoid issues such as conflicting packages, it is best to create a fresh conda environment in which to install `pygwb`. To do this, run
|
|
|
```
|
|
|
conda create --name {pygwb_environment} python=3.9
|
|
|
conda activate {pygwb_environment}
|
|
|
```
|
|
|
Please substitute `{pygwb_environment}` with your desired environment name -- possibly something that you will recognise in the future. Note that we have specified the `python` version for the new conda environment to be `3.9`. `pygwb` is only maintained for `python` versions `>=3.8`. If trying to install `pygwb` on an older version, you will get an error.
|
|
|
|
|
|
## 3. **Clone and install pygwb**
|
|
|
|
|
|
We are now ready to download and install `pygwb`. Navigate to a convenient folder in your home, and then run
|
|
|
```
|
|
|
git clone git@git.ligo.org:pygwb/pygwb.git {pygwb_main_folder}
|
|
|
cd {pygwb_main_folder}
|
|
|
```
|
|
|
substituting `{pygwb_main_folder}` with the name you wish to give your main `pygwb` folder. If you do not specify a name, the default name will be that of the repo, i.e. `pygwb`. This may not be what you want, for example if you have multiple installations of the package (which by the way is *not* encouraged). Now run
|
|
|
```
|
|
|
pip install .
|
|
|
```
|
|
|
to install the `pygwb` version that is present in that specific folder. This will copy over the necessary files to your `conda` environment package home, so you will not need to run `pygwb` directly in that folder.
|
|
|
|
|
|
## 4. **Testing the pipeline**
|
|
|
|
|
|
You can view the pipeline run options by executing
|
|
|
```
|
|
|
pygwb_pipe --help
|
|
|
```
|
|
|
To test the pipeline, simply run a command like
|
|
|
```
|
|
|
pygwb_pipe --param_file {path_to_param_file}
|
|
|
```
|
|
|
When running on the file ```"pygwb_pipe/parameters.ini"``` in the repo, one should get as final result
|
|
|
```
|
|
|
2022-04-19 18:55:13.330 | SUCCESS | __main__:main:124 - Ran stochastic search over times 1247644138-1247645038
|
|
|
2022-04-19 18:55:13.330 | SUCCESS | __main__:main:127 - POINT ESIMATE: -2.966117e-06
|
|
|
2022-04-19 18:55:13.330 | SUCCESS | __main__:main:128 - SIGMA: 1.229361e-06
|
|
|
```
|
|
|
|
|
|
## 5. **Writing and submitting a `dag` file**
|
|
|
|
|
|
We are now ready to condorise the pipeline and run a batch of jobs, just like the one run in point 4. For the purposes of this example, we'll run on mock data, using the `local` data option available in the package. Let's take this steps:
|
|
|
|
|
|
* *writing the `dag` file*
|
|
|
|
|
|
To start, let's copy the `DAG` folder to the location where you want to start your jobs. I suggest leaving the `pygwb` installation folder, and creating a `pygwb_run` folder somewhere completely different (this could even be in your `public_html` folder!). Once you have navigated to the folder you want to start from, run
|
|
|
```
|
|
|
cp -r {path-to-pygwb_main_folder}/pygwb_pipe/DAG/* .
|
|
|
```
|
|
|
You should now see some new files and folders in your `run` folder. amongst these, there is a handy script to prepare a `dag` file for the mock data analysis submission. To see how to use it, run
|
|
|
```
|
|
|
./make_DAG_pygwb_pipe -h
|
|
|
```
|
|
|
As you can see, it expects the following arguments:
|
|
|
```
|
|
|
--subfile SUBFILE Submission file.
|
|
|
--data_path DATA_PATH
|
|
|
Path to data files folder.
|
|
|
--parentdir PARENTDIR
|
|
|
Starting folder.
|
|
|
--param_file PARAM_FILE
|
|
|
Path to parameters.ini file.
|
|
|
--dag_name DAG_NAME Dag file name.
|
|
|
```
|
|
|
not all of which are necessary. For a basic condor run, you can use the following recipe to compile your `dag`
|
|
|
```
|
|
|
./make_DAG_pygwb_pipe --subfile {full-path-to-your-run-dir}/condor/Simulated_Data_New_Pipeline.sub --data_path /home/arianna.renzini/PROJECTS/SMDC_2021/100_day_test_pygwb/MDC_Generation_2/output/ --param_file {full-path-to-your-installation-dir}/pygwb_pipe/parameters_mock.ini
|
|
|
```
|
|
|
* *submitting the job*
|
|
|
|
|
|
The `dag` file is now created in the `output` folder. To submit the job, navigate to that folder and run
|
|
|
```
|
|
|
condor_submit_dag {your-dag-file.dag}
|
|
|
```
|
|
|
If you have not specified the `dag` name at the previous step, the current default name is `condor_simulated_100_day_MDC_2.dag`. |