Skip to content
Snippets Groups Projects

CI development

Merged Aaron Viets requested to merge aaron-viets/gstlal-calibration:CI-development into main
1 unresolved thread

This MR will include many useful unit tests in the continuous integration. This long overdue upgrade will save a lot of manual effort in making plots for reviews, reduce the occurrence of bugs in the code, and make bugs easier to catch right away. Here are the tests that are now included:

  • Run 3 versions of the calibration pipeline, and check that the right number of calibrated frames are produced
  • Check the ASDs of GDS-CALIB_STRAIN and GDS-CALIB_STRAIN_CLEAN, and compare them to a "standard", nominally correct ASD. Plots are included in the CI job artifacts.
  • Check the TDCFs, and compare them to nominally correct TDCFs. Plots included.
  • Check the calibration state vector, and compare each bit to a nominally correct state vector. Plots included.
  • Measure ratios of h(t) / Pcal and h(t) / Actuation injections, and compare to a nominally correct version. Time-series plots included.
  • Use FrDiff to compare GDS-CALIB_STRAIN_CLEAN in frames produced by two otherwise identical pipelines with different start times. They are required to be identical.
  • Measure the latency in a simulated low-latency-like configuration. Check that the mean latency is not too large, that there are not frequent or large excursions, and that the latency does not increase steadily with time.

Job artifacts from the most recent commit (at the time of writing) can be seen here, including plots and txt files with the data being plotted.

Getting all stages of the CI to work also required me to address some packaging issues. If I did things correctly, this MR Closes #7 (closed), #12 (closed), #14 (closed), #15 (closed), #26 (closed). I was not able to add @duncanmmacleod as a reviewer, but he created several of these issues.

Merge request reports

Merge request pipeline #653289 failed

Merge request pipeline failed for 8b5144a3

Merged by Aaron VietsAaron Viets 6 months ago (Aug 20, 2024 5:42pm UTC)

Merge details

Pipeline #653676 passed

Pipeline passed for 682cbe80 on main

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • requested review from @madeline-wade and @jameson.rollins

  • I strongly support having good unit tests, so I approve of this MR in principle. However it's a large change that I'm not going to be able to review in detail. I will leave it up to @madeline-wade to sign off on this.

    We don't expect that anything here would change the performance of the running pipeline, do we?

    It seems like some of the changes addressing some of the linked issues could have been in their own MR.

    Edited by Jameson Rollins
  • @jameson.rollins For the most part, you are correct that this should not affect the performance of the pipeline. There are two small changes that involve pipeline code, which were necessary to get the unit tests to pass:

    • There was a pipeline startup transient issue in the line subtraction that I needed to fix. Basically, a lot of noise was added to GDS-CALIB_STRAIN_NOLINES due to the startup behavior of the multiplier element. I made a change that would avoid this startup noise. This only affects the pipeline right after startup.
    • Some real-time updates in the line subtraction were occurring at timestamps that were not precisely controlled. This is not a significant issue for accuracy or quality of subtraction, but it makes it impossible to exactly reproduce calibrated data. Since one of the tests requires producing identical data from pipelines with differing start times, I had to use the GstController to control time exact timestamps of these updates. This would be a more relevant issue for offline calibration.

    The reason I addressed the other issues was because some of them were necessary to get the CI to pass.

  • Madeline Wade
  • 58 parser = OptionParser()
    59
    60 parser.add_option("--gps-start-time", metavar = "seconds", type = int, default = 1370674240, help = "GPS time at which to start processing data")
    61 parser.add_option("--gps-end-time", metavar = "seconds", type = int, default = 1370678400, help = "GPS time at which to stop processing data")
    62 parser.add_option("--ifo", metavar = "name", default = "H1", help = "Name of the interferometer (IFO), e.g., H1, L1")
    50 from tests.tests_pytest.plot import plot_deltal_over_inj_timeseries
    63 51
    64 options, filenames = parser.parse_args()
    65 52
    66 ifo = options.ifo
    67
    68
    69 #ifo = 'H1'
    53 gps_start_time = 1404304248
    54 gps_end_time = 1404308216
    55 ifo = 'H1'
    • Should there be an equivalent L1 test? (Or perhaps there is and I haven't gotten to it yet?)

    • @madeline-wade I limited the CI to use only one data set for 2 reasons:

      1. I would like to keep the number of raw frames in the repo to a minimum, to save on storage.
      2. The CI currently takes ~90 minutes to finish, so I did not want to increase that.

      I tried to choose a relatively brief time period that was also "interesting," so that many things can be tested with one data set. The time at LHO that I chose is about 4000s in duration, and has these "interesting" features:

      • Transition from lock loss to lock (needed for state vector test)
      • Observation ready occurs quite soon after lock acquisition.
      • Rapid thermalization (needed for pcal-to-DARM test, etc.)
      • Multiple TDCFs are not equal to their nominal values, making it harder to pass any tests by chance without actually doing everything right.
    • @aaron-viets 90 minutes is really long for a CI job. maybe too long. Remember, this job will run every time anyone pushes anything to this repo, or any fork of it, so be very weary of the potential resources used.

      Why does it need to be that long? Can you trim it down?

    • @jameson.rollins I managed to trim it down some by reducing the amount of data that gets calibrated. The current version takes a little over an hour. I could maybe reduce this more by further reducing pipeline startup time. The artifacts from the most recent job are here.

      The reason it takes long is that many of the common types of bugs we want to catch with the unit tests require calibrating at least ~30 minutes of data, partly due to the medians and averages in the TDCFs and the line subtraction. We also are testing latency, and many hard-to-catch but common bugs involve a gradually increasing latency, which can be detected by calibrating a stretch of data like this. We also want to see how the time-dependent model handles thermalization and how the state vector handles transitions between lock-loss and lock. So, I had to find a single ~30-minute stretch of data with just the right features. Finally, in order to test the reproducibility of the strain data and the cleaned data, I needed enough data to let one pipeline start a little later than the other and still allow for running medians and averages to settle. Having more than a few minutes of data in the plots (CI artifacts) is also helpful in diagnosing problems when the CI fails.

    • @aaron-viets this all sounds great, and I don't think you should sacrifice the test coverage just to make it shorter. But I still think we don't want really long running tests in gitlab CI. So my recommendation is to keep all the tests, so that developers can run them when they're developing code, but just don't hook them into the CI. Is there some smaller set of sanity tests that we could hook into the CI instead?

    • @jameson.rollins @madeline-wade I changed it so that the CI tests must be run manually for ordinary pushes, but are automatically run only when a branch is merged. That way, it only runs the computationally-expensive tests occasionally. The most recent job artifacts can be seen here (they all look the same as before). I ran this one manually just for demonstration.

      Edited by Aaron Viets
    • When @jameson.rollins is okay with the structure and time of the unit tests I am okay to have this merged.

    • having the CI only run when an MR is merged into the main branch sacrifices the main purpose of the CI to catch issues before we merge, but I guess having it at all is better than not. we just need to be extra vigilant to check and make sure the CI passed after we merged, or we'll leave the main branch in a broken state. i don't know if there's a way to configure the job to run only right before we merge...

    • @jameson.rollins Right, that is what I ideally wanted to do. However, we can manually run the CI on any commit at any time on gitlab (MUCH easier that manually running tests used to be). Here is an example of where to run it manually if needed. Here is an example of a job that I ran manually in this way.

      I thought about using rules to make jobs run only if the diff includes files that affect the pipeline, but that seemed complicated, and it would have taken me a while to figure out how to do it. This solution is simple.

    • this discussion on mattermost lead to this link from the lalsuite folks that have set up the same thing (CI on merge only):

      https://git.ligo.org/lscsoft/lalsuite/-/blob/ff17a18ad2c408f188c6f455330103f92db0174d/.gitlab/ci/rules.yml#L49

    • @aaron-viets how do you run the test manually from gitlab?

    • @jameson.rollins See the links in my comment above. The first link takes you to the test stage of the CI pipeline. There is a button that says "Run job."

    • ah ok, I guess I don't have access to trigger it, but that's fine. I was not aware of this feature.

      I approved this MR, so feel free to merge when @madeline-wade approves.

    • Please register or sign in to reply
  • Aaron Viets added 1 commit

    added 1 commit

    • 09a5f482 - CI calibrate less data to finish faster, and make it easier to modify amount of data calibrated

    Compare with previous version

  • Aaron Viets added 1 commit

    added 1 commit

    • 73776549 - moved gps_start_time into functions

    Compare with previous version

  • Aaron Viets added 1 commit

    added 1 commit

    • ed7bea62 - adjust error threshold for CI to pass

    Compare with previous version

  • Aaron Viets added 1 commit

    added 1 commit

    • 986e7332 - Make unit tests manual except for merges

    Compare with previous version

  • Aaron Viets added 1 commit

    added 1 commit

    • 8b5144a3 - CI allow failure in case of manual job not running

    Compare with previous version

  • Jameson Rollins approved this merge request

    approved this merge request

  • I'll approve this for now and we can work on tweaking the CI later.

  • merged

  • Aaron Viets mentioned in commit 682cbe80

    mentioned in commit 682cbe80

  • Aaron Viets mentioned in merge request !2 (merged)

    mentioned in merge request !2 (merged)

  • Please register or sign in to reply
    Loading