Update 071222. - minutes authored by Rachael Huxford's avatar Rachael Huxford
......@@ -15,7 +15,10 @@ Rachael
Char / Minute Taker / Focus Session [Rota](https://git.ligo.org/groups/gstlal/-/wikis/Meetings/West-call/Rota-table)
## Action Items for next week
(**Leave action items here for next week.**)
- [ ] Chad: put together notes on table definitions to prepare for incorporating new injection file format.
- [ ] Ron to give us dedicated slots for online analysis on ICDS
- [ ] Cort to make a ticket for offline condor issue involving file transfer. Others to bring it up internally.
- [ ] All to look into scald retention policies for Ed + Jake to decrease Grafana query times
## Agenda / Minutes
......@@ -23,28 +26,112 @@ Char / Minute Taker / Focus Session [Rota](https://git.ligo.org/groups/gstlal/-/
- Please check the rota for next week's call
- Confirmation of next week's focus session
* Last week's call
- Neither Chad nor Ron were on the call to give updates.
- [ ] Chad: put together notes on table definitions to prepare for incorporating new injection file format.
- [ ] Ron to give us dedicated slots for online analysis on ICDS
* Last week's East call - [agenda](https://wiki.ligo.org/CBC/Searches/GstLALEastAgenda20220707)
A few small MRs that were approved and merged.
* Quick updates (45 minutes)
- Operations (5 minutes)
- LL CBC operations
- Becca: FAR history not appearing in Edward dash for some reason. Needs follow-up
- Becca: Seeing some spikes in latency. Not sure if this is b/c of the large ranking stat files, or for some other reason. After restarting analysis today, may be able to rule out large dist_stat files.
- Surabhi: EW still seeing high latency. Try to change flow and max dur to see if that fixes it.
- Surabhi: will iDQ be packaged with EW frames? Patrick: No.
- Surabhi: Cannot make dag w/o iDQ. iDQ compression option gets set to None, and then fails. Rachael: This is a bug. Will fix.
- Surabhi: Did we figure out what the issue was with the high RAM usage? Becca: We're assuming that this was due to the large dist_stats files. We will know for sure this afternoon when we relaunch with the smaller files.
- Offline CBC operations
- Cort: Having permission issues w/ offline analysis. Also seeing that marg dist stat files are missing even though those jobs supposedly finished properly.
- Chad: this might be an issue w/ singularity and file transfer. We should ask around internally b/c its a serious problem.
- Patrick: see this related [link](https://chat.ligo.org/ligo/pl/8u8pzn73apya7b4xionb4ke6sh)
- LL IDQ operations
- Rachael: Still working on MR. Mainly showing tests and proof that it does what we think it does.
- O4 Dev (30 minutes)
- Low latency integrated testing and Monitoring
- Becca: Log LR plots still don't look quite correct. Will look into it more. Sankey diagrams look good now.
- Template bank
- Rachael:
- Likelihood ratio, background and foreground sampling
- Anarya: Background sampling: running some more tests to investigate the effects of restricted sampling in detail and documenting them. But my Jobs on ics are sitting idle for a long time so its taking long to get results. This started happening since early last week, similar jobs were running smoothly before.
- Prathamesh: Chisq and bankchisq from snr data and orthoganlize. Calculated them and made [plots](https://ldas-jobs.ligo.caltech.edu/~prathamesh.joshi/bankchisq_orthogonalization/make/data/plots/).
- Chad: Is this something new, or just chisq and bankchisq. Prathamesh: Just the normal ones. Not orthogonalized.
- Chad: Maybe plotting log will look better? It's a positive definite number, so they won't be normally distributed. In a limit though where they are starting to get close-ish to normal.
- Optimization and throughput benchmarking
- Data format wrangling
- Enabling running on OSG/IGWN grid
- DQ dev
- HM search
- Exploratory development (5 minutes)
- Misc projects
- Alex: setting up to run some rocky tests
- Chad: has branch that speeds up itacac by improving mem management. Improves overall speed of GstLAL by ~ 10%. Gives same answers to floating point precision. Could help EW?
- Surabhi: Added that patch, but latency is still high. Were seeing high RAM, maybe for the same reason as what is in Ed? Also tried upping flow from 10 -> 15, but didn't seem to help. These are settings that we don't want to use in EW, but are helping track down the issues.
- Chad: Maybe we should run a job in perf? Surabhi: Already has.
- Chad: Going to generally improve mem management across GstLAL. Alex has built some containers w/ aligned mem. Probably some malloc & calloc around which we could switch out w/ aligned versions.
- Chad: we should test by clearing out a piece of hardware, and running just a single problem job. This will let us know if it is fully the program limiting itself, but if it runs fine, then maybe we just don't have the hardware resources we need. Surabhi: Will reach out to Stuart about getting a dedicated node for this. Chad and Surabhi will work together to do all the testing.
- Data format wrangling - nothing
- Enabling running on OSG/IGWN grid - nothing
- DQ dev - nothing
- HM search - nothing
- Exploratory development - nothing
- Misc projects - nothing
- Paper Updates
- Chad: Preparing poster for NSF PI meeting for a grant that supports Chad, Maddie Wade, and Erin Feats (? I apologize for spelling).
- Chad: If you have comments, etc. Please let me know.[Poster here](https://docs.google.com/presentation/d/18WkBacc6RmNb20BQ-FCTb8ONxkPdLELmQcg1xh82zag/edit#slide=id.p)
* Focus session
- None
* AOB
## Chat log
\ No newline at end of file
## Chat log
<09:31:25> "Becca Ewing": https://git.ligo.org/groups/gstlal/-/wikis/West-call/071222
<09:33:11> "Becca Ewing": https://git.ligo.org/groups/gstlal/-/wikis/West-call/071222
<09:34:43> "Rachael Huxford": yepp
<09:40:28> "SurabhiSachdev": hand up
<09:41:35> "SurabhiSachdev": https://dcc.ligo.org/DocDB/0173/T2100015/005/O4-calibration-requirements.pdf
<09:42:57> "chad.hanna": sorry I am late
<09:43:03> "SurabhiSachdev": it still needs me to give it the compress idq option
<09:43:14> "patrick.godwin": no plans to package with EW, since that h(t) path is lower latency
<09:43:19> "patrick.godwin": oh that's a bug then
<09:43:30> "Cort Posnansky": Here's the agenda: https://git.ligo.org/groups/gstlal/-/wikis/West-call/071222
<09:43:45> "SurabhiSachdev": thank you!
<09:45:26> "patrick.godwin": that needs to be brought up to the condor call
<09:45:34> "patrick.godwin": that's a bug in either condor itself or config settings
<09:45:46> "Becca Ewing": is this on the ICDS dag you mean?
<09:46:07> "patrick.godwin": other people have seen it before, I think Ron is aware but not certain
<09:46:29> "patrick.godwin": 10am tuesday
<09:46:47> "patrick.godwin": oops, 11am but cancelled
<09:46:48> "alexander.pace": short of the condor call, you can make a support ticket: https://git.ligo.org/computing/helpdesk/-/issues
<09:46:54> "patrick.godwin": ^^
<09:46:56> "chad.hanna": hand u
<09:47:11> "patrick.godwin": it's serious enough it makes dags kinda useless
<09:47:53> "Rachael Huxford": Patrick you sound bonkers on my end
<09:47:59> "alexander.pace": is that patrick or darth vader
<09:47:59> "SurabhiSachdev": same here
<09:47:59> "Becca Ewing": patrick is a robot
<09:48:00> "Cort Posnansky": Me too haha
<09:48:01> "Rachael Huxford": Like you're talking through a fan
<09:48:04> "SurabhiSachdev": pretty cool though
<09:48:17> "patrick.godwin": yeah
<09:48:33> "patrick.godwin": execute job correctly has the file, it fails to gets transfered but gets marked as done
<09:48:55> "patrick.godwin": see https://chat.ligo.org/ligo/pl/8u8pzn73apya7b4xionb4ke6sh as well
<09:49:17> "Cort Posnansky": I'll ping the offline dev mattermost channel later today with a zoom link
<09:50:22> "Becca Ewing": https://gstlal.ligo.caltech.edu/grafana/d/edward-testsuite-mdc05/edward-mdc05-test-suite-dashboard?orgId=1&from=now-15h&to=now
<09:51:21> "chad.hanna": hand up
<09:52:07> "chad.hanna": oh, sorry
<09:52:09> "SurabhiSachdev": hand up
<09:52:30> "chad.hanna": I remembered the other two things now too if I can go after Surabhi (they are fast)
<09:54:05> "SurabhiSachdev": got it, thanks
<09:54:13> "Rachael Huxford": Hopefully it solves the RAM and latency issues 
<09:54:20> "SurabhiSachdev": 
<09:55:03> "Becca Ewing": it looked abnormal to me this morning
<09:55:16> "Becca Ewing": i'm hoping that the dist stats compressing will fix everything 
<09:57:20> "Anarya": Background sampling : running some more tests to investigate the effects of restricted sampling in detail and documenting them. But my Jobs on ics are sitting idle for a long time so its taking long to get results. This started happening since early last week, similar jobs were running smoothly before.
<09:58:02> "Prathamesh Joshi": Hand up
<09:59:09> "SurabhiSachdev": I added that patch
<09:59:17> "SurabhiSachdev": hand up
<10:00:39> "SurabhiSachdev": I did
<10:04:30> "chad.hanna": sounds good
<10:05:06> "Prathamesh Joshi": https://ldas-jobs.ligo.caltech.edu/~prathamesh.joshi/bankchisq_orthogonalization/make/data/plots/
<10:08:07> "Prathamesh Joshi": Right
<10:08:26> "chad.hanna": nothing from me...
<10:08:47> "Rachael Huxford": nothing
<10:09:27> "chad.hanna": hand up
<10:10:09> "chad.hanna": https://docs.google.com/presentation/d/18WkBacc6RmNb20BQ-FCTb8ONxkPdLELmQcg1xh82zag/edit#slide=id.p
<10:10:39> "Rachael Huxford": When is the poster being presented?
\ No newline at end of file