Skip to content

Proposal to hide exposed hourly MDC superevents on production

Description: Moving into O4, I've been monitoring the load on the production database, and I noticed that the highest load on the database (over two OOM cpu usage over other requests) occur under a very specific circumstance: when an unauthenticated user makes a request to view public data products. An example would be, when a member of the public views a public superevent page, or a script scrapes for public skymaps, etc.

I traced this down to the SQL that's generated by a django-guardian function called get_objects_for_user. There has to be an underlying bug with GraceDB's public viewexposed permission, but I haven't been able to find it yet.

That being said, there are a couple of stackoverflow posts and github issues about this function and this statement is accurate to me:

Also, if possible, i suggest you don't use get_objects_for_user shortcut when project gets bigger. Its VERY slow query once you get more objects/permissions in the database.

that seems consistent with some testing that i've seen this week.

So why wasn't this an issue before? At the end of O3, there were 80 exposed (public) superevents. That's a trivial number of items from a database standpoint. But in the three years since O3 ended, the hourly first-two-years MDC uploads have been exposed to the public. Multiply 24 daily superevents by three years and all of a sudden....

In [11]: Superevent.objects.filter(is_exposed=True).filter(category='M').count()
Out[11]: 35354

There's over 35,000 exposed superevents and growing by the hour.

A quick test can be to open this file list: https://gracedb.ligo.org/superevents/S200316bj/files/

as an authenticated user (243ms):

Screen_Shot_2023-05-03_at_11.37.54_AM

and in incog (13.5s 😭):

Screen_Shot_2023-05-03_at_11.39.53_AM

Proposal:

  1. Unless there are objections, I'm going to hide exposed MDC uploads and see the performance impact.
  2. If it works, then I'm going to set up a tool to hide all (or a subset..?) of MDC superevents (which is a bandaid)
  3. Figure out what's wrong with the permissions, because finding the bug might have other wider-ranging performance implications
  4. Unless there is the desire to have the test uploads public, then modify GWCelery not to expose the test uploads. We can revisit this request based on the results of 1-3.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information