Improve performance of public alerts page
Issue:
When I was revamping the O3 Public Alerts page for O4, I failed to anticipate the number of public triggers that would be displayed. When a user visits a site in their browser, that counts as one request. When the page gets rendered on their browser, loading every individual image also counts as a request. As of this writing, there are 286 public events, and by showing all events at once on the page, that is a burst of hundreds of requests to GraceDB every time the page is loaded. Exposing the less-significant events by default makes the problem even worse. And in the scenario where the majority of users only really care about updates at the very top of the page, that represents a serious performance issue for all users (browser and API) of GraceDB.
So how much of an issue is this? Below are the top requests on production GraceDB in the last 24 hours:
There were over 3000 hits to the public alerts page, and with each one of those, hundreds more hits to the individual skymaps. There are a few processes that refresh the public every so often (I think it's a raspberry-pi display wall, or something), so given the rate of new events, there's always an unnecessary baseline load on GraceDB, which is only going to get worse with time without intervention.
Fix:
There are a few changes to review for this change.
-
Caching Improvements: This is was already merged in, but it's worth mentioning here again for completeness. The cache
max-age
on the public alerts page (and just that page) was bumped up to 300s, and that is a configurable parameter without having to rebuild the server code and redeploy. That means any given user has a chance to see a version of that page is at most 5 minutes old? That seems like a reasonable compromise for server stability. -
Less-significant events are hidden by default:. Described here. Presumedly, most users are interested in the significant events first so they can get that information faster and with fewer interactions/requests.
-
Paginated views: Now only the first 15 (configurable parameter) most recent events are shown on the table, with a "previous/next" link to show the next bunch. So if you're refreshing the table to see changes, then you're only requesting the most recent events, and that request is likely to be a cached version of the page.
Other Notes:
The table contents and viewing logic are all the same as before. There was a question about what text to show on the button that shows/hides insignificant events. Right now, it's the same as it was before, but I'm open to suggestions. I also added the following bullet:
Which hopefully makes it more clear.
In the mean time, this is living on gracedb-dev1 (https://gracedb-dev1.ligo.org/superevents/public/O4/) and I'll get it up on gracedb-dev (https://gracedb-dev.ligo.org/superevents/public/O4/) later on today and I'll expose some test superevents.
I'd like to request a statement from @keita.kawabe and @brian.oreilly on the changes before I merge and deploy.