Skip to content
Snippets Groups Projects

improve performance of public page

Merged Alexander Pace requested to merge public-alerts-improvements into master
3 unresolved threads

Streamline the view that generates the public page by reducing how many times it hits the db.

  • remove unnecessary queries
  • precache queries
  • take queries out of the main object loop
  • take auth check for display far out of loop
Edited by Alexander Pace

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Alexander Pace added 3 commits

    added 3 commits

    • b3db4eaf - take auth check out of event loop
    • 05244a21 - turn off public page caching for development
    • e5ee32cc - improve efficiency of voevent query

    Compare with previous version

  • Alexander Pace added 1 commit

    added 1 commit

    Compare with previous version

  • Alexander Pace added 1 commit

    added 1 commit

    • a7808a23 - update public alert comments outside of main event loop

    Compare with previous version

  • Alexander Pace marked the checklist item take auth check for display far out of loop as completed

    marked the checklist item take auth check for display far out of loop as completed

  • Alexander Pace added 1 commit

    added 1 commit

    • e4c040ec - reduce redundant skymap image queries

    Compare with previous version

  • Alexander Pace added 2 commits

    added 2 commits

    Compare with previous version

  • Alexander Pace added 2 commits

    added 2 commits

    • 209a3754 - 1 commit from branch master
    • 7ebdaaee - Merge branch 'master' into public-alerts-improvements

    Compare with previous version

  • Alexander Pace marked the checklist item precache queries as completed

    marked the checklist item precache queries as completed

  • I've been running the public alerts page on gracedb-dev1 through django's debug mode, which shows the number and type of queries that are taking place under the hood. I was able to find a number of inefficiencies. For context, when gathering data to render on the page, there's a bit of overhead at first for counting significant, insignificant, and retracted events, and then there's a main loop that loops over either significant or insignificant events and gathers data for each row of the table. Then this data is paginated and sent back to the user's browser.

    The biggest gains were the result of:

    • taking redundant queries out of the main loop
    • prefetching queries and moving data out of the main loop and storing it in memory (to avoid hitting the db each iteration)
    • more efficient use of django querysets

    So i ran debug mode, on the public page on gracedb-dev1, which has 65 total public events, and before the optimizations, there were 521 hits to the database:

    Screen_Shot_2023-10-20_at_2.57.35_PM

    After the changes above, I got it down to 121 queries:

    Screen_Shot_2023-10-20_at_2.58.13_PM

    Keep in mind that this is on a system with just 65 public (significant and insignificant) events. On the production system, which has thousands and counting of low-significance triggers, I'm very confident that these changes will make a significant reduction in load times.

    I also verified that the table contents, retracted/non-retracted status, skymap image and url, and total event count (significant/insigificicant/retracted) were were same before and after.

20 # /year value for use in tables and such.
21
22 def display_far_hr_to_yr(display_far):
23
24 # Determine "human-readable" FAR to display
25 display_far_hr = display_far
26 if display_far:
27 # FAR in units of yr^-1
28 far_yr = display_far * far_hr_to_yr
29 if (far_yr < 1):
30 display_far_hr = "1 per {0:0.5g} years".format(1.0/far_yr)
31 else:
32 display_far_hr = "{0:0.5g} per year".format(far_yr)
33
34 return display_far_hr
35
  • 265 264 # Modify get_queryset() to filter for significant vs insignificant:
    266 265 def get_queryset(self, showall=False, **kwargs):
    267 266 if showall:
    268 return self.get_public_superevents()
    267 return self.get_public_superevents()\
    268 .prefetch_related('voevent_set', 'log_set') \
    269 .select_related('preferred_event')
    269 270 else:
    270 return self.significant_events(self.get_public_superevents())
    271 return self.significant_events(self.get_public_superevents())\
    272 .prefetch_related('voevent_set', 'log_set') \
    273 .select_related('preferred_event')
  • 409 # Construct a list of unique superevent ids that have public analyst comments.
    410 # This hits the database once:
    411 public_comments_sids = list(public_comments_set.values_list('superevent__superevent_id', flat=True).distinct())
    372 412
    373 # Filter and loop over exposed superevents for the given run:
    413 # Construct a dictionary where the key is a superevent_id and the value is the
    414 # associated concatenated comment. This hits the database once for each value
    415 # in public_comments_sids. But again this is less than O(1%) or less than the
    416 # total number of public superevents:
    417
    418 comments_dict = {}
    419
    420 for sid in public_comments_sids:
    421 comments_dict.update(
    422 {sid: ' ** '.join(list(public_comments_set.filter(superevent__superevent_id=sid)\
    423 .values_list('comment', flat=True)))})
    • Comment on lines +422 to +423

      Is this cast to list necessary? str.join works on any iterable, so I think it should be safe here to just drop the cast, which should make things slightly faster and lower memory usage.

    • Alexander Pace changed this line in version 8 of the diff

      changed this line in version 8 of the diff

    • Also a good catch. There are a few other querysets that i intentionally casted to lists because that got data out of the database and into memory to reduce the number of queries. I sort of robotically added this one too, but you're right that it wasn't necessary.

      The total number of hits to the db is the same before and after this change. thanks!

    • Please register or sign in to reply
  • Alexander Pace mentioned in commit 935010af

    mentioned in commit 935010af

  • Alexander Pace mentioned in commit 2bd2c179

    mentioned in commit 2bd2c179

  • Alexander Pace mentioned in commit a9add43a

    mentioned in commit a9add43a

  • Alexander Pace added 3 commits

    added 3 commits

    Compare with previous version

  • Alexander Pace marked this merge request as ready

    marked this merge request as ready

  • merged

  • Please register or sign in to reply
    Loading