improve performance of public page
Streamline the view that generates the public page by reducing how many times it hits the db.
-
remove unnecessary queries -
precache queries -
take queries out of the main object loop -
take auth check for display far out of loop
Merge request reports
Activity
added 1 commit
- a7808a23 - update public alert comments outside of main event loop
added 2 commits
I've been running the public alerts page on
gracedb-dev1
through django's debug mode, which shows the number and type of queries that are taking place under the hood. I was able to find a number of inefficiencies. For context, when gathering data to render on the page, there's a bit of overhead at first for counting significant, insignificant, and retracted events, and then there's a main loop that loops over either significant or insignificant events and gathers data for each row of the table. Then this data is paginated and sent back to the user's browser.The biggest gains were the result of:
- taking redundant queries out of the main loop
- prefetching queries and moving data out of the main loop and storing it in memory (to avoid hitting the db each iteration)
- more efficient use of django querysets
So i ran debug mode, on the public page on
gracedb-dev1
, which has 65 total public events, and before the optimizations, there were 521 hits to the database:After the changes above, I got it down to 121 queries:
Keep in mind that this is on a system with just 65 public (significant and insignificant) events. On the production system, which has thousands and counting of low-significance triggers, I'm very confident that these changes will make a significant reduction in load times.
I also verified that the table contents, retracted/non-retracted status, skymap image and url, and total event count (significant/insigificicant/retracted) were were same before and after.
20 # /year value for use in tables and such. 21 22 def display_far_hr_to_yr(display_far): 23 24 # Determine "human-readable" FAR to display 25 display_far_hr = display_far 26 if display_far: 27 # FAR in units of yr^-1 28 far_yr = display_far * far_hr_to_yr 29 if (far_yr < 1): 30 display_far_hr = "1 per {0:0.5g} years".format(1.0/far_yr) 31 else: 32 display_far_hr = "{0:0.5g} per year".format(far_yr) 33 34 return display_far_hr 35 changed this line in version 8 of the diff
265 264 # Modify get_queryset() to filter for significant vs insignificant: 266 265 def get_queryset(self, showall=False, **kwargs): 267 266 if showall: 268 return self.get_public_superevents() 267 return self.get_public_superevents()\ 268 .prefetch_related('voevent_set', 'log_set') \ 269 .select_related('preferred_event') 269 270 else: 270 return self.significant_events(self.get_public_superevents()) 271 return self.significant_events(self.get_public_superevents())\ 272 .prefetch_related('voevent_set', 'log_set') \ 273 .select_related('preferred_event') - Comment on lines 266 to +273
You should make this a bit DRYer.
if showall: superevents = self.get_public_superevents() else: superevents = self.significant_events(self.get_public_superevents()) return ( superevents .prefetch_related('voevent_set', 'log_set') .select_related('preferred_event') )
Edited by Daniel Wysocki changed this line in version 8 of the diff
409 # Construct a list of unique superevent ids that have public analyst comments. 410 # This hits the database once: 411 public_comments_sids = list(public_comments_set.values_list('superevent__superevent_id', flat=True).distinct()) 372 412 373 # Filter and loop over exposed superevents for the given run: 413 # Construct a dictionary where the key is a superevent_id and the value is the 414 # associated concatenated comment. This hits the database once for each value 415 # in public_comments_sids. But again this is less than O(1%) or less than the 416 # total number of public superevents: 417 418 comments_dict = {} 419 420 for sid in public_comments_sids: 421 comments_dict.update( 422 {sid: ' ** '.join(list(public_comments_set.filter(superevent__superevent_id=sid)\ 423 .values_list('comment', flat=True)))}) changed this line in version 8 of the diff
Also a good catch. There are a few other querysets that i intentionally casted to lists because that got data out of the database and into memory to reduce the number of queries. I sort of robotically added this one too, but you're right that it wasn't necessary.
The total number of hits to the db is the same before and after this change. thanks!
mentioned in commit 935010af
mentioned in commit 2bd2c179
mentioned in commit a9add43a