querying on labels.all() vs labelling_set.all()
Here's a demonstration of something I noticed while doing some debugging and profiling eventToDict.
Let's say that someone wants to create a dictionary of the label names of an event/superevent. So fetch the event object:
In [2]: from events.models import Event, Label, Labelling, EventLog
In [3]: e = Event.getByGraceid('G422230')
Now let's examine two methods for constructing the dictionary and then look at the sql that gets executed. Operating off the labelling_set
:
In [4]: {'labels': [l.label.name for l in e.labelling_set.all()]}
Out[4]: {'labels': ['EMBRIGHT_READY', 'PASTRO_READY']}
Executes the sql:
2024-10-10 10:49:30.315 CDT [2556018] gracedb@gracedb LOG: statement: SELECT "events_labelling"."id", "events_labelling"."creator_id", "events_labelling"."created", "events_labelling"."event_id", "events_labelling"."label_id" FROM "events_labelling" WHERE "events_labelling"."event_id" = 422230
2024-10-10 10:49:30.318 CDT [2556018] gracedb@gracedb LOG: statement: SELECT "events_label"."id", "events_label"."name", "events_label"."defaultColor", "events_label"."description", "events_label"."protected" FROM "events_label" WHERE "events_label"."id" = 31 LIMIT 21
2024-10-10 10:49:30.319 CDT [2556018] gracedb@gracedb LOG: statement: SELECT "events_label"."id", "events_label"."name", "events_label"."defaultColor", "events_label"."description", "events_label"."protected" FROM "events_label" WHERE "events_label"."id" = 32 LIMIT 21
There's three calls to the database: 1) fetching the labelling objects (rows), and then two subsequent ones to get the two labels for the event (which gets the names).
Okay, now let's do the same thing, except with the event's labels.all()
:
In [5]: {'labels': [l.name for l in e.labels.all()]}
Out[5]: {'labels': ['EMBRIGHT_READY', 'PASTRO_READY']}
Which executes the following sql:
2024-10-10 10:53:57.080 CDT [2556018] gracedb@gracedb LOG: statement: SELECT "events_label"."id", "events_label"."name", "events_label"."defaultColor", "events_label"."description", "events_label"."protected" FROM "events_label" INNER JOIN "events_labelling" ON ("events_label"."id" = "events_labelling"."label_id") WHERE "events_labelling"."event_id" = 422230
tada!
Note that there are legitimate times when you would want to get the labelling
object: if you need to know when the label was added, and who added it (like for the floating css tooltip) then that information is in the labelling
object. But, for the above example you can select_related
on the labelling_set
to get it down to one db call. For example:
In [11]: {'labels': [l.label.name for l in e.labelling_set.all().select_related()]}
Out[11]: {'labels': ['EMBRIGHT_READY', 'PASTRO_READY']}
results in:
SELECT "events_labelling"."id", "events_labelling"."creator_id", "events_labelling"."created", "events_labelling"."event_id", "events_labelling"."label_id", "auth_user"."id", "auth_user"."password", "auth_user"."last_login", "auth_user"."is_superuser", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."date_joined", "events_event"."id", "events_event"."submitter_id", "events_event"."created", "events_event"."group_id", "events_event"."superevent_id", "events_event"."pipeline_preferred_id", "events_event"."pipeline_id", "events_event"."search_id", "events_event"."instruments", "events_event"."nevents", "events_event"."far", "events_event"."likelihood", "events_event"."gpstime", "events_event"."perms", "events_event"."offline", "events_event"."graceid", "events_event"."reporting_latency", T4."id", T4."password", T4."last_login", T4."is_superuser", T4."username", T4."first_name", T4."last_name", T4."email", T4."is_staff", T4."is_active", T4."date_joined", "events_group"."id", "events_group"."name", "events_pipeline"."id", "events_pipeline"."name", "events_pipeline"."enabled", "events_pipeline"."pipeline_type", "events_label"."id", "events_label"."name", "events_label"."defaultColor", "events_label"."description", "events_label"."protected" FROM "events_labelling" INNER JOIN "events_event" ON ("events_labelling"."event_id" = "events_event"."id") INNER JOIN "auth_user" ON ("events_labelling"."creator_id" = "auth_user"."id") INNER JOIN "auth_user" T4 ON ("events_event"."submitter_id" = T4."id") INNER JOIN "events_group" ON ("events_event"."group_id" = "events_group"."id") INNER JOIN "events_pipeline" ON ("events_event"."pipeline_id" = "events_pipeline"."id") INNER JOIN "events_label" ON ("events_labelling"."label_id" = "events_label"."id") WHERE "events_labelling"."event_id" = 422230
a longer call, but still it hits the database only once. I'm going to go in and make some code changes and see if it makes an appreciable difference.