Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise N+1 end points #676

Open
6 of 15 tasks
philipbelesky opened this issue Jul 3, 2018 · 21 comments
Open
6 of 15 tasks

Optimise N+1 end points #676

philipbelesky opened this issue Jul 3, 2018 · 21 comments
Assignees
Labels
help wanted performance Issues related to the performance

Comments

@philipbelesky
Copy link
Member

philipbelesky commented Jul 3, 2018

This is an ongoing tracker for identifying/discussing these views, although scout doesn't actually identify many over the 150ms threshold so they could definitely all be ticked off. None are egregious in terms of execution time (except for /database/participants/adjudicator/) but it would be nice to improve them:

  • results.views.PublicResultsForRoundView
  • participants.views.AdjudicatorRecordView + participants.views.PublicTeamRecordView
  • tournaments.views.TournamentAdminHomeView + tournaments.views.TournamentAssistantHomeView
  • adjallocation.views.CreateAutoAllocation
  • draw.views.CreateDrawView
  • results.views.AdminNewBallotSetView + results.views.AdminEditBallotSetView
  • adjallocation.views.SaveDebatePanel
  • django.contrib.admin.options.changelist_view for /database/participants/adjudicator/
  • participants.views.UpdateEligibilityEditView
  • /django.contrib.admin.options.change_view aka /database/adjfeedback/adjudicatorfeedback/740/change/
  • venues.views.AutoAllocateVenuesView
  • adjfeedback.views.FeedbackByTargetView
  • adjfeedback.views.FeedbackBySourceView
  • printing.views.AdminPrintScoresheetsView
  • adjfeedback.views.PublicAddFeedbackByRandomisedUrlView
@philipbelesky philipbelesky added the performance Issues related to the performance label Jul 3, 2018
@philipbelesky philipbelesky added this to the Peformance milestone Jul 3, 2018
@czlee czlee self-assigned this Jul 3, 2018
@czlee
Copy link
Member

czlee commented Jul 5, 2018

I think the cause of the TournamentAdminHomeView and TournamentAssistantHomeView is grabbing the content object in the action logs, which I think can't be done easily and actually just indicates that our schema for action logs is poorly designed.

I just realized we can get some insight into some of the POST requests by intercepting redirects with the debug toolbar, so might try that some time.

@philipbelesky
Copy link
Member Author

philipbelesky commented Jul 5, 2018 via email

@czlee
Copy link
Member

czlee commented Jul 5, 2018

Re serialisation, you might remember this discussion from here: 3545827#commitcomment-28040144

I had the feeling our serialisation framework causes a lot of duplicate and redundant queries, I think we need to think more thoroughly through how to get it to fetch everything we want, and only want we want, efficiently?

@philipbelesky
Copy link
Member Author

Yea definitely. Although it might be worth delaying until it is pushing changes back in an async fashion (although this would probably use a similar serialisation method so maybe not). When done to produce a serialised draw bulk I think the current methods aren't that inefficient (maybe I'm wrong here) but when done to update just a single debate it seems inefficient.

Either the fetch/re-serialise query could be combined into a singular query and/or we could pre-populated 'expensive' attributes like adj scores/liveness from the JSON provided which would remove the need to query them (although at the cost of information not being completely live).

@czlee
Copy link
Member

czlee commented Jul 6, 2018

Hmm, I think it's the other way round. You generally can't make updating a single debate any more efficient, because (a) for the debate itself, there's no concept of an N+1 problem if you're only dealing with one row and (b) for things the debate has many of, like teams, speaker scores and team scores, in general UPDATE statements don't really work well with multiple rows anyway.

(If you have an N+1 problem with an update, there are ways in SQL to make it more efficient, but as far as I can tell you can't collapse it into a single statement, I think you have to use transaction.atomic() or something. Suffice to say Django's not great with this. Or, lots of people have pointed to this: https://github.com/aykut/django-bulk-update/)

I'm not fully confident about this assessment, but last time I looked at it, I felt like serialising large amounts of data had a bunch of N+1 issues (3545827). I think we just want to revise the framework—having serialise() as a method of individual models doesn't really seem amenable to performance improvements. I'm imagining instead some module that is called upon with requests for information, and gathers all the relevant data using a prefetch strategy similar to Round.debate_set_with_prefetches(), but returning the requested JSON response. With tactful prefetches I suspect even "expensive" attributes can be made to run in reasonable time—some of the existing functions in prefetch.py files aren't exactly cheap, but once implemented, they seemed to make the things they affected pretty smooth.

@philipbelesky
Copy link
Member Author

While I think information could be definitely prefetched better in whatever method is used to produce serialisations, for all intents and purposes hitting the SaveDebatePanel is in effect going to end up serialising only a single model so mass-prefetching probably isn't necessary or efficient.

Any performance problems with the mass-serialisation (i.e. on initial page load) are probably less impactful as they aren't being accessed constantly. Or at least they don't have the N+1 issues flagged by scout.

@czlee
Copy link
Member

czlee commented Jul 7, 2018

Do you remember what the N+1 queries for SaveDebatePanel were? Scout doesn't show it more than 24 hours after the last one, and from looking at the code I feel like it's probably the DebateAdjudicator.objects.update_or_create() call, which is called once for each debate-adjudicator in the request, and can't really be done in bulk. I remember seeing Scout highlighting some UPDATE queries in the N+1 detector, which as I said above we can't really do much about, save for more drastic approaches to bulk updating.

I get a little nervous about bulk updates, or even bulk creates, because they bypass the .save() function of the model in question. For example, the change in draw/manager.py in 102fc4a helps a lot with how long it takes to generate the draw, but I'm still deciding if I want to push through with it.

@philipbelesky
Copy link
Member Author

Um I don't have a copy saved, but looking at the queries in debug toolbar (using the Request History addon linked above) the duplicated queries are:

SELECT "venues_venuecategory"."id", "venues_venuecategory"."name", "venues_venuecategory"."description", "venues_venuecategory"."display_in_venue_name", "venues_venuecategory"."display_in_public_tooltip" FROM "venues_venuecategory" INNER JOIN "venues_venuecategory_venues" ON ("venues_venuecategory"."id" = "venues_venuecategory_venues"."venuecategory_id") WHERE "venues_venuecategory_venues"."venue_id" = 19
  Duplicated 2 times.

SELECT "venues_venuecategory"."id", "venues_venuecategory"."name", "venues_venuecategory"."description", "venues_venuecategory"."display_in_venue_name", "venues_venuecategory"."display_in_public_tooltip" FROM "venues_venuecategory" INNER JOIN "venues_venuecategory_venues" ON ("venues_venuecategory"."id" = "venues_venuecategory_venues"."venuecategory_id") WHERE "venues_venuecategory_venues"."venue_id" = 19
  Duplicated 2 times.

SELECT "participants_institution"."id", "participants_institution"."name", "participants_institution"."code", "participants_institution"."region_id" FROM "participants_institution" WHERE "participants_institution"."id" = 9
  Duplicated 3 times.

SELECT "participants_person"."id", "participants_person"."name", "participants_person"."email", "participants_person"."phone", "participants_person"."anonymous", "participants_person"."notes", "participants_person"."gender", "participants_person"."pronoun", "participants_speaker"."person_ptr_id", "participants_speaker"."team_id" FROM "participants_speaker" INNER JOIN "participants_person" ON ("participants_speaker"."person_ptr_id" = "participants_person"."id") WHERE "participants_speaker"."team_id" = 16
  Duplicated 2 times.

SELECT "breakqual_breakcategory"."id", "breakqual_breakcategory"."tournament_id", "breakqual_breakcategory"."name", "breakqual_breakcategory"."slug", "breakqual_breakcategory"."seq", "breakqual_breakcategory"."break_size", "breakqual_breakcategory"."is_general", "breakqual_breakcategory"."priority", "breakqual_breakcategory"."limit", "breakqual_breakcategory"."rule" FROM "breakqual_breakcategory" INNER JOIN "participants_team_break_categories" ON ("breakqual_breakcategory"."id" = "participants_team_break_categories"."breakcategory_id") INNER JOIN "tournaments_tournament" ON ("breakqual_breakcategory"."tournament_id" = "tournaments_tournament"."id") WHERE "participants_team_break_categories"."team_id" = 16 ORDER BY "tournaments_tournament"."seq" ASC, "breakqual_breakcategory"."seq" ASC
  Duplicated 2 times.

SELECT COUNT(*) AS "__count" FROM "results_teamscore" INNER JOIN "results_ballotsubmission" ON ("results_teamscore"."ballot_submission_id" = "results_ballotsubmission"."id") INNER JOIN "draw_debateteam" ON ("results_teamscore"."debate_team_id" = "draw_debateteam"."id") WHERE ("results_ballotsubmission"."confirmed" = true AND "draw_debateteam"."team_id" = 16 AND "results_teamscore"."win" = true)
  Duplicated 2 times.

SELECT SUM("results_teamscore"."points") AS "points__sum" FROM "results_teamscore" INNER JOIN "results_ballotsubmission" ON ("results_teamscore"."ballot_submission_id" = "results_ballotsubmission"."id") INNER JOIN "draw_debateteam" ON ("results_teamscore"."debate_team_id" = "draw_debateteam"."id") WHERE ("results_ballotsubmission"."confirmed" = true AND "draw_debateteam"."team_id" = 16)
  Duplicated 2 times.

SELECT "participants_institution"."id", "participants_institution"."name", "participants_institution"."code", "participants_institution"."region_id" FROM "participants_institution" WHERE "participants_institution"."id" = 1
  Duplicated 3 times.

SELECT "participants_person"."id", "participants_person"."name", "participants_person"."email", "participants_person"."phone", "participants_person"."anonymous", "participants_person"."notes", "participants_person"."gender", "participants_person"."pronoun", "participants_speaker"."person_ptr_id", "participants_speaker"."team_id" FROM "participants_speaker" INNER JOIN "participants_person" ON ("participants_speaker"."person_ptr_id" = "participants_person"."id") WHERE "participants_speaker"."team_id" = 1
  Duplicated 2 times.

SELECT "breakqual_breakcategory"."id", "breakqual_breakcategory"."tournament_id", "breakqual_breakcategory"."name", "breakqual_breakcategory"."slug", "breakqual_breakcategory"."seq", "breakqual_breakcategory"."break_size", "breakqual_breakcategory"."is_general", "breakqual_breakcategory"."priority", "breakqual_breakcategory"."limit", "breakqual_breakcategory"."rule" FROM "breakqual_breakcategory" INNER JOIN "participants_team_break_categories" ON ("breakqual_breakcategory"."id" = "participants_team_break_categories"."breakcategory_id") INNER JOIN "tournaments_tournament" ON ("breakqual_breakcategory"."tournament_id" = "tournaments_tournament"."id") WHERE "participants_team_break_categories"."team_id" = 1 ORDER BY "tournaments_tournament"."seq" ASC, "breakqual_breakcategory"."seq" ASC
  Duplicated 2 times.

SELECT COUNT(*) AS "__count" FROM "results_teamscore" INNER JOIN "results_ballotsubmission" ON ("results_teamscore"."ballot_submission_id" = "results_ballotsubmission"."id") INNER JOIN "draw_debateteam" ON ("results_teamscore"."debate_team_id" = "draw_debateteam"."id") WHERE ("results_ballotsubmission"."confirmed" = true AND "draw_debateteam"."team_id" = 1 AND "results_teamscore"."win" = true)
  Duplicated 2 times.

SELECT SUM("results_teamscore"."points") AS "points__sum" FROM "results_teamscore" INNER JOIN "results_ballotsubmission" ON ("results_teamscore"."ballot_submission_id" = "results_ballotsubmission"."id") INNER JOIN "draw_debateteam" ON ("results_teamscore"."debate_team_id" = "draw_debateteam"."id") WHERE ("results_ballotsubmission"."confirmed" = true AND "draw_debateteam"."team_id" = 1)
  Duplicated 2 times.

SELECT "participants_institution"."id", "participants_institution"."name", "participants_institution"."code", "participants_institution"."region_id" FROM "participants_institution" WHERE "participants_institution"."id" = 3
  Duplicated 3 times.

@czlee
Copy link
Member

czlee commented Jul 7, 2018

Whoa okay, the duplication's not really bothersome but that's a lot of different tables for a view that's just meant to save a debate panel.

@philipbelesky
Copy link
Member Author

Just added participants.views.UpdateEligibilityEditView, the flagged query was:

SELECT (1) AS "a" FROM "participants_speakercategory" INNER JOIN "participants_speaker_categories" ON ("participants_speakercategory"."id" = "participants_speaker_categories"."speakercategory_id") WHERE ("participants_speaker_categories"."speaker_id" = %s AND "participants_speakercategory"."id" = %s) LIMIT 1

Called 372 times (presumably once per speaker), I presume from:

self.set_category_eligibility(speaker, posted_info[str(speaker_id)])

I guess there is some optimisation possible where the updates are issued in bulk rather than through looping.

@philipbelesky
Copy link
Member Author

philipbelesky commented Aug 3, 2018

  • For adjfeedback.views.FeedbackByTargetView

  • 188 x SELECT COUNT(*) AS "__count" FROM "adjfeedback_adjudicatorfeedback" WHERE "adjfeedback_adjudicatorfeedback"."adjudicator_id" = %s

  • For adjfeedback.views.FeedbackBySourceView

  • 194 x SELECT COUNT(*) AS "__count" FROM "adjfeedback_adjudicatorfeedback" INNER JOIN "draw_debateteam" ON ("adjfeedback_adjudicatorfeedback"."source_team_id" = "draw_debateteam"."id") WHERE "draw_debateteam"."team_id" = %s

  • 188 x SELECT COUNT(*) AS "__count" FROM "adjfeedback_adjudicatorfeedback" INNER JOIN "adjallocation_debateadjudicator" ON ("adjfeedback_adjudicatorfeedback"."source_adjudicator_id" = "adjallocation_debateadjudicator"."id") WHERE "adjallocation_debateadjudicator"."adjudicator_id" = %s

  • 194 x SELECT "participants_person"."id", "participants_person"."name", "participants_person"."email", "participants_person"."phone", "participants_person"."anonymous", "participants_person"."notes", "participants_person"."gender", "participants_person"."pronoun", "participants_speaker"."person_ptr_id", "participants_speaker"."team_id" FROM "participants_speaker" INNER JOIN "participants_person" ON ("participants_speaker"."person_ptr_id" = "participants_person"."id") WHERE "participants_speaker"."team_id" = %s

@philipbelesky
Copy link
Member Author

philipbelesky commented Aug 3, 2018

  • For printing.views.AdminPrintScoresheetsView this seems to have been due to the new behaviour that creates CheckIn identifiers for ballots. This currently happens one by one rather than in bulk, so it ends up the following queries repeating for the NUMBER_OF_ROOMS:

  • INSERT INTO "checkins_identifier" ("barcode") VALUES (%s) RETURNING "checkins_identifier"."id"
    INSERT INTO "checkins_debateidentifier" ("identifier_ptr_id", "debate_id") VALUES (%s, %s)

  • SELECT "checkins_identifier"."id", "checkins_identifier"."barcode", "checkins_debateidentifier"."identifier_ptr_id", "checkins_debateidentifier"."debate_id" FROM "checkins_debateidentifier" INNER JOIN "checkins_identifier" ON ("checkins_debateidentifier"."identifier_ptr_id" = "checkins_identifier"."id") WHERE "checkins_debateidentifier"."debate_id" = %s
    SELECT COUNT(*) AS "__count" FROM "checkins_identifier" WHERE "checkins_identifier"."barcode" = %s

Update: I have constructed gotten the query part into a single filter() rather than an iterated get(); however the creation part isn't compatible with bulk_create because its a multi-table model. I have however wrapped it in transaction.atomic() so it should be faster to execute.

@philipbelesky
Copy link
Member Author

philipbelesky commented Aug 3, 2018

  • For adjfeedback.views.PublicAddFeedbackByRandomisedUrlView

  • Repeated 10 times: SELECT "tournaments_round"."id", "tournaments_round"."tournament_id", "tournaments_round"."seq", "tournaments_round"."name", "tournaments_round"."abbreviation", "tournaments_round"."stage", "tournaments_round"."draw_type", "tournaments_round"."break_category_id", "tournaments_round"."draw_status", "tournaments_round"."feedback_weight", "tournaments_round"."silent", "tournaments_round"."motions_released", "tournaments_round"."starts_at" FROM "tournaments_round" WHERE "tournaments_round"."id" = %s

@philipbelesky
Copy link
Member Author

Also btw @tienne-B or @tfpk if you are handy with optimising SQL query usage (I suspect most cases just need a strategic select_related) feel free to have a crack at any of the above views mentioned above. Enabling the Django Debug Toolbar (there should be a flag in local_settings.py) will provide insight into the repeated queries within a view.

@tienne-B
Copy link
Member

tienne-B commented Aug 3, 2018 via email

philipbelesky added a commit that referenced this issue Aug 3, 2018
Improve queries in adjudicator feedback views as part of #676
@philipbelesky
Copy link
Member Author

philipbelesky commented Aug 3, 2018

  • For venues.views.AutoAllocateVenuesView, the flagged queries are:

  • 195 x SELECT "participants_institution"."id", "participants_institution"."name", "participants_institution"."code", "participants_institution"."region_id" FROM "participants_institution" WHERE "participants_institution"."id" = %s

  • 46 x UPDATE "draw_debate" SET "round_id" = %s, "venue_id" = %s, "division_id" = NULL, "bracket" = %s, "room_rank" = %s, "time" = NULL, "flags" = %s, "importance" = %s, "result_status" = %s, "sides_confirmed" = %s WHERE "draw_debate"."id" = %s

  • 25 x SELECT "venues_venue"."id", "venues_venue"."name", "venues_venue"."priority", "venues_venue"."tournament_id" FROM "venues_venue" INNER JOIN "venues_venuecategory_venues" ON ("venues_venue"."id" = "venues_venuecategory_venues"."venue_id") WHERE "venues_venuecategory_venues"."venuecategory_id" = %s ORDER BY "venues_venue"."name" ASC

  • 25 x SELECT "tournaments_tournament"."id", "tournaments_tournament"."name", "tournaments_tournament"."short_name", "tournaments_tournament"."seq", "tournaments_tournament"."slug", "tournaments_tournament"."current_round_id", "tournaments_tournament"."active" FROM "tournaments_tournament" WHERE "tournaments_tournament"."id" = %s

  • 25 x SELECT "venues_venuecategory"."id", "venues_venuecategory"."name", "venues_venuecategory"."description", "venues_venuecategory"."display_in_venue_name", "venues_venuecategory"."display_in_public_tooltip" FROM "venues_venuecategory" WHERE "venues_venuecategory"."id" = %s

(Reposted this comment to tag @czlee)

For whenever you had time to look at this:

  • The SELECT "participants_institution" query is related to the logger.info("Constraints on %s: %s", debate, constraints) line — in constructing the debate information it needs to pull in a bunch of participant data. Could we remove the logging function here, or only add it on DEBUG? Alternately perhaps we could just log the debate.id
  • The 25 x SELECT "venues_venue"."id" query is related to the eligible_venues = set(highest_constraint.category.venues.all()) & set(self._all_venues). I'm not super familiar with this code but given its happening within a loop, can highest_constraint.category.venues.all() be called outside the loop, or does it need to update live? There is a similar issue with constraint_venues = set(constraint.category.venues.all())
  • For the iterated debate.save() I wonder if its possible to use a a F() expression to update these? Probably not unless maybe the new venue was written as an annotation? Otherwise I think the appropriate method would be to use an atomic transaction for that loop to minimise round-tripping.

@tienne-B
Copy link
Member

tienne-B commented Aug 4, 2018

@czlee
Copy link
Member

czlee commented Aug 4, 2018

Regarding atomic transactions, just posting a link to the comment thread at 7a67893#r29956843. TL;DR: Atomic transactions won't reduce round trips to the database and we shouldn't use them for performance purposes.

More generally, I sort of think we shouldn't try to optimise N+1 issues involving INSERTs or UPDATEs, for the reasons discussed in 7a67893#r29957930, if you'll pardon the quoting myself as I try to move that comment thread here:

Where possible, bulk_creates are (I'm sure you'd agree) the best way to reduce N+1 issues with INSERT statements. But I'm a little reluctant to make this a standard pattern of ours. The bulk_create() method also bypasses the save() function, which for some of our models does important things, and unless we have good systems for tracking what bulk_creates happen where, I worry that one day we'll add a save() method to a model that doesn't already have one, and break some bulk_create() in a file we forgot was there, but it'll be latent for ages and it'll show up as a really, really obscure bug and we'll have to spend ages tracking it down, if we succeed at all. Weighing up that against some slow transactions when a lot of things are being saved—I think I'm okay with transactions that involve inserts/updates (as opposed to selects) taking quite a long time, if it gives me more confidence that it's being done right.

Regarding the debug line, I definitely agree that debug lines shouldn't necessitate prefetches, I'll have a look at it and consider removing or simplifying the debug line. The debate.id idea sounds good.

@philipbelesky
Copy link
Member Author

philipbelesky commented Sep 2, 2018

Details for a few recent additions (tournament rounds = 6; rooms = 9):

results.views.PublicResultsForRoundView

Repeat 49 times SELECT "tournaments_tournament"."id", "tournaments_tournament"."name", "tournaments_tournament"."short_name", "tournaments_tournament"."seq", "tournaments_tournament"."slug", "tournaments_tournament"."current_round_id", "tournaments_tournament"."active" FROM "tournaments_tournament" WHERE "tournaments_tournament"."id" = %s

participants.views.PublicTeamRecordView / participants.views.AdjudicatorRecordView

Repeated 9 times SELECT "tournaments_tournament"."id", "tournaments_tournament"."name", "tournaments_tournament"."short_name", "tournaments_tournament"."seq", "tournaments_tournament"."slug", "tournaments_tournament"."current_round_id", "tournaments_tournament"."active" FROM "tournaments_tournament" WHERE "tournaments_tournament"."id" = %s

Repeated 6 times SELECT "tournaments_round"."id", "tournaments_round"."tournament_id", "tournaments_round"."seq", "tournaments_round"."name", "tournaments_round"."abbreviation", "tournaments_round"."stage", "tournaments_round"."draw_type", "tournaments_round"."break_category_id", "tournaments_round"."draw_status", "tournaments_round"."feedback_weight", "tournaments_round"."silent", "tournaments_round"."motions_released", "tournaments_round"."starts_at" FROM "tournaments_round" WHERE "tournaments_round"."id" = %s

tournaments.views.TournamentAdminHomeView

Repeated 7 times SELECT "draw_debate"."id", "draw_debate"."round_id", "draw_debate"."venue_id", "draw_debate"."division_id", "draw_debate"."bracket", "draw_debate"."room_rank", "draw_debate"."time", "draw_debate"."flags", "draw_debate"."importance", "draw_debate"."result_status", "draw_debate"."sides_confirmed" FROM "draw_debate" WHERE "draw_debate"."id" = %s

Repeated 9 times SELECT "draw_debateteam"."id", "draw_debateteam"."debate_id", "draw_debateteam"."team_id", "draw_debateteam"."side", "draw_debateteam"."flags", "draw_debate"."id", "draw_debate"."round_id", "draw_debate"."venue_id", "draw_debate"."division_id", "draw_debate"."bracket", "draw_debate"."room_rank", "draw_debate"."time", "draw_debate"."flags", "draw_debate"."importance", "draw_debate"."result_status", "draw_debate"."sides_confirmed", "participants_team"."id", "participants_team"."reference", "participants_team"."short_reference", "participants_team"."code_name", "participants_team"."short_name", "participants_team"."long_name", "participants_team"."institution_id", "participants_team"."tournament_id", "participants_team"."division_id", "participants_team"."use_institution_prefix", "participants_team"."type", "participants_team"."emoji" FROM "draw_debateteam" INNER JOIN "draw_debate" ON ("draw_debateteam"."debate_id" = "draw_debate"."id") INNER JOIN "participants_team" ON ("draw_debateteam"."team_id" = "participants_team"."id") WHERE "draw_debateteam"."debate_id" = %s

Repeated 11 times SELECT "tournaments_tournament"."id", "tournaments_tournament"."name", "tournaments_tournament"."short_name", "tournaments_tournament"."seq", "tournaments_tournament"."slug", "tournaments_tournament"."current_round_id", "tournaments_tournament"."active" FROM "tournaments_tournament" WHERE "tournaments_tournament"."id" = %s

Repeated 11 times SELECT "tournaments_round"."id", "tournaments_round"."tournament_id", "tournaments_round"."seq", "tournaments_round"."name", "tournaments_round"."abbreviation", "tournaments_round"."stage", "tournaments_round"."draw_type", "tournaments_round"."break_category_id", "tournaments_round"."draw_status", "tournaments_round"."feedback_weight", "tournaments_round"."silent", "tournaments_round"."motions_released", "tournaments_round"."starts_at" FROM "tournaments_round" WHERE "tournaments_round"."id" = %s

draw.views.CreateDrawView

Repeated 48 times INSERT INTO "draw_debateteam" ("debate_id", "team_id", "side", "flags") VALUES (%s, %s, %s, %s) RETURNING "draw_debateteam"."id"

Repeated 12 times UPDATE "draw_debate" SET "round_id" = %s, "venue_id" = %s, "division_id" = NULL, "bracket" = %s, "room_rank" = %s, "time" = NULL, "flags" = %s, "importance" = %s, "result_status" = %s, "sides_confirmed" = %s WHERE "draw_debate"."id" = %s

Repeated 12 times INSERT INTO "draw_debate" ("round_id", "venue_id", "division_id", "bracket", "room_rank", "time", "flags", "importance", "result_status", "sides_confirmed") VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s) RETURNING "draw_debate"."id"

adjallocation.views.CreateAutoAllocation

Various / many.

philipbelesky added a commit that referenced this issue Sep 2, 2018
Currently get_side_names and record_links on record pages lookup objects per team/adj. Given this is always going to be for the tournament that the page is attached to it seems better to pass that value
@czlee
Copy link
Member

czlee commented Sep 3, 2018

Re draw.views.CreateDrawView, just noting my experiments with bulk_create() here: 102fc4a

Despite this proof of concept, as explained above, I'd prefer that for database writes (INSERT and UPDATE), that we don't bother trying to resolve the N+1 inefficiency, and instead just take the hit. I'm open to arguments to the contrary, but for now I think the additional risk of bugs (intrinsic to using bulk_create()) is too high to offset the time cost for functions that are used once per round.

@philipbelesky
Copy link
Member Author

Yea, for both create draw and auto-allocate they aren't exactly commonly-hit functions so there isn't a general need to speed their performance at the cost of reliability. The only concern is if they run up against the dyno timeout but that is probably better handled by shifting their calculation to Celery.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted performance Issues related to the performance
Projects
None yet
Development

No branches or pull requests

3 participants