-
-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slow execution of complex query #4281
Comments
@max-hoffman will dig into this one when he gets a chance. |
Some additional notes:
We can force index lookups in other ways if this is a sensitive query that needs to be optimized in a hurry. |
Thanks for the feedback. I had noticed the import error and was going to report that separately. No need now, I suppose :-) Thanks for the The query is not sensitive, but I imagine I will have many other queries which would exhibit the same performance problems. Because actually my regular queries wouldn't do a |
Small status update, I've identified the issue and am making progress in this PR dolthub/go-mysql-server#1247. If I move |
Sounds excellent! I tried doing what you describe (inline CTE in FROM clause) but with my version of Dolt I didn't see much of a difference. Regarding the other two problems we briefly discussed here, do you need issues for them? |
Yes the main issue is the join order, which will only be in that feature branch for now. Moving the condition to a We appreciate the bug reports! Keep them coming. |
You should usually file a bug before resorting to this, but you also manually set the join order with |
I see, Oracle-style hints. Are these documented somewhere? |
No. Maybe we should document them? |
Resolving. Made docs ticket for the join hints. |
Oops. Reopening until PR is merged. |
I need to rewrite join ordering generation to get the final edge cases. I hope to finish by Friday. |
@knutwannheden I ended up rewriting our join ordering algorithm to fix this. I am currently working on adding our missing join operators, SemiJoin, AntiJoin, FullOuterJoin to make it easier to test the rewrite, and then I am redoing our index selection, which should be easier now that we only generate correctly indexable plans. The combination of these improvements will make us lighter on our feet reacting to join queries in the future. Thank you for your patience! Keep the bugs coming. |
The new runtime with the joins PR (dolthub/go-mysql-server#1290.) and pending up subquery bug fix (dolthub/go-mysql-server#1350) should be more satisfactory, though there is more room for improving the join plan when we get to costing filters: time dolt sql -q "with kommission_interessengruppe as (select k.id kommission_id, i.id interessengruppe_id from kommission k join branche b on k.id = b.kommission_id join interessengruppe i on b.id = i.branche_id where k.abkuerzung in ('SGK-NR', 'SGK-SR')) select count(*) from parlamentarier pa join partei p on pa.partei_id = p.id join interessenbindung ib on ib.parlamentarier_id = pa.id left join (select * from interessenbindung_jahr where id in (select max(id) from interessenbindung_jahr group by interessenbindung_id)) ibj on ibj.interessenbindung_id = ib.id join organisation o on ib.organisation_id = o.id join in_kommission ik on pa.id = ik.parlamentarier_id join kommission_interessengruppe ki on ki.kommission_id = ik.kommission_id and ki.interessengruppe_id in (o.interessengruppe_id, o.interessengruppe2_id, o.interessengruppe3_id) where now() between pa.im_rat_seit and coalesce(pa.im_rat_bis, now()) and now() between coalesce(ib.von, ib.created_date) and coalesce(ib.bis, now()) and now() between ib.von and coalesce(ik.bis, now()) and ib.art != 'mitglied' and o.rechtsform not in ('Parlamentarische Gruppe', 'Informelle Gruppe') ;"
+----------+
| count(*) |
+----------+
| 62 |
+----------+
________________________________________________________
Executed in 709.37 millis fish external
usr time 733.86 millis 205.00 micros 733.65 millis
sys time 75.79 millis 667.00 micros 75.12 millis |
Sounds like excellent progress! Thanks for the update. |
Release |
The following somewhat complex query is very slow on Dolt, when compared to MySQL. In MySQL the query runs in about 110ms (on my machine) and in Dolt the query takes around 35s to complete (so some factor 300+):
I have attached the SQL script required to create the schema and data here: lobbywatch.zip
Here is the execution plan in MySQL 8:
max(id)
))When I try to run
EXPLAIN
for the query in Dolt I get an error:2022-09-07T15:59:47+02:00 WARN [conn 3] error running query {connectTime=2022-09-07T15:59:11+02:00, connectionDb=lobbywatch_public, error=string ' │ │ │ │ │ │ └─ columns: [id nachname vorname vorname_kurz zweiter_vorname buergerorte rat_id kanton_id kommissionen partei_id parteifunktion fraktion_id fraktionsfunktion im_rat_seit im_rat_bis ratswechsel ratsunterbruch_von ratsunterbruch_bis beruf beruf_fr beruf_interessengruppe_id titel aemter weitere_aemter zivilstand anzahl_kinder militaerischer_grad_id geschlecht geburtstag photo_dateiname photo_dateierweiterung photo_dateiname_voll photo_mime_type kleinbild sitzplatz email_2 homepage homepage_2 parlament_biografie_id parlament_number parlament_beruf_json parlament_interessenbindungen parlament_interessenbindungen_json parlament_interessenbindungen_updated twitter_name instagram_profil youtube_user linkedin_profil_url xing_profil_name facebook_name wikipedia wikidata_qid sprache arbeitssprache adresse_firma adresse_plz adresse_ort erfasst autorisierung_reminder_verschickt_visa autorisierung_reminder_verschickt_datum autorisiert_datum freigabe_datum created_date updated_date]' is too large for column 'varchar(1000)', query=/* ApplicationName=IntelliJ IDEA 2022.2.2 Preview */ explain with kommission_interessengruppe as (select k.id kommission_id, i.id interessengruppe_id from kommission k join branche b on k.id = b.kommission_id join interessengruppe i on b.id = i.branche_id where k.abkuerzung in ('SGK-NR', 'SGK-SR')) select count(*) from parlamentarier pa join partei p on pa.partei_id = p.id join interessenbindung ib on ib.parlamentarier_id = pa.id left join (select * from interessenbindung_jahr where id in (select max(id) from interessenbindung_jahr group by interessenbindung_id)) ibj on ibj.interessenbindung_id = ib.id join organisation o on ib.organisation_id = o.id join in_kommission ik on pa.id = ik.parlamentarier_id join kommission_interessengruppe ki on ki.kommission_id = ik.kommission_id and ki.interessengruppe_id in (o.interessengruppe_id, o.interessengruppe2_id, o.interessengruppe3_id) where now() between pa.im_rat_seit and coalesce(pa.im_rat_bis, now()) and now() between coalesce(ib.von, ib.created_date) and coalesce(ib.bis, now()) and now() between ib.von and coalesce(ik.bis, now()) and ib.art != 'mitglied' and o.rechtsform not in ('Parlamentarische Gruppe', 'Informelle Gruppe')}
I am sure the query can be boiled down a bit to a much simpler query, which also has performance problems. Please let me know if I can help with anything.
The text was updated successfully, but these errors were encountered: