Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue or Question or ?: large gtfs without calendar (only calendar_dates) #79

Open
vingerha opened this issue Nov 6, 2023 · 2 comments

Comments

@vingerha
Copy link
Contributor

vingerha commented Nov 6, 2023

Hi,
I am remodelling the gtfs solution in HomeAssistant and have a use case with a large file from the NL, the sqlite turns into 7Gb.
As this dataset does not contain calendar entries, I need to rewrite the query to compensate for that and since sqlite does not allow outer joins I need to run it twice with an UNION ALL. Due to the large amount of data the query is pretty slow (db browser : 20-23 sec) and I was wondering if I could optimize this wqith indexes. This I will do myself but 2 questions:

  • can you easily add indexes for pytgts, else I need to add them to my solution which I would not prefer
  • would yo umaybe know of a way to construct the calendar with only calendar_dates entries? I may try myself but asking first
@vingerha
Copy link
Contributor Author

vingerha commented Nov 6, 2023

Massive improvement is by adding a idx on stop_times stop_id and trip_id...query goes to below 1s

@vingerha vingerha closed this as completed Nov 6, 2023
@vingerha vingerha reopened this Nov 6, 2023
@vingerha
Copy link
Contributor Author

vingerha commented Nov 6, 2023

Sorry, closed incorrectly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant