Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction log parsing performance regression #2760

Closed
Tom-Newton opened this issue Aug 12, 2024 · 12 comments
Closed

Transaction log parsing performance regression #2760

Tom-Newton opened this issue Aug 12, 2024 · 12 comments
Labels
binding/rust Issues for the Rust crate bug Something isn't working

Comments

@Tom-Newton
Copy link
Contributor

Tom-Newton commented Aug 12, 2024

Environment

Delta-rs version: 0.18.2

Binding: Python

Environment:

  • Cloud provider: Azure
  • OS: Ubuntu 20.04
  • Other: Python 3.10

Bug

What happened:
Performance regression in transaction log parsing compared to deltalake 0.10.1
image
Y axis is time in seconds.

What you expected to happen:
New versions of deltalake to be the same or better performance than older versions.

How to reproduce it:

deltalake.DeltaTable(
    "abfss://<container>@<storage account>.dfs.core.windows.net/path/to/table/with/long/transaction_log",
    storage_options={"use_azure_cli": "true"},
)

Compare the time taken for this compared to the same thing when using deltalake 0.10.1.

More details:
I think I have identified the 2 reasons for the performance regression.

  1. deltalake now relies on ObjectStore.list_with_offset which uses an inefficient implementation for Azure but is probably advantageous on GCS or S3.
    1. I'm not particularly worried about this. Use reconstructed ListBlobs marker to provide list offset support in MicrosoftAzure store apache/arrow-rs#6174 (comment) is trying to solve it and we're trying to get Azure to help with that.
    2. I have a temporary solution that re-implements the old behaviour of deltalake where it iteratively checks if commit versions exist, instead of using a list operation. This gives us the "0.18.2 modified" results in my graph above.
    3. For the record list_with_offset was introduced by feat: buffered reading of transaction logs #1549
  2. It looks like the latest checkpoint and subsequent log entries are download from blob storage twice.
    1. First downloaded to get the table protocol and metadata, second download to get relevant parquet files
    2. Looks like this behaviour was introduced by feat: arrow backed log replay and table state #2037
    3. This one concerns me a bit more. Is there any plan to try to improve this?

(Reporting as a bug seemed more appropriate than a feature request but is not ideal)

@Tom-Newton Tom-Newton added the bug Something isn't working label Aug 12, 2024
@rtyler rtyler added the binding/rust Issues for the Rust crate label Aug 12, 2024
@rtyler
Copy link
Member

rtyler commented Aug 12, 2024

interesting find, thanks for the report! 👋

@ion-elgreco
Copy link
Collaborator

@Tom-Newton If I am bored next weekend, I can take a look at the second issue ; )

@ion-elgreco
Copy link
Collaborator

@Tom-Newton does this have some negligible change for you? #2764

@Tom-Newton
Copy link
Contributor Author

Thanks @ion-elgreco it looks like this did provide a measurable improvement
image
Middle is what I previously called "0.18.2 modified", right is the same again but additionally with #2764

It looks like its still about 41-55% slower than deltalake 0.10.1 though. I have some network download stats and it looks like its still downloaded about 120MB just to initialise deltalake.DeltaTable() while the same test downloaded about 60MB on 0.10.1.

@ion-elgreco
Copy link
Collaborator

@Tom-Newton awesome! Thanks for checking this!

Yeah so like you mentioned before we read the files twice.

I was thinking of perhaps modifying the commit_stream/checkpoint_stream so that it caches the result prior to returning. Then the stream function should also be able to drain the cache when it's executed for the second time.

@rtyler
Copy link
Member

rtyler commented Aug 14, 2024

@Tom-Newton I'm curious if you can run your test with the latest 0.19.0 that just released, I'm very curious how well @roeap 's optimizations might have also helped your usecase 😄

@Tom-Newton
Copy link
Contributor Author

Its looking good
image
The latest version now slightly outperforms 0.10.1 and it only downloads 30MiB compared to 120MiB previously (30MiB matches up nicely with the size of the checkpoint plus subsequent commits) 🎉.

I assume it was #2772 that made the difference, but I'm a bit confused about how.

@rtyler
Copy link
Member

rtyler commented Aug 15, 2024

✨ magic! ✨

more efficient use of the checkpoints and other metadata in the log is likely causing fewer log files to be fetched from storage. Either way, thank you so much for providing pretty graphs! I'm going to stick a fork in this turkey and call it done!

@rtyler rtyler closed this as completed Aug 15, 2024
@ion-elgreco
Copy link
Collaborator

@rtyler I do believe we can squeeze more performance out though

@Tom-Newton
Copy link
Contributor Author

I think we are still reading the checkpoint and subsequent log files twice but with the new optimisations we're taking advantage of the features of parquet to not download columns and row groups of the checkpoint that we don't need. This seems to be enough to edge out the 0.10.1 performance even when doing that twice.

Anyway, thanks everyone 🙂. I'm impressed how quickly this was resolved.

@ion-elgreco ion-elgreco reopened this Aug 15, 2024
@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Aug 15, 2024

I'll leave it open though, to not forget to look into this


Actually will just create new issue instead

@alexwilcoxson-rel
Copy link
Contributor

saw this late but one thing that may have helped also was #2717 this should result in only reading the columns of parquet necessary from the checkpoint for the actions being queried

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants