Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enh]: Better benchmarking routine #805

Open
FBruzzesi opened this issue Aug 17, 2024 · 4 comments
Open

[Enh]: Better benchmarking routine #805

FBruzzesi opened this issue Aug 17, 2024 · 4 comments

Comments

@FBruzzesi
Copy link
Member

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

No response

Please describe the purpose of the new feature or describe the problem to solve.

There are some features that can require extra attention or are worth benchmarking to understand if worth implementing. For example I am thinking of #500 and #743.

Suggest a solution if possible.

I checked how other libraries do that, specifically pydantic. They use codspeed which seems to have a free tier for public repos.

Question is: what to benchmark?! Would TPCH queries in main vs branch be a reasonable test?

If you have tried alternatives, please describe them below.

Currently very manual effort on kaggle

Additional information that may help us understand your needs.

No response

@DeaMariaLeon
Copy link
Member

DeaMariaLeon commented Aug 18, 2024

Hey, I installed Codspeed for benchmarking on pydata/sparse. I'm transforming benchmarks from asv to codspeed "as we speak".

Would TPCH queries in main vs branch be a reasonable test?

Codpeed tests the opened PR against main, if that was the question. 🤔
It can be setup to block the merge if there is a regression. It sends the report as a comment on the PR.
If you need details (or help) let me know.
Those are my 15 cents. 😇
edit:
Benchmarks for Codspeed run in the CI. TPCH tests with a lot of data, doesn't it?

@FBruzzesi
Copy link
Member Author

Hey Dea, thanks for the input. That's what happens when I open issues in a rush. Let me try to clarify some points and ideas.

My understanding is that one can mark some test for benchmarking, and I am wondering what could these test be.

TPCH tests with a lot of data, doesn't it?

One option is to run TPCH queries with the subset of the data we have in tests/data/ folder. It should not take as long as the actual TPCH benchmarking.

Codpeed tests the opened PR against main, if that was the question. 🤔

Yes that is exactly my point: PR (branch) vs main, so I am getting the process right 👌
I wonder though if it could be trigger only, as it is definitly an overkill for most PRs.

If you need details (or help) let me know.

I have never used it so far, I am happy to give it a spin, but expect to be pinged for help 🙈

@DeaMariaLeon
Copy link
Member

I wonder though if it could be trigger only, as it is definitly an overkill for most PRs.

It doesn't say in the documentation. I guess one could do it "somehow" with the CI, but I don't think that it's an out-of-the-box option. You can select if the report is sent all the time to the PR, or only if there is a failure/improvement... but that's all they mention.

but expect to be pinged for help

I truly doubt that you'll ever need help from me 😁 .. but sure!

@FBruzzesi
Copy link
Member Author

Commenting to discuss the idea: as plotly is understandably concerned about performances, maybe we could use the script they shared to assess if we have a performance drop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants