Skip to content

[v2] ci: run bm25 and ColBERT test in ci#1829

Merged
Samoed merged 17 commits intoembeddings-benchmark:v2.0.0from
sam-hey:ci/test_colbert_bm25s
Jan 21, 2025
Merged

[v2] ci: run bm25 and ColBERT test in ci#1829
Samoed merged 17 commits intoembeddings-benchmark:v2.0.0from
sam-hey:ci/test_colbert_bm25s

Conversation

@sam-hey
Copy link
Contributor

@sam-hey sam-hey commented Jan 17, 2025

cI: Enable skipped tests for ColBERT and BM25
fix: get model meta from CrossEncoder
ref: use tmp_path for tests
test: add tests for model_meta

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the formatter to format the code using make lint.

start in: #1759 (comment)

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only really checks if they can be installed. We should add an actual test as well?

@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

Hi @KennethEnevoldsen,

I added two tests in this PR. However, they are not being run in the CI because the dependencies don't exist.

@KennethEnevoldsen
Copy link
Contributor

Ahh thanks for the pointer. Any reason not to remove the skip? (I could later clean up the installs without knowing that I disabled a test).

Will you also add a time diff? (coverage report would also be nice, but not required)

@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

The current issue with PyLate (referenced in GitHub Issue #79) is related to compatibility problems with older Python versions, which is causing the tests for Python 3.9 to fail.

Not skipping would require all developers to install additional dependencies: pylate, bm25s, and pystemmer. While this is feasible, it does introduce more requirements for everyone on the team. This decision is ultimately up to you, and I am fine with proceeding if you decide to include them.

@sam-hey sam-hey marked this pull request as draft January 17, 2025 15:57
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 17, 2025

Tests conducted on v2.0.0 with Python 3.11:

checks occasionally take up to 12 minutes.
This branch test runtime: 8-9 minutes.

Performance benchmarks:
ColBERT + BM25: Test execution on my machine completes in 15.17 seconds.

Test coverage summary:
Overall test coverage: 66%.
mteb/models/colbert_models.py: 63%
mteb/models/bm25.py: 94%

@sam-hey sam-hey marked this pull request as ready for review January 17, 2025 21:52
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 18, 2025

I believe I’ve added everything for now and would be happy to get your review of the changes!

  • Skipping pylate: Skipped for Python 3.9.
  • Updates:
    • Added extra tests.
    • Introduced ModelMeta for CrossEncoders passed as models to run().
  • Pending Decision: Awaiting your input on skipping BM25s and ColBERT if the version is not available.

@KennethEnevoldsen

@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 21, 2025

@Samoed can you give me a review if you have some time?

Copy link
Member

@Samoed Samoed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@sam-hey sam-hey force-pushed the ci/test_colbert_bm25s branch from a29aa75 to 893d804 Compare January 21, 2025 09:33
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 21, 2025

Why was MaxSim renamed to max_sim? This change breaks ColBERT. @Samoed, I’ll fix it and add a test to prevent similar issues in the future.

@Samoed
Copy link
Member

Samoed commented Jan 21, 2025

max_sim was used in main branch and I thought it was tested and worked with this name

@Samoed Samoed changed the title ci: run bm25 and ColBERT test in ci [v2] ci: run bm25 and ColBERT test in ci Jan 21, 2025
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 21, 2025

max_sim was used in main branch and I thought it was tested and worked with this name

The max_sim function was updated to MaxSim to align with the upcoming pylate integration in a future release.

@Samoed Samoed merged commit 6da8a13 into embeddings-benchmark:v2.0.0 Jan 21, 2025
11 checks passed
@sam-hey
Copy link
Contributor Author

sam-hey commented Jan 21, 2025

Awesome, thanks a lot!

@Samoed Samoed added the v2 label Sep 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants