Skip to content

refactor: Cache, reuse Implementation._backend_version#2764

Merged
MarcoGorelli merged 34 commits intomainfrom
backend-version-cache
Jul 3, 2025
Merged

refactor: Cache, reuse Implementation._backend_version#2764
MarcoGorelli merged 34 commits intomainfrom
backend-version-cache

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Jun 30, 2025

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

The diff is big, but the changes boil down to:

  1. Implementation._backend_version is now cached (5fac883)
  2. We don't store and pass around the version tuple on every class

The validation of backend_version occurs before returning the cached value.
Therefore, if we manage to access it - it must be valid - so we don't need to repeat that validation for every DataFrame, LazyFrame, Series.

In the cases where we might be importing a backend for the first time (lazy|collect), the version will either be

  1. Validated for the first time, or
  2. Returned (as the validation was cached)

Tasks

Most changes are factoring-out _backend_version from Compliant* to use Implementation._backend_version() instead:

Now `mypy` understands the same error as `pyright` 😏
- Seemed the easiest way to get coverage
- Will split into another PR
- https://ibis-project.org/reference/expression-tables#ibis.memtable
Was broken on windows
> UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 21406: character maps to <undefined>
@dangotbanned dangotbanned changed the title chore(RFC): Cache Implementation._backend_version chore: Cache Implementation._backend_version Jul 1, 2025
@dangotbanned dangotbanned changed the title chore: Cache Implementation._backend_version refactor: Cache, reuse Implementation._backend_version Jul 1, 2025
@dangotbanned dangotbanned marked this pull request as ready for review July 1, 2025 18:45
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally on board, probably needs a merge

Comment on lines -491 to +480
if (
self._implementation
in {Implementation.PYSPARK, Implementation.PYSPARK_CONNECT}
and (pyspark := get_pyspark()) is not None
and parse_version(pyspark) < (3, 4)
): # pragma: no cover
if self._implementation in {
Implementation.PYSPARK,
Implementation.PYSPARK_CONNECT,
} and Implementation.PYSPARK._backend_version() < (3, 4): # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@MarcoGorelli
Copy link
Member

thanks - shiny failure is unrelated (and now resolved), i'd say let's wait for this to turn green then we can ship it

@MarcoGorelli MarcoGorelli merged commit 54bf3d3 into main Jul 3, 2025
31 of 32 checks passed
@MarcoGorelli MarcoGorelli deleted the backend-version-cache branch July 3, 2025 13:39
dangotbanned added a commit that referenced this pull request Jul 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants