Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] CI broken on main (mitigated: needs follow-up) #3398

Open
johnkerl opened this issue Dec 4, 2024 · 3 comments
Open

[python] CI broken on main (mitigated: needs follow-up) #3398

johnkerl opened this issue Dec 4, 2024 · 3 comments
Assignees
Labels
bug Something isn't working priority python-api

Comments

@johnkerl
Copy link
Member

johnkerl commented Dec 4, 2024

Python full CI manual run:
https://github.com/single-cell-data/TileDB-SOMA/actions/runs/12164118211/job/33924984198

FAILED apis/python/tests/test_basic_xarray_io.py::TestDenseNDDataArray1D::test_getitem_with_steps[key0] - Failed: DID NOT RAISE <class 'NotImplementedError'>
FAILED apis/python/tests/test_basic_xarray_io.py::TestDenseNDDataArray3D::test_getitem_with_steps[key0] - Failed: DID NOT RAISE <class 'NotImplementedError'>
= 2 failed, 6338 passed, 23 skipped, 2 xfailed, 2450 warnings in 433.07s (0:07:13) =

def test_getitem_with_steps(self, xr_soma_data_array, key):
with pytest.raises(NotImplementedError):
xr_soma_data_array[key].data.compute()

This appears to be new as of today.

@XanthosXanthopoulos
Copy link
Collaborator

This is caused by the latest Dask version 2024.11.2 probably this specific PR

@johnkerl
Copy link
Member Author

johnkerl commented Dec 5, 2024

Finding by @XanthosXanthopoulos:

I looked it a bit further and the new version create some extra potential issues down the road. The passed SOMAArray is copied entirely that's why we are not raising a exception. The slicing operation is acting upon the copied array which support the stepped indexing.

A potential issue is now that everything is copied we are going to require more memory.

Adding @jp-dark as assignee to triage after (well-deserved!) PTO.

@XanthosXanthopoulos
Copy link
Collaborator

After some more digging the previously mentioned PR is not causing the issue. The task graph differs between the two versions. The newer version has a read op that reads a chunk of data and then the next node reads back the requested slice compared to the previous version which reads the requested slice directly.

With that in mind the memory issue I mentioned earlier doesn't seem to be actually an issue.

johnkerl added a commit that referenced this issue Dec 5, 2024
* [python/r] Docstring audit for new shape

* work around #3398 CI regression

* work around #3398 CI regression

* run `devtools::document()`

* add some doclinks

* README.md
@johnkerl johnkerl changed the title [python] CI broken on main [python] CI broken on main (mitigated: needs follow-up) Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority python-api
Projects
None yet
Development

No branches or pull requests

4 participants