Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix: consolidation failure on dense arrays without respecting current domain #5390

Merged
merged 18 commits into from
Dec 4, 2024

Conversation

kounelisagis
Copy link
Member

@kounelisagis kounelisagis commented Nov 28, 2024

This PR addresses a bug in the Domain::expand_to_tiles(NDRange*) function, which previously did not respect the current domain. The lack of this consideration led to errors when the tile extent was larger than the current domain, causing out-of-bounds issues during consolidation. To resolve this, the PR introduces a new function that appropriately accounts for the current domain.

[sc-59793] and single-cell-data/TileDB-SOMA#3383

Huge kudos to @johnkerl for getting 99.9% of this done.


TYPE: BUG
DESC: This PR fixes a bug in Domain::expand_to_tiles(NDRange*), which failed to respect the current domain, causing errors when tile extents exceeded it. A new function is introduced to properly account for the current domain during consolidation.

@kounelisagis kounelisagis marked this pull request as ready for review November 28, 2024 18:13
Copy link
Contributor

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and thank you!!

I left some requested changes. Given timezone skew, and the U.S. holiday, I won't mark these as 'request changes' but will rather just let you make them.

tiledb/sm/array_schema/array_schema.h Outdated Show resolved Hide resolved
tiledb/sm/array_schema/array_schema.h Outdated Show resolved Hide resolved
tiledb/sm/array_schema/array_schema.cc Outdated Show resolved Hide resolved
tiledb/sm/array_schema/array_schema.cc Outdated Show resolved Hide resolved
@ihnorton ihnorton requested a review from ypatia December 2, 2024 13:34
Copy link
Member

@ypatia ypatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! A few comments on code style mostly.

tiledb/sm/array_schema/array_schema.h Outdated Show resolved Hide resolved
tiledb/sm/array_schema/array_schema.cc Outdated Show resolved Hide resolved
tiledb/sm/array_schema/array_schema.cc Outdated Show resolved Hide resolved
test/src/unit-cppapi-consolidation.cc Outdated Show resolved Hide resolved
@kounelisagis kounelisagis marked this pull request as draft December 2, 2024 18:47
@kounelisagis kounelisagis marked this pull request as ready for review December 3, 2024 11:13
@kounelisagis kounelisagis force-pushed the kerl/dense-consolidate-debug branch from 09b3b9e to 114a6aa Compare December 3, 2024 11:28
@kounelisagis kounelisagis force-pushed the kerl/dense-consolidate-debug branch from 114a6aa to 1565203 Compare December 3, 2024 12:13
@@ -390,12 +392,181 @@ void Domain::expand_ndrange(const NDRange& r1, NDRange* r2) const {
}
}

void Domain::expand_to_tiles(NDRange* ndrange) const {
void Domain::expand_to_tiles_when_no_current_domain(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _when_no_current_domain suffix sounds a bit verbose. I would have kept the existing name, but I understand why we would want to make it more explicit and am fine with keeping the suffix.

@@ -54,7 +54,7 @@ conclude(object_library)
#
commence(object_library domain)
this_target_sources(domain.cc)
this_target_object_libraries(datum dimension math)
this_target_object_libraries(datum dimension math ndrectangle current_domain)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a cyclic dependency between domain, ndrectangle and current_domain. An immediate-term fix might be to move Domain::expand_to_tiles somewhere in the current_domain object library (a static or freestanding method) and adding a couple friend declarations to Domain if needed.

In the longer term we should re-evaluate our object library modularization approach. Modularization is definitely a good thing, but taking it to the extreme of having object libraries with one file each is unnecessary and causes us self-inflicted pain. For our case at hand I can't see why attribute, dimension, domain, enumeration, ndrectangle, current_domain and array_schema have to be distinct object libraries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another somehow related comment on why this problem came up. It is my understanding that expand_to_tiles is actually operating on a Range and should be a method of that class instead and not of Domain. I realized that when I wanted to suggested that we pass NDRange& query_ndrange as const and realized that the method's side effects are in fact in that class so we can't. This is quite old code and has been structured that way since forever so I didn't want to ask for such a large refactoring as part of this PR that is just trying to introduce a bugfix and it's also quite urgent. We could open a ticket as a follow up to restructure stuff internally so that we try to move it to Range from the temporarily solution of CurrentDomain and see if that avoids circular dependencies. In theory such circular dependencies are indicators of wrong design, and IMO that's the case here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

The commit e631d93:

  • Did not have circular dependencies
  • Fixed the time-sensitive customer-impacting bug

@kounelisagis kounelisagis force-pushed the kerl/dense-consolidate-debug branch from 1565203 to 1918082 Compare December 3, 2024 18:11
@ihnorton ihnorton merged commit ae2c2a6 into dev Dec 4, 2024
62 of 63 checks passed
@ihnorton ihnorton deleted the kerl/dense-consolidate-debug branch December 4, 2024 15:41
@ihnorton
Copy link
Member

ihnorton commented Dec 4, 2024

/backport release-2.27

@teo-tsirpanis
Copy link
Member

/backport to release-2.27

Copy link
Contributor

github-actions bot commented Dec 4, 2024

Started backporting to release-2.27: https://github.com/TileDB-Inc/TileDB/actions/runs/12163311071

ihnorton pushed a commit that referenced this pull request Dec 4, 2024
…s without respecting current domain (#5390) (#5397)

Backport of #5390 to release-2.27

---
TYPE: BUG
DESC: Bug fix: consolidation failure on dense arrays without respecting current domain
---------

Co-authored-by: John Kerl <[email protected]>
Co-authored-by: Agisilaos Kounelis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants