Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] Use libtiledbsoma for array handles #3061

Open
johnkerl opened this issue Sep 24, 2024 · 3 comments
Open

[r] Use libtiledbsoma for array handles #3061

johnkerl opened this issue Sep 24, 2024 · 3 comments
Labels

Comments

@johnkerl
Copy link
Member

johnkerl commented Sep 24, 2024

Currently we're retaining a tiledb-r handle at open:

private$.tiledb_array <- tiledb::tiledb_array_open(self$object, type = mode)

And we use a temporary open-use-close at every single call to libtiledbsoma -- here is just one of many examples:

void set_metadata(std::string& uri, std::string& key, SEXP valuesxp, std::string& type,
bool is_array, Rcpp::XPtr<somactx_wrap_t> ctxxp,
Rcpp::Nullable<Rcpp::DatetimeVector> tsvec = R_NilValue) {
// shared pointer to SOMAContext from external pointer wrapper
std::shared_ptr<tdbs::SOMAContext> sctx = ctxxp->ctxptr;
// SOMA Object unique pointer (aka soup)
auto soup = getObjectUniquePointer(is_array, OpenMode::write, uri, sctx, tsvec);
if (type == "character") {
const tiledb_datatype_t value_type = TILEDB_STRING_UTF8;
std::string value = Rcpp::as<std::string>(valuesxp);
spdl::debug("[set_metadata] key {} value {} is_array {} type {}", key, value, is_array, type);
soup->set_metadata(key, value_type, value.length(), (void*) value.c_str(), true);
} else if (type == "integer64") {
const tiledb_datatype_t value_type = TILEDB_INT64;
double dv = Rcpp::as<double>(valuesxp);
int64_t value = Rcpp::fromInteger64(dv);
spdl::debug("[set_metadata] key {} value {} is_array {} type {}", key, value, is_array, type);
soup->set_metadata(key, value_type, 1, (void*) &value, true);
} else {
Rcpp::stop("Unsupported type '%s'", type);
}
soup->close();
}

On the one hand this might seem lower-pri: the redundant opens are a perf hit but they work.

But as discussed on #3060 we must do this in order to remove the tiledb-r dependency.

See also #3053 which @nguyenv is working on -- this is a case where we do currently require a second open for array reads.

From #3059:

Already done for groups

#2406; [sc-55685].

Steps:

  • Status quo is
    • There is a private$.tiledb_array with connection to self$object
      • This is set on construction/open, and closed on close
    • The libtiledbsoma handles are currently ephemeral
      • Every read / write / get_metadata etc which is libtiledbsoma-capable currently takes a uri and a ctx and does open/op/close
  • Needing to be done:

To check:

  • Currently the array set_metadata does not error out when the array is opened for read -- this is a bug.
    • This should be auto-fixed by this issue, but, it needs to be tested.
@johnkerl johnkerl changed the title [r] Use libtiledbsoma for array handles [r] Use libtiledbsoma for array handles Oct 10, 2024
@johnkerl
Copy link
Member Author

Blocked by #3051 since here
https://github.com/single-cell-data/TileDB-SOMA/blob/1.15.0rc2/apis/r/R/SOMADataFrame.R#L182
the self$object is used. Removing that private$.tiledb_array / self$object from the base class TileDBArray.R causes this query-condition logic to return empty results, in a way that makes unit-test cases fail in non-obvious ways.

We need to move the query-condition logic over from TileDB-R before we can continue here.

@viviannguyen
Copy link

I don't believe I'm the correct Vivian to be tagged here. You're probably looking for @nguyenv.

@johnkerl
Copy link
Member Author

Oh no @viviannguyen sorry about that!! Have a nice day though!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants