Skip to content

Conversation

kylebarron
Copy link

Which issue does this PR close?

  • Closes #.

Rationale for this change

I'd like to consider using InstrumentedObjectStore in obstore, a Python binding to object_store. I prototyped this in developmentseed/obstore#544 but I need a few changes to be able to avoid type erasure to use signing APIs.

None of these changes are backwards-incompatible because InstrumentedObjectStore isn't currently exposed.

What changes are included in this PR?

  • Expose InstrumentedObjectStore publicly. I can't use a generic Arc<dyn ObjectStore> because I also expose signing APIs, and those aren't implemented on the type-erased ObjectStore. Because they're only implemented on the concrete S3/GCS/Azure stores, I need to store the non-type-erased AmazonS3 directly.
  • Make InstrumentedObjectStore generic over T: ObjectStore. As mentioned above, I need to be able to avoid type erasure, so that I can get a reference back to the original &AmazonS3 to use with signing APIs.
  • Expose an fn inner(&self) -> &T so that I can access the original store passed in.

Are these changes tested?

Tested by existing tests.

Are there any user-facing changes?

Only new APIs.

@geoffreyclaude
Copy link
Collaborator

Hi @kylebarron ,
Thanks for your detailed PR description. Your issue is super clear and makes a lot of sense.

I'd really like to avoid exposing the InstrumentedObjectStore publicly though, as that would come with multiple constraints.

What do you think instead about creating a new interface specifically for the multiple trait ObjectStore + Signer? Something like:

/// Instruments the provided `ObjectStore + Signer` with tracing.
pub fn instrument_signed_object_store(
    store: Arc<dyn ObjectStore + Signer>,
    name: &str,
) -> Arc<dyn ObjectStore + Signer>

@kylebarron
Copy link
Author

kylebarron commented Aug 29, 2025

I'd really like to avoid exposing the InstrumentedObjectStore publicly though, as that would come with multiple constraints.

I don't think that's possible for me, as I need to store an InstrumentedObjectStore<T>, so that I can access the original AmazonS3 for use with signing APIs.

Perhaps if I didn't try to support signing, then I could just store a bare, type-erased Arc<dyn ObjectStore>... But regardless I'm looking at a potential Python API that uses PaginatedListStore because that allows for substring prefix instead of just path prefix, and PaginatedListStore similarly is only implemented for the concrete AWS/GCS/Azure stores, not ObjectStore. So either way I need to store the concrete T, not dyn ObjectStore.

The code seems pretty stable, so perhaps it wouldn't be too bad if I vendored this file, but I'd be happy to use datafusion-tracing as a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants