Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] HTTP API routes overhaul #2954

Merged
merged 29 commits into from
Oct 18, 2024
Merged

Conversation

drewkim
Copy link
Contributor

@drewkim drewkim commented Oct 14, 2024

Improvements & Bug fixes

  • Overhauls the HTTP API routes to follow a more RESTful approach. Now routes follow the patten of /tenants/{tenant}/databases/{database}/collections/{collection_id}.
  • Configures a "handshake" where the client calls the server auth implementation to possible resolve the tenant/database based on the provided API key.
  • Updates clients to work with new route definitions.

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

Copy link
Contributor Author

drewkim commented Oct 14, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @drewkim and the rest of your teammates on Graphite Graphite

Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@drewkim drewkim marked this pull request as ready for review October 14, 2024 23:27
@drewkim drewkim marked this pull request as draft October 14, 2024 23:27
@drewkim drewkim force-pushed the drew/_enh_rest_api_routes_overhaul branch from c763290 to a41e4e6 Compare October 15, 2024 18:44
@@ -94,7 +95,9 @@ export class ChromaClient {
}

/** @ignore */
init(): Promise<void> {
async init(): Promise<void> {
await this.resolveTenantAndDatabases();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not optimal, as it causes us to make a call to the API every time. However, we can't call async code from the constructor so the alternative would be causing a breaking change for anyone using our JS client by forcing them to call a follow-on method upon initialization that sets the tenant/database. I propose we go about this way for now and implement the more efficient method when we apply more sweeping breaking client changes.

@drewkim drewkim marked this pull request as ready for review October 15, 2024 21:57
@drewkim drewkim requested a review from HammadB October 15, 2024 21:59
chromadb/config.py Outdated Show resolved Hide resolved
@@ -88,8 +89,7 @@ def test_valid_key(mock_cloud_server: System, valid_token: str) -> None:
def test_invalid_key(mock_cloud_server: System, valid_token: str) -> None:
# Try to connect to the default tenant and database with an invalid token
invalid_token = valid_token + "_invalid"
# TODO this should raise an auth or more descriptive error
with pytest.raises(ValueError):
with pytest.raises(ChromaAuthError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix

@@ -985,7 +1792,7 @@ async def reset(
)

@trace_method("FastAPI.get_nearest_neighbors", OpenTelemetryGranularity.OPERATION)
async def get_nearest_neighbors(
async def get_nearest_neighbors_v1(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code cleanliness nit - can we somehow bifurcate the v1 code into a sep file for easier readying?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the most straightforward to do so and includes module-ing FastAPI. I think it should be somewhat more readable now with the split into functions. If it's still tough to read, lmk and I'll go down that route but felt inefficient given that we're immediately deleting the v1 routes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair but, Isn't it just a matter of creating a wrapper class that extends this and moving the functions there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe it's that straightforward—wouldn't that create two running instances of the server?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you only instantiate one.

Server() -> Say this is current class
ServerV2(Server) -> Adds v2 routes

or alternatively
Server -> Current class with v2 routes only
ServerV1(Server) -> Adds v1 routes for bw compat.

This is fine as is, but I think it should be simple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean. Updated to wrap with class that has v1 routes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to revert this and keep all in one file because it will break hosted Chroma otherwise. This is how I implemented your suggestion:

class FastAPI(Server):
# Setup v2 routes

class FastAPIWithV1(FastAPI):
# Setup v1 routes
# Post route setup
self._app.include_router(self.router)
use_route_names_as_operation_ids(self._app)
... 

# In chromadb/app.py:
from chromadb.server.fastapi.v1 import FastAPIWithV1
server = FastAPIWithV1(settings)

There is some configuration that necessarily has to happen post adding routes, so we can only fire it in the wrapper class. We thus also have to update the app to use the wrapper class. This makes it such that hosted Chroma will fail once we deploy changes, since it initializes the server in python/chroma_hosted/chroma_hosted/app.py. It won't have access to the wrapper class, which has the necessary post route configuration.

There is a way around this, by doing something like:

class FastAPI(Server):
...
class FastAPI(FastAPIWithV1):
...

However, in this case we still have to split things up in an awkward way, like moving router initialization and it doesn't make for a clean delete of the old routes, especially if there's some time between and context is lost. Overall, I think it's cleaner to keep things all in one file.

chromadb/api/__init__.py Outdated Show resolved Hide resolved
chromadb/api/client.py Outdated Show resolved Hide resolved

# Get the root system component we want to interact with
self._server = self._system.instance(ServerAPI)

user_identity = self.resolve_tenant_and_databases()

maybe_tenant, maybe_database = SharedSystemClient.maybe_set_tenant_and_database(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the user supplied tenant or database doesn't match?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are only overridden if they are (not provided or the default values) and we can definitely determine what they should be based on the API key (i.e. a specific tenant and only one specific database exists). Thus, if they don't match but the API key is correct, then the request will fail authorization since it's a mismatch of resources.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where that would happen?

User does: chromadb.Client(tenant=FOO, database=BLAH)
Code does: self.get_user_identity() -> Returns UserIdentity(tenant=NOT_FOO, database=SOMETHING)
Code does: maybe_set_tenant_and_database will just set tenant and database to NOT_FOO, SOMETHING

I'm probably missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In maybe_set_tenant_and_database, the logic is:

if (not tenant or tenant == DEFAULT_TENANT) and new_tenant:
    tenant = new_tenant
if (not database or database == DEFAULT_DATABASE) and new_database:
    database = new_database

So the tenant/database only get set to the values pulled from auth if the provided ones don't exist or are the default valeus

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right and I am saying should we make the client init fail, under a case like this, rather than deferring the failure to the first request the user will make?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Updated to throw an exception in this error case and moved resolution to the auth library under a new utils section (I thought in your other comments were suggesting to throw it into the server auth provider).

chromadb/api/client.py Outdated Show resolved Hide resolved
@drewkim drewkim force-pushed the drew/_enh_rest_api_routes_overhaul branch from 61039f7 to 80d5ade Compare October 16, 2024 17:42
@drewkim drewkim requested a review from HammadB October 16, 2024 17:45
@drewkim drewkim force-pushed the drew/_enh_rest_api_routes_overhaul branch from 37b9720 to 033e670 Compare October 16, 2024 18:00
chromadb/api/types.py Outdated Show resolved Hide resolved
@drewkim drewkim force-pushed the drew/_enh_rest_api_routes_overhaul branch from 51c4efc to 983f11f Compare October 16, 2024 22:45
# values, which might not match the provided values. Thus, it's important
# to ensure that the flag is set to True only when there is an auth provider.
if tenant and tenant != DEFAULT_TENANT and new_tenant and new_tenant != tenant:
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this error more descriptive to users, they might not know what "resolved" means.

Also - does this expose us to enumeration attacks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it does any more than already—an attacker could do the same thing with collections currently, just receiving a 401 if their API doesn't match the resource. I can change this to be a more opaque 401 to match.

Copy link
Collaborator

@HammadB HammadB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see:

  • unit tests of the auth overriding logic
  • v1 client bw compat tests with a v2 server
  • @codetheweb or @jeffchuber to TAL at the JS changes.
  • just a quick thought about enumeration attacks

Otherwise looks great - thank you!

@drewkim
Copy link
Contributor Author

drewkim commented Oct 18, 2024

Reviewed with @codetheweb offline

@@ -94,7 +99,12 @@ export class ChromaClient {
}

/** @ignore */
init(): Promise<void> {
async init(): Promise<void> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be slightly cleaner to combine the two checks into the single this._initPromise
also should probably mark init() as private, not sure why it wasn't already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there're some tests relying on it being public

@drewkim drewkim changed the title [ENH] REST API routes overhaul [ENH] HTTP API routes overhaul Oct 18, 2024
@drewkim drewkim merged commit bc26b0a into main Oct 18, 2024
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants