Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .github/workflows/deploy-mkdocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Publish docs via GitHub Pages

on:
push:
branches:
- main
paths:
# Only rebuild website when docs have changed
- "README.md"
- "deployment/**"
- "docs/**"
- "src/**"
- ".github/workflows/deploy_mkdocs.yml"

jobs:
build:
name: Deploy docs
runs-on: ubuntu-latest
steps:
- name: Checkout main
uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: 3.11

- uses: astral-sh/setup-uv@v4
with:
enable-cache: true

- name: Deploy docs
run: uv run mkdocs gh-deploy --force
480 changes: 0 additions & 480 deletions README.md

Large diffs are not rendered by default.

195 changes: 195 additions & 0 deletions docs/architecture/data-filtering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# Data filtering via CQL2

The system supports generating CQL2 filters based on request context to provide row-level content filtering. These CQL2 filters are then set on outgoing requests prior to the upstream API.

> [!IMPORTANT]
> The upstream STAC API must support the [STAC API Filter Extension](https://github.com/stac-api-extensions/filter/blob/main/README.md), including the [Features Filter](http://www.opengis.net/spec/ogcapi-features-3/1.0/conf/features-filter) conformance class on to the Features resource (`/collections/{cid}/items`)[^37].

## Filters

### `ITEMS_FILTER`

The [`ITEMS_FILTER`](../configuration.md#collections_filter_cls) is applied to the following operations.

> [!WARNING]
> Operations without a check mark are not yet supported. We intend to support these operations within the future.

- [x] `GET /search`
- **Action:** Read Item
- **Strategy:** Append query params with generated CQL2 query.
- [x] `POST /search`
- **Action:** Read Item
- **Strategy:** Append body with generated CQL2 query.
- [x] `GET /collections/{collection_id}/items`
- **Action:** Read Item
- **Strategy:** Append query params with generated CQL2 query.
- [x] `GET /collections/{collection_id}/items/{item_id}`
- **Action:** Read Item
- **Strategy:** Validate response against CQL2 query.
- [ ] `POST /collections/{collection_id}/items`[^21]
- **Action:** Create Item
- **Strategy:** Validate body with generated CQL2 query.
- [ ] `PUT /collections/{collection_id}/items/{item_id}`[^21]
- **Action:** Update Item
- **Strategy:** Fetch Item and validate CQL2 query; merge Item with body and validate with generated CQL2 query.
- [ ] `DELETE /collections/{collection_id}/items/{item_id}`[^21]
- **Action:** Delete Item
- **Strategy:** Fetch Item and validate with CQL2 query.
- [ ] `POST /collections/{collection_id}/bulk_items`[^21]
- **Action:** Create Items
- **Strategy:** Validate items in body with generated CQL2 query.

### `COLLECTIONS_FILTER`

The [`COLLECTIONS_FILTER`](../configuration#collections_filter_cls) applies to the following operations.

> [!WARNING]
> Operations without a check mark are not yet supported. We intend to support these operations within the future.

- [x] `GET /collections`
- **Action:** Read Collection
- **Strategy:** Append query params with generated CQL2 query.
- [x] `GET /collections/{collection_id}`
- **Action:** Read Collection
- **Strategy:** Validate response against CQL2 query.
- [ ] `POST /collections/`[^22]
- **Action:** Create Collection
- **Strategy:** Validate body with generated CQL2 query.
- [ ] `PUT /collections/{collection_id}`[^22]
- **Action:** Update Collection
- **Strategy:** Fetch Collection and validate CQL2 query; merge Item with body and validate with generated CQL2 query.
- [ ] `DELETE /collections/{collection_id}`[^22]
- **Action:** Delete Collection
- **Strategy:** Fetch Collection and validate with CQL2 query.

## Example Request Flow for multi-record endpoints

```mermaid
sequenceDiagram
Client->>Proxy: GET /collections
Note over Proxy: EnforceAuth checks credentials
Note over Proxy: BuildCql2Filter creates filter
Note over Proxy: ApplyCql2Filter applies filter to request
Proxy->>STAC API: GET /collection?filter=(collection=landsat)
STAC API->>Client: Response
```

## Example Request Flow for single-record endpoints

The Filter Extension does not apply to fetching individual records. As such, we must validate the record _after_ it is returned from the upstream API but _before_ it is returned to the user:

```mermaid
sequenceDiagram
Client->>Proxy: GET /collections/abc123
Note over Proxy: EnforceAuth checks credentials
Note over Proxy: BuildCql2Filter creates filter
Proxy->>STAC API: GET /collection/abc123
Note over Proxy: ApplyCql2Filter validates the response
STAC API->>Client: Response
```

## Authoring Filter Generators

The `ITEMS_FILTER_CLS` configuration option can be used to specify a class that will be used to generate a CQL2 filter for the request. The class must define a `__call__` method that accepts a single argument: a dictionary containing the request context; and returns a valid `cql2-text` expression (as a `str`) or `cql2-json` expression (as a `dict`).

> [!TIP]
> An example integration can be found in [`examples/custom-integration`](https://github.com/developmentseed/stac-auth-proxy/blob/main/examples/custom-integration).

### Basic Filter Generator

```py
import dataclasses
from typing import Any

from cql2 import Expr


@dataclasses.dataclass
class ExampleFilter:
async def __call__(self, context: dict[str, Any]) -> str:
return "true"
```

> [!TIP]
> Despite being referred to as a _class_, a filter generator could be written as a function.
>
> <details>
>
> <summary>Example</summary>
>
> ```py
> from typing import Any
>
> from cql2 import Expr
>
>
> def example_filter():
> async def example_filter(context: dict[str, Any]) -> str | dict[str, Any]:
> return Expr("true")
> return example_filter
> ```
>
> </details>

### Complex Filter Generator

An example of a more complex filter generator where the filter is generated based on the response of an external API:

```py
import dataclasses
from typing import Any, Literal, Optional

from httpx import AsyncClient
from stac_auth_proxy.utils.cache import MemoryCache


@dataclasses.dataclass
class ApprovedCollectionsFilter:
api_url: str
kind: Literal["item", "collection"] = "item"
client: AsyncClient = dataclasses.field(init=False)
cache: MemoryCache = dataclasses.field(init=False)

def __post_init__(self):
# We keep the client in the class instance to avoid creating a new client for
# each request, taking advantage of the client's connection pooling.
self.client = AsyncClient(base_url=self.api_url)
self.cache = MemoryCache(ttl=30)

async def __call__(self, context: dict[str, Any]) -> dict[str, Any]:
token = context["req"]["headers"].get("authorization")

try:
# Check cache for a previously generated filter
approved_collections = self.cache[token]
except KeyError:
# Lookup approved collections from an external API
approved_collections = await self.lookup(token)
self.cache[token] = approved_collections

# Build CQL2 filter
return {
"op": "a_containedby",
"args": [
{"property": "collection" if self.kind == "item" else "id"},
approved_collections
],
}

async def lookup(self, token: Optional[str]) -> list[str]:
# Lookup approved collections from an external API
headers = {"Authorization": f"Bearer {token}"} if token else {}
response = await self.client.get(
f"/get-approved-collections",
headers=headers,
)
response.raise_for_status()
return response.json()["collections"]
```

> [!TIP]
> Filter generation runs for every relevant request. Consider memoizing external API calls to improve performance.

[^21]: https://github.com/developmentseed/stac-auth-proxy/issues/21
[^22]: https://github.com/developmentseed/stac-auth-proxy/issues/22
[^37]: https://github.com/developmentseed/stac-auth-proxy/issues/37
39 changes: 39 additions & 0 deletions docs/architecture/middleware-stack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Middleware Stack

Aside from the actual communication with the upstream STAC API, the majority of the proxy's functionality occurs within a chain of middlewares. Each request passes through this chain, wherein each middleware performs a specific task:

1. **[`EnforceAuthMiddleware`][stac_auth_proxy.middleware.EnforceAuthMiddleware]**

- Handles authentication and authorization
- Configurable public/private endpoints
- OIDC integration
- Places auth token payload in request state

1. **[`Cql2BuildFilterMiddleware`][stac_auth_proxy.middleware.Cql2BuildFilterMiddleware]**

- Builds CQL2 filters based on request context/state
- Places [CQL2 expression](http://developmentseed.org/cql2-rs/latest/python/#cql2.Expr) in request state

2. **[`Cql2ApplyFilterQueryStringMiddleware`][stac_auth_proxy.middleware.Cql2ApplyFilterQueryStringMiddleware]**

- Retrieves [CQL2 expression](http://developmentseed.org/cql2-rs/latest/python/#cql2.Expr) from request state
- Augments `GET` requests with CQL2 filter by appending to querystring

3. **[`Cql2ApplyFilterBodyMiddleware`][stac_auth_proxy.middleware.Cql2ApplyFilterBodyMiddleware]**

- Retrieves [CQL2 expression](http://developmentseed.org/cql2-rs/latest/python/#cql2.Expr) from request state
- Augments `` POST`/`PUT`/`PATCH `` requests with CQL2 filter by modifying body

4. **[`Cql2ValidateResponseBodyMiddleware`][stac_auth_proxy.middleware.Cql2ValidateResponseBodyMiddleware]**

- Retrieves [CQL2 expression](http://developmentseed.org/cql2-rs/latest/python/#cql2.Expr) from request state
- Validates response against CQL2 filter for non-filterable endpoints

5. **[`OpenApiMiddleware`][stac_auth_proxy.middleware.OpenApiMiddleware]**

- Modifies OpenAPI specification based on endpoint configuration, adding security requirements
- Only active if `openapi_spec_endpoint` is configured

6. **[`AddProcessTimeHeaderMiddleware`][stac_auth_proxy.middleware.AddProcessTimeHeaderMiddleware]**
- Adds processing time headers
- Useful for monitoring/debugging
Binary file added docs/assets/ds-symbol-negative-mono.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/ds-symbol-positive-mono.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading