Skip to content
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
207 changes: 207 additions & 0 deletions site/content/in-dev/unreleased/policy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
---
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
title: Policy
type: docs
weight: 425
---

The Polaris Policy framework empowers organizations to centrally define, manage, and enforce fine-grained governance, lifecycle, and operational rules across all data resources in the catalog.

With the policy API, you can:
- Create and manage policies
- Attach policies to specific resources (catalogs, namespaces, tables, or views)
- Check applicable policies for any given resource

## What is a Policy?

A policy in Apache Polaris is a structured entity that defines rules governing actions on specified resources under
predefined conditions. Each policy contains:

- **Name**: A unique identifier within a namespace
- **Type**: Determines the semantics and expected format of the policy content
- **Description**: Explains the purpose of the policy
- **Content**: Contains the actual rules defining the policy behavior
- **Version**: An automatically tracked revision number
- **Inheritable**: Whether the policy can be inherited by child resources, decided by its type

### Policy Types

Polaris supports several predefined system policy types (prefixed with `system.`):

- **`system.data-compaction`**: Defines rules for data compaction operations
- Schema Definition: @https://polaris.apache.org/schemas/policies/system/data-compaction/2025-02-03.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: what does the "@" do for links in this framework?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted it to a table, in which @ is removed.

- Controls file compaction to optimize storage and query performance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Applicable resources: Iceberg table, namespace, catalog

- **`system.metadata-compaction`**: Defines rules for metadata compaction operations
- Schema Definition: @https://polaris.apache.org/schemas/policies/system/metadata-compaction/2025-02-03.json
- Optimizes table metadata for improved performance
- Applicable resources: Iceberg table, namespace, catalog

- **`system.orphan-file-removal`**: Defines rules for removing orphaned files
- Schema Definition: @https://polaris.apache.org/schemas/policies/system/orphan-file-removal/2025-02-03.json
- Identifies and safely removes files that are no longer referenced by the table metadata
- Applicable resources: Iceberg table, namespace, catalog

- **`system.snapshot-expiry`**: Defines rules for snapshot expiration
- Schema Definition: @https://polaris.apache.org/schemas/policies/system/snapshot-expiry/2025-02-03.json
- Controls how long snapshots are retained before removal
- Applicable resources: Iceberg table, namespace, catalog

- **Custom policy types**: Can be defined for specific organizational needs (WIP)

- **FGAC (Fine-Grained Access Control) policies**: Row filtering, column masking, column hiding (WIP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, the site contains only the features/concepts that has been finalized and implemented, but I also that mentioning these features here can help form a better understanding of the general scope of policy. Should we phrase this more generally, such as: 'Support for additional predefined system policy types and custom policy type definitions is in progress. For more details, please refer to the roadmap.' This way, we avoid over-sharing implementation details while still giving users a clear sense of the feature scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think calling FGAC out explicitly will provide a clear picture of what we are trying to do, as the questions about FGAC policy will come up naturally from anyone who understand policies. However, I'm OK with either way. Let me know if you strongly feel which way is better. I can make the changes correspondently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes total sense—thanks for the clear explanation! My main concern was that listing those three specific FGAC policy types might give the impression that the community has already made concrete decisions on the direction. Maybe we could just mark them as tentative examples to make it clearer they're still under discussion (i.e. extends the WIP a little bit)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change per suggestion


### Policy Inheritance

The entity hierarchy in Polaris is structured as follows:

```
Catalog
|
Namespace
|
+-----------+----------+
| | |
Iceberg Iceberg Generic
Table View Table
```

Policies can be attached at any level, and inheritance flows from catalog down to namespace, then to tables and views.

Policies can be inheritable or non-inheritable:

- **Inheritable policies**: Apply to the target resource and all its applicable child resources
- **Non-inheritable policies**: Apply only to the specific target resource

The inheritance follows an override mechanism:
1. Table-level policies override namespace and catalog policies
2. Namespace-level policies override parent namespace and catalog policies

## Working with Policies

### Creating a Policy

To create a policy, you need to provide a name, type, and optionally a description and content:

```json
POST /polaris/v1/{prefix}/namespaces/{namespace}/policies
{
"name": "compaction-policy",
"type": "system.data-compaction",
"description": "Policy for optimizing table storage",
"content": "{\"version\": \"2025-02-03\", \"enable\": true, \"config\": {\"target_file_size_bytes\": 134217728}}"
}
```

The policy content is validated against a schema specific to its type. Here are a few policy content examples
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The policy content is validated against a schema specific to its type. Here are a few policy content examples
The policy content is validated against a schema specific to its type. Here are a few policy content examples:

- Data Compaction Policy
```json
{
"version": "2025-02-03",
"enable": true,
"config": {
"target_file_size_bytes": 134217728,
"compaction_strategy": "bin-pack",
"max-concurrent-file-group-rewrites": 5
}
}
```
- Orphan File Removal Policy
```json
{
"version": "2025-02-03",
"enable": true,
"max_orphan_file_age_in_days": 30,
"locations": ["s3://my-bucket/my-table-location"],
"config": {
"prefix_mismatch_mode": "ignore"
}
}
```

### Attaching Policies to Resources

Policies can be attached to different resource levels:

1. **Catalog level**: Applies to the entire catalog
2. **Namespace level**: Applies to a specific namespace
3. **Table-like level**: Applies to individual tables or views

Example of attaching a policy to a table:

```json
PUT /polaris/v1/{prefix}/namespaces/{namespace}/policies/{policy-name}/mappings
{
"target": {
"type": "table-like",
"path": ["NS1", "NS2", "test_table_1"]
}
}
```

For inheritable policies, only one policy of a given type can be attached to a resource. For non-inheritable policies, multiple policies of the same type can be attached.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[not requesting change]: is there a document about why this is the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note at the section "Policy Inheritance"


### Retrieving Applicable Policies
A user can view applicable policies on a resource (e.g., table, namespace, or catalog) as long as they have
read permission on that resource. The permission model may be enhanced in the future when Fine-Grained Access Control
policy is introduced, which will provide more granular control over policy visibility and management.
Copy link
Contributor

@singhpk234 singhpk234 May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we skip this part for now ? as its bit too much info.

Suggested change
read permission on that resource. The permission model may be enhanced in the future when Fine-Grained Access Control
policy is introduced, which will provide more granular control over policy visibility and management.
read permission on that resource.


Here is an example to find all policies that apply to a specific resource (including inherited policies):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also mention the RBAC comes into picture ? as not every one can see all the policies applicable to a resource, for ex i think in current implementation we need to require permission on a resource to see the policy applicable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it

```
GET /polaris/v1/catalog/applicable-policies?namespace=finance%1Fquarterly&target-name=transactions
```

**Sample response:**
```json
{
"policies": [
{
"name": "snapshot-expiry-policy",
"type": "system.snapshot-expiry",
"appliedAt": "namespace",
"content": {
"version": "2025-02-03",
"enable": true,
"config": {
"min_snapshot_to_keep": 1,
"max_snapshot_age_days": 2,
"max_ref_age_days": 3
}
}
},
{
"name": "compaction-policy",
"type": "system.data-compaction",
"appliedAt": "catalog",
"content": {
"version": "2025-02-03",
"enable": true,
"config": {
"target_file_size_bytes": 134217728
}
}
}
]
}
```

### API Reference

For the complete and up-to-date API specification, see the [policy-api.yaml](https://github.com/apache/polaris/blob/main/spec/polaris-catalog-apis/policy-apis.yaml).