diff --git a/site/docs/metadata.md b/site/docs/metadata.md
new file mode 100644
index 000000000000..c83a7858d8e1
--- /dev/null
+++ b/site/docs/metadata.md
@@ -0,0 +1,134 @@
+# Metadata Tables
+
+This page describes the internal metadata tables maintained by Iceberg. Please refer to [definitions page](terms.md)
+for more information on terms and definitions and the [specifications page](spec.md) for more information on Iceberg's
+table specification. Complete metadata table schema can be found on the [Spark Queries page](spark-queries.md#metadata-table-schema).
+
+| Name | Description |
+| --------------------------------------------------| ------------|
+| [`AllDataFilesTable`](#AllDataFilesTable) | Contains rows representing all of the data files in the table. Each row will contain metadata as well as path information stored by the Iceberg. This differs from the `DataFilesTable` because it contains all files currently referenced by any existing Snapshot from this table rather than just the current one.
+| [`AllEntriesTable`](#AllEntriesTable) | Contains a table's manifest entries as rows, for both delete and data files. Please note that this table exposes internal details, like files that have been deleted. For a table of the live data files, please use `DataFilesTable`.
+| [`AllManifestsTable`](#AllManifestsTable) | Contains a table's valid manifest files as rows. A valid manifest file is referenced from any snapshot currently tracked by the table. This table may contain duplicate rows.
+| [`DataFilesTable`](#DataFilesTable) | Contains a table's data files as rows.
+| [`HistoryTable`](#HistoryTable) | Contains a table's history as rows. History is based on the table's snapshot log, which logs each update to the table's current snapshot.
+| [`ManifestEntriesTable`](#ManifestEntriesTable) | Contains a table's manifest entries as rows, for both delete and data files. Please note that this table exposes internal details, like files that have been deleted. For a table of the live data files, please use `DataFilesTable`.
+| [`ManifestsTable`](#ManifestsTable) | Contains a table's manifest files as rows.
+| [`PartitionsTable`](#PartitionsTable) | Contains a table's partitions as rows.
+| [`SnapshotsTable`](#SnapshotsTable) | Contains a table's known snapshots as rows. This does not include snapshots that have been expired using [`ExpireSnapshots`](https://iceberg.apache.org/javadoc/master/org/apache/iceberg/ExpireSnapshots.html).
+
+
+## Table Schema
+
+### 1. `AllDataFilesTable`
+
+| Column name | Required | Data type | Description |
+|-----------------------|-----------|-------------------|-------------|
+| content | | int | Contents of the file: 0=data, 1=position deletes, 2=equality deletes
+| file_path | ✔️ | string | Location URI with FS scheme
+| file_format | ✔️ | string | File format name: avro, orc, or parquet
+| partition | ✔️ | `struct<...>` | Partition data tuple, schema based on the partition spec
+| record_count | ✔️ | long | Number of records in the file
+| file_size_in_bytes | ✔️ | long | Total file size in bytes
+| column_sizes | ️ | `map` | Map of column id to total size on disk
+| value_counts | ️ | `map` | Map of column id to total count, including null and NaN
+| null_value_counts | ️ | `map` | Map of column id to null value count
+| nan_value_counts | | `map` | Map of column id to number of NaN values in the column
+| lower_bounds | | `map`| Map of column id to lower bound
+| upper_bounds | | `map`| Map of column id to upper bound
+| key_metadata | | binary | Encryption key metadata blob
+| split_offsets | | `list` | Splittable offsets
+| equality_ids | | `list` | Equality comparison field IDs
+| sort_order_id | | int | Sort order ID
+
+### 2. `AllEntriesTable`
+
+| Column name | Required | Data type | Description |
+|-------------------|----------|------------------------|-------------|
+| status | ✔️ | int | Used to track additions and deletions: `0: EXISTING` `1: ADDED` `2: DELETED`
+| snapshot_id | | long | Snapshot id where the file was added, or deleted if status is 2. Inherited when null.
+| sequence_number | | long | Sequence number when the file was added. Inherited when null.
+| data_file | ✔️ | `data_file` `struct` | File path, partition tuple, metrics, ...
+
+### 3. `AllManifestsTable`
+
+| Column name | Required | Data type | Description |
+|---------------------------|----------|--------------------|-------------|
+| path | ✔️ | string | Location of the manifest file
+| length | ✔️ | long | Length of the manifest file
+| partition_spec_id | | int | ID of a partition spec used to write the manifest; must be listed in table metadata `partition-specs`
+| added_snapshot_id | | long | ID of the snapshot where the manifest file was added
+| added_data_files_count | | int | Number of entries in the manifest that have status `ADDED` (1), when `null` this is assumed to be non-zero
+| existing_data_files_count | | int | Number of entries in the manifest that have status `EXISTING` (0), when `null` this is assumed to be non-zero
+| deleted_data_files_count | | int | Number of entries in the manifest that have status `DELETED` (2), when `null` this is assumed to be non-zero
+| partition_summaries | | `list>`| Partition summary information: contains null/nan, optional lower and upper bounds
+
+### 4. `DataFilesTable`
+
+| Column name | Required | Data type | Description |
+|-----------------------|-------|-------------------|-------------|
+| content | | int | Contents of the file: 0=data, 1=position deletes, 2=equality deletes
+| file_path | ✔️ | string | Location URI with FS scheme
+| file_format | ✔️ | string | File format name: avro, orc, or parquet
+| partition | ✔️ | `struct<...>` | Partition data tuple, schema based on the partition spec
+| record_count | ✔️ | long | Number of records in the file
+| file_size_in_bytes | ✔️ | long | Total file size in bytes
+| column_sizes | ️ | `map` | Map of column id to total size on disk
+| value_counts | ️ | `map` | Map of column id to total count, including null and NaN
+| null_value_counts | ️ | `map` | Map of column id to null value count
+| nan_value_counts | | `map` | Map of column id to number of NaN values in the column
+| lower_bounds | | `map`| Map of column id to lower bound
+| upper_bounds | | `map`| Map of column id to upper bound
+| key_metadata | | binary | Encryption key metadata blob
+| split_offsets | | `list` | Splittable offsets
+| equality_ids | | `list` | Equality comparison field IDs
+| sort_order_id | | int | Sort order ID
+
+### 5. `HistoryTable`
+
+| Column name | Required | Data type | Description |
+|-----------------------|-----------|-----------|-------------|
+| made_current_at | ✔️ | timstampz | Timestamp (with timezone) when this snapshot was promoted to current, i.e. when the first writer to this snapshot committed.
+| snapshot_id | ✔️ | long | A unique ID
+| parent_id | | long | ID of parent snapshot
+| is_current_ancestor | ✔️ | boolean | True if if this snapshot is ancestor of current; false otherwise
+
+### 6. `ManifestEntriesTable`
+
+| Column name | Required | Data type | Description |
+|-------------------|----------|------------------------|-------------|
+| status | ✔️ | int | Used to track additions and deletions: `0: EXISTING` `1: ADDED` `2: DELETED`
+| snapshot_id | | long | Snapshot id where the file was added, or deleted if status is 2. Inherited when null.
+| sequence_number | | long | Sequence number when the file was added. Inherited when null
+| data_file | ✔️ | `data_file` `struct` | File path, partition tuple, metrics, ...
+
+### 7. `ManifestsTable`
+
+| Column name | Required | Data type | Description |
+|---------------------------|----------|--------------------|-------------|
+| path | ✔️ | string | Location of the manifest file
+| length | ✔️ | long | Length of the manifest file
+| partition_spec_id | ✔️ | int | ID of a partition spec used to write the manifest; must be listed in table metadata `partition-specs`
+| added_snapshot_id | ✔️ | long | ID of the snapshot where the manifest file was added
+| added_data_files_count | ✔️ | int | Number of entries in the manifest that have status `ADDED` (1), when `null` this is assumed to be non-zero
+| existing_data_files_count | ✔️ | int | Number of entries in the manifest that have status `EXISTING` (0), when `null` this is assumed to be non-zero
+| deleted_data_files_count | ✔️ | int | Number of entries in the manifest that have status `DELETED` (2), when `null` this is assumed to be non-zero
+| partition_summaries | ✔️ | `list>`| Partition summary information: contains null/nan, optional lower and upper bounds
+
+### 8. `PartitionsTable`
+
+| Column name | Required | Data type | Description |
+|---------------|----------|----------------|-------------|
+| partition | ✔️ | `struct<...>` | The table partition spec determined by partition type
+| record_count | ✔️ | long | Aggregated number of records in this partition
+| file_count | ✔️ | int | Total number of data files in this partition
+
+### 9. `SnapshotsTable`
+
+| Column name | Required | Data type | Description |
+|---------------------------|-------------------------------|-------------|
+| committed_at | ✔️ | timestampz | Commit timestamp with timezone
+| snapshot_id | ✔️ | long | A unique ID
+| parent_id | | long | The snapshot ID of the snapshot's parent. Omitted for any snapshot with no parent
+| operation | | string | Used by some operations, like snapshot expiration, to skip processing certain snapshots. Possible `operation` values are: `append`, `replace`, `overwrite`, `delete`
+| manifest_list | | string | The location of a manifest list for this snapshot that tracks manifest files with additional meadata
+| summary | | `map` | A string map that summarizes the snapshot changes |
diff --git a/site/docs/spark-queries.md b/site/docs/spark-queries.md
index f7a78b566beb..644a15e7f2f5 100644
--- a/site/docs/spark-queries.md
+++ b/site/docs/spark-queries.md
@@ -234,6 +234,189 @@ SELECT * FROM prod.db.table.manifests
+----------------------------------------------------------------------+--------+-------------------+---------------------+------------------------+---------------------------+--------------------------+---------------------------------+
```
+### Metadata Table Schema
+
+1. `AllDataFilesTable`
+
+```json
+table {
+ 134: content: optional int (Contents of the file: 0=data, 1=position deletes, 2=equality deletes)
+ 100: file_path: required string (Location URI with FS scheme)
+ 101: file_format: required string (File format name: avro, orc, or parquet)
+ 102: partition: required struct<1000: data_bucket: optional int> (Partition data tuple, schema based on the partition spec)
+ 103: record_count: required long (Number of records in the file)
+ 104: file_size_in_bytes: required long (Total file size in bytes)
+ 108: column_sizes: optional map (Map of column id to total size on disk)
+ 109: value_counts: optional map (Map of column id to total count, including null and NaN)
+ 110: null_value_counts: optional map (Map of column id to null value count)
+ 137: nan_value_counts: optional map (Map of column id to number of NaN values in the column)
+ 125: lower_bounds: optional map (Map of column id to lower bound)
+ 128: upper_bounds: optional map (Map of column id to upper bound)
+ 131: key_metadata: optional binary (Encryption key metadata blob)
+ 132: split_offsets: optional list (Splittable offsets)
+ 135: equality_ids: optional list (Equality comparison field IDs)
+ 140: sort_order_id: optional int (Sort order ID)
+}
+```
+
+2. `AllEntriesTable`
+
+```json
+table {
+ 0: status: required int
+ 1: snapshot_id: optional long
+ 3: sequence_number: optional long
+ 2: data_file: required struct<
+ 134: content: optional int (Contents of the file: 0=data, 1=position deletes, 2=equality deletes),
+ 100: file_path: required string (Location URI with FS scheme),
+ 101: file_format: required string (File format name: avro, orc, or parquet),
+ 102: partition: required struct<1000: data_bucket: optional int> (Partition data tuple, schema based on the partition spec),
+ 103: record_count: required long (Number of records in the file),
+ 104: file_size_in_bytes: required long (Total file size in bytes),
+ 108: column_sizes: optional map (Map of column id to total size on disk),
+ 109: value_counts: optional map (Map of column id to total count, including null and NaN),
+ 110: null_value_counts: optional map (Map of column id to null value count),
+ 137: nan_value_counts: optional map (Map of column id to number of NaN values in the column),
+ 125: lower_bounds: optional map (Map of column id to lower bound),
+ 128: upper_bounds: optional map (Map of column id to upper bound),
+ 131: key_metadata: optional binary (Encryption key metadata blob),
+ 132: split_offsets: optional list (Splittable offsets),
+ 135: equality_ids: optional list (Equality comparison field IDs),
+ 140: sort_order_id: optional int (Sort order ID)
+ >
+}
+```
+
+3. `AllManifestsTable`
+
+```json
+table {
+ 1: path: required string
+ 2: length: required long
+ 3: partition_spec_id: optional int
+ 4: added_snapshot_id: optional long
+ 5: added_data_files_count: optional int
+ 6: existing_data_files_count: optional int
+ 7: deleted_data_files_count: optional int
+ 8: partition_summaries: optional list<
+ struct<
+ 10: contains_null: required boolean,
+ 11: contains_nan: required boolean,
+ 12: lower_bound: optional string,
+ 13: upper_bound: optional string
+ >
+ >
+}
+```
+
+4. `DataFilesTable`
+
+```json
+table {
+ 134: content: optional int (Contents of the file: 0=data, 1=position deletes, 2=equality deletes)
+ 100: file_path: required string (Location URI with FS scheme)
+ 101: file_format: required string (File format name: avro, orc, or parquet)
+ 102: partition: required struct<1000: data_bucket: optional int> (Partition data tuple, schema based on the partition spec)
+ 103: record_count: required long (Number of records in the file)
+ 104: file_size_in_bytes: required long (Total file size in bytes)
+ 108: column_sizes: optional map (Map of column id to total size on disk)
+ 109: value_counts: optional map (Map of column id to total count, including null and NaN)
+ 110: null_value_counts: optional map (Map of column id to null value count)
+ 137: nan_value_counts: optional map (Map of column id to number of NaN values in the column)
+ 125: lower_bounds: optional map (Map of column id to lower bound)
+ 128: upper_bounds: optional map (Map of column id to upper bound)
+ 131: key_metadata: optional binary (Encryption key metadata blob)
+ 132: split_offsets: optional list (Splittable offsets)
+ 135: equality_ids: optional list (Equality comparison field IDs)
+ 140: sort_order_id: optional int (Sort order ID)
+}
+```
+
+5. `HistoryTable`
+
+```java
+private static final Schema HISTORY_SCHEMA = new Schema(
+ Types.NestedField.required(1, "made_current_at", Types.TimestampType.withZone()),
+ Types.NestedField.required(2, "snapshot_id", Types.LongType.get()),
+ Types.NestedField.optional(3, "parent_id", Types.LongType.get()),
+ Types.NestedField.required(4, "is_current_ancestor", Types.BooleanType.get())
+);
+```
+
+6. `ManifestEntriesTable`
+
+```json
+table {
+ 0: status: required int
+ 1: snapshot_id: optional long
+ 3: sequence_number: optional long
+ 2: data_file: required struct<
+ 134: content: optional int (Contents of the file: 0=data, 1=position deletes, 2=equality deletes),
+ 100: file_path: required string (Location URI with FS scheme),
+ 101: file_format: required string (File format name: avro, orc, or parquet),
+ 102: partition: required struct<1000: data_bucket: optional int> (Partition data tuple, schema based on the partition spec),
+ 103: record_count: required long (Number of records in the file),
+ 104: file_size_in_bytes: required long (Total file size in bytes),
+ 108: column_sizes: optional map (Map of column id to total size on disk),
+ 109: value_counts: optional map (Map of column id to total count, including null and NaN),
+ 110: null_value_counts: optional map (Map of column id to null value count),
+ 137: nan_value_counts: optional map (Map of column id to number of NaN values in the column),
+ 125: lower_bounds: optional map (Map of column id to lower bound),
+ 128: upper_bounds: optional map (Map of column id to upper bound),
+ 131: key_metadata: optional binary (Encryption key metadata blob),
+ 132: split_offsets: optional list (Splittable offsets),
+ 135: equality_ids: optional list (Equality comparison field IDs),
+ 140: sort_order_id: optional int (Sort order ID)
+ >
+}
+```
+
+7. `ManifestsTable`
+
+```json
+table {
+ 1: path: required string
+ 2: length: required long
+ 3: partition_spec_id: required int
+ 4: added_snapshot_id: required long
+ 5: added_data_files_count: required int
+ 6: existing_data_files_count: required int
+ 7: deleted_data_files_count: required int
+ 8: partition_summaries: required list<
+ struct<
+ 10: contains_null: required boolean,
+ 11: contains_nan: required boolean,
+ 12: lower_bound: optional string,
+ 13: upper_bound: optional string
+ >
+ >
+}
+```
+
+8. `PartitionsTable`
+
+```java
+this.schema = new Schema(
+ Types.NestedField.required(1, "partition", table.spec().partitionType()),
+ Types.NestedField.required(2, "record_count", Types.LongType.get()),
+ Types.NestedField.required(3, "file_count", Types.IntegerType.get())
+);
+```
+
+9. `SnapshotsTable`
+
+```java
+private static final Schema SNAPSHOT_SCHEMA = new Schema(
+ Types.NestedField.required(1, "committed_at", Types.TimestampType.withZone()),
+ Types.NestedField.required(2, "snapshot_id", Types.LongType.get()),
+ Types.NestedField.optional(3, "parent_id", Types.LongType.get()),
+ Types.NestedField.optional(4, "operation", Types.StringType.get()),
+ Types.NestedField.optional(5, "manifest_list", Types.StringType.get()),
+ Types.NestedField.optional(6, "summary",
+ Types.MapType.ofRequired(7, 8, Types.StringType.get(), Types.StringType.get()))
+);
+```
+
## Inspecting with DataFrames
Metadata tables can be loaded in Spark 2.4 or Spark 3 using the DataFrameReader API:
diff --git a/site/docs/spec.md b/site/docs/spec.md
index 6bcfd379d7f8..f6b9321d0efa 100644
--- a/site/docs/spec.md
+++ b/site/docs/spec.md
@@ -375,7 +375,7 @@ A snapshot consists of the following fields:
| _optional_ | _optional_ | **`parent-snapshot-id`** | The snapshot ID of the snapshot's parent. Omitted for any snapshot with no parent |
| | _required_ | **`sequence-number`** | A monotonically increasing long that tracks the order of changes to a table |
| _required_ | _required_ | **`timestamp-ms`** | A timestamp when the snapshot was created, used for garbage collection and table inspection |
-| _optional_ | _required_ | **`manifest-list`** | The location of a manifest list for this snapshot that tracks manifest files with additional meadata |
+| _optional_ | _required_ | **`manifest-list`** | The location of a manifest list for this snapshot that tracks manifest files with additional metadata |
| _optional_ | | **`manifests`** | A list of manifest file locations. Must be omitted if `manifest-list` is present |
| _optional_ | _required_ | **`summary`** | A string map that summarizes the snapshot changes, including `operation` (see below) |
diff --git a/site/mkdocs.yml b/site/mkdocs.yml
index f7746bb50501..eaf27fd0e463 100644
--- a/site/mkdocs.yml
+++ b/site/mkdocs.yml
@@ -49,12 +49,13 @@ nav:
- How to Release: how-to-release.md
- Tables:
- Configuration: configuration.md
- - Schemas: schemas.md
- - Partitioning: partitioning.md
- - Table evolution: evolution.md
- Maintenance: maintenance.md
+ - Metadata: metadata.md
+ - Partitioning: partitioning.md
- Performance: performance.md
- Reliability: reliability.md
+ - Schemas: schemas.md
+ - Table evolution: evolution.md
- Spark:
- Getting Started: getting-started.md
- Configuration: spark-configuration.md