Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion hadoop-hdds/common/src/main/resources/ozone-default.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2343,7 +2343,6 @@
<tag>OZONE, OM, MANAGEMENT</tag>
<description>
The maximum number of filesystem snapshot allowed in an Ozone Manager.
This limit is set to 65000 because the ext4 filesystem limits the number of hard links per file to 65,000.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious the reason why you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we ended up setting the limit to 10k.

</description>
</property>

Expand Down
57 changes: 51 additions & 6 deletions hadoop-hdds/docs/content/feature/Snapshot.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,12 +158,57 @@ This section covers key configurations and monitoring for Ozone snapshots. Tune

**Snapshot-Related Configuration Parameters:**

* **`ozone.om.fs.snapshot.max.limit`**: Max snapshots per bucket (Default: 10000). Safety limit.
* **`ozone.om.snapshot.compaction.dag.max.time.allowed`**: Window for efficient SnapshotDiff (Default: 30 days). Older diffs may be slower.
* **`ozone.om.snapshot.diff.db.dir`**: Directory for SnapshotDiff job data. Defaults to OM metadata dir. Use a spacious location for large diffs.
* **`ozone.om.snapshot.rocksdb.metrics.enabled`**: Enable detailed RocksDB metrics for snapshots (Default: false). Use for debugging/monitoring.
* **`ozone.om.snapshot.load.native.lib`**: Use native RocksDB library for snapshot operations (Default: true). Set to false as a workaround for native library issues.
* **`ozone.om.snapshot.diff.concurrent.max`**: Max concurrent SnapshotDiff jobs per OM (Default: 10). Increase if OM resources allow.
* **General Snapshot Management**
* `ozone.om.fs.snapshot.max.limit`: Max snapshots per bucket (Default: 10000). Safety limit.
* `ozone.om.ratis.snapshot.dir`: The directory where OM Ratis snapshots are stored (Default: ratis-snapshot under OM DB dir).
* `ozone.om.ratis.snapshot.max.total.sst.size`: The maximum total size of SST files to be included in a Ratis snapshot (Default: 100000000).
* `ozone.om.snapshot.load.native.lib`: Use native RocksDB library for snapshot operations (Default: true). Set to false as a workaround for native library issues.
* `ozone.om.snapshot.checkpoint.dir.creation.poll.timeout`: Timeout for polling the creation of the snapshot checkpoint directory (Default: 20s).

* **SnapshotDiff Service**
* `ozone.om.snapshot.diff.db.dir`: Directory for SnapshotDiff job data. Defaults to OM metadata dir. Use a spacious location for large diffs.
* `ozone.om.snapshot.force.full.diff`: Force a full diff for all snapshot diff jobs (Default: false).
* `ozone.om.snapshot.diff.disable.native.libs`: Disable native libraries for snapshot diff (Default: false).
* `ozone.om.snapshot.diff.max.page.size`: Maximum page size for snapshot diff (Default: 1000).
* `ozone.om.snapshot.diff.thread.pool.size`: Thread pool size for snapshot diff (Default: 10).
* `ozone.om.snapshot.diff.job.default.wait.time`: Default wait time for a snapshot diff job (Default: 1m).
* `ozone.om.snapshot.diff.max.allowed.keys.changed.per.job`: Maximum number of keys allowed to be changed per snapshot diff job (Default: 10000000).

* **Snapshot Compaction and Cleanup**
* `ozone.snapshot.key.deleting.limit.per.task`: The maximum number of keys scanned by the snapshot deleting service in a single run (Default: 20000).
* `ozone.om.snapshot.compact.non.snapshot.diff.tables`: When enabled, allows compaction of tables not tracked by snapshot diffs after snapshots are evicted from the cache (Default: false).
* `ozone.om.snapshot.compaction.dag.max.time.allowed`: Window for efficient SnapshotDiff (Default: 30 days). Older diffs may be slower.
* `ozone.om.snapshot.prune.compaction.backup.batch.size`: Batch size for pruning compaction backups (Default: 2000).
* `ozone.om.snapshot.compaction.dag.prune.daemon.run.interval`: Interval for the compaction DAG pruning daemon (Default: 1h).
* `ozone.om.snapshot.diff.max.jobs.purge.per.task`: Maximum number of snapshot diff jobs to purge per task (Default: 100).
* `ozone.om.snapshot.diff.job.report.persistent.time`: Persistence time for snapshot diff job reports (Default: 7d).
* `ozone.om.snapshot.diff.cleanup.service.run.interval`: Interval for the snapshot diff cleanup service (Default: 1m).
* `ozone.om.snapshot.diff.cleanup.service.timeout`: Timeout for the snapshot diff cleanup service (Default: 5m).
* `ozone.om.snapshot.cache.cleanup.service.run.interval`: Interval for the snapshot cache cleanup service (Default: 1m).
* `ozone.snapshot.filtering.limit.per.task`: The maximum number of snapshots to be filtered in a single run of the snapshot filtering service (Default: 2).
* `ozone.snapshot.deleting.limit.per.task`: The maximum number of snapshots to be deleted in a single run of the snapshot deleting service (Default: 10).
* `ozone.snapshot.filtering.service.interval`: Interval for the snapshot filtering service (Default: 60s).
* `ozone.snapshot.deleting.service.timeout`: Timeout for the snapshot deleting service (Default: 300s).
* `ozone.snapshot.deleting.service.interval`: Interval for the snapshot deleting service (Default: 30s).
* `ozone.snapshot.deep.cleaning.enabled`: Enable deep cleaning of snapshots (Default: false).

* **Performance and Resource Management**
* `ozone.om.snapshot.rocksdb.metrics.enabled`: Enable detailed RocksDB metrics for snapshots (Default: false). Use for debugging/monitoring.
* `ozone.om.snapshot.cache.max.size`: Maximum size of the snapshot cache soft limit (Default: 10).
* `ozone.om.snapshot.db.max.open.files`: Maximum number of open files for the snapshot database (Default: 100).

* **Snapshot Provider (Internal)**
* `ozone.om.snapshot.provider.socket.timeout`: Socket timeout for the snapshot provider (Default: 5s).
* `ozone.om.snapshot.provider.connection.timeout`: Connection timeout for the snapshot provider (Default: 5s).
* `ozone.om.snapshot.provider.request.timeout`: Request timeout for the snapshot provider (Default: 5m).

### Recon-Specific Settings

These settings, defined in `ozone-default.xml`, apply specifically to Recon.

* `ozone.recon.om.snapshot.task.initial.delay`: Initial delay for the OM snapshot task in Recon (Default: 1m).
* `ozone.recon.om.snapshot.task.interval.delay`: Interval for the OM snapshot task in Recon (Default: 5s).
* `ozone.recon.om.snapshot.task.flush.param`: Flush parameter for the OM snapshot task in Recon (Default: false).

Monitor OM heap usage with many snapshots or large diffs. Enable Ozone Native ACLs or Ranger for access control.

Expand Down