-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13378. [Docs] Add a Production page under Getting Started #8734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+158
−0
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
b69df00
docs: Add production deployment guide
jojochuang 1c41951
Update Hugo header.
jojochuang ae74cd7
docs: Update production deployment guide based on feedback
jojochuang ed1285e
docs: HDDS-13378. Add production deployment recommendations
jojochuang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| --- | ||
| title: Production Deployment | ||
| weight: 6 | ||
| menu: | ||
| main: | ||
| parent: Getting Started | ||
| --- | ||
| <!-- | ||
| Licensed to the Apache Software Foundation (ASF) under one or more | ||
| contributor license agreements. See the NOTICE file distributed with | ||
| this work for additional information regarding copyright ownership. | ||
| The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| --> | ||
|
|
||
| This document provides guidance on the requirements and best practices for a production deployment of Apache Ozone. | ||
|
|
||
| ## Ozone Components | ||
|
|
||
| A typical production Ozone cluster includes the following services: | ||
|
|
||
| * **Ozone Manager (OM)**: Manages the namespace and metadata of the Ozone cluster. A production cluster requires 3 OM instances for high availability. | ||
| * **Storage Container Manager (SCM)**: Manages the data nodes and pipelines. A production cluster requires 3 SCM instances for high availability. | ||
| * **DataNode**: Stores the actual data in containers. A production cluster requires at least 3 DataNodes. | ||
| * **Recon**: A web-based UI for monitoring and managing the Ozone cluster. A Recon server is strongly recommended, though not required. | ||
| * **S3 Gateway (S3G)**: An S3-compatible gateway for accessing Ozone. Multiple S3 Gateway instances are strongly recommended to load balance S3 traffic. | ||
| * **HttpFs**: An HDFS-compatible API for accessing Ozone. This is an optional component. | ||
|
|
||
| ## Requirements | ||
|
|
||
| ### System Requirements | ||
|
|
||
| * **Hardware**: Bare metal machines are recommended for optimal performance. Virtual machines or containers are not recommended for production deployments. | ||
| * **Operating System**: Linux (recommended distributions: Red Hat 8/Rocky 8+, Ubuntu, SUSE; supported architectures: x86/ARM). | ||
| * **Java Development Kit (JDK)**: Version 8 or higher. | ||
| * **Time Synchronization**: A time synchronization service such as Chrony or ntpd must be enabled to prevent time drift. | ||
|
|
||
| ### Memory Requirements | ||
|
|
||
| * **Ozone Manager (OM), Storage Container Manager (SCM), and Recon**: Recommended heap size in large production clusters is 64GB. | ||
| * **DataNode, S3 Gateway, and HttpFs**: Recommended heap size is 31GB. | ||
|
|
||
| ### Storage Requirements | ||
|
|
||
| * **Ozone Manager (OM), Storage Container Manager (SCM), and Recon Metadata Storage**: Use SAS SSD or NVMe SSD for metadata (RocksDB and Ratis) to ensure optimal performance. It is recommended to use RAID 1 (disk mirroring) for the metadata disks to protect against disk failures. | ||
| * **DataNode Storage**: | ||
| * **Ratis Log**: Use SAS SSD or NVMe SSD for the Ratis log directory for low latency writes. | ||
| * **Container Data**: Hard disks are acceptable for container data storage. | ||
| * **Disk Configuration**: It is recommended to use a JBOD (Just a Bunch Of Disks) configuration instead of RAID. Ozone is a replicated distributed storage system and handles data redundancy. Using RAID can decrease performance without providing additional data protection benefits. | ||
| * **Storage Type**: Use direct-attached storage. Do not use Network Attached Storage (NAS) or Storage Area Network (SAN). | ||
|
|
||
| ### Network Requirements | ||
|
|
||
| * **Network Bandwidth**: A minimum of 25Gbps network card bandwidth is recommended. | ||
| * **Network Topology**: A leaf-spine network topology with an oversubscription ratio below 3:1 is recommended for predictable performance. | ||
|
|
||
| ### Security Requirements (Optional but Recommended) | ||
|
|
||
| * **Kerberos**: A Kerberos environment, including a Key Distribution Center (KDC), is recommended for enhanced security. | ||
|
|
||
| ## Recommended Configurations | ||
|
|
||
| ### Linux Kernel | ||
|
|
||
| * **CPU Governor**: Set the CPU scaling driver to `performance` mode to maximize performance. | ||
| * **Transparent Hugepage**: Disable Transparent Hugepage to avoid performance issues. | ||
ivandika3 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * **SELinux**: Disable SELinux. | ||
| * **Swappiness**: Set `vm.swappiness=1` to minimize swapping. | ||
ivandika3 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ### Local File System | ||
|
|
||
| * **LVM**: Disable Logical Volume Manager (LVM) for data drives. | ||
| * **File System**: Use `ext4` or `xfs` file systems. | ||
ivandika3 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * **Mount Options**: Mount drives with the `noatime` option to reduce unnecessary disk writes. For SSDs, also add the `discard` option. | ||
|
|
||
| ### Ozone Configuration | ||
|
|
||
| * **Monitoring**: Install Prometheus and Grafana for monitoring the Ozone cluster. For audit logs, consider using a log ingestion framework such as the ELK Stack (Elasticsearch, Logstash, and Kibana) with FileBeat, or other similar frameworks. Alternatively, you can use Apache Ranger to manage audit logs. | ||
| * **Pipeline Limits**: Increase the number of allowed write pipelines to better suit your workload by adjusting `ozone.scm.datanode.pipeline.limit` and `ozone.scm.ec.pipeline.minimum`. | ||
| * **Heap Sizes**: Configure sufficient heap sizes for Ozone Manager (OM), Storage Container Manager (SCM), Recon, DataNode, S3 Gateway (S3G), and HttpFs services to ensure stability. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| --- | ||
| title: 生产环境部署 | ||
| weight: 6 | ||
| menu: | ||
| main: | ||
| parent: 快速入门 | ||
| --- | ||
| <!-- | ||
| Licensed to the Apache Software Foundation (ASF) under one or more | ||
| contributor license agreements. See the NOTICE file distributed with | ||
| this work for additional information regarding copyright ownership. | ||
| The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| --> | ||
|
|
||
| 本文档旨在为 Apache Ozone 的生产环境部署提供需求和最佳实践的指导。 | ||
|
|
||
| ## 需求 | ||
|
|
||
| ### 系统需求 | ||
|
|
||
| * **操作系统**: Linux(推荐发行版:Red Hat 8/Rocky 8+、Ubuntu、SUSE;支持架构:x86/ARM)。 | ||
| * **Java 开发工具包 (JDK)**: 版本 8 或更高。 | ||
| * **时间同步**: 必须启用时间同步服务(如 Chrony 或 ntpd)以防止时间漂移。 | ||
|
|
||
| ### 存储需求 | ||
|
|
||
| * **元数据存储**: 为确保最佳性能,请使用 SAS SSD 或 NVMe SSD 存储元数据(RocksDB 和 Ratis)。 | ||
| * **DataNode 存储**: DataNode 数据存储可使用硬盘。 | ||
| * **存储类型**: 请使用直接附加存储。不要使用网络附加存储 (NAS) 或存储区域网络 (SAN)。 | ||
|
|
||
| ### 网络需求 | ||
|
|
||
| * **网络带宽**: 建议网卡带宽至少为 25Gbps。 | ||
| * **网络拓扑**: 为实现可预测的性能,建议采用超分比例低于 3:1 的叶脊网络拓扑。 | ||
|
|
||
| ### 安全需求 (可选但推荐) | ||
|
|
||
| * **Kerberos**: 为增强安全性,建议使用包括密钥分发中心 (KDC) 在内的 Kerberos 环境。 | ||
|
|
||
| ## 推荐配置 | ||
|
|
||
| ### Linux 内核 | ||
|
|
||
| * **CPU 调节器**: 将 CPU 调节驱动设置为 `performance` 模式以最大化性能。 | ||
| * **透明大页**: 禁用透明大页以避免性能问题。 | ||
| * **SELinux**: 禁用 SELinux。 | ||
| * **Swappiness**: 设置 `vm.swappiness=1` 以最小化交换。 | ||
|
|
||
| ### 本地文件系统 | ||
|
|
||
| * **LVM**: 禁用数据驱动器的逻辑卷管理器 (LVM)。 | ||
| * **文件系统**: 使用 `ext4` 或 `xfs` 文件系统。 | ||
| * **挂载选项**: 使用 `noatime` 选项挂载驱动器以减少不必要的磁盘写入。对于 SSD,还需添加 `discard` 选项。 | ||
|
|
||
| ### Ozone 配置 | ||
|
|
||
| * **监控**: 安装 Prometheus 和 Grafana 以监控 Ozone 集群。 | ||
| * **管道限制**: 通过调整 `ozone.scm.datanode.pipeline.limit` 和 `ozone.scm.ec.pipeline.minimum` 来增加允许的写入管道数量,以更好地适应您的工作负载。 | ||
| * **堆大小**: 为 Ozone Manager (OM)、Storage Container Manager (SCM)、Recon、DataNode、S3 Gateway (S3G) 和 HttpFs 服务配置足够的堆大小,以确保稳定性。 |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.