diff --git a/hadoop-hdds/docs/content/_index.md b/hadoop-hdds/docs/content/_index.md index 52e190cf99a4..9bc7a7ae695a 100644 --- a/hadoop-hdds/docs/content/_index.md +++ b/hadoop-hdds/docs/content/_index.md @@ -1,4 +1,5 @@ --- +name: Ozone title: Overview menu: main weight: -10 @@ -29,7 +30,7 @@ Apart from scaling to billions of objects of varying sizes, Ozone can function effectively in containerized environments like Kubernetes._*

-Applications like Apache Spark, Hive and YARN, work without any modifications when using Ozone. Ozone comes with a [Java client library]({{}}), [S3 protocol support]({{< ref "S3.md" >}}), and a [command line interface]({{< ref "shell/_index.md" >}}) which makes it easy to use Ozone. +Applications like Apache Spark, Hive and YARN, work without any modifications when using Ozone. Ozone comes with a [Java client library]({{}}), [S3 protocol support]({{< ref "S3.md" >}}), and a [command line interface]({{< ref "Cli.md" >}}) which makes it easy to use Ozone. Ozone consists of volumes, buckets, and keys: diff --git a/hadoop-hdds/docs/content/_index.zh.md b/hadoop-hdds/docs/content/_index.zh.md index 8bdcf5044454..689490be11ad 100644 --- a/hadoop-hdds/docs/content/_index.zh.md +++ b/hadoop-hdds/docs/content/_index.zh.md @@ -28,7 +28,7 @@ weight: -10 Ozone 不仅能存储数十亿个不同大小的对象,还支持在容器化环境(比如 Kubernetes)中运行。_*

Apache Spark、Hive 和 YARN 等应用无需任何修改即可使用 Ozone。Ozone 提供了 [Java API]({{< -ref "JavaApi.zh.md" >}})、[S3 接口]({{< ref "S3.zh.md" >}})和[命令行接口]({{< ref "shell/_index.zh.md" >}}),极大地方便了 Ozone +ref "JavaApi.zh.md" >}})、[S3 接口]({{< ref "S3.zh.md" >}})和命令行接口,极大地方便了 Ozone 在不同应用场景下的的使用。 Ozone 的管理由卷、桶和键组成: diff --git a/hadoop-hdds/docs/content/beyond/Containers.md b/hadoop-hdds/docs/content/beyond/Containers.md deleted file mode 100644 index 13a66d801f5d..000000000000 --- a/hadoop-hdds/docs/content/beyond/Containers.md +++ /dev/null @@ -1,234 +0,0 @@ ---- -title: "Ozone Containers" -summary: Ozone uses containers extensively for testing. This page documents the usage and best practices of Ozone. -weight: 2 ---- - - -Docker heavily is used at the ozone development with three principal use-cases: - -* __dev__: - * We use docker to start local pseudo-clusters (docker provides unified environment, but no image creation is required) -* __test__: - * We create docker images from the dev branches to test ozone in kubernetes and other container orchestrator system - * We provide _apache/ozone_ images for each release to make it easier for evaluation of Ozone. - These images are __not__ created __for production__ usage. - -

- -* __production__: - * We have documentation on how you can create your own docker image for your production cluster. - -Let's check out each of the use-cases in more detail: - -## Development - -Ozone artifact contains example docker-compose directories to make it easier to start Ozone cluster in your local machine. - -From distribution: - -```bash -cd compose/ozone -docker-compose up -d -``` - -After a local build: - -```bash -cd hadoop-ozone/dist/target/ozone-*/compose -docker-compose up -d -``` - -These environments are very important tools to start different type of Ozone clusters at any time. - -To be sure that the compose files are up-to-date, we also provide acceptance test suites which start -the cluster and check the basic behaviour. - -The acceptance tests are part of the distribution, and you can find the test definitions in `smoketest` directory. - -You can start the tests from any compose directory: - -For example: - -```bash -cd compose/ozone -./test.sh -``` - -### Implementation details - -`compose` tests are based on the apache/hadoop-runner docker image. The image itself does not contain -any Ozone jar file or binary just the helper scripts to start ozone. - -hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself -is mounted from the including directory: - -(Example docker-compose fragment) - -``` - scm: - image: apache/hadoop-runner:jdk11 - volumes: - - ../..:/opt/hadoop - ports: - - 9876:9876 - -``` - -The containers are configured based on environment variables, but because the same environment -variables should be set for each containers we maintain the list of the environment variables -in a separated file: - -``` - scm: - image: apache/hadoop-runner:jdk11 - #... - env_file: - - ./docker-config -``` - -The docker-config file contains the list of the required environment variables: - -``` -OZONE-SITE.XML_ozone.om.address=om -OZONE-SITE.XML_ozone.om.http-address=om:9874 -OZONE-SITE.XML_ozone.scm.names=scm -#... -``` - -As you can see we use naming convention. Based on the name of the environment variable, the -appropriate hadoop config XML (`ozone-site.xml` in our case) will be generated by a -[script](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest/scripts) which is -included in the `hadoop-runner` base image. - -The [entrypoint](https://github.com/apache/hadoop/blob/docker-hadoop-runner-latest/scripts/starter.sh) -of the `hadoop-runner` image contains a helper shell script which triggers this transformation and -can do additional actions (eg. initialize scm/om storage, download required keytabs, etc.) -based on environment variables. - -## Test/Staging - -The `docker-compose` based approach is recommended only for local test, not for multi node cluster. -To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes. - -Kubernetes example files are included in the `kubernetes` folder. - -*Please note*: all the provided images are based the `hadoop-runner` image which contains all the -required tool for testing in staging environments. For production we recommend to create your own, -hardened image with your own base image. - -### Test the release - -The release can be tested with deploying any of the example clusters: - -```bash -cd kubernetes/examples/ozone -kubectl apply -f -``` - -Plese note that in this case the latest released container will be downloaded from the dockerhub. - -### Test the development build - -To test a development build you can create your own image and upload it to your own docker registry: - - -```bash -mvn clean install -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone -``` - -The configured image will be used in all the generated kubernetes resources files (`image:` keys are adjusted during the build) - -```bash -cd kubernetes/examples/ozone -kubectl apply -f -``` - -## Production - - - -You can use the source of our development images as an example: - - * [Base image](https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile) - * [Docker image](https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/docker/Dockerfile) - - Most of the elements are optional and just helper function but to use the provided example - kubernetes resources you may need the scripts from - [here](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11/scripts) - - * The two python scripts convert environment variables to real hadoop XML config files - * The start.sh executes the python scripts (and other initialization) based on environment variables. - -## Containers - -Ozone related container images and source locations: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#ContainerRepositoryBaseBranchTagsComments
1apache/ozonehttps://github.com/apache/hadoop-docker-ozoneozone-... hadoop-runner0.3.0,0.4.0,0.4.1For each Ozone release we create new release tag.
2apache/hadoop-runner https://github.com/apache/hadoopdocker-hadoop-runnercentosjdk11,jdk8,latestThis is the base image used for testing Hadoop Ozone. - This is a set of utilities that make it easy for us run ozone.
diff --git a/hadoop-hdds/docs/content/beyond/Containers.zh.md b/hadoop-hdds/docs/content/beyond/Containers.zh.md deleted file mode 100644 index c06902e04a36..000000000000 --- a/hadoop-hdds/docs/content/beyond/Containers.zh.md +++ /dev/null @@ -1,203 +0,0 @@ ---- -title: "Ozone 中的容器技术" -summary: Ozone 广泛地使用容器来进行测试,本页介绍 Ozone 中容器的使用及其最佳实践。 -weight: 2 ---- - - -Ozone 的开发中大量地使用了 Docker,包括以下三种主要的应用场景: - -* __开发__: - * 我们使用 docker 来启动本地伪集群(docker 可以提供统一的环境,但是不需要创建镜像)。 -* __测试__: - * 我们从开发分支创建 docker 镜像,然后在 kubernetes 或其它容器编排系统上测试 ozone。 - * 我们为每个发行版提供了 _apache/ozone_ 镜像,以方便用户体验 Ozone。 - 这些镜像 __不__ 应当在 __生产__ 中使用。 - - - -* __生产__: - * 我们提供了如何为生产集群创建 docker 镜像的文档。 - -下面我们来详细地介绍一下各种应用场景: - -## 开发 - -Ozone 安装包中包含了 docker-compose 的示例目录,用于方便地在本地机器启动 Ozone 集群。 - -使用官方提供的发行包: - -```bash -cd compose/ozone -docker-compose up -d -``` - -本地构建方式: - -```bash -cd hadoop-ozone/dist/target/ozone-*/compose -docker-compose up -d -``` - -这些 compose 环境文件是重要的工具,可以用来随时启动各种类型的 Ozone 集群。 - -为了确保 compose 文件是最新的,我们提供了验收测试套件,套件会启动集群并检查其基本行为是否正常。 - -验收测试也包含在发行包中,你可以在 `smoketest` 目录下找到各个测试的定义。 - -你可以在任意 compose 目录进行测试,比如: - -```bash -cd compose/ozone -./test.sh -``` - -### 实现细节 - -`compose` 测试都基于 apache/hadoop-runner 镜像,这个镜像本身并不包含任何 Ozone 的 jar 包或二进制文件,它只是提供其了启动 Ozone 的辅助脚本。 - -hadoop-runner 提供了一个随处运行 Ozone 的固定环境,Ozone 分发包通过目录挂载包含在其中。 - -(docker-compose 示例片段) - -``` - scm: - image: apache/hadoop-runner:jdk11 - volumes: - - ../..:/opt/hadoop - ports: - - 9876:9876 - -``` - -容器应该通过环境变量来进行配置,由于每个容器都应当设置相同的环境变量,我们在单独的文件中维护了一个环境变量列表: - -``` - scm: - image: apache/hadoop-runner:jdk11 - #... - env_file: - - ./docker-config -``` - -docker-config 文件中包含了所需环境变量的列表: - -``` -OZONE-SITE.XML_ozone.om.address=om -OZONE-SITE.XML_ozone.om.http-address=om:9874 -OZONE-SITE.XML_ozone.scm.names=scm -#... -``` - -你可以看到我们所使用的命名规范,根据这些环境变量的名字,`hadoop-runner` 基础镜像中的[脚本](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest/scripts) 会生成合适的 hadoop XML 配置文件(在我们这种情况下就是 `ozone-site.xml`)。 - -`hadoop-runner` 镜像的[入口点](https://github.com/apache/hadoop/blob/docker-hadoop-runner-latest/scripts/starter -.sh)包含了一个辅助脚本,这个辅助脚本可以根据环境变量触发上述的配置文件生成以及其它动作(比如初始化 SCM 和 OM 的存储、下载必要的 keytab 等)。 - -## 测试 - -`docker-compose` 的方式应当只用于本地测试,不适用于多节点集群。要在多节点集群上使用容器,我们需要像 Kubernetes 这样的容器编排系统。 - -Kubernetes 示例文件在 `kubernetes` 文件夹中。 - -*请注意*:所有提供的镜像都使用 `hadoop-runner` 作为基础镜像,这个镜像中包含了所有测试环境所需的测试工具。对于生产环境,我们推荐用户使用自己的基础镜像创建可靠的镜像。 - -### 发行包测试 - -可以通过部署任意的示例集群来测试发行包: - -```bash -cd kubernetes/examples/ozone -kubectl apply -f -``` - -注意,在这个例子中会从 Docker Hub 下载最新的镜像。 - -### 开发构建测试 - -为了测试开发中的构建,你需要创建自己的镜像并上传到自己的 docker 仓库中: - - -```bash -mvn clean install -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone -``` - -所有生成的 kubernetes 资源文件都会使用这个镜像 (`image:` keys are adjusted during the build) - -```bash -cd kubernetes/examples/ozone -kubectl apply -f -``` - -## 生产 - - - -你可以使用我们开发中所用的镜像作为示例: - - * [基础镜像] (https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile) - * [完整镜像] (https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/docker/Dockerfile) - - Dockerfile 中大部分内容都是可选的辅助功能,但如果要使用我们提供的 kubernetes 示例资源文件,你可能需要[这里](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11/scripts)的脚本。 - - * 两个 python 脚本将环境变量转化为实际的 hadoop XML 配置文件 - * start.sh 根据环境变量执行 python 脚本(以及其它初始化工作) - -## 容器 - -Ozone 相关的容器镜像和 Dockerfile 位置: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#容器仓库基础镜像分支标签说明
1apache/ozonehttps://github.com/apache/hadoop-docker-ozoneozone-... hadoop-runner0.3.0,0.4.0,0.4.1每个 Ozone 发行版都对应一个新标签。
2apache/hadoop-runner https://github.com/apache/hadoopdocker-hadoop-runnercentosjdk11,jdk8,latest这是用于测试 Hadoop Ozone 的基础镜像,包含了一系列可以让我们更加方便地运行 Ozone 的工具。 -
diff --git a/hadoop-hdds/docs/content/beyond/DockerCheatSheet.md b/hadoop-hdds/docs/content/beyond/DockerCheatSheet.md deleted file mode 100644 index f4f5492cf177..000000000000 --- a/hadoop-hdds/docs/content/beyond/DockerCheatSheet.md +++ /dev/null @@ -1,88 +0,0 @@ ---- -title: "Docker Cheat Sheet" -date: 2017-08-10 -summary: Docker Compose cheat sheet to help you remember the common commands to control an Ozone cluster running on top of Docker. -weight: 4 ---- - - - -In the `compose` directory of the ozone distribution there are multiple pseudo-cluster setup which -can be used to run Ozone in different way (for example: secure cluster, with tracing enabled, -with prometheus etc.). - -If the usage is not document in a specific directory the default usage is the following: - -```bash -cd compose/ozone -docker-compose up -d -``` - -The data of the container is ephemeral and deleted together with the docker volumes. -```bash -docker-compose down -``` - -## Useful Docker & Ozone Commands - -If you make any modifications to ozone, the simplest way to test it is to run freon and unit tests. - -Here are the instructions to run freon in a docker-based cluster. - -{{< highlight bash >}} -docker-compose exec datanode bash -{{< /highlight >}} - -This will open a bash shell on the data node container. -Now we can execute freon for load generation. - -{{< highlight bash >}} -ozone freon randomkeys --numOfVolumes=10 --numOfBuckets 10 --numOfKeys 10 -{{< /highlight >}} - -Here is a set of helpful commands for working with docker for ozone. -To check the status of the components: - -{{< highlight bash >}} -docker-compose ps -{{< /highlight >}} - -To get logs from a specific node/service: - -{{< highlight bash >}} -docker-compose logs scm -{{< /highlight >}} - - -As the WebUI ports are forwarded to the external machine, you can check the web UI: - -* For the Storage Container Manager: http://localhost:9876 -* For the Ozone Manager: http://localhost:9874 -* For the Datanode: check the port with `docker ps` (as there could be multiple data nodes, ports are mapped to the ephemeral port range) - -You can start multiple data nodes with: - -{{< highlight bash >}} -docker-compose scale datanode=3 -{{< /highlight >}} - -You can test the commands from the [Ozone CLI]({{< ref "shell/_index.md" >}}) after opening a new bash shell in one of the containers: - -{{< highlight bash >}} -docker-compose exec datanode bash -{{< /highlight >}} diff --git a/hadoop-hdds/docs/content/beyond/DockerCheatSheet.zh.md b/hadoop-hdds/docs/content/beyond/DockerCheatSheet.zh.md deleted file mode 100644 index 0a37f9ba0714..000000000000 --- a/hadoop-hdds/docs/content/beyond/DockerCheatSheet.zh.md +++ /dev/null @@ -1,85 +0,0 @@ ---- -title: "Docker 速查表" -date: 2017-08-10 -summary: Docker Compose 速查表帮助你记住一些操作在 Docker 上运行的 Ozone 集群的常用命令。 -weight: 4 ---- - - - -Ozone 发行包中的 `compose` 目录包含了多种伪集群配置,可以用来以多种方式运行 Ozone(比如:安全集群,启用追踪功能,启用 prometheus 等)。 - -如果目录下没有额外的使用说明,默认的用法如下: - -```bash -cd compose/ozone -docker-compose up -d -``` - -容器中的数据没有持久化,在集群关闭时会和 docker 卷一起被删除。 -```bash -docker-compose down -``` - -## Docker 和 Ozone 实用命令 - -如果你对 Ozone 做了修改,最简单的测试方法是运行 freon 和单元测试。 - -下面是在基于 docker 的集群中运行 freon 的命令。 - -{{< highlight bash >}} -docker-compose exec datanode bash -{{< /highlight >}} - -这会在数据节点的容器中打开一个 bash shell,接下来我们执行 freon 来生成负载。 - -{{< highlight bash >}} -ozone freon randomkeys --numOfVolumes=10 --numOfBuckets 10 --numOfKeys 10 -{{< /highlight >}} - -下面是一些与 docker 有关的实用命令。 -检查各组件的状态: - -{{< highlight bash >}} -docker-compose ps -{{< /highlight >}} - -获取指定节点/服务中的日志: - -{{< highlight bash >}} -docker-compose logs scm -{{< /highlight >}} - - -因为 WebUI 的端口已经被转发到外部机器,你可以查看 web UI: - -* 对于 Storage Container Manager:http://localhost:9876 -* 对于 Ozone Manager:http://localhost:9874 -* 对于 数据节点:使用 `docker ps` 查看端口(因为可能会有多个数据节点,它们的端口被映射到一个临时的端口) - -你也可以启动多个数据节点: - -{{< highlight bash >}} -docker-compose scale datanode=3 -{{< /highlight >}} - -在一个容器中打开 bash shell 后,你也可以对 [Ozone 命令行接口]({{< ref "shell/_index.zh.md" >}})中的命令进行测试。 - -{{< highlight bash >}} -docker-compose exec datanode bash -{{< /highlight >}} diff --git a/hadoop-hdds/docs/content/beyond/_index.md b/hadoop-hdds/docs/content/beyond/_index.md deleted file mode 100644 index 2a29a5810aab..000000000000 --- a/hadoop-hdds/docs/content/beyond/_index.md +++ /dev/null @@ -1,30 +0,0 @@ ---- -title: "Beyond Basics" -date: "2017-10-10" -menu: main -weight: 7 - ---- - - -{{}} - Beyond Basics pages go into custom configurations of Ozone, including how - to run Ozone concurrently with an existing HDFS cluster. These pages also - take deep into how to run profilers and leverage tracing support built into - Ozone. -{{}} diff --git a/hadoop-hdds/docs/content/beyond/_index.zh.md b/hadoop-hdds/docs/content/beyond/_index.zh.md deleted file mode 100644 index b7f6775674e2..000000000000 --- a/hadoop-hdds/docs/content/beyond/_index.zh.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: "进阶" -date: "2017-10-10" -menu: main -weight: 7 - ---- - - -{{}} - 本部分介绍 Ozone 的自定义配置,包括如何将 Ozone 以并存的方式部署到已有的 HDFS 集群,以及如何运行 Ozone 内置的 profilers 和 tracing 功能。 -{{}} diff --git a/hadoop-hdds/docs/content/concept/Containers.md b/hadoop-hdds/docs/content/concept/Containers.md new file mode 100644 index 000000000000..4e46acc5a280 --- /dev/null +++ b/hadoop-hdds/docs/content/concept/Containers.md @@ -0,0 +1,47 @@ +--- +title: Containers +weight: 5 +menu: + main: + parent: Architecture +summary: Description of the Containers, the replication unit of Ozone. +--- + + + +Containers are the fundamental replication unit of Ozone/HDDS, they are managed by the Storage Container Manager (SCM) service. + +Containers are big binary units (5Gb by default) which can contain multiple blocks: + +![Containers](Containers.png) + +Blocks are local information and not managed by SCM. Therefore even if billions of small files are created in the system (which means billions of blocks are created), only of the status of the containers will be reported by the Datanodes and containers will be replicated. + +When Ozone Manager requests a new Block allocation from the SCM, SCM will identify the suitable container and generate a block id which contains `ContainerId` + `LocalId`. Client will connect to the Datanode which stores the Container, and datanode can manage the separated block based on the `LocalId`. + +## Open vs. Closed containers + +When a container is created it starts in an OPEN state. When it's full (~5GB data is written), container will be closed and becomes a CLOSED container. + +The fundamental differences between OPEN and CLOSED containers: + +OPEN | CLOSED +-----------------------------------|----------------------------------------- +mutable | immutable +replicated with RAFT (Ratis) | Replicated with async container copy +Raft leader is used to READ / WRITE | All the nodes can be used to READ diff --git a/hadoop-hdds/docs/content/concept/Containers.png b/hadoop-hdds/docs/content/concept/Containers.png new file mode 100644 index 000000000000..3d2df0f313bc Binary files /dev/null and b/hadoop-hdds/docs/content/concept/Containers.png differ diff --git a/hadoop-hdds/docs/content/concept/Datanodes.md b/hadoop-hdds/docs/content/concept/Datanodes.md index f7b27297eda5..1910d6c6b53b 100644 --- a/hadoop-hdds/docs/content/concept/Datanodes.md +++ b/hadoop-hdds/docs/content/concept/Datanodes.md @@ -1,7 +1,10 @@ --- title: "Datanodes" date: "2017-09-14" -weight: 4 +weight: 7 +menu: + main: + parent: Architecture summary: Ozone supports Amazon's Simple Storage Service (S3) protocol. In fact, You can use S3 clients and S3 SDK based applications without any modifications with Ozone. --- - -Storage container manager provides multiple critical functions for the Ozone -cluster. SCM acts as the cluster manager, Certificate authority, Block -manager and the Replica manager. - -{{}} -SCM is in charge of creating an Ozone cluster. When an SCM is booted up via init command, SCM creates the cluster identity and root certificates needed for the SCM certificate authority. SCM manages the life cycle of a data node in the cluster. -{{}} - -{{}} -SCM's Ceritificate authority is in -charge of issuing identity certificates for each and every -service in the cluster. This certificate infrastructre makes -it easy to enable mTLS at network layer and also the block -token infrastructure depends on this certificate infrastructure. -{{}} - -{{}} -SCM is the block manager. SCM -allocates blocks and assigns them to data nodes. Clients -read and write these blocks directly. -{{}} - - -{{}} -SCM keeps track of all the block -replicas. If there is a loss of data node or a disk, SCM -detects it and instructs data nodes make copies of the -missing blocks to ensure high avialablity. -{{}} diff --git a/hadoop-hdds/docs/content/concept/Overview.md b/hadoop-hdds/docs/content/concept/Overview.md index 23fcda2325ae..f478734124ec 100644 --- a/hadoop-hdds/docs/content/concept/Overview.md +++ b/hadoop-hdds/docs/content/concept/Overview.md @@ -2,6 +2,11 @@ title: Overview date: "2017-10-10" weight: 1 +menu: + main: + name: "ArchitectureOverview" + title: "Overview" + parent: Architecture summary: Ozone's overview and components that make up Ozone. --- @@ -29,7 +34,7 @@ scale to billions of objects. Ozone separates namespace management and block space management; this helps ozone to scale much better. The namespace is managed by a daemon called [Ozone Manager ]({{< ref "OzoneManager.md" >}}) (OM), and block space is -managed by [Storage Container Manager]({{< ref "Hdds.md" >}}) (SCM). +managed by [Storage Container Manager]({{< ref "StorageContainerManager.md" >}}) (SCM). Ozone consists of volumes, buckets, and keys. diff --git a/hadoop-hdds/docs/content/concept/Overview.zh.md b/hadoop-hdds/docs/content/concept/Overview.zh.md index de16738a423c..042651ed1b2f 100644 --- a/hadoop-hdds/docs/content/concept/Overview.zh.md +++ b/hadoop-hdds/docs/content/concept/Overview.zh.md @@ -24,7 +24,7 @@ summary: 介绍 Ozone 的整体和各个组件。 Ozone 是一个分布式、多副本的对象存储系统,并针对大数据场景进行了专门的优化。Ozone 主要围绕可扩展性进行设计,目标是十亿数量级以上的对象存储。 -Ozone 通过对命名空间与块空间的管理进行分离,大大增加了其可扩展性,其中命名空间由 [Ozone Manager ]({{< ref "OzoneManager.zh.md" >}})(OM)管理,块空间由 [Storage Container Manager] ({{< ref "Hdds.zh.md" >}})(SCM)管理。 +Ozone 通过对命名空间与块空间的管理进行分离,大大增加了其可扩展性,其中命名空间由 [Ozone Manager ]({{< ref "OzoneManager.zh.md" >}})(OM)管理,块空间由 [Storage Container Manager] ({{< ref "StorageContainerManager.zh.md" >}})(SCM)管理。 Ozone 的管理由卷、桶和键组成。卷类似于个人主目录,只有管理员可以创建。 diff --git a/hadoop-hdds/docs/content/concept/OzoneManager-ReadPath.png b/hadoop-hdds/docs/content/concept/OzoneManager-ReadPath.png new file mode 100644 index 000000000000..5e68f6fc1cd6 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/OzoneManager-ReadPath.png differ diff --git a/hadoop-hdds/docs/content/concept/OzoneManager-WritePath.png b/hadoop-hdds/docs/content/concept/OzoneManager-WritePath.png new file mode 100644 index 000000000000..924b61c31a23 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/OzoneManager-WritePath.png differ diff --git a/hadoop-hdds/docs/content/concept/OzoneManager.md b/hadoop-hdds/docs/content/concept/OzoneManager.md index 1ebdd4951d20..f0711ed21d0d 100644 --- a/hadoop-hdds/docs/content/concept/OzoneManager.md +++ b/hadoop-hdds/docs/content/concept/OzoneManager.md @@ -2,6 +2,9 @@ title: "Ozone Manager" date: "2017-09-14" weight: 2 +menu: + main: + parent: Architecture summary: Ozone Manager is the principal name space service of Ozone. OM manages the life cycle of volumes, buckets and Keys. --- +![Ozone Manager](OzoneManager.png) + Ozone Manager (OM) is the namespace manager for Ozone. This means that when you want to write some data, you ask Ozone @@ -55,6 +60,8 @@ understood if we trace what happens during a key write and key read. ### Key Write +![Write Path](OzoneManager-WritePath.png) + * To write a key to Ozone, a client tells Ozone manager that it would like to write a key into a bucket that lives inside a specific volume. Once Ozone Manager determines that you are allowed to write a key to the specified bucket, @@ -73,15 +80,65 @@ to the client. the block and writes data to the data node. * Once the write is complete on the data node, the client will update the block -information on -Ozone manager. - +information on Ozone manager. ### Key Reads +![Read Path](OzoneManager-ReadPath.png) + * Key reads are simpler, the client requests the block list from the Ozone Manager * Ozone manager will return the block list and block tokens which allows the client to read the data from data nodes. * Client connects to the data node and presents the block token and reads the data from the data node. + +## Main components of the Ozone Manager + +For a detailed view of Ozone Manager this section gives a quick overview about the provided network services and the stored persisted data. + +**Network services provided by Ozone Manager:** + +Ozone provides a network service for the client and for administration commands. The main service calls + + * Key, Bucket, Volume / CRUD + * Multipart upload (Initiate, Complete…) + * Supports upload of huge files in multiple steps + * FS related calls (optimized for hierarchical queries instead of a flat ObjectStore namespace) + * GetFileStatus, CreateDirectory, CreateFile, LookupFile + * ACL related + * Managing ACLs if [internal ACLs]({{< ref "security/SecurityAcls.md" >}}) are used instead of [Ranger]({{< ref "security/SecurityWithRanger.md" >}}) + * Delegation token (Get / Renew / Cancel) + * For security + * Admin APIs + * Get S3 secret + * ServiceList (used for service discovery) + * DBUpdates (used by [Recon]({{< ref "feature/Recon.md" >}}) downloads snapshots) + +**Persisted state** + +The following data is persisted in Ozone Manager side in a specific RocksDB directory: + + * Volume / Bucket / Key tables + * This is the main responsibility of OM + * Key metadata contains the block id (which includes container id) to find the data + * OpenKey table + * for keys which are created, but not yet committed + * Delegation token table + * for security + * PrefixInfo table + * specific index table to store directory level ACL and to provide better performance for hierarchical queries + * S3 secret table + * For S# secret management + * Multipart info table + * Inflight uploads should be tracked + * Deleted table + * To track the blocks which should be deleted from the datanodes + +## Notable configuration + +key | default | description | +----|-------------|-------- +ozone.om.address | 0.0.0.0:9862 | RPC address of the OM. Required by the client. +ozone.om.http-address | 0.0.0.0:9874 | Default port of the HTTP server. +ozone.metadata.dirs | none | Directory to store persisted data (RocksDB). diff --git a/hadoop-hdds/docs/content/concept/OzoneManager.png b/hadoop-hdds/docs/content/concept/OzoneManager.png new file mode 100644 index 000000000000..f71bfacc4121 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/OzoneManager.png differ diff --git a/hadoop-hdds/docs/content/concept/OzoneManager.zh.md b/hadoop-hdds/docs/content/concept/OzoneManager.zh.md index 5e9ab7f23d0e..27b33c5fe8db 100644 --- a/hadoop-hdds/docs/content/concept/OzoneManager.zh.md +++ b/hadoop-hdds/docs/content/concept/OzoneManager.zh.md @@ -21,6 +21,12 @@ summary: Ozone Manager 是 Ozone 主要的命名空间服务,它管理了卷 limitations under the License. --> +
+ +注意:本页面翻译的信息可能滞后,最新的信息请参看英文版的相关页面。 + +
+ Ozone Manager(OM)管理 Ozone 的命名空间。 当向 Ozone 写入数据时,你需要向 OM 请求一个块,OM 会返回一个块并记录下相关信息。当你想要读取那个文件时,你也需要先通过 OM 获取那个块的地址。 diff --git a/hadoop-hdds/docs/content/concept/StorageContainerManager.md b/hadoop-hdds/docs/content/concept/StorageContainerManager.md new file mode 100644 index 000000000000..68953ced24d3 --- /dev/null +++ b/hadoop-hdds/docs/content/concept/StorageContainerManager.md @@ -0,0 +1,102 @@ +--- +title: "Storage Container Manager" +date: "2017-09-14" +weight: 3 +menu: + main: + parent: Architecture +summary: Storage Container Manager or SCM is the core metadata service of Ozone. SCM provides a distributed block layer for Ozone. +--- + + +Storage Container Manager (SCM) is the leader node of the *block space management*. The main responsibility is to create and manage [containers]({{}}) which is the main replication unit of Ozone. + + +![Storage Container Manager](StorageContainerManager.png) + +## Main responsibilities + +Storage container manager provides multiple critical functions for the Ozone +cluster. SCM acts as the cluster manager, Certificate authority, Block +manager and the Replica manager. + +SCM is in charge of creating an Ozone cluster. When an SCM is booted up via `init` command, SCM creates the cluster identity and root certificates needed for the SCM certificate authority. SCM manages the life cycle of a data node in the cluster. + + 1. SCM is the block manager. SCM +allocates blocks and assigns them to data nodes. Clients +read and write these blocks directly. + + 2. SCM keeps track of all the block +replicas. If there is a loss of data node or a disk, SCM +detects it and instructs data nodes make copies of the +missing blocks to ensure high availability. + + 3. **SCM's Ceritificate authority** is in +charge of issuing identity certificates for each and every +service in the cluster. This certificate infrastructure makes +it easy to enable mTLS at network layer and the block +token infrastructure depends on this certificate infrastructure. + +## Main components + +For a detailed view of Storage Container Manager this section gives a quick overview about the provided network services and the stored persisted data. + +**Network services provided by Storage Container Manager:** + + * Pipelines: List/Delete/Activate/Deactivate + * pipelines are set of datanodes to form replication groups + * Raft groups are planned by SCM + * Containers: Create / List / Delete containers + * Admin related requests + * Safemode status/modification + * Replication manager start / stop + * CA authority service + * Required by other sever components + * Datanode HeartBeat protocol + * From Datanode to SCM (30 sec by default) + * Datanodes report the status of containers, node... + * SCM can add commands to the response + +Note: client doesn't connect directly to the SCM + +**Persisted state** + + +The following data is persisted in Storage Container Manager side in a specific RocksDB directory + + * Pipelines + * Replication group of servers. Maintained to find a group for new container/block allocations. + * Containers + * Containers are the replication units. Data is required to act in case of data under/over replicated. + * Deleted blocks + * Block data is deleted in the background. Need a list to follow the progress. + * Valid cert, Revoked certs + * Used by the internal Certificate Authority to authorize other Ozone services + +## Notable configuration + + +## Notable configuration + +key | default | description | +----|-------------|-------- +ozone.scm.container.size | 5GB | Default container size used by Ozone +ozone.scm.block.size | 256MB | The default size of a data block. +hdds.scm.safemode.min.datanode | 1 | Minimum number of datanodes to start the real work. +ozone.scm.http-address | 0.0.0.0:9876 | HTTP address of the SCM server +ozone.metadata.dirs | none | Directory to store persisted data (RocksDB). \ No newline at end of file diff --git a/hadoop-hdds/docs/content/concept/StorageContainerManager.png b/hadoop-hdds/docs/content/concept/StorageContainerManager.png new file mode 100644 index 000000000000..605c48c355f8 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/StorageContainerManager.png differ diff --git a/hadoop-hdds/docs/content/concept/Hdds.zh.md b/hadoop-hdds/docs/content/concept/StorageContainerManager.zh.md similarity index 93% rename from hadoop-hdds/docs/content/concept/Hdds.zh.md rename to hadoop-hdds/docs/content/concept/StorageContainerManager.zh.md index d53090646cc0..da29869808c7 100644 --- a/hadoop-hdds/docs/content/concept/Hdds.zh.md +++ b/hadoop-hdds/docs/content/concept/StorageContainerManager.zh.md @@ -21,6 +21,12 @@ summary: Storage Container Manager(SCM)是 Ozone 的核心元数据服务 limitations under the License. --> +
+ +注意:本页面翻译的信息可能滞后,最新的信息请参看英文版的相关页面。 + +
+ SCM 为 Ozone 集群提供了多种重要功能,包括:集群管理、证书管理、块管理和副本管理等。 {{}} diff --git a/hadoop-hdds/docs/content/concept/_index.md b/hadoop-hdds/docs/content/concept/_index.md index 8f0aeb07c965..1441b00f2115 100644 --- a/hadoop-hdds/docs/content/concept/_index.md +++ b/hadoop-hdds/docs/content/concept/_index.md @@ -1,8 +1,8 @@ --- -title: Concepts +title: "Architecture" date: "2017-10-10" menu: main -weight: 6 +weight: 3 --- diff --git a/hadoop-hdds/docs/content/design/ec.md b/hadoop-hdds/docs/content/design/ec.md new file mode 100644 index 000000000000..415796d57597 --- /dev/null +++ b/hadoop-hdds/docs/content/design/ec.md @@ -0,0 +1,39 @@ +--- +title: Erasure Coding in Ozone +summary: Use Erasure Coding algorithm for efficient storage +date: 2020-06-30 +jira: HDDS-3816 +status: draft +author: Uma Maheswara Rao Gangumalla, Marton Elek, Stephen O'Donnell +--- + + +# Abstract + + Support Erasure Coding for read and write pipeline of Ozone. + +# Status + + The design doc describes two main methods to implement EC: + + * Container level, async Erasure Coding, to encode closed containers in the background + * Block level, striped Erasure Coding + + Second option can work only with new, dedicated write-path. Details of possible implementation will be included in the next version. + +# Link + + https://issues.apache.org/jira/secure/attachment/13006245/Erasure%20Coding%20in%20Apache%20Hadoop%20Ozone.pdf + diff --git a/hadoop-hdds/docs/content/design/namespace-support.md b/hadoop-hdds/docs/content/design/namespace-support.md index 0317b46b29a0..5dbd289e9d76 100644 --- a/hadoop-hdds/docs/content/design/namespace-support.md +++ b/hadoop-hdds/docs/content/design/namespace-support.md @@ -1,9 +1,9 @@ --- -title: Ozone Manager HA -summary: Support HA for Ozone Manager with the help of RATIS +title: Ozone FS namespace +summary: Use additional prefix table for indexed data retrieval date: 2020-01-20 jira: HDDS-2939 -status: accepted +status: implementing author: Supratim Deka, Anu Engineer --- +# Abstract + +Proposal suggest to introduce a new storage-class abstraction which can be used to define different replication strategies (factor, type, ...) for different bucket/keys. + +# Link + +https://hackmd.io/4kxufJBOQNaKn7PKFK_6OQ?view diff --git a/hadoop-hdds/docs/content/design/topology.md b/hadoop-hdds/docs/content/design/topology.md new file mode 100644 index 000000000000..edd5a90662ea --- /dev/null +++ b/hadoop-hdds/docs/content/design/topology.md @@ -0,0 +1,29 @@ +--- +title: Topology-awareness +summary: Placement policy to use rack information for read and write +date: 2018-11-16 +jira: HDDS-698 +status: implemented +author: junping, xiaoyu, junjie, jitendra, anu, nicholas +--- + + +# Abstract + + Adjust read/write path to consider rack information for proper data placement. + +# Link + + * https://docs.google.com/document/d/1HsZqlBcEmlezU6HriUaIOFE9SFdcBoaiz15Qt_ng0P8/edit \ No newline at end of file diff --git a/hadoop-hdds/docs/content/design/ozone-volume-management.md b/hadoop-hdds/docs/content/design/volume-management.md similarity index 100% rename from hadoop-hdds/docs/content/design/ozone-volume-management.md rename to hadoop-hdds/docs/content/design/volume-management.md diff --git a/hadoop-hdds/docs/content/feature/GDPR.md b/hadoop-hdds/docs/content/feature/GDPR.md new file mode 100644 index 000000000000..47424844d946 --- /dev/null +++ b/hadoop-hdds/docs/content/feature/GDPR.md @@ -0,0 +1,80 @@ +--- +title: "GDPR in Ozone" +date: "2019-September-17" +weight: 4 +summary: GDPR in Ozone +icon: user +menu: + main: + parent: Features +summary: Support to implement the "Right to be Forgotten" requirement of GDPR +--- + +--- + + +The General Data Protection Regulation (GDPR) is a law that governs how personal data should be handled. +This is an European Union law, but due to the nature of software oftentimes spills into other geographies. + +**Ozone supports GDPR's Right to Erasure(Right to be Forgotten) feature** + +When GDPR support is enabled all the keys are encrypt, by default. The encryption key is stored on the metadata server and used to encrypt the data for each of the requests. + +In case of a key deletion, Ozone deletes the metadata immediately but the binary data is deleted at the background in an async way. With GDPR support enabled, the encryption key is deleted immediately and as is, the data won't be possible to read any more even if the related binary (blocks or containers) are not yet deleted by the background process). + +Once you create a GDPR compliant bucket, any key created in that bucket will +automatically be GDPR compliant. + +Enabling GDPR compliance in Ozone is very straight forward. During bucket +creation, you can specify `--enforcegdpr=true` or `-g=true` and this will +ensure the bucket is GDPR compliant. Thus, any key created under this bucket +will automatically be GDPR compliant. + +GDPR can only be enabled on a new bucket. For existing buckets, you would +have to create a new GDPR compliant bucket and copy data from old bucket into + new bucket to take advantage of GDPR. + +Example to create a GDPR compliant bucket: + +```shell +ozone sh bucket create --enforcegdpr=true /hive/jan + +ozone sh bucket create -g=true /hive/jan +``` + +If you want to create an ordinary bucket then you can skip `--enforcegdpr` +and `-g` flags. + +## References + + * [Design doc]({{< ref "design/gdpr.md" >}}) diff --git a/hadoop-hdds/docs/content/gdpr/GDPR in Ozone.zh.md b/hadoop-hdds/docs/content/feature/GDPR.zh.md similarity index 91% rename from hadoop-hdds/docs/content/gdpr/GDPR in Ozone.zh.md rename to hadoop-hdds/docs/content/feature/GDPR.zh.md index e44957f537be..af0684dcfe08 100644 --- a/hadoop-hdds/docs/content/gdpr/GDPR in Ozone.zh.md +++ b/hadoop-hdds/docs/content/feature/GDPR.zh.md @@ -22,6 +22,11 @@ icon: user limitations under the License. --> +
+ +注意:本页面翻译的信息可能滞后,最新的信息请参看英文版的相关页面。 + +
在 Ozone 中遵守 GDPR 规范非常简单,只需要在创建桶时指定 `--enforcegdpr=true` 或 `-g=true` 参数,这样创建出的桶都是符合 GDPR 规范的,当然,在桶中创建的键也都自动符合。 diff --git a/hadoop-hdds/docs/content/feature/HA-OM-doublebuffer.png b/hadoop-hdds/docs/content/feature/HA-OM-doublebuffer.png new file mode 100644 index 000000000000..a71adce40a63 Binary files /dev/null and b/hadoop-hdds/docs/content/feature/HA-OM-doublebuffer.png differ diff --git a/hadoop-hdds/docs/content/feature/HA-OM.png b/hadoop-hdds/docs/content/feature/HA-OM.png new file mode 100644 index 000000000000..b1ff506f7860 Binary files /dev/null and b/hadoop-hdds/docs/content/feature/HA-OM.png differ diff --git a/hadoop-hdds/docs/content/feature/HA.md b/hadoop-hdds/docs/content/feature/HA.md new file mode 100644 index 000000000000..116cbb72be4b --- /dev/null +++ b/hadoop-hdds/docs/content/feature/HA.md @@ -0,0 +1,115 @@ +--- +title: "High Availability" +weight: 1 +menu: + main: + parent: Features +summary: HA setup for Ozone to avoid any single point of failure. +--- + + +Ozone has two leader nodes (*Ozone Manager* for key space management and *Storage Container Management* for block space management) and storage nodes (Datanode). Data is replicated between datanodes with the help of RAFT consensus algorithm. + +To avoid any single point of failure the leader nodes also should have a HA setup. + + 1. HA of Ozone Manager is implemented with the help of RAFT (Apache Ratis) + 2. HA of Storage Container Manager is [under implementation]({{< ref "scmha.md">}}) + +## Ozone Manager HA + +A single Ozone Manager uses [RocksDB](https://github.com/facebook/rocksdb/) to persiste metadata (volumes, buckets, keys) locally. HA version of Ozone Manager does exactly the same but all the data is replicated with the help of the RAFT consensus algorithm to follower Ozone Manager instances. + +![OM HA](HA-OM.png) + +Client connects to the Leader Ozone Manager which process the request and schedule the replication with RAFT. When the request is replicated to all the followers the leader can return with the response. + +## Configuration + +HA mode of Ozone Manager can be enabled with the following settings in `ozone-site.xml`: + +```XML + + ozone.om.ratis.enable + true + +``` +One Ozone configuration (`ozone-site.xml`) can support multiple Ozone HA cluster. To select between the available HA clusters a logical name is required for each of the clusters which can be resolved to the IP addresses (and domain names) of the Ozone Managers. + +This logical name is called `serviceId` and can be configured in the `ozone-site.xml` + + ``` + + ozone.om.service.ids + cluster1,cluster2 + +``` + +For each of the defined `serviceId` a logical configuration name should be defined for each of the servers. + +```XML + + ozone.om.nodes.cluster1 + om1,om2,om3 + +``` + +The defined prefixes can be used to define the address of each of the OM services: + +```XML + + ozone.om.address.cluster1.om1 + host1 + + + ozone.om.address.cluster1.om2 + host2 + + + ozone.om.address.cluster1.om3 + host3 + +``` + +The defined `serviceId` can be used instead of a single OM host using [client interfaces]({{< ref "interface/_index.md" >}}) + +For example with `o3fs://` + +```shell +hdfs dfs -ls o3fs://bucket.volume.cluster1/prefix/ +``` + +Or with `ofs://`: + +```shell +hdfs dfs -ls ofs://cluster1/volume/bucket/prefix/ +``` + +## Implementation details + +Raft can guarantee the replication of any request if the request is persisted to the RAFT log on the majority of the nodes. To achive high throghput with Ozone Manager, it returns with the response even if the request is persisted only to the RAFT logs. + +RocksDB instaces are updated by a background thread with batching transactions (so called "double buffer" as when one of the buffers is used to commit the data the other one collects all the new requests for the next commit.) To make all data available for the next request even if the background process is not yet wrote them the key data is cached in the memory. + +![Double buffer](HA-OM-doublebuffer.png + +The details of this approach discussed in a separated [design doc]({{< ref "design/omha.md" >}}) but it's integral part of the OM HA design. + +## References + + * Check [this page]({{< ref "design/omha.md" >}}) for the links to the original design docs + * Ozone distribution contains an example OM HA configuration, under the `compose/ozone-om-ha` directory which can be tested with the help of [docker-compose]({{< ref "start/RunningViaDocker.md" >}}). \ No newline at end of file diff --git a/hadoop-hdds/docs/content/feature/Observability.md b/hadoop-hdds/docs/content/feature/Observability.md new file mode 100644 index 000000000000..2913abd4b125 --- /dev/null +++ b/hadoop-hdds/docs/content/feature/Observability.md @@ -0,0 +1,224 @@ +--- +title: "Observability" +weight: 8 +menu: + main: + parent: Features +summary: Different tools for Ozone to increase Observability +--- + + +Ozone provides multiple tools to get more information about the current state of the cluster. + +## Prometheus + +Ozone has native Prometheus. Each internal metrics (collected by Hadoop metrics framework) published under the `/prom` HTTP context. (For example under http://localhost:9876/prom for SCM). + +The Prometheus endpoint is turned on by default but can be turned off by the `hdds.prometheus.endpoint.enabled` configuration variable. + +In a secure environment the page is guarded with SPNEGO authentication which is not supported by Prometheus. To enable monitoring in a secure environment a specific authentication token cen be configured + +Example `ozone-site.xml`: + +```XML + + hdds.prometheus.endpoint.token + putyourtokenhere + +``` + +Example prometheus configuration: + +```YAML +scrape_configs: + - job_name: ozone + bearer_token: + metrics_path: /prom + static_configs: + - targets: + - "127.0.0.1:9876" +``` + +## Distributed tracing + +Distributed tracing can help to understand performance bottleneck with visualizing end-to-end performance. + +Ozone uses [jaeger](https://jaegertracing.io) tracing library to collect traces which can send tracing data to any compatible backend (Zipkin, ...). + +Tracing is turned off by default, but can be turned on with `hdds.tracing.enabled` from `ozone-site.xml` + +```XML + + hdds.tracing.enabled + true + +``` + +Jager client can be configured with environment variables as documented [here](https://github.com/jaegertracing/jaeger-client-java/blob/master/jaeger-core/README.md): + +For example: + +```shell +JAEGER_SAMPLER_PARAM=0.01 +JAEGER_SAMPLER_TYPE=probabilistic +JAEGER_AGENT_HOST=jaeger +``` + +This configuration will record 1% of the requests to limit the performance overhead. For more information about jaeger sampling [check the documentation](https://www.jaegertracing.io/docs/1.18/sampling/#client-sampling-configuration) + +## ozone insight + +Ozone insight is a swiss-army-knife tool to for checking the current state of Ozone cluster. It can show logging, metrics and configuration for a particular component. + +To check the available components use `ozone insight list`: + +```shell +> ozone insight list + +Available insight points: + + scm.node-manager SCM Datanode management related information. + scm.replica-manager SCM closed container replication manager + scm.event-queue Information about the internal async event delivery + scm.protocol.block-location SCM Block location protocol endpoint + scm.protocol.container-location SCM Container location protocol endpoint + scm.protocol.security SCM Block location protocol endpoint + om.key-manager OM Key Manager + om.protocol.client Ozone Manager RPC endpoint + datanode.pipeline More information about one ratis datanode ring. +``` + +### Configuration + +`ozone insight config` can show configuration related to a specific component (supported only for selected components). + +```shell +> ozone insight config scm.replica-manager + +Configuration for `scm.replica-manager` (SCM closed container replication manager) + +>>> hdds.scm.replication.thread.interval + default: 300s + current: 300s + +There is a replication monitor thread running inside SCM which takes care of replicating the containers in the cluster. This property is used to configure the interval in which that thread runs. + + +>>> hdds.scm.replication.event.timeout + default: 30m + current: 30m + +Timeout for the container replication/deletion commands sent to datanodes. After this timeout the command will be retried. + +``` + +### Metrics + +`ozone insight metrics` can show metrics related to a specific component (supported only for selected components). + + +```shell +> ozone insight metrics scm.protocol.block-location +Metrics for `scm.protocol.block-location` (SCM Block location protocol endpoint) + +RPC connections + + Open connections: 0 + Dropped connections: 0 + Received bytes: 1267 + Sent bytes: 2420 + + +RPC queue + + RPC average queue time: 0.0 + RPC call queue length: 0 + + +RPC performance + + RPC processing time average: 0.0 + Number of slow calls: 0 + + +Message type counters + + Number of AllocateScmBlock: ??? + Number of DeleteScmKeyBlocks: ??? + Number of GetScmInfo: ??? + Number of SortDatanodes: ??? +``` + +### Logs + +`ozone insight logs` can connect to the required service and show the DEBUG/TRACE log related to one specific component. For example to display RPC message: + +```shell +>ozone insight logs om.protocol.client + +[OM] 2020-07-28 12:31:49,988 [DEBUG|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] OzoneProtocol ServiceList request is received +[OM] 2020-07-28 12:31:50,095 [DEBUG|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] OzoneProtocol CreateVolume request is received +``` + +Using `-v` flag the content of the protobuf message can also be displayed (TRACE level log): + +```shell +ozone insight logs -v om.protocol.client + +[OM] 2020-07-28 12:33:28,463 [TRACE|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] [service=OzoneProtocol] [type=CreateVolume] request is received: +cmdType: CreateVolume +traceID: "" +clientId: "client-A31DF5C6ECF2" +createVolumeRequest { + volumeInfo { + adminName: "hadoop" + ownerName: "hadoop" + volume: "vol1" + quotaInBytes: 1152921504606846976 + volumeAcls { + type: USER + name: "hadoop" + rights: "200" + aclScope: ACCESS + } + volumeAcls { + type: GROUP + name: "users" + rights: "200" + aclScope: ACCESS + } + creationTime: 1595939608460 + objectID: 0 + updateID: 0 + modificationTime: 0 + } +} + +[OM] 2020-07-28 12:33:28,474 [TRACE|org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB|OzoneProtocolMessageDispatcher] [service=OzoneProtocol] [type=CreateVolume] request is processed. Response: +cmdType: CreateVolume +traceID: "" +success: false +message: "Volume already exists" +status: VOLUME_ALREADY_EXISTS +``` + + \ No newline at end of file diff --git a/hadoop-hdds/docs/content/shell/_index.md b/hadoop-hdds/docs/content/feature/Recon.md similarity index 52% rename from hadoop-hdds/docs/content/shell/_index.md rename to hadoop-hdds/docs/content/feature/Recon.md index 3cb1a9f61672..7234b809bc7b 100644 --- a/hadoop-hdds/docs/content/shell/_index.md +++ b/hadoop-hdds/docs/content/feature/Recon.md @@ -1,8 +1,10 @@ --- -title: Command Line Interface +title: "Recon" +weight: 7 menu: main: - weight: 3 + parent: Features +summary: Recon is the Web UI and analysis service for Ozone --- +Recon is the Web UI and analytics service for Ozone. It's an optional component, but strongly recommended as it can add additional visibility. -{{}} - Ozone shell is the primary interface to interact with Ozone. - It provides a command shell interface to work against Ozone. -{{}} +Recon collects all the data from an Ozone cluster and **store** them in a SQL database for further analyses. + + 1. Ozone Manager data is downloaded in the background by an async process. A RocksDB snapshots are created on OM side periodically, and the incremental data is copied to Recon and processed. + 2. Datanodes can send Heartbeats not just to SCM but Recon. Recon can be a read-only listener of the Heartbeats and updates the local database based on the received information. \ No newline at end of file diff --git a/hadoop-hdds/docs/content/feature/Topology.md b/hadoop-hdds/docs/content/feature/Topology.md new file mode 100644 index 000000000000..71c289c56d4a --- /dev/null +++ b/hadoop-hdds/docs/content/feature/Topology.md @@ -0,0 +1,108 @@ +--- +title: "Topology awareness" +weight: 2 +menu: + main: + parent: Features +summary: Configuration for rack-awarness for improved read/write +--- + + +Ozone can use topology related information (for example rack placement) to optimize read and write pipelines. To get full rack-aware cluster, Ozone requires three different configuration. + + 1. The topology information should be configured by Ozone. + 2. Topology related information should be used when Ozone chooses 3 different datanodes for a specific pipeline/container. (WRITE) + 3. When Ozone reads a Key it should prefer to read from the closest node. + + + +## Topology hierarchy + +Topology hierarchy can be configured with using `net.topology.node.switch.mapping.impl` configuration key. This configuration should define an implementation of the `org.apache.hadoop.net.CachedDNSToSwitchMapping`. As this is a Hadoop class, the configuration is exactly the same as the Hadoop Configuration + +### Static list + +Static list can be configured with the help of ```TableMapping```: + +```XML + + net.topology.node.switch.mapping.impl + org.apache.hadoop.net.TableMapping + + + net.topology.table.file.name + /opt/hadoop/compose/ozone-topology/network-config + +``` + +The second configuration option should point to a text file. The file format is a two column text file, with columns separated by whitespace. The first column is a DNS or IP address and the second column specifies the rack where the address maps. If no entry corresponding to a host in the cluster is found, then `/default-rack` is assumed. + +### Dynamic list + +Rack information can be identified with the help of an external script: + + +```XML + + net.topology.node.switch.mapping.impl + org.apache.hadoop.net.TableMapping + + + org.apache.hadoop.net.ScriptBasedMapping + /usr/local/bin/rack.sh + +``` + +If implementing an external script, it will be specified with the `net.topology.script.file.name` parameter in the configuration files. Unlike the java class, the external topology script is not included with the Ozone distribution and is provided by the administrator. Ozone will send multiple IP addresses to ARGV when forking the topology script. The number of IP addresses sent to the topology script is controlled with `net.topology.script.number.args` and defaults to 100. If `net.topology.script.number.args` was changed to 1, a topology script would get forked for each IP submitted. + +## Write path + +Placement of the closed containers can be configured with `ozone.scm.container.placement.impl` configuration key. The available container placement policies can be found in the `org.apache.hdds.scm.container.placement` [package](https://github.com/apache/hadoop-ozone/tree/master/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/placement/algorithms). + +By default the `SCMContainerPlacementRandom` is used for topology-awareness the `SCMContainerPlacementRackAware` can be used: + +```XML + + ozone.scm.container.placement.impl + org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware + +``` + +This placement policy complies with the algorithm used in HDFS. With default 3 replica, two replicas will be on the same rack, the third one will on a different rack. + +This implementation applies to network topology like "/rack/node". Don't recommend to use this if the network topology has more layers. + +## Read path + +Finally the read path also should be configured to read the data from the closest pipeline. + +```XML + + ozone.network.topology.aware.read + true + +``` + +## References + + * Hadoop documentation about `net.topology.node.switch.mapping.impl`: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/RackAwareness.html + * [Design doc]({{< ref "design/topology.md">}}) \ No newline at end of file diff --git a/hadoop-hdds/docs/content/gdpr/_index.md b/hadoop-hdds/docs/content/feature/_index.md similarity index 80% rename from hadoop-hdds/docs/content/gdpr/_index.md rename to hadoop-hdds/docs/content/feature/_index.md index 017206e9fbcd..2b30d3ddb37c 100644 --- a/hadoop-hdds/docs/content/gdpr/_index.md +++ b/hadoop-hdds/docs/content/feature/_index.md @@ -1,9 +1,8 @@ --- -title: GDPR -name: GDPR -identifier: gdpr +title: Features +name: Features menu: main -weight: 5 +weight: 4 --- - - -Enabling GDPR compliance in Ozone is very straight forward. During bucket -creation, you can specify `--enforcegdpr=true` or `-g=true` and this will -ensure the bucket is GDPR compliant. Thus, any key created under this bucket -will automatically be GDPR compliant. - -GDPR can only be enabled on a new bucket. For existing buckets, you would -have to create a new GDPR compliant bucket and copy data from old bucket into - new bucket to take advantage of GDPR. - -Example to create a GDPR compliant bucket: - -`ozone sh bucket create --enforcegdpr=true /hive/jan` - -`ozone sh bucket create -g=true /hive/jan` - -If you want to create an ordinary bucket then you can skip `--enforcegdpr` -and `-g` flags. \ No newline at end of file diff --git a/hadoop-hdds/docs/content/interface/CSI.md b/hadoop-hdds/docs/content/interface/CSI.md index c7046d09f898..d1971c14b7f2 100644 --- a/hadoop-hdds/docs/content/interface/CSI.md +++ b/hadoop-hdds/docs/content/interface/CSI.md @@ -1,6 +1,9 @@ --- title: CSI Protocol -weight: 3 +weight: 6 +menu: + main: + parent: "Client Interfaces" summary: Ozone supports Container Storage Interface(CSI) protocol. You can use Ozone by mounting an Ozone volume by Ozone CSI. --- @@ -21,10 +24,18 @@ summary: Ozone supports Container Storage Interface(CSI) protocol. You can use O limitations under the License. --> -`Container Storage Interface` (CSI) will enable storage vendors (SP) to develop a plugin once and have it work across a number of container orchestration (CO) systems. +`Container Storage Interface` (CSI) will enable storage vendors (SP) to develop a plugin once and have it work across a number of container orchestration (CO) systems like Kubernetes or Yarn. To get more information about CSI at [SCI spec](https://github.com/container-storage-interface/spec/blob/master/spec.md) +CSI defined a simple GRPC interface with 3 interfaces (Identity, Controller, Node). It defined how the Container Orchestrator can request the creation of a new storage space or the mount of the newly created storage but doesn't define how the storage can be mounted. + +![CSI](CSI.png) + +By default Ozone CSI service uses a S3 fuse driver ([goofys](https://github.com/kahing/goofys)) to mount the created Ozone bucket. Implementation of other mounting options such as a dedicated NFS server or native Fuse driver is work in progress. + + + Ozone CSI is an implementation of CSI, it can make possible of using Ozone as a storage volume for a container. ## Getting started diff --git a/hadoop-hdds/docs/content/interface/CSI.png b/hadoop-hdds/docs/content/interface/CSI.png new file mode 100644 index 000000000000..38720c3019cf Binary files /dev/null and b/hadoop-hdds/docs/content/interface/CSI.png differ diff --git a/hadoop-hdds/docs/content/interface/Cli.md b/hadoop-hdds/docs/content/interface/Cli.md new file mode 100644 index 000000000000..d65d573c0074 --- /dev/null +++ b/hadoop-hdds/docs/content/interface/Cli.md @@ -0,0 +1,208 @@ +--- +title: Command Line Interface +weight: 4 +menu: + main: + parent: "Client Interfaces" +--- + + + +Ozone shell is the primary interface to interact with Ozone from the command line. Behind the scenes it uses the [Java API]({{< ref "interface/JavaApi.md">}}). + + There are some functionality which couldn't be accessed without using `ozone sh` commands. For example: + + 1. Creating volumes with quota + 2. Managing internal ACLs + 3. Creating buckets with encryption key + +All of these are one-time, administration tasks. Applications can use Ozone without this CLI using other interface like Hadoop Compatible File System (o3fs or ofs) or S3 interface. + + +Ozone shell help can be invoked at _object_ level or at _action_ level. + +For example: + +```bash +ozone sh volume --help +``` + +will show all possible actions for volumes. + +Or it can be invoked to explain a specific action like: + +```bash +ozone sh volume create --help +``` + +which will print the command line options of the `create` command for volumes. + +## General Command Format + +Ozone shell commands take the following form: + +> _ozone sh object action url_ + +**ozone** script is used to invoke all Ozone sub-commands. The ozone shell is +invoked via ```sh``` command. + +Object can be volume, bucket or key. Actions are various verbs like +create, list, delete etc. + +Depending on the action, Ozone URL can point to a volume, bucket or key in the following format: + +_\[schema\]\[server:port\]/volume/bucket/key_ + + +Where, + +1. **Schema** - This should be `o3` which is the native RPC protocol to access + Ozone API. The usage of the schema is optional. + +2. **Server:Port** - This is the address of the Ozone Manager. If the port is +omitted the default port from ozone-site.xml will be used. + +Please see volume commands, bucket commands, and key commands section for more +detail. + +## Volume operations + +Volume is the top level element of the hierarchy, managed only by administrators. Optionally, quota and the owner user can be specified. + +Example commands: + +```shell +$ ozone sh volume create /vol1 +``` + +```shell +$ ozone sh volume info /vol1 +{ + "metadata" : { }, + "name" : "vol1", + "admin" : "hadoop", + "owner" : "hadoop", + "creationTime" : "2020-07-28T12:31:50.112Z", + "modificationTime" : "2020-07-28T12:31:50.112Z", + "acls" : [ { + "type" : "USER", + "name" : "hadoop", + "aclScope" : "ACCESS", + "aclList" : [ "ALL" ] + }, { + "type" : "GROUP", + "name" : "users", + "aclScope" : "ACCESS", + "aclList" : [ "ALL" ] + } ], + "quota" : 1152921504606846976 +} +``` + +```shell +$ ozone sh volume list / +{ + "metadata" : { }, + "name" : "s3v", + "admin" : "hadoop", + "owner" : "hadoop", + "creationTime" : "2020-07-27T11:32:22.314Z", + "modificationTime" : "2020-07-27T11:32:22.314Z", + "acls" : [ { + "type" : "USER", + "name" : "hadoop", + "aclScope" : "ACCESS", + "aclList" : [ "ALL" ] + }, { + "type" : "GROUP", + "name" : "users", + "aclScope" : "ACCESS", + "aclList" : [ "ALL" ] + } ], + "quota" : 1152921504606846976 +} +.... +``` +## Bucket operations + +Bucket is the second level of the object hierarchy, and is similar to AWS S3 buckets. Users can create buckets in volumes, if they have the necessary permissions. + +Command examples: + +```shell +$ ozone sh bucket create /vol1/bucket1 +```shell + +```shell +$ ozone sh bucket info /vol1/bucket1 +{ + "metadata" : { }, + "volumeName" : "vol1", + "name" : "bucket1", + "storageType" : "DISK", + "versioning" : false, + "creationTime" : "2020-07-28T13:14:45.091Z", + "modificationTime" : "2020-07-28T13:14:45.091Z", + "encryptionKeyName" : null, + "sourceVolume" : null, + "sourceBucket" : null +} +``` + +[Transparent Data Encryption]({{< ref "security/SecuringTDE.md" >}}) can be enabled at the bucket level. + +## Key operations + +Key is the object which can store the data. + +```shell +$ ozone sh key put /vol1/bucket1/README.md README.md +``` + + + + + +```shell +$ ozone sh key info /vol1/bucket1/README.md +{ + "volumeName" : "vol1", + "bucketName" : "bucket1", + "name" : "README.md", + "dataSize" : 3841, + "creationTime" : "2020-07-28T13:17:20.749Z", + "modificationTime" : "2020-07-28T13:17:21.979Z", + "replicationType" : "RATIS", + "replicationFactor" : 1, + "ozoneKeyLocations" : [ { + "containerID" : 1, + "localID" : 104591670688743424, + "length" : 3841, + "offset" : 0 + } ], + "metadata" : { }, + "fileEncryptionInfo" : null +} +``` + +```shell +$ ozone sh key get /vol1/bucket1/README.md /tmp/ +``` diff --git a/hadoop-hdds/docs/content/interface/JavaApi.md b/hadoop-hdds/docs/content/interface/JavaApi.md index bb18068f4000..2a97922d7415 100644 --- a/hadoop-hdds/docs/content/interface/JavaApi.md +++ b/hadoop-hdds/docs/content/interface/JavaApi.md @@ -1,7 +1,10 @@ --- title: "Java API" date: "2017-09-14" -weight: 1 +weight: 5 +menu: + main: + parent: "Client Interfaces" summary: Ozone has a set of Native RPC based APIs. This is the lowest level API's on which all other protocols are built. This is the most performant and feature-full of all Ozone protocols. --- The Hadoop compatible file system interface allows storage backends like Ozone -to be easily integrated into Hadoop eco-system. Ozone file system is an -Hadoop compatible file system. Currently, Ozone supports two scheme: o3fs and ofs. -The biggest difference between the o3fs and ofs,is that o3fs supports operations -only at a single bucket, while ofs supports operations across all volumes and buckets. -you can Refer to "Differences from existing o3FS "in ofs.md for details of the differences. +to be easily integrated into Hadoop eco-system. Ozone file system is an +Hadoop compatible file system. + + ## Setting up the o3fs @@ -43,7 +52,7 @@ Once this is created, please make sure that bucket exists via the _list volume_ Please add the following entry to the core-site.xml. -{{< highlight xml >}} +```XML fs.AbstractFileSystem.o3fs.impl org.apache.hadoop.fs.ozone.OzFs @@ -52,7 +61,7 @@ Please add the following entry to the core-site.xml. fs.defaultFS o3fs://bucket.volume -{{< /highlight >}} +``` This will make this bucket to be the default Hadoop compatible file system and register the o3fs file system type. @@ -116,55 +125,3 @@ hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:6789/key Note: Only port number from the config is used in this case, whereas the host name in the config `ozone.om.address` is ignored. -## Setting up the ofs -This is just a general introduction. For more detailed usage, you can refer to ofs.md. - -Please add the following entry to the core-site.xml. - -{{< highlight xml >}} - - fs.ofs.impl - org.apache.hadoop.fs.ozone.RootedOzoneFileSystem - - - fs.defaultFS - ofs://om-host.example.com/ - -{{< /highlight >}} - -This will make all the volumes and buckets to be the default Hadoop compatible file system and register the ofs file system type. - -You also need to add the ozone-filesystem-hadoop3.jar file to the classpath: - -{{< highlight bash >}} -export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-hadoop3-*.jar:$HADOOP_CLASSPATH -{{< /highlight >}} - -(Note: with Hadoop 2.x, use the `hadoop-ozone-filesystem-hadoop2-*.jar`) - -Once the default Filesystem has been setup, users can run commands like ls, put, mkdir, etc. -For example: - -{{< highlight bash >}} -hdfs dfs -ls / -{{< /highlight >}} - -Note that ofs works on all buckets and volumes. Users can create buckets and volumes using mkdir, such as create volume named volume1 and bucket named bucket1: - -{{< highlight bash >}} -hdfs dfs -mkdir /volume1 -hdfs dfs -mkdir /volume1/bucket1 -{{< /highlight >}} - - -Or use the put command to write a file to the bucket. - -{{< highlight bash >}} -hdfs dfs -put /etc/hosts /volume1/bucket1/test -{{< /highlight >}} - -For more usage, see: https://issues.apache.org/jira/secure/attachment/12987636/Design%20ofs%20v1.pdf - -## Special note - -Trash is disabled even if `fs.trash.interval` is set on purpose. (HDDS-3982) diff --git a/hadoop-hdds/docs/content/interface/OzoneFS.zh.md b/hadoop-hdds/docs/content/interface/O3fs.zh.md similarity index 97% rename from hadoop-hdds/docs/content/interface/OzoneFS.zh.md rename to hadoop-hdds/docs/content/interface/O3fs.zh.md index 996991962c75..0b2a06f32181 100644 --- a/hadoop-hdds/docs/content/interface/OzoneFS.zh.md +++ b/hadoop-hdds/docs/content/interface/O3fs.zh.md @@ -21,6 +21,12 @@ summary: Hadoop 文件系统兼容使得任何使用类 HDFS 接口的应用无 limitations under the License. --> +
+ +注意:本页面翻译的信息可能滞后,最新的信息请参看英文版的相关页面。 + +
+ Hadoop 的文件系统接口兼容可以让任意像 Ozone 这样的存储后端轻松地整合进 Hadoop 生态系统,Ozone 文件系统就是一个兼容 Hadoop 的文件系统。 目前ozone支持两种协议: o3fs和ofs。两者最大的区别是o3fs只支持在单个bucket上操作,而ofs则支持跨所有volume和bucket的操作。关于两者在操作 上的具体区别可以参考ofs.md中的"Differences from existing o3fs"。 diff --git a/hadoop-hdds/docs/content/interface/Ofs.md b/hadoop-hdds/docs/content/interface/Ofs.md new file mode 100644 index 000000000000..fcc1467a7102 --- /dev/null +++ b/hadoop-hdds/docs/content/interface/Ofs.md @@ -0,0 +1,227 @@ +--- +title: Ofs (Hadoop compatible) +date: 2017-09-14 +weight: 1 +menu: + main: + parent: "Client Interfaces" +summary: Hadoop Compatible file system allows any application that expects an HDFS like interface to work against Ozone with zero changes. Frameworks like Apache Spark, YARN and Hive work against Ozone without needing any change. **Global level view.** +--- + + +The Hadoop compatible file system interface allows storage backends like Ozone +to be easily integrated into Hadoop eco-system. Ozone file system is an +Hadoop compatible file system. + + + + +## The Basics + +Examples of valid OFS paths: + +``` +ofs://om1/ +ofs://om3:9862/ +ofs://omservice/ +ofs://omservice/volume1/ +ofs://omservice/volume1/bucket1/ +ofs://omservice/volume1/bucket1/dir1 +ofs://omservice/volume1/bucket1/dir1/key1 + +ofs://omservice/tmp/ +ofs://omservice/tmp/key1 +``` + +Volumes and mount(s) are located at the root level of an OFS Filesystem. +Buckets are listed naturally under volumes. +Keys and directories are under each buckets. + +Note that for mounts, only temp mount `/tmp` is supported at the moment. + +## Configuration + + +Please add the following entry to the core-site.xml. + +{{< highlight xml >}} + + fs.ofs.impl + org.apache.hadoop.fs.ozone.RootedOzoneFileSystem + + + fs.defaultFS + ofs://om-host.example.com/ + +{{< /highlight >}} + +This will make all the volumes and buckets to be the default Hadoop compatible file system and register the ofs file system type. + +You also need to add the ozone-filesystem-hadoop3.jar file to the classpath: + +{{< highlight bash >}} +export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-hadoop3-*.jar:$HADOOP_CLASSPATH +{{< /highlight >}} + +(Note: with Hadoop 2.x, use the `hadoop-ozone-filesystem-hadoop2-*.jar`) + +Once the default Filesystem has been setup, users can run commands like ls, put, mkdir, etc. +For example: + +{{< highlight bash >}} +hdfs dfs -ls / +{{< /highlight >}} + +Note that ofs works on all buckets and volumes. Users can create buckets and volumes using mkdir, such as create volume named volume1 and bucket named bucket1: + +{{< highlight bash >}} +hdfs dfs -mkdir /volume1 +hdfs dfs -mkdir /volume1/bucket1 +{{< /highlight >}} + + +Or use the put command to write a file to the bucket. + +{{< highlight bash >}} +hdfs dfs -put /etc/hosts /volume1/bucket1/test +{{< /highlight >}} + +For more usage, see: https://issues.apache.org/jira/secure/attachment/12987636/Design%20ofs%20v1.pdf + +## Special note + +Trash is disabled even if `fs.trash.interval` is set on purpose. (HDDS-3982) + +## Differences from [o3fs]({{< ref "interface/O3fs.md" >}}) + +### Creating files + +OFS doesn't allow creating keys(files) directly under root or volumes. +Users will receive an error message when they try to do that: + +``` +$ ozone fs -touch /volume1/key1 +touch: Cannot create file under root or volume. +``` + +### Simplify fs.defaultFS + +With OFS, fs.defaultFS (in core-site.xml) no longer needs to have a specific +volume and bucket in its path like o3fs did. +Simply put the OM host or service ID (in case of HA): + +``` + +fs.defaultFS +ofs://omservice + +``` + +The client would then be able to access every volume and bucket on the cluster +without specifying the hostname or service ID. + +``` +$ ozone fs -mkdir -p /volume1/bucket1 +``` + +### Volume and bucket management directly from FileSystem shell + +Admins can create and delete volumes and buckets easily with Hadoop FS shell. +Volumes and buckets are treated similar to directories so they will be created +if they don't exist with `-p`: + +``` +$ ozone fs -mkdir -p ofs://omservice/volume1/bucket1/dir1/ +``` + +Note that the supported volume and bucket name character set rule still applies. +For instance, bucket and volume names don't take underscore(`_`): + +``` +$ ozone fs -mkdir -p /volume_1 +mkdir: Bucket or Volume name has an unsupported character : _ +``` + +## Mounts + +In order to be compatible with legacy Hadoop applications that use /tmp/, +we have a special temp mount located at the root of the FS. +This feature may be expanded in the feature to support custom mount paths. + +Important: To use it, first, an **admin** needs to create the volume tmp +(the volume name is hardcoded for now) and set its ACL to world ALL access. +Namely: + +``` +$ ozone sh volume create tmp +$ ozone sh volume setacl tmp -al world::a +``` + +These commands only needs to be done **once per cluster**. + +Then, **each user** needs to mkdir first to initialize their own temp bucket +once. + +``` +$ ozone fs -mkdir /tmp +2020-06-04 00:00:00,050 [main] INFO rpc.RpcClient: Creating Bucket: tmp/0238 ... +``` + +After that they can write to it just like they would do to a regular +directory. e.g.: + +``` +$ ozone fs -touch /tmp/key1 +``` + +## Delete with trash enabled + +When keys are deleted with trash enabled, they are moved to a trash directory +under each bucket, because keys aren't allowed to be moved(renamed) between +buckets in Ozone. + +``` +$ ozone fs -rm /volume1/bucket1/key1 +2020-06-04 00:00:00,100 [main] INFO fs.TrashPolicyDefault: Moved: 'ofs://id1/volume1/bucket1/key1' to trash at: ofs://id1/volume1/bucket1/.Trash/hadoop/Current/volume1/bucket1/key1 +``` + +This is very similar to how the HDFS encryption zone handles trash location. + +## Recursive listing + +OFS supports recursive volume, bucket and key listing. + +i.e. `ozone fs -ls -R ofs://omservice/`` will recursively list all volumes, +buckets and keys the user has LIST permission to if ACL is enabled. +If ACL is disabled, the command would just list literally everything on that +cluster. + +This feature wouldn't degrade server performance as the loop is on the client. +Think it as a client is issuing multiple requests to the server to get all the +information. + +## Special note + +Trash is disabled even if `fs.trash.interval` is set on purpose. (HDDS-3982) diff --git a/hadoop-hdds/docs/content/interface/S3.md b/hadoop-hdds/docs/content/interface/S3.md index 1be0137942ef..d2145440f06d 100644 --- a/hadoop-hdds/docs/content/interface/S3.md +++ b/hadoop-hdds/docs/content/interface/S3.md @@ -1,6 +1,9 @@ --- title: S3 Protocol weight: 3 +menu: + main: + parent: "Client Interfaces" summary: Ozone supports Amazon's Simple Storage Service (S3) protocol. In fact, You can use S3 clients and S3 SDK based applications without any modifications with Ozone. --- @@ -110,6 +113,24 @@ export AWS_SECRET_ACCESS_KEY=c261b6ecabf7d37d5f9ded654b1c724adac9bd9f13e247a235e aws s3api --endpoint http://localhost:9878 create-bucket --bucket bucket1 ``` +## Expose any volume + +Ozone has one more element in the name-space hierarchy compared to S3: the volumes. By default, all the buckets of the `/s3v` volume can be accessed with S3 interface but only the (Ozone) buckets of the `/s3v` volumes are exposed. + +To make any other buckets available with the S3 interface a "symbolic linked" bucket can be created: + +```bash +ozone sh create volume /s3v +ozone sh create volume /vol1 + +ozone sh create bucket /vol1/bucket1 +ozone sh bucket link /vol1/bucket1 /s3v/common-bucket +``` + +This example expose the `/vol1/bucket1` Ozone bucket as an S3 compatible `common-bucket` via the S3 interface. + +(Note: the implementation details of the bucket-linking feature can be found in the [design doc]({{< ref "design/volume-management.md">}})) + ## Clients ### AWS Cli diff --git a/hadoop-hdds/docs/content/interface/_index.md b/hadoop-hdds/docs/content/interface/_index.md index 254864732fb8..40ca5e7b249b 100644 --- a/hadoop-hdds/docs/content/interface/_index.md +++ b/hadoop-hdds/docs/content/interface/_index.md @@ -1,8 +1,8 @@ --- -title: "Programming Interfaces" +title: "Client Interfaces" menu: main: - weight: 4 + weight: 5 --- - -Ozone shell supports the following bucket commands. - - * [create](#create) - * [delete](#delete) - * [info](#info) - * [list](#list) - -### Create - -The `bucket create` command allows users to create a bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| -g, \-\-enforcegdpr | Optional, if set to true it creates a GDPR compliant bucket, if not specified or set to false, it creates an ordinary bucket. -| -k, \-\-bucketKey | Optional, if a bucket encryption key name from the configured KMS server is specified, the files in the bucket will be transparently encrypted. Instruction on KMS configuration can be found from Hadoop KMS document. -| Uri | The name of the bucket in **/volume/bucket** format. - - -{{< highlight bash >}} -ozone sh bucket create /hive/jan -{{< /highlight >}} - -The above command will create a bucket called _jan_ in the _hive_ volume. -Since no scheme was specified this command defaults to O3 (RPC) protocol. - -### Delete - -The `bucket delete` command allows users to delete a bucket. If the -bucket is not empty then this command will fail. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the bucket - -{{< highlight bash >}} -ozone sh bucket delete /hive/jan -{{< /highlight >}} - -The above command will delete _jan_ bucket if it is empty. - -### Info - -The `bucket info` commands returns the information about the bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the bucket. - -{{< highlight bash >}} -ozone sh bucket info /hive/jan -{{< /highlight >}} - -The above command will print out the information about _jan_ bucket. - -### List - -The `bucket list` command allows users to list the buckets in a volume. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| -l, \-\-length | Maximum number of results to return. Default: 100 -| -p, \-\-prefix | Optional, Only buckets that match this prefix will be returned. -| -s, \-\-start | The listing will start from key after the start key. -| Uri | The name of the _volume_. - -{{< highlight bash >}} -ozone sh bucket list /hive -{{< /highlight >}} - -This command will list all buckets on the volume _hive_. diff --git a/hadoop-hdds/docs/content/shell/BucketCommands.zh.md b/hadoop-hdds/docs/content/shell/BucketCommands.zh.md deleted file mode 100644 index 9afd28079c20..000000000000 --- a/hadoop-hdds/docs/content/shell/BucketCommands.zh.md +++ /dev/null @@ -1,98 +0,0 @@ ---- -title: 桶命令 -summary: 用桶命令管理桶的生命周期 -weight: 3 ---- - - -Ozone shell 提供以下桶命令: - - * [创建](#创建) - * [删除](#删除) - * [查看](#查看) - * [列举](#列举) - -### 创建 - -用户使用 `bucket create` 命令来创建桶。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| -g, \-\-enforcegdpr | 可选,如果设置为 true 则创建符合 GDPR 规范的桶,设置为 false 或不指定则创建普通的桶| -| -k, \-\-bucketKey | 可选,如果指定了 KMS 服务器中的桶加密密钥名,该桶中的文件都会被自动加密,KMS 的配置说明可以参考 Hadoop KMS 文档。 -| Uri | 桶名,格式为 **/volume/bucket** | - - -{{< highlight bash >}} -ozone sh bucket create /hive/jan -{{< /highlight >}} - -上述命令会在 _hive_ 卷中创建一个名为 _jan_ 的桶,因为没有指定 scheme,默认使用 O3(RPC)协议。 - -### 删除 - -用户使用 `bucket delete` 命令来删除桶,如果桶不为空,此命令将失败。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 桶名 | - -{{< highlight bash >}} -ozone sh bucket delete /hive/jan -{{< /highlight >}} - -如果 _jan_ 桶不为空,上述命令会将其删除。 - -### 查看 - -`bucket info` 命令返回桶的信息。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 桶名 | - -{{< highlight bash >}} -ozone sh bucket info /hive/jan -{{< /highlight >}} - -上述命令会打印出 _jan_ 桶的有关信息。 - -### 列举 - -用户通过 `bucket list` 命令列举一个卷下的所有桶。 - -***参数:*** - -| 参数 | 说明 | -|--------------------------------|-----------------------------------------| -| -l, \-\-length | 返回结果的最大数量,默认为 100 -| -p, \-\-prefix | 可选,只有匹配指定前缀的桶会被返回 -| -s, \-\-start | 从指定键开始列举 -| Uri | 卷名 - -{{< highlight bash >}} -ozone sh bucket list /hive -{{< /highlight >}} - -此命令会列出 _hive_ 卷中的所有桶。 diff --git a/hadoop-hdds/docs/content/shell/Format.md b/hadoop-hdds/docs/content/shell/Format.md deleted file mode 100644 index d6c9d2f51802..000000000000 --- a/hadoop-hdds/docs/content/shell/Format.md +++ /dev/null @@ -1,69 +0,0 @@ ---- -title: Shell Overview -summary: Explains the command syntax used by shell command. -weight: 1 ---- - - -Ozone shell help can be invoked at _object_ level or at _action_ level. -For example: - -{{< highlight bash >}} -ozone sh volume --help -{{< /highlight >}} - -This will show all possible actions for volumes. - -or it can be invoked to explain a specific action like -{{< highlight bash >}} -ozone sh volume create --help -{{< /highlight >}} -This command will give you command line options of the create command. - -

- - -### General Command Format - -The Ozone shell commands take the following format. - -> _ozone sh object action url_ - -**ozone** script is used to invoke all Ozone sub-commands. The ozone shell is -invoked via ```sh``` command. - -The object can be a volume, bucket or a key. The action is various verbs like -create, list, delete etc. - - -Ozone URL can point to a volume, bucket or keys in the following format: - -_\[schema\]\[server:port\]/volume/bucket/key_ - - -Where, - -1. **Schema** - This should be `o3` which is the native RPC protocol to access - Ozone API. The usage of the schema is optional. - -2. **Server:Port** - This is the address of the Ozone Manager. If the port is -omitted the default port from ozone-site.xml will be used. - -Depending on the call, the volume/bucket/key names will be part of the URL. -Please see volume commands, bucket commands, and key commands section for more -detail. diff --git a/hadoop-hdds/docs/content/shell/Format.zh.md b/hadoop-hdds/docs/content/shell/Format.zh.md deleted file mode 100644 index edfcbdc24a49..000000000000 --- a/hadoop-hdds/docs/content/shell/Format.zh.md +++ /dev/null @@ -1,65 +0,0 @@ ---- -title: Shell 概述 -summary: shell 命令的语法介绍。 -weight: 1 ---- - - -Ozone shell 的帮助命令既可以在 _对象_ 级别调用,也可以在 _操作_ 级别调用。 -比如: - -{{< highlight bash >}} -ozone sh volume --help -{{< /highlight >}} - -此命令会列出所有对卷的可能操作。 - -你也可以通过它查看特定操作的帮助,比如: - -{{< highlight bash >}} -ozone sh volume create --help -{{< /highlight >}} - -这条命令会给出 create 命令的命令行选项。 - -

- - -### 通用命令格式 - -Ozone shell 命令都遵照以下格式: - -> _ozone sh object action url_ - -**ozone** 脚本用来调用所有 Ozone 子命令,ozone shell 通过 ```sh``` 子命令调用。 - -对象可以是卷、桶或键,操作一般是各种动词,比如 create、list、delete 等等。 - - -Ozone URL 可以指向卷、桶或键,格式如下: - -_\[schema\]\[server:port\]/volume/bucket/key_ - - -其中, - -1. **Schema** - 可选,默认为 `o3`,表示使用原生 RPC 协议来访问 Ozone API。 - -2. **Server:Port** - OM 的地址,如果省略了端口, 则使用 ozone-site.xml 中的默认端口。 - -根据具体的命令不同,卷名、桶名和键名将用来构成 URL,卷、桶和键命令的文档有更多具体的说明。 diff --git a/hadoop-hdds/docs/content/shell/KeyCommands.md b/hadoop-hdds/docs/content/shell/KeyCommands.md deleted file mode 100644 index 11186c422184..000000000000 --- a/hadoop-hdds/docs/content/shell/KeyCommands.md +++ /dev/null @@ -1,177 +0,0 @@ ---- -title: Key Commands -summary: Key commands help you to manage the life cycle of - Keys / Objects. -weight: 4 ---- - - - -Ozone shell supports the following key commands. - - * [get](#get) - * [put](#put) - * [delete](#delete) - * [info](#info) - * [list](#list) - * [rename](#rename) - * [cat](#cat) - * [copy](#cp) - - -### Get - -The `key get` command downloads a key from Ozone cluster to local file system. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the key in **/volume/bucket/key** format. -| FileName | Local file to download the key to. - - -{{< highlight bash >}} -ozone sh key get /hive/jan/sales.orc sales.orc -{{< /highlight >}} -Downloads the file sales.orc from the _/hive/jan_ bucket and writes to the -local file sales.orc. - -### Put - -The `key put` command uploads a file from the local file system to the specified bucket. - -***Params:*** - - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the key in **/volume/bucket/key** format. -| FileName | Local file to upload. -| -r, \-\-replication | Optional, Number of copies, ONE or THREE are the options. Picks up the default from cluster configuration. -| -t, \-\-type | Optional, replication type of the new key. RATIS and STAND_ALONE are the options. Picks up the default from cluster configuration. - -{{< highlight bash >}} -ozone sh key put /hive/jan/corrected-sales.orc sales.orc -{{< /highlight >}} -The above command will put the sales.orc as a new key into _/hive/jan/corrected-sales.orc_. - -### Delete - -The `key delete` command removes the key from the bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the key. - -{{< highlight bash >}} -ozone sh key delete /hive/jan/corrected-sales.orc -{{< /highlight >}} - -The above command deletes the key _/hive/jan/corrected-sales.orc_. - - -### Info - -The `key info` commands returns the information about the key. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the key. - -{{< highlight bash >}} -ozone sh key info /hive/jan/sales.orc -{{< /highlight >}} - -The above command will print out the information about _/hive/jan/sales.orc_ -key. - -### List - -The `key list` command allows user to list all keys in a bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| -l, \-\-length | Maximum number of results to return. Default: 100 -| -p, \-\-prefix | Optional, Only keys that match this prefix will be returned. -| -s, \-\-start | The listing will start from key after the start key. -| Uri | The name of the _volume_. - -{{< highlight bash >}} -ozone sh key list /hive/jan -{{< /highlight >}} - -This command will list all keys in the bucket _/hive/jan_. - -### Rename - -The `key rename` command changes the name of an existing key in the specified bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the bucket in **/volume/bucket** format. -| FromKey | The existing key to be renamed -| ToKey | The new desired name of the key - -{{< highlight bash >}} -ozone sh key rename /hive/jan sales.orc new_name.orc -{{< /highlight >}} -The above command will rename _sales.orc_ to _new\_name.orc_ in the bucket _/hive/jan_. - -### Cat - -The `key cat` command displays the contents of a specific Ozone key to standard output. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the key in **/volume/bucket/key** format. - - -{{< highlight bash >}} -ozone sh key cat /hive/jan/hello.txt -{{< /highlight >}} -Displays the contents of the key hello.txt from the _/hive/jan_ bucket to standard output. - -### Cp - -The `key cp` command copies a key to another one in the specified bucket. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the bucket in **/volume/bucket** format. -| FromKey | The existing key to be copied -| ToKey | The name of the new key -| -r, \-\-replication | Optional, Number of copies, ONE or THREE are the options. Picks up the default from cluster configuration. -| -t, \-\-type | Optional, replication type of the new key. RATIS and STAND_ALONE are the options. Picks up the default from cluster configuration. - -{{< highlight bash >}} -ozone sh key cp /hive/jan sales.orc new_one.orc -{{< /highlight >}} -The above command will copy _sales.orc_ to _new\_one.orc_ in the bucket _/hive/jan_. \ No newline at end of file diff --git a/hadoop-hdds/docs/content/shell/KeyCommands.zh.md b/hadoop-hdds/docs/content/shell/KeyCommands.zh.md deleted file mode 100644 index 2a36e7324f31..000000000000 --- a/hadoop-hdds/docs/content/shell/KeyCommands.zh.md +++ /dev/null @@ -1,176 +0,0 @@ ---- -title: 键命令 -summary: 用键命令管理键/对象的生命周期 -weight: 4 ---- - - - -Ozone shell 提供以下键命令: - - * [下载](#下载) - * [上传](#上传) - * [删除](#删除) - * [查看](#查看) - * [列举](#列举) - * [重命名](#重命名) - * [Cat](#cat) - * [Cp](#cp) - - -### 下载 - -`key get` 命令从 Ozone 集群下载一个键到本地文件系统。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 键名,格式为 **/volume/bucket/key** -| FileName | 下载到本地后的文件名 - - -{{< highlight bash >}} -ozone sh key get /hive/jan/sales.orc sales.orc -{{< /highlight >}} - -从 _/hive/jan_ 桶中下载 sales.orc 文件,写入到本地名为 sales.orc 的文件。 - -### 上传 - -`key put` 命令从本地文件系统上传一个文件到指定的桶。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 键名,格式为 **/volume/bucket/key** -| FileName | 待上传的本地文件 -| -r, \-\-replication | 可选,上传后的副本数,合法值为 ONE 或者 THREE,如果不设置,将采用集群配置中的默认值。 -| -t, \-\-type | 可选,副本类型,合法值为 RATIS 或 STAND_ALONE,如果不设置,将采用集群配置中的默认值。 - -{{< highlight bash >}} -ozone sh key put /hive/jan/corrected-sales.orc sales.orc -{{< /highlight >}} - -上述命令将 sales.orc 文件作为新键上传到 _/hive/jan/corrected-sales.orc_ 。 - -### 删除 - -`key delete` 命令用来从桶中删除指定键。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 键名 - -{{< highlight bash >}} -ozone sh key delete /hive/jan/corrected-sales.orc -{{< /highlight >}} - -上述命令会将 _/hive/jan/corrected-sales.orc_ 这个键删除。 - - -### 查看 - -`key info` 命令返回指定键的信息。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 键名 - -{{< highlight bash >}} -ozone sh key info /hive/jan/sales.orc -{{< /highlight >}} - -上述命令会打印出 _/hive/jan/sales.orc_ 键的相关信息。 - -### 列举 - -用户通过 `key list` 命令列出一个桶中的所有键。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| -l, \-\-length | 返回结果的最大数量,默认值为 100 -| -p, \-\-prefix | 可选,只有匹配指定前缀的键会被返回 -| -s, \-\-start | 从指定键开始列举 -| Uri | 桶名 - -{{< highlight bash >}} -ozone sh key list /hive/jan -{{< /highlight >}} - -此命令会列出 _/hive/jan_ 桶中的所有键。 - -### 重命名 - -`key rename` 命令用来修改指定桶中的已有键的键名。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 桶名,格式为 **/volume/bucket** -| FromKey | 旧的键名 -| ToKey | 新的键名 - -{{< highlight bash >}} -ozone sh key rename /hive/jan sales.orc new_name.orc -{{< /highlight >}} - -上述命令会将 _/hive/jan_ 桶中的 _sales.orc_ 重命名为 _new\_name.orc_ 。 - -### Cat - -`key cat` 命令用来把指定的键的内容输出到终端。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 键名,格式为 **/volume/bucket/key** - - -{{< highlight bash >}} -ozone sh key cat /hive/jan/hello.txt -{{< /highlight >}} -上述命令会将 _/hive/jan_ 桶中的 hello.txt 的内容输出到标准输出中来。 - -### Cp - -`key cp` 命令用来在同一个bucket下,从一个key复制出另一个key。 - -***Params:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 桶名 格式为**/volume/bucket**。 -| FromKey | 现有的键名 -| ToKey | 新的键名 -| -r, \-\-replication | 可选,上传后的副本数,合法值为 ONE 或者 THREE,如果不设置,将采用集群配置中的默认值。 -| -t, \-\-type | 可选,副本类型,合法值为 RATIS 或 STAND_ALONE,如果不设置,将采用集群配置中的默认值。 - -{{< highlight bash >}} -ozone sh key cp /hive/jan sales.orc new_one.orc -{{< /highlight >}} -上述命令会将 _/hive/jan_ 桶中的 _sales.orc_ 复制到 _new\_one.orc_ 。 \ No newline at end of file diff --git a/hadoop-hdds/docs/content/shell/VolumeCommands.md b/hadoop-hdds/docs/content/shell/VolumeCommands.md deleted file mode 100644 index fe459f313352..000000000000 --- a/hadoop-hdds/docs/content/shell/VolumeCommands.md +++ /dev/null @@ -1,114 +0,0 @@ ---- -title: Volume Commands -weight: 2 -summary: Volume commands help you to manage the life cycle of a volume. ---- - - -Volume commands generally need administrator privileges. The ozone shell supports the following volume commands. - - * [create](#create) - * [delete](#delete) - * [info](#info) - * [list](#list) - * [update](#update) - -### Create - -The `volume create` command allows an administrator to create a volume and -assign it to a user. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| -q, \-\-quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. | -| -u, \-\-user | Required, The name of the user who owns this volume. This user can create, buckets and keys on this volume. | -| Uri | The name of the volume. | - -{{< highlight bash >}} -ozone sh volume create --quota=1TB --user=bilbo /hive -{{< /highlight >}} - -The above command will create a volume called _hive_ on the ozone cluster. This -volume has a quota of 1TB, and the owner is _bilbo_. - -### Delete - -The `volume delete` command allows an administrator to delete a volume. If the -volume is not empty then this command will fail. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the volume. - -{{< highlight bash >}} -ozone sh volume delete /hive -{{< /highlight >}} - -The above command will delete the volume hive, if the volume has no buckets -inside it. - -### Info - -The `volume info` commands returns the information about the volume including -quota and owner information. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| Uri | The name of the volume. - -{{< highlight bash >}} -ozone sh volume info /hive -{{< /highlight >}} - -The above command will print out the information about hive volume. - -### List - -The `volume list` command will list the volumes accessible by a user. - -{{< highlight bash >}} -ozone sh volume list --user hadoop -{{< /highlight >}} - -When ACL is enabled, the above command will print out volumes that the user -hadoop has LIST permission to. When ACL is disabled, the above command will -print out all the volumes owned by the user hadoop. - -### Update - -The volume update command allows changing of owner and quota on a given volume. - -***Params:*** - -| Arguments | Comment | -|--------------------------------|-----------------------------------------| -| -q, \-\-quota | Optional, This argument that specifies the maximum size this volume can use in the Ozone cluster. | -| -u, \-\-user | Optional, The name of the user who owns this volume. This user can create, buckets and keys on this volume. | -| Uri | The name of the volume. | - -{{< highlight bash >}} -ozone sh volume update --quota=10TB /hive -{{< /highlight >}} - -The above command updates the volume quota to 10TB. diff --git a/hadoop-hdds/docs/content/shell/VolumeCommands.zh.md b/hadoop-hdds/docs/content/shell/VolumeCommands.zh.md deleted file mode 100644 index 190e0994e74c..000000000000 --- a/hadoop-hdds/docs/content/shell/VolumeCommands.zh.md +++ /dev/null @@ -1,108 +0,0 @@ ---- -title: 卷命令 -weight: 2 -summary: 用卷命令管理卷的生命周期 ---- - - -卷命令通常需要管理员权限,ozone shell 支持以下卷命令: - - * [创建](#创建) - * [删除](#删除) - * [查看](#查看) - * [列举](#列举) - * [更新](#更新) - -### 创建 - -管理员可以通过 `volume create` 命令创建一个卷并分配给一个用户。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| -q, \-\-quota | 可选,指明该卷在 Ozone 集群所能使用的最大空间,即限额。 | -| -u, \-\-user | 必需,指明该卷的所有者,此用户可以在该卷中创建桶和键。 | -| Uri | 卷名 | - -{{< highlight bash >}} -ozone sh volume create --quota=1TB --user=bilbo /hive -{{< /highlight >}} - -上述命令会在 ozone 集群中创建名为 _hive_ 的卷,卷的限额为 1TB,所有者为 _bilbo_ 。 - -### 删除 - -管理员可以通过 `volume delete` 命令删除一个卷,如果卷不为空,此命令将失败。 - -***参数*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 卷名 | - -{{< highlight bash >}} -ozone sh volume delete /hive -{{< /highlight >}} - -如果 hive 卷中不包含任何桶,上述命令将删除 hive 卷。 - -### 查看 - -通过 `volume info` 命令可以获取卷的限额和所有者信息。 - -***参数:*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| Uri | 卷名 | - -{{< highlight bash >}} -ozone sh volume info /hive -{{< /highlight >}} - -上述命令会打印出 hive 卷的相关信息。 - -### 列举 - -`volume list` 命令用来列举一个用户可以访问的所有卷。 - -{{< highlight bash >}} -ozone sh volume list --user hadoop -{{< /highlight >}} - -若 ACL 已启用,上述命令会打印出 hadoop 用户有 LIST 权限的所有卷。 -若 ACL 被禁用,上述命令会打印出 hadoop 用户拥有的所有卷。 - -### 更新 - -`volume update` 命令用来修改卷的所有者和限额。 - -***参数*** - -| 参数名 | 说明 | -|--------------------------------|-----------------------------------------| -| -q, \-\-quota | 可选,重新指定该卷在 Ozone 集群中的限额。 | -| -u, \-\-user | 可选,重新指定该卷的所有者 | -| Uri | 卷名 | - -{{< highlight bash >}} -ozone sh volume update --quota=10TB /hive -{{< /highlight >}} - -上述命令将 hive 卷的限额更新为 10TB。 diff --git a/hadoop-hdds/docs/content/shell/_index.zh.md b/hadoop-hdds/docs/content/shell/_index.zh.md deleted file mode 100644 index 0f6220b5f0e6..000000000000 --- a/hadoop-hdds/docs/content/shell/_index.zh.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: 命令行接口 -menu: - main: - weight: 3 ---- - - - -{{}} - Ozone shell 是用户与 Ozone 进行交互的主要接口,它提供了操作 Ozone 的命令行接口。 -{{}} diff --git a/hadoop-hdds/docs/content/start/FromSource.md b/hadoop-hdds/docs/content/start/FromSource.md index 9ce0cc4b6a8f..80f47fb78f0b 100644 --- a/hadoop-hdds/docs/content/start/FromSource.md +++ b/hadoop-hdds/docs/content/start/FromSource.md @@ -22,18 +22,21 @@ weight: 30 {{< requirements >}} * Java 1.8 * Maven - * Protoc (2.5) {{< /requirements >}} - +planning to build sources yourself, you can safely skip this page. + + If you are a Hadoop ninja, and wise in the ways of Apache, you already know that a real Apache release is a source release. -If you want to build from sources, Please untar the source tarball and run -the ozone build command. This instruction assumes that you have all the +If you want to build from sources, Please untar the source tarball (or clone the latest code +from the [git repository](https://github.com/apache/hadoop-ozone)) and run the ozone build command. This instruction assumes that you have all the dependencies to build Hadoop on your build machine. If you need instructions on how to build Hadoop, please look at the Apache Hadoop Website. @@ -41,28 +44,27 @@ on how to build Hadoop, please look at the Apache Hadoop Website. mvn clean package -DskipTests=true ``` -This will build an ozone-\.tar.gz in your `hadoop-ozone/dist/target` directory. +This will build an `ozone-\` directory in your `hadoop-ozone/dist/target` directory. You can copy this tarball and use this instead of binary artifacts that are provided along with the official release. -## How to test the build - -You can run the acceptance tests in the hadoop-ozone directory to make sure -that your build is functional. To launch the acceptance tests, please follow - the instructions in the **README.md** in the `smoketest` directory. +To create tar file distribution, use the `-Pdist` profile: ```bash -cd smoketest -./test.sh +mvn clean package -DskipTests=true -Pdist ``` - You can also execute only a minimal subset of the tests: +## How to run Ozone from build + +When you have the new distribution, you can start a local cluster [with docker-compose]({{< ref "start/RunningViaDocker.md">}}). ```bash -cd smoketest -./test.sh --env ozone basic +cd hadoop-ozone/dist/target/ozone-X.X.X... +cd compose/ozone +docker-compose up -d ``` -Acceptance tests will start a small ozone cluster and verify that ozone shell and ozone file - system is fully functional. +## How to test the build + +`compose` subfolder contains multiple type of example setup (secure, non-secure, HA, Yarn). They can be tested with the help of [robotframework](http://robotframework.org/) with executing `test.sh` in any of the directories. \ No newline at end of file diff --git a/hadoop-hdds/docs/content/start/FromSource.zh.md b/hadoop-hdds/docs/content/start/FromSource.zh.md index a1b9f372e5e8..ab740af73828 100644 --- a/hadoop-hdds/docs/content/start/FromSource.zh.md +++ b/hadoop-hdds/docs/content/start/FromSource.zh.md @@ -19,10 +19,15 @@ weight: 30 limitations under the License. --> +
+ +注意:本页面翻译的信息可能滞后,最新的信息请参看英文版的相关页面。 + +
+ {{< requirements >}} * Java 1.8 * Maven - * Protoc (2.5) {{< /requirements >}}