apache · elek · Jun 25, 2020 · Jun 19, 2020 · Jun 22, 2020 · Jun 23, 2020
diff --git a/hadoop-hdds/docs/content/interface/OzoneFS.md b/hadoop-hdds/docs/content/interface/OzoneFS.md
@@ -25,7 +25,7 @@ The Hadoop compatible file system interface allows storage backends like Ozone
 to be easily integrated into Hadoop eco-system.  Ozone file system is an
 Hadoop compatible file system.
 
-## Setting up the Ozone file system
+## Setting up the Ozone file system (o3fs)
 
 To create an ozone file system, we have to choose a bucket where the file system would live. This bucket will be used as the backend store for OzoneFileSystem. All the files and directories will be stored as keys in this bucket.
 
@@ -41,10 +41,6 @@ Once this is created, please make sure that bucket exists via the _list volume_
 Please add the following entry to the core-site.xml.
 
 {{< highlight xml >}}
-<property>
-  <name>fs.o3fs.impl</name>
-  <value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
-</property>
 <property>
   <name>fs.AbstractFileSystem.o3fs.impl</name>
   <value>org.apache.hadoop.fs.ozone.OzFs</value>
@@ -57,12 +53,14 @@ Please add the following entry to the core-site.xml.
 
 This will make this bucket to be the default file system for HDFS dfs commands and register the o3fs file system type.
 
-You also need to add the ozone-filesystem.jar file to the classpath:
+You also need to add the ozone-filesystem-hadoop3.jar file to the classpath:
 
 {{< highlight bash >}}
-export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-lib-current*.jar:$HADOOP_CLASSPATH
+export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-hadoop3-*.jar:$HADOOP_CLASSPATH
 {{< /highlight >}}
 
+(Note: with Hadoop 2.x, use the `hadoop-ozone-filesystem-hadoop2-*.jar`)
+
 Once the default Filesystem has been setup, users can run commands like ls, put, mkdir, etc.
 For example,
 
@@ -81,6 +79,7 @@ Or put command etc. In other words, all programs like Hive, Spark, and Distcp wi
 Please note that any keys created/deleted in the bucket using methods apart from OzoneFileSystem will show up as directories and files in the Ozone File System.
 
 Note: Bucket and volume names are not allowed to have a period in them.
+
 Moreover, the filesystem URI can take a fully qualified form with the OM host and an optional port as a part of the path following the volume name.
 For example, you can specify both host and port:
 
@@ -114,43 +113,3 @@ hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:6789/key
 Note: Only port number from the config is used in this case, 
 whereas the host name in the config `ozone.om.address` is ignored.
 
-
-## Supporting older Hadoop version (Legacy jar, BasicOzoneFilesystem)
-
-There are two ozonefs files, both of them include all the dependencies:
-
- * share/ozone/lib/hadoop-ozone-filesystem-lib-current-VERSION.jar
- * share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-VERSION.jar
-
-The first one contains all the required dependency to use ozonefs with a
- compatible hadoop version (hadoop 3.2).
-
-The second one contains all the dependency in an internal, separated directory,
- and a special class loader is used to load all the classes from the location.
-
-With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
- any older hadoop version (eg. hadoop 3.1, hadoop 2.7 or spark+hadoop 2.7)
-
-Similar to the dependency jar, there are two OzoneFileSystem implementation.
-
-For hadoop 3.0 and newer, you can use `org.apache.hadoop.fs.ozone.OzoneFileSystem`
- which is a full implementation of the Hadoop compatible File System API.
-
-For Hadoop 2.x you should use the Basic version: `org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`.
-
-This is the same implementation but doesn't include the features/dependencies which are added with
- Hadoop 3.0. (eg. FS statistics, encryption zones).
-
-### Summary
-
-The following table summarize which jar files and implementation should be used:
-
-Hadoop version | Required jar            | FileSystem implementation | AbstractFileSystem implementation
----------------|-------------------------|-------------------------------------------------|-------------------------------------
-3.2            | filesystem-lib-current  | org.apache.hadoop.fs.ozone.OzoneFileSystem      | org.apache.hadoop.fs.ozone.OzFs
-3.1            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.OzoneFileSystem      | org.apache.hadoop.fs.ozone.OzFs
-2.9            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem | org.apache.hadoop.fs.ozone.BasicOzFs
-2.7            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem | org.apache.hadoop.fs.ozone.BasicOzFs
-
-With this method the hadoop-ozone-filesystem-lib-legacy.jar can be used from
- any older hadoop version (eg. hadoop 2.7 or spark+hadoop 2.7)
diff --git a/hadoop-hdds/docs/content/interface/OzoneFS.zh.md b/hadoop-hdds/docs/content/interface/OzoneFS.zh.md
@@ -39,10 +39,6 @@ ozone sh bucket create /volume/bucket
 请在 core-site.xml 中添加以下条目：
 
 {{< highlight xml >}}
-<property>
-  <name>fs.o3fs.impl</name>
-  <value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
-</property>
 <property>
   <name>fs.AbstractFileSystem.o3fs.impl</name>
   <value>org.apache.hadoop.fs.ozone.OzFs</value>
@@ -58,9 +54,11 @@ ozone sh bucket create /volume/bucket
 你还需要将 ozone-filesystem.jar 文件加入 classpath：
 
 {{< highlight bash >}}
-export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-lib-current*.jar:$HADOOP_CLASSPATH
+export HADOOP_CLASSPATH=/opt/ozone/share/ozonefs/lib/hadoop-ozone-filesystem-hadoop3-*.jar:$HADOOP_CLASSPATH
 {{< /highlight >}}
 
+(注意：当使用Hadoop 2.x时，应该在classpath上添加hadoop-ozone-filesystem-hadoop2-*.jar)
+
 当配置了默认的文件系统之后，用户可以运行 ls、put、mkdir 等命令，比如：
 
 {{< highlight bash >}}
@@ -109,32 +107,3 @@ hdfs dfs -ls o3fs://bucket.volume.om-host.example.com:6789/key
 注意：在这种情况下，`ozone.om.address` 配置中只有端口号会被用到，主机名是被忽略的。
 
 
-## 兼容旧版本 Hadoop（Legacy jar 和 BasicOzoneFilesystem）
-
-Ozone 文件系统的 jar 包有两种类型，它们都包含了所有的依赖：
-
- * share/ozone/lib/hadoop-ozone-filesystem-lib-current-VERSION.jar
- * share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-VERSION.jar
-
-第一种 jar 包包含了在一个版本兼容的 hadoop（hadoop 3.2）中使用 Ozone 文件系统需要的所有依赖。
-
-第二种 jar 包将所有依赖单独放在一个内部的目录，并且这个目录下的类会用一个特殊的类加载器来加载这些类。通过这种方法，旧版本的 hadoop 就可以使用 hadoop-ozone-filesystem-lib-legacy.jar（比如hadoop 3.1、hadoop 2.7 或者 spark+hadoop 2.7）。
-
-和依赖的 jar 包类似， OzoneFileSystem 也有两种实现。
-
-对于 Hadoop 3.0 之后的版本，你应当使用 `org.apache.hadoop.fs.ozone.OzoneFileSystem`，它是兼容 Hadoop 文件系统 API 的完整实现。
-
-对于 Hadoop 2.x 的版本，你应该使用基础版本 `org.apache.hadoop.fs.ozone.BasicOzoneFileSystem`，两者实现基本相同，但是不包含在 Hadoop 3.0 中引入的特性和依赖（比如文件系统统计信息、加密桶等）。
-
-### 总结
-
-下表总结了各个版本 Hadoop 应当使用的 jar 包和文件系统实现：
-
-Hadoop 版本 | 需要的 jar            | FileSystem 实现  | AbstractFileSystem 实现
----------------|-------------------------|-------------------------------------------------|---------------------------
-3.2            | filesystem-lib-current  | org.apache.hadoop.fs.ozone.OzoneFileSystem      | org.apache.hadoop.fs.ozone.OzFs
-3.1            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.OzoneFileSystem      | org.apache.hadoop.fs.ozone.OzFs
-2.9            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem | org.apache.hadoop.fs.ozone.BasicOzFs
-2.7            | filesystem-lib-legacy   | org.apache.hadoop.fs.ozone.BasicOzoneFileSystem | org.apache.hadoop.fs.ozone.BasicOzFs
-
-由此可知，低版本的 Hadoop 可以使用 hadoop-ozone-filesystem-lib-legacy.jar（比如 hadoop 2.7 或者 spark+hadoop 2.7）。
diff --git a/hadoop-hdds/docs/content/recipe/SparkOzoneFSK8S.md b/hadoop-hdds/docs/content/recipe/SparkOzoneFSK8S.md
@@ -31,7 +31,7 @@ This recipe shows how Ozone object store can be used from Spark using:
 ## Requirements
 
 Download latest Spark and Ozone distribution and extract them. This method is
-tested with the `spark-2.4.0-bin-hadoop2.7` distribution.
+tested with the `spark-2.4.6-bin-hadoop2.7` distribution.
 
 You also need the following:
 
@@ -47,13 +47,13 @@ First of all create a docker image with the Spark image creator.
 Execute the following from the Spark distribution
 
 ```bash
-./bin/docker-image-tool.sh -r myrepo -t 2.4.0 build
+./bin/docker-image-tool.sh -r myrepo -t 2.4.6 build
 ```
 
 _Note_: if you use Minikube add the `-m` flag to use the docker daemon of the Minikube image:
 
 ```bash
-./bin/docker-image-tool.sh -m -r myrepo -t 2.4.0 build
+./bin/docker-image-tool.sh -m -r myrepo -t 2.4.6 build
 ```
 
 `./bin/docker-image-tool.sh` is an official Spark tool to create container images and this step will create multiple Spark container images with the name _myrepo/spark_. The first container will be used as a base container in the following steps.
@@ -72,41 +72,35 @@ And create a custom `core-site.xml`.
 
 ```xml
 <configuration>
-    <property>
-        <name>fs.o3fs.impl</name>
-        <value>org.apache.hadoop.fs.ozone.BasicOzoneFileSystem</value>
-    </property>
     <property>
         <name>fs.AbstractFileSystem.o3fs.impl</name>
         <value>org.apache.hadoop.fs.ozone.OzFs</value>
      </property>
 </configuration>
 ```
 
-_Note_: You may also use `org.apache.hadoop.fs.ozone.OzoneFileSystem` without the `Basic` prefix. The `Basic` version doesn't support FS statistics and encryption zones but can work together with older hadoop versions.
-
-Copy the `ozonefs.jar` file from an ozone distribution (__use the legacy version!__)
+Copy the `ozonefs.jar` file from an ozone distribution (__use the hadoop2 version!__)
 
 ```
-kubectl cp om-0:/opt/hadoop/share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-0.4.0-SNAPSHOT.jar .
+kubectl cp om-0:/opt/hadoop/share/ozone/lib/hadoop-ozone-filesystem-hadoop2-VERSION.jar hadoop-ozone-filesystem-hadoop2.jar
 ```
 
 
 Create a new Dockerfile and build the image:
 ```
-FROM myrepo/spark:2.4.0
+FROM myrepo/spark:2.4.6
 ADD core-site.xml /opt/hadoop/conf/core-site.xml
 ADD ozone-site.xml /opt/hadoop/conf/ozone-site.xml
 ENV HADOOP_CONF_DIR=/opt/hadoop/conf
 ENV SPARK_EXTRA_CLASSPATH=/opt/hadoop/conf
-ADD hadoop-ozone-filesystem-lib-legacy-0.4.0-SNAPSHOT.jar /opt/hadoop-ozone-filesystem-lib-legacy.jar
+ADD hadoop-ozone-filesystem-hadoop2.jar /opt/hadoop-ozone-filesystem-hadoop2.jar
 ```
 
 ```bash
 docker build -t myrepo/spark-ozone
 ```
 
-For remote kubernetes cluster you may need to push it:
+For remote Kubernetes cluster you may need to push it:
 
 ```bash
 docker push myrepo/spark-ozone
@@ -133,8 +127,8 @@ kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount
 
 Execute the following spark-submit command, but change at least the following values:
 
- * the kubernetes master url (you can check your _~/.kube/config_ to find the actual value)
- * the kubernetes namespace (_yournamespace_ in this example)
+ * the Kubernetes master url (you can check your _~/.kube/config_ to find the actual value)
+ * the Kubernetes namespace (_yournamespace_ in this example)
  * serviceAccountName (you can use the _spark_ value if you followed the previous steps)
  * container.image (in this example this is _myrepo/spark-ozone_. This is pushed to the registry in the previous steps)
 
@@ -149,9 +143,9 @@ bin/spark-submit \
     --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
     --conf spark.kubernetes.container.image=myrepo/spark-ozone \
     --conf spark.kubernetes.container.image.pullPolicy=Always \
-    --jars /opt/hadoop-ozone-filesystem-lib-legacy.jar \
+    --jars /opt/hadoop-ozone-filesystem-hadoop2.jar \
     local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar \
-    o3fs://test.s3v/alice.txt
+    o3fs://test.s3v.ozone-om-0.ozone-om:9862/alice.txt
 ```
 
 Check the available `spark-word-count-...` pods with `kubectl get pod`

diff --git a/hadoop-hdds/docs/content/recipe/SparkOzoneFSK8S.zh.md b/hadoop-hdds/docs/content/recipe/SparkOzoneFSK8S.zh.md
@@ -30,7 +30,7 @@ summary: 如何在 K8s 上通过 Apache Spark 使用 Ozone ?
 
 ## 准备
 
-下载 Spark 和 Ozone 的最新发行包并解压，本方法使用 `spark-2.4.0-bin-hadoop2.7` 进行了测试。
+下载 Spark 和 Ozone 的最新发行包并解压，本方法使用 `spark-2.4.6-bin-hadoop2.7` 进行了测试。
 
 你还需要准备以下内容：
 
@@ -46,13 +46,13 @@ summary: 如何在 K8s 上通过 Apache Spark 使用 Ozone ?
 在 Spark 发行包中运行以下命令：
 
 ```bash
-./bin/docker-image-tool.sh -r myrepo -t 2.4.0 build
+./bin/docker-image-tool.sh -r myrepo -t 2.4.6 build
 ```
 
 _注意_: 如果你使用 Minikube，需要加上 `-m` 参数来使用 Minikube 镜像的 docker 进程。
 
 ```bash
-./bin/docker-image-tool.sh -m -r myrepo -t 2.4.0 build
+./bin/docker-image-tool.sh -m -r myrepo -t 2.4.6 build
 ```
 
 `./bin/docker-image-tool.sh` 是 Spark 用来创建镜像的官方工具，上面的步骤会创建多个名为 _myrepo/spark_ 的 Spark 镜像，其中的第一个镜像用作接下来步骤的基础镜像。
@@ -67,38 +67,28 @@ _注意_: 如果你使用 Minikube，需要加上 `-m` 参数来使用 Minikube
 kubectl cp om-0:/opt/hadoop/etc/hadoop/ozone-site.xml .
 ```
 
-然后创建一个包含以下内容的 core-site.xml：
+从 Ozone 目录中拷贝 `ozonefs.jar`（__使用 hadoop2 版本！__）
 
 ```xml
 <configuration>
-    <property>
-        <name>fs.o3fs.impl</name>
-        <value>org.apache.hadoop.fs.ozone.BasicOzoneFileSystem</value>
-    </property>
     <property>
         <name>fs.AbstractFileSystem.o3fs.impl</name>
         <value>org.apache.hadoop.fs.ozone.OzFs</value>
      </property>
 </configuration>
 ```
-
-_注意_: 你也可以使用不带 `Basic` 前缀的 `org.apache.hadoop.fs.ozone.OzoneFileSystem`，带 `Basic` 的版本不支持 FS statistics 和加密空间，但可以兼容旧版本的 Hadoop。
-
-从 Ozone 目录中拷贝 `ozonefs.jar`（__使用 legacy 版本！__）
-
-```
-kubectl cp om-0:/opt/hadoop/share/ozone/lib/hadoop-ozone-filesystem-lib-legacy-0.4.0-SNAPSHOT.jar .
+kubectl cp om-0:/opt/hadoop/share/ozone/lib/hadoop-ozone-filesystem-hadoop2-VERSION.jar hadoop-ozone-filesystem-hadoop2.jar
 ```
 
 
 编写新的 Dockerfile 并构建镜像：
 ```
-FROM myrepo/spark:2.4.0
+FROM myrepo/spark:2.4.6
 ADD core-site.xml /opt/hadoop/conf/core-site.xml
 ADD ozone-site.xml /opt/hadoop/conf/ozone-site.xml
 ENV HADOOP_CONF_DIR=/opt/hadoop/conf
 ENV SPARK_EXTRA_CLASSPATH=/opt/hadoop/conf
-ADD hadoop-ozone-filesystem-lib-legacy-0.4.0-SNAPSHOT.jar /opt/hadoop-ozone-filesystem-lib-legacy.jar
+ADD hadoop-ozone-filesystem-hadoop2.jar /opt/hadoop-ozone-filesystem-hadoop2.jar
 ```
 
 ```bash
@@ -119,14 +109,6 @@ docker push myrepo/spark-ozone
 kubectl port-forward s3g-0 9878:9878
 aws s3api --endpoint http://localhost:9878 create-bucket --bucket=test
 aws s3api --endpoint http://localhost:9878 put-object --bucket test --key alice.txt --body /tmp/alice.txt
-kubectl exec -it scm-0 ozone s3 path test
-```
-
-最后一个命令的输出形如：
-
-```
-Volume name for S3Bucket is : s3asdlkjqiskjdsks
-Ozone FileSystem Uri is : o3fs://test.s3asdlkjqiskjdsks
 ```
 
 记下 Ozone 文件系统的 URI，在接下来的 spark-submit 命令中会用到它。
@@ -158,9 +140,9 @@ bin/spark-submit \
     --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
     --conf spark.kubernetes.container.image=myrepo/spark-ozone \
     --conf spark.kubernetes.container.image.pullPolicy=Always \
-    --jars /opt/hadoop-ozone-filesystem-lib-legacy.jar \
+    --jars /opt/hadoop-ozone-filesystem-hadoop2.jar \
     local:///opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar \
-    o3fs://bucket.volume/alice.txt
+    o3fs://test.s3v.ozone-om-0.ozone-om:9862/alice.txt
 ```
 
 使用 `kubectl get pod` 命令查看可用的 `spark-word-count-...` pod。