-
Notifications
You must be signed in to change notification settings - Fork 2.8k
ZEPPELIN-198 HDFS File Interpreter #752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 13 commits
865e6ab
7d61e5f
1239fe6
70507a8
79f0d90
27e0438
797fd29
32ed7cb
b505391
227b815
d24f4c0
aec0512
71d53d3
29540df
933c890
56a5174
1c7a5c2
67bbc5b
5938b0e
9f66514
8d0ee3d
c34a913
9832622
5e9e131
af16ce0
f7bfef8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| --- | ||
| layout: page | ||
| title: "HDFS File Interpreter" | ||
| description: "" | ||
| group: manual | ||
| --- | ||
| {% include JB/setup %} | ||
| .add(HDFS_USER, "hdfs", "The WebHDFS user") | ||
| .add(HDFS_MAXLENGTH, "1000", "Maximum number of lines of results fetched").build()); | ||
|
|
||
| ## HDFS File Interpreter for Apache Zeppelin | ||
|
|
||
| [Hadoop File System](http://hadoop.apache.org/) is a distributed, fault tolerant file system part of the hadoop project and is often used as storage for distributed processing engines like [Hadoop MapReduce](http://hadoop.apache.org/) and [Apache Spark](http://spark.apache.org/) or underlying file systems like [Alluxio](http://www.alluxio.org/). | ||
|
|
||
| ## Configuration | ||
| <table class="table-configuration"> | ||
| <tr> | ||
| <th>Property</th> | ||
| <th>Default</th> | ||
| <th>Description</th> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.url</td> | ||
| <td>http://localhost:50070/webhdfs/v1/</td> | ||
| <td>The URL for WebHDFS</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.user</td> | ||
| <td>hdfs</td> | ||
| <td>The WebHDFS user</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.maxlength</td> | ||
| <td>1000</td> | ||
| <td>Maximum number of lines of results fetched</td> | ||
| </tr> | ||
| </table> | ||
|
|
||
| <br/> | ||
| This interpreter connects to HDFS using the HTTP WebHDFS interface. | ||
| It supports the basic shell file commands applied to HDFS, it currently only supports browsing | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @runyontr : ) |
||
| * You can use <i>ls [PATH]</i> and <i>ls -l [PATH]</i> to list a directory. If the path is missing, then the current directory is listed. <i>ls </i> supports a <i>-h</h> flag for human readable file sizes. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| * You can use <i>cd [PATH]</i> to change your current directory by giving a relative or an absolute path. | ||
| * You can invoke <i>pwd</i> to see your current directory. | ||
|
|
||
| > **Tip :** Use ( Ctrl + . ) for autocompletion. | ||
|
|
||
| ### Create Interpreter | ||
|
|
||
| In a notebook, to enable the **HDFS** interpreter, click the **Gear** icon and select **HDFS**. | ||
|
|
||
|
|
||
| ### Configuration | ||
| You can modify the configuration of HDFS from the `Interpreter` section. The HDFS interpreter express the following properties: | ||
|
|
||
| <table class="table-configuration"> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need this property table too? It's duplicated with the above one. |
||
| <tr> | ||
| <th>Property Name</th> | ||
| <th>Description</th> | ||
| <th>Default Value</th> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.url</td> | ||
| <td>The URL for WebHDFS</td> | ||
| <td>http://localhost:50070/webhdfs/v1/</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.user</td> | ||
| <td>The WebHDFS user</td> | ||
| <td>hdfs</td> | ||
| </tr> | ||
| <tr> | ||
| <td>hdfs.maxlength</td> | ||
| <td>Maximum number of lines of results fetched</td> | ||
| <td>1000</td> | ||
| </tr> | ||
| </table> | ||
|
|
||
|
|
||
| #### WebHDFS REST API | ||
| You can confirm that you're able to access the WebHDFS API by running a curl command against the WebHDFS end point provided to the interpreter. | ||
|
|
||
| Here is an example: | ||
| $> curl "http://localhost:50070/webhdfs/v1/?op=LISTSTATUS" | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,146 @@ | ||
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <!-- | ||
| ~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
| ~ contributor license agreements. See the NOTICE file distributed with | ||
| ~ this work for additional information regarding copyright ownership. | ||
| ~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| ~ (the "License"); you may not use this file except in compliance with | ||
| ~ the License. You may obtain a copy of the License at | ||
| ~ | ||
| ~ http://www.apache.org/licenses/LICENSE-2.0 | ||
| ~ | ||
| ~ Unless required by applicable law or agreed to in writing, software | ||
| ~ distributed under the License is distributed on an "AS IS" BASIS, | ||
| ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| ~ See the License for the specific language governing permissions and | ||
| ~ limitations under the License. | ||
| --> | ||
|
|
||
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> | ||
| <modelVersion>4.0.0</modelVersion> | ||
|
|
||
| <parent> | ||
| <artifactId>zeppelin</artifactId> | ||
| <groupId>org.apache.zeppelin</groupId> | ||
| <version>0.6.0-incubating-SNAPSHOT</version> | ||
| </parent> | ||
|
|
||
| <groupId>org.apache.zeppelin</groupId> | ||
| <artifactId>zeppelin-file</artifactId> | ||
| <packaging>jar</packaging> | ||
| <version>0.6.0-incubating-SNAPSHOT</version> | ||
| <name>Zeppelin File System Interpreters</name> | ||
| <url>http://www.apache.org</url> | ||
|
|
||
| <dependencies> | ||
| <dependency> | ||
| <groupId>org.apache.zeppelin</groupId> | ||
| <artifactId>zeppelin-interpreter</artifactId> | ||
| <version>${project.version}</version> | ||
| <scope>provided</scope> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>javax.ws.rs</groupId> | ||
| <artifactId>javax.ws.rs-api</artifactId> | ||
| <version>2.0</version> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.slf4j</groupId> | ||
| <artifactId>slf4j-api</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.slf4j</groupId> | ||
| <artifactId>slf4j-log4j12</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> | ||
| <groupId>org.glassfish.jersey.core</groupId> | ||
| <artifactId>jersey-common</artifactId> | ||
| <version>2.22.2</version> | ||
| </dependency> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. License of new binary dependency (including transitive dependencies) need to be address in zeppelin-distribution/src/bin_license/LICENSE and/or zeppelin-distribution/src/bin_license/NOTICE. Could you take care of them? here is dependency tree.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I pushed up a new commit last night. Could you please check to make sure the format of the license change is correct? |
||
|
|
||
|
|
||
| <dependency> | ||
| <groupId>junit</groupId> | ||
| <artifactId>junit</artifactId> | ||
| <scope>test</scope> | ||
| </dependency> | ||
| </dependencies> | ||
|
|
||
| <build> | ||
| <plugins> | ||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> | ||
| <artifactId>maven-deploy-plugin</artifactId> | ||
| <version>2.7</version> | ||
| <configuration> | ||
| <skip>true</skip> | ||
| </configuration> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <groupId>org.apache.maven.plugins</groupId> | ||
| <artifactId>maven-surefire-plugin</artifactId> | ||
| <version>2.18.1</version> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <artifactId>maven-enforcer-plugin</artifactId> | ||
| <version>1.3.1</version> | ||
| <executions> | ||
| <execution> | ||
| <id>enforce</id> | ||
| <phase>none</phase> | ||
| </execution> | ||
| </executions> | ||
| </plugin> | ||
|
|
||
| <plugin> | ||
| <artifactId>maven-dependency-plugin</artifactId> | ||
| <version>2.8</version> | ||
| <executions> | ||
| <execution> | ||
| <id>copy-dependencies</id> | ||
| <phase>package</phase> | ||
| <goals> | ||
| <goal>copy-dependencies</goal> | ||
| </goals> | ||
| <configuration> | ||
| <outputDirectory>${project.build.directory}/../../interpreter/file</outputDirectory> | ||
| <overWriteReleases>false</overWriteReleases> | ||
| <overWriteSnapshots>false</overWriteSnapshots> | ||
| <overWriteIfNewer>true</overWriteIfNewer> | ||
| <includeScope>runtime</includeScope> | ||
| </configuration> | ||
| </execution> | ||
| <execution> | ||
| <id>copy-artifact</id> | ||
| <phase>package</phase> | ||
| <goals> | ||
| <goal>copy</goal> | ||
| </goals> | ||
| <configuration> | ||
| <outputDirectory>${project.build.directory}/../../interpreter/file</outputDirectory> | ||
| <overWriteReleases>false</overWriteReleases> | ||
| <overWriteSnapshots>false</overWriteSnapshots> | ||
| <overWriteIfNewer>true</overWriteIfNewer> | ||
| <!--<includeScope>runtime</includeScope>--> | ||
| <artifactItems> | ||
| <artifactItem> | ||
| <groupId>${project.groupId}</groupId> | ||
| <artifactId>${project.artifactId}</artifactId> | ||
| <version>${project.version}</version> | ||
| <type>${project.packaging}</type> | ||
| </artifactItem> | ||
| </artifactItems> | ||
| </configuration> | ||
| </execution> | ||
| </executions> | ||
| </plugin> | ||
| </plugins> | ||
| </build> | ||
|
|
||
| </project> | ||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, its absolutely not supposed to be there.