12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License.
- ---
- Hadoop HDFS over HTTP - Documentation Sets ${project.version}
- ---
- ---
- ${maven.build.timestamp}
- Hadoop HDFS over HTTP - Documentation Sets ${project.version}
- HttpFS is a server that provides a REST HTTP gateway supporting all HDFS
- File System operations (read and write). And it is inteoperable with the
- <<webhdfs>> REST HTTP API.
- HttpFS can be used to transfer data between clusters running different
- versions of Hadoop (overcoming RPC versioning issues), for example using
- Hadoop DistCP.
- HttpFS can be used to access data in HDFS on a cluster behind of a firewall
- (the HttpFS server acts as a gateway and is the only system that is allowed
- to cross the firewall into the cluster).
- HttpFS can be used to access data in HDFS using HTTP utilities (such as curl
- and wget) and HTTP libraries Perl from other languages than Java.
- The <<webhdfs>> client FileSytem implementation can be used to access HttpFS
- using the Hadoop filesystem command (<<<hadoop fs>>>) line tool as well as
- from Java aplications using the Hadoop FileSystem Java API.
- HttpFS has built-in security supporting Hadoop pseudo authentication and
- HTTP SPNEGO Kerberos and other pluggable authentication mechanims. It also
- provides Hadoop proxy user support.
- * How Does HttpFS Works?
- HttpFS is a separate service from Hadoop NameNode.
- HttpFS itself is Java web-application and it runs using a preconfigured Tomcat
- bundled with HttpFS binary distribution.
- HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file
- system operation. For example, using the <<<curl>>> Unix command:
- * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt>>> returns
- the contents of the HDFS <<</user/foo/README.txt>>> file.
- * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo?op=list>>> returns the
- contents of the HDFS <<</user/foo>>> directory in JSON format.
- * <<<$ curl -X POST http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=mkdirs>>>
- creates the HDFS <<</user/foo.bar>>> directory.
- * How HttpFS and Hadoop HDFS Proxy differ?
- HttpFS was inspired by Hadoop HDFS proxy.
- HttpFS can be seening as a full rewrite of Hadoop HDFS proxy.
- Hadoop HDFS proxy provides a subset of file system operations (read only),
- HttpFS provides support for all file system operations.
- HttpFS uses a clean HTTP REST API making its use with HTTP tools more
- intuitive.
- HttpFS supports Hadoop pseudo authentication, Kerberos SPNEGOS authentication
- and Hadoop proxy users. Hadoop HDFS proxy did not.
- * User and Developer Documentation
- * {{{./ServerSetup.html}HttpFS Server Setup}}
- * {{{./UsingHttpTools.html}Using HTTP Tools}}
- * Current Limitations
- <<<GETDELEGATIONTOKEN, RENEWDELEGATIONTOKEN and CANCELDELEGATIONTOKEN>>>
- operations are not supported.
|