index.apt.vm 3.3 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
  1. ~~ Licensed under the Apache License, Version 2.0 (the "License");
  2. ~~ you may not use this file except in compliance with the License.
  3. ~~ You may obtain a copy of the License at
  4. ~~
  5. ~~ http://www.apache.org/licenses/LICENSE-2.0
  6. ~~
  7. ~~ Unless required by applicable law or agreed to in writing, software
  8. ~~ distributed under the License is distributed on an "AS IS" BASIS,
  9. ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  10. ~~ See the License for the specific language governing permissions and
  11. ~~ limitations under the License.
  12. ---
  13. Hadoop HDFS over HTTP - Documentation Sets ${project.version}
  14. ---
  15. ---
  16. ${maven.build.timestamp}
  17. Hadoop HDFS over HTTP - Documentation Sets ${project.version}
  18. HttpFS is a server that provides a REST HTTP gateway supporting all HDFS
  19. File System operations (read and write). And it is inteoperable with the
  20. <<webhdfs>> REST HTTP API.
  21. HttpFS can be used to transfer data between clusters running different
  22. versions of Hadoop (overcoming RPC versioning issues), for example using
  23. Hadoop DistCP.
  24. HttpFS can be used to access data in HDFS on a cluster behind of a firewall
  25. (the HttpFS server acts as a gateway and is the only system that is allowed
  26. to cross the firewall into the cluster).
  27. HttpFS can be used to access data in HDFS using HTTP utilities (such as curl
  28. and wget) and HTTP libraries Perl from other languages than Java.
  29. The <<webhdfs>> client FileSytem implementation can be used to access HttpFS
  30. using the Hadoop filesystem command (<<<hadoop fs>>>) line tool as well as
  31. from Java aplications using the Hadoop FileSystem Java API.
  32. HttpFS has built-in security supporting Hadoop pseudo authentication and
  33. HTTP SPNEGO Kerberos and other pluggable authentication mechanims. It also
  34. provides Hadoop proxy user support.
  35. * How Does HttpFS Works?
  36. HttpFS is a separate service from Hadoop NameNode.
  37. HttpFS itself is Java web-application and it runs using a preconfigured Tomcat
  38. bundled with HttpFS binary distribution.
  39. HttpFS HTTP web-service API calls are HTTP REST calls that map to a HDFS file
  40. system operation. For example, using the <<<curl>>> Unix command:
  41. * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo/README.txt>>> returns
  42. the contents of the HDFS <<</user/foo/README.txt>>> file.
  43. * <<<$ curl http://httpfs-host:14000/webhdfs/v1/user/foo?op=list>>> returns the
  44. contents of the HDFS <<</user/foo>>> directory in JSON format.
  45. * <<<$ curl -X POST http://httpfs-host:14000/webhdfs/v1/user/foo/bar?op=mkdirs>>>
  46. creates the HDFS <<</user/foo.bar>>> directory.
  47. * How HttpFS and Hadoop HDFS Proxy differ?
  48. HttpFS was inspired by Hadoop HDFS proxy.
  49. HttpFS can be seening as a full rewrite of Hadoop HDFS proxy.
  50. Hadoop HDFS proxy provides a subset of file system operations (read only),
  51. HttpFS provides support for all file system operations.
  52. HttpFS uses a clean HTTP REST API making its use with HTTP tools more
  53. intuitive.
  54. HttpFS supports Hadoop pseudo authentication, Kerberos SPNEGOS authentication
  55. and Hadoop proxy users. Hadoop HDFS proxy did not.
  56. * User and Developer Documentation
  57. * {{{./ServerSetup.html}HttpFS Server Setup}}
  58. * {{{./UsingHttpTools.html}Using HTTP Tools}}
  59. * Current Limitations
  60. <<<GETDELEGATIONTOKEN, RENEWDELEGATIONTOKEN and CANCELDELEGATIONTOKEN>>>
  61. operations are not supported.