NativeLibraries.apt.vm 6.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187
  1. ~~ Licensed under the Apache License, Version 2.0 (the "License");
  2. ~~ you may not use this file except in compliance with the License.
  3. ~~ You may obtain a copy of the License at
  4. ~~
  5. ~~ http://www.apache.org/licenses/LICENSE-2.0
  6. ~~
  7. ~~ Unless required by applicable law or agreed to in writing, software
  8. ~~ distributed under the License is distributed on an "AS IS" BASIS,
  9. ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  10. ~~ See the License for the specific language governing permissions and
  11. ~~ limitations under the License. See accompanying LICENSE file.
  12. ---
  13. Native Libraries Guide
  14. ---
  15. ---
  16. ${maven.build.timestamp}
  17. Native Libraries Guide
  18. %{toc|section=1|fromDepth=0}
  19. * Overview
  20. This guide describes the native hadoop library and includes a small
  21. discussion about native shared libraries.
  22. Note: Depending on your environment, the term "native libraries" could
  23. refer to all *.so's you need to compile; and, the term "native
  24. compression" could refer to all *.so's you need to compile that are
  25. specifically related to compression. Currently, however, this document
  26. only addresses the native hadoop library (<<<libhadoop.so>>>).
  27. The document for libhdfs library (<<<libhdfs.so>>>) is
  28. {{{../hadoop-hdfs/LibHdfs.html}here}}.
  29. * Native Hadoop Library
  30. Hadoop has native implementations of certain components for performance
  31. reasons and for non-availability of Java implementations. These
  32. components are available in a single, dynamically-linked native library
  33. called the native hadoop library. On the *nix platforms the library is
  34. named <<<libhadoop.so>>>.
  35. * Usage
  36. It is fairly easy to use the native hadoop library:
  37. [[1]] Review the components.
  38. [[2]] Review the supported platforms.
  39. [[3]] Either download a hadoop release, which will include a pre-built
  40. version of the native hadoop library, or build your own version of
  41. the native hadoop library. Whether you download or build, the name
  42. for the library is the same: libhadoop.so
  43. [[4]] Install the compression codec development packages (>zlib-1.2,
  44. >gzip-1.2):
  45. * If you download the library, install one or more development
  46. packages - whichever compression codecs you want to use with
  47. your deployment.
  48. * If you build the library, it is mandatory to install both
  49. development packages.
  50. [[5]] Check the runtime log files.
  51. * Components
  52. The native hadoop library includes various components:
  53. * Compression Codecs (bzip2, lz4, snappy, zlib)
  54. * Native IO utilities for {{{../hadoop-hdfs/ShortCircuitLocalReads.html}
  55. HDFS Short-Circuit Local Reads}} and
  56. {{{../hadoop-hdfs/CentralizedCacheManagement.html}Centralized Cache
  57. Management in HDFS}}
  58. * CRC32 checksum implementation
  59. * Supported Platforms
  60. The native hadoop library is supported on *nix platforms only. The
  61. library does not to work with Cygwin or the Mac OS X platform.
  62. The native hadoop library is mainly used on the GNU/Linus platform and
  63. has been tested on these distributions:
  64. * RHEL4/Fedora
  65. * Ubuntu
  66. * Gentoo
  67. On all the above distributions a 32/64 bit native hadoop library will
  68. work with a respective 32/64 bit jvm.
  69. * Download
  70. The pre-built 32-bit i386-Linux native hadoop library is available as
  71. part of the hadoop distribution and is located in the <<<lib/native>>>
  72. directory. You can download the hadoop distribution from Hadoop Common
  73. Releases.
  74. Be sure to install the zlib and/or gzip development packages -
  75. whichever compression codecs you want to use with your deployment.
  76. * Build
  77. The native hadoop library is written in ANSI C and is built using the
  78. GNU autotools-chain (autoconf, autoheader, automake, autoscan,
  79. libtool). This means it should be straight-forward to build the library
  80. on any platform with a standards-compliant C compiler and the GNU
  81. autotools-chain (see the supported platforms).
  82. The packages you need to install on the target platform are:
  83. * C compiler (e.g. GNU C Compiler)
  84. * GNU Autools Chain: autoconf, automake, libtool
  85. * zlib-development package (stable version >= 1.2.0)
  86. * openssl-development package(e.g. libssl-dev)
  87. Once you installed the prerequisite packages use the standard hadoop
  88. pom.xml file and pass along the native flag to build the native hadoop
  89. library:
  90. ----
  91. $ mvn package -Pdist,native -DskipTests -Dtar
  92. ----
  93. You should see the newly-built library in:
  94. ----
  95. $ hadoop-dist/target/hadoop-${project.version}/lib/native
  96. ----
  97. Please note the following:
  98. * It is mandatory to install both the zlib and gzip development
  99. packages on the target platform in order to build the native hadoop
  100. library; however, for deployment it is sufficient to install just
  101. one package if you wish to use only one codec.
  102. * It is necessary to have the correct 32/64 libraries for zlib,
  103. depending on the 32/64 bit jvm for the target platform, in order to
  104. build and deploy the native hadoop library.
  105. * Runtime
  106. The bin/hadoop script ensures that the native hadoop library is on the
  107. library path via the system property:
  108. <<<-Djava.library.path=<path> >>>
  109. During runtime, check the hadoop log files for your MapReduce tasks.
  110. * If everything is all right, then:
  111. <<<DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...>>>
  112. <<<INFO util.NativeCodeLoader - Loaded the native-hadoop library>>>
  113. * If something goes wrong, then:
  114. <<<INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable>>>
  115. * Native Shared Libraries
  116. You can load any native shared library using DistributedCache for
  117. distributing and symlinking the library files.
  118. This example shows you how to distribute a shared library, mylib.so,
  119. and load it from a MapReduce task.
  120. [[1]] First copy the library to the HDFS:
  121. <<<bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1>>>
  122. [[2]] The job launching program should contain the following:
  123. <<<DistributedCache.createSymlink(conf);>>>
  124. <<<DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so. 1#mylib.so", conf);>>>
  125. [[3]] The MapReduce task can contain:
  126. <<<System.loadLibrary("mylib.so");>>>
  127. Note: If you downloaded or built the native hadoop library, you don’t
  128. need to use DistibutedCache to make the library available to your
  129. MapReduce tasks.