123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License. See accompanying LICENSE file.
- ---
- Native Libraries Guide
- ---
- ---
- ${maven.build.timestamp}
- Native Libraries Guide
- %{toc|section=1|fromDepth=0}
- * Overview
- This guide describes the native hadoop library and includes a small
- discussion about native shared libraries.
- Note: Depending on your environment, the term "native libraries" could
- refer to all *.so's you need to compile; and, the term "native
- compression" could refer to all *.so's you need to compile that are
- specifically related to compression. Currently, however, this document
- only addresses the native hadoop library (<<<libhadoop.so>>>).
- The document for libhdfs library (<<<libhdfs.so>>>) is
- {{{../hadoop-hdfs/LibHdfs.html}here}}.
- * Native Hadoop Library
- Hadoop has native implementations of certain components for performance
- reasons and for non-availability of Java implementations. These
- components are available in a single, dynamically-linked native library
- called the native hadoop library. On the *nix platforms the library is
- named <<<libhadoop.so>>>.
- * Usage
- It is fairly easy to use the native hadoop library:
- [[1]] Review the components.
- [[2]] Review the supported platforms.
- [[3]] Either download a hadoop release, which will include a pre-built
- version of the native hadoop library, or build your own version of
- the native hadoop library. Whether you download or build, the name
- for the library is the same: libhadoop.so
- [[4]] Install the compression codec development packages (>zlib-1.2,
- >gzip-1.2):
- * If you download the library, install one or more development
- packages - whichever compression codecs you want to use with
- your deployment.
- * If you build the library, it is mandatory to install both
- development packages.
- [[5]] Check the runtime log files.
- * Components
- The native hadoop library includes various components:
- * Compression Codecs (bzip2, lz4, snappy, zlib)
- * Native IO utilities for {{{../hadoop-hdfs/ShortCircuitLocalReads.html}
- HDFS Short-Circuit Local Reads}} and
- {{{../hadoop-hdfs/CentralizedCacheManagement.html}Centralized Cache
- Management in HDFS}}
- * CRC32 checksum implementation
- * Supported Platforms
- The native hadoop library is supported on *nix platforms only. The
- library does not to work with Cygwin or the Mac OS X platform.
- The native hadoop library is mainly used on the GNU/Linus platform and
- has been tested on these distributions:
- * RHEL4/Fedora
- * Ubuntu
- * Gentoo
- On all the above distributions a 32/64 bit native hadoop library will
- work with a respective 32/64 bit jvm.
- * Download
- The pre-built 32-bit i386-Linux native hadoop library is available as
- part of the hadoop distribution and is located in the <<<lib/native>>>
- directory. You can download the hadoop distribution from Hadoop Common
- Releases.
- Be sure to install the zlib and/or gzip development packages -
- whichever compression codecs you want to use with your deployment.
- * Build
- The native hadoop library is written in ANSI C and is built using the
- GNU autotools-chain (autoconf, autoheader, automake, autoscan,
- libtool). This means it should be straight-forward to build the library
- on any platform with a standards-compliant C compiler and the GNU
- autotools-chain (see the supported platforms).
- The packages you need to install on the target platform are:
- * C compiler (e.g. GNU C Compiler)
- * GNU Autools Chain: autoconf, automake, libtool
- * zlib-development package (stable version >= 1.2.0)
- * openssl-development package(e.g. libssl-dev)
- Once you installed the prerequisite packages use the standard hadoop
- pom.xml file and pass along the native flag to build the native hadoop
- library:
- ----
- $ mvn package -Pdist,native -DskipTests -Dtar
- ----
- You should see the newly-built library in:
- ----
- $ hadoop-dist/target/hadoop-${project.version}/lib/native
- ----
- Please note the following:
- * It is mandatory to install both the zlib and gzip development
- packages on the target platform in order to build the native hadoop
- library; however, for deployment it is sufficient to install just
- one package if you wish to use only one codec.
- * It is necessary to have the correct 32/64 libraries for zlib,
- depending on the 32/64 bit jvm for the target platform, in order to
- build and deploy the native hadoop library.
- * Runtime
- The bin/hadoop script ensures that the native hadoop library is on the
- library path via the system property:
- <<<-Djava.library.path=<path> >>>
- During runtime, check the hadoop log files for your MapReduce tasks.
- * If everything is all right, then:
- <<<DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...>>>
- <<<INFO util.NativeCodeLoader - Loaded the native-hadoop library>>>
- * If something goes wrong, then:
- <<<INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable>>>
- * Native Shared Libraries
- You can load any native shared library using DistributedCache for
- distributing and symlinking the library files.
- This example shows you how to distribute a shared library, mylib.so,
- and load it from a MapReduce task.
- [[1]] First copy the library to the HDFS:
- <<<bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1>>>
- [[2]] The job launching program should contain the following:
- <<<DistributedCache.createSymlink(conf);>>>
- <<<DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so. 1#mylib.so", conf);>>>
- [[3]] The MapReduce task can contain:
- <<<System.loadLibrary("mylib.so");>>>
- Note: If you downloaded or built the native hadoop library, you don’t
- need to use DistibutedCache to make the library available to your
- MapReduce tasks.
|