123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250 |
- Build instructions for Hadoop
- ----------------------------------------------------------------------------------
- Requirements:
- * Unix System
- * JDK 1.6+
- * Maven 3.0 or later
- * Findbugs 1.3.9 (if running findbugs)
- * ProtocolBuffer 2.5.0
- * CMake 2.6 or newer (if compiling native code)
- * Zlib devel (if compiling native code)
- * openssl devel ( if compiling native hadoop-pipes )
- * Internet connection for first build (to fetch all Maven and Hadoop dependencies)
- ----------------------------------------------------------------------------------
- Maven main modules:
- hadoop (Main Hadoop project)
- - hadoop-project (Parent POM for all Hadoop Maven modules. )
- (All plugins & dependencies versions are defined here.)
- - hadoop-project-dist (Parent POM for modules that generate distributions.)
- - hadoop-annotations (Generates the Hadoop doclet used to generated the Javadocs)
- - hadoop-assemblies (Maven assemblies used by the different modules)
- - hadoop-common-project (Hadoop Common)
- - hadoop-hdfs-project (Hadoop HDFS)
- - hadoop-mapreduce-project (Hadoop MapReduce)
- - hadoop-tools (Hadoop tools like Streaming, Distcp, etc.)
- - hadoop-dist (Hadoop distribution assembler)
- ----------------------------------------------------------------------------------
- Where to run Maven from?
- It can be run from any module. The only catch is that if not run from utrunk
- all modules that are not part of the build run must be installed in the local
- Maven cache or available in a Maven repository.
- ----------------------------------------------------------------------------------
- Maven build goals:
- * Clean : mvn clean
- * Compile : mvn compile [-Pnative]
- * Run tests : mvn test [-Pnative]
- * Create JAR : mvn package
- * Run findbugs : mvn compile findbugs:findbugs
- * Run checkstyle : mvn compile checkstyle:checkstyle
- * Install JAR in M2 cache : mvn install
- * Deploy JAR to Maven repo : mvn deploy
- * Run clover : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license]
- * Run Rat : mvn apache-rat:check
- * Build javadocs : mvn javadoc:javadoc
- * Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar]
- * Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION
- Build options:
- * Use -Pnative to compile/bundle native code
- * Use -Pdocs to generate & bundle the documentation in the distribution (using -Pdist)
- * Use -Psrc to create a project source TAR.GZ
- * Use -Dtar to create a TAR with the distribution (using -Pdist)
- Snappy build options:
- Snappy is a compression library that can be utilized by the native code.
- It is currently an optional component, meaning that Hadoop can be built with
- or without this dependency.
- * Use -Drequire.snappy to fail the build if libsnappy.so is not found.
- If this option is not specified and the snappy library is missing,
- we silently build a version of libhadoop.so that cannot make use of snappy.
- This option is recommended if you plan on making use of snappy and want
- to get more repeatable builds.
- * Use -Dsnappy.prefix to specify a nonstandard location for the libsnappy
- header files and library files. You do not need this option if you have
- installed snappy using a package manager.
- * Use -Dsnappy.lib to specify a nonstandard location for the libsnappy library
- files. Similarly to snappy.prefix, you do not need this option if you have
- installed snappy using a package manager.
- * Use -Dbundle.snappy to copy the contents of the snappy.lib directory into
- the final tar file. This option requires that -Dsnappy.lib is also given,
- and it ignores the -Dsnappy.prefix option.
- Tests options:
- * Use -DskipTests to skip tests when running the following Maven goals:
- 'package', 'install', 'deploy' or 'verify'
- * -Dtest=<TESTCLASSNAME>,<TESTCLASSNAME#METHODNAME>,....
- * -Dtest.exclude=<TESTCLASSNAME>
- * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java
- ----------------------------------------------------------------------------------
- Building components separately
- If you are building a submodule directory, all the hadoop dependencies this
- submodule has will be resolved as all other 3rd party dependencies. This is,
- from the Maven cache or from a Maven repository (if not available in the cache
- or the SNAPSHOT 'timed out').
- An alternative is to run 'mvn install -DskipTests' from Hadoop source top
- level once; and then work from the submodule. Keep in mind that SNAPSHOTs
- time out after a while, using the Maven '-nsu' will stop Maven from trying
- to update SNAPSHOTs from external repos.
- ----------------------------------------------------------------------------------
- Protocol Buffer compiler
- The version of Protocol Buffer compiler, protoc, must match the version of the
- protobuf JAR.
- If you have multiple versions of protoc in your system, you can set in your
- build shell the HADOOP_PROTOC_PATH environment variable to point to the one you
- want to use for the Hadoop build. If you don't define this environment variable,
- protoc is looked up in the PATH.
- ----------------------------------------------------------------------------------
- Importing projects to eclipse
- When you import the project to eclipse, install hadoop-maven-plugins at first.
- $ cd hadoop-maven-plugins
- $ mvn install
- Then, generate eclipse project files.
- $ mvn eclipse:eclipse -DskipTests
- At last, import to eclipse by specifying the root directory of the project via
- [File] > [Import] > [Existing Projects into Workspace].
- ----------------------------------------------------------------------------------
- Building distributions:
- Create binary distribution without native code and without documentation:
- $ mvn package -Pdist -DskipTests -Dtar
- Create binary distribution with native code and with documentation:
- $ mvn package -Pdist,native,docs -DskipTests -Dtar
- Create source distribution:
- $ mvn package -Psrc -DskipTests
- Create source and binary distributions with native code and documentation:
- $ mvn package -Pdist,native,docs,src -DskipTests -Dtar
- Create a local staging version of the website (in /tmp/hadoop-site)
- $ mvn clean site; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
- ----------------------------------------------------------------------------------
- Handling out of memory errors in builds
- ----------------------------------------------------------------------------------
- If the build process fails with an out of memory error, you should be able to fix
- it by increasing the memory used by maven -which can be done via the environment
- variable MAVEN_OPTS.
- Here is an example setting to allocate between 256 and 512 MB of heap space to
- Maven
- export MAVEN_OPTS="-Xms256m -Xmx512m"
- ----------------------------------------------------------------------------------
- Building on OS/X
- ----------------------------------------------------------------------------------
- A one-time manual step is required to enable building Hadoop OS X with Java 7
- every time the JDK is updated.
- see: https://issues.apache.org/jira/browse/HADOOP-9350
- $ sudo mkdir `/usr/libexec/java_home`/Classes
- $ sudo ln -s `/usr/libexec/java_home`/lib/tools.jar `/usr/libexec/java_home`/Classes/classes.jar
- ----------------------------------------------------------------------------------
- Building on Windows
- ----------------------------------------------------------------------------------
- Requirements:
- * Windows System
- * JDK 1.6+
- * Maven 3.0 or later
- * Findbugs 1.3.9 (if running findbugs)
- * ProtocolBuffer 2.5.0
- * Windows SDK or Visual Studio 2010 Professional
- * Unix command-line tools from GnuWin32 or Cygwin: sh, mkdir, rm, cp, tar, gzip
- * zlib headers (if building native code bindings for zlib)
- * Internet connection for first build (to fetch all Maven and Hadoop dependencies)
- If using Visual Studio, it must be Visual Studio 2010 Professional (not 2012).
- Do not use Visual Studio Express. It does not support compiling for 64-bit,
- which is problematic if running a 64-bit system. The Windows SDK is free to
- download here:
- http://www.microsoft.com/en-us/download/details.aspx?id=8279
- ----------------------------------------------------------------------------------
- Building:
- Keep the source code tree in a short path to avoid running into problems related
- to Windows maximum path length limitation. (For example, C:\hdc).
- Run builds from a Windows SDK Command Prompt. (Start, All Programs,
- Microsoft Windows SDK v7.1, Windows SDK 7.1 Command Prompt.)
- JAVA_HOME must be set, and the path must not contain spaces. If the full path
- would contain spaces, then use the Windows short path instead.
- You must set the Platform environment variable to either x64 or Win32 depending
- on whether you're running a 64-bit or 32-bit system. Note that this is
- case-sensitive. It must be "Platform", not "PLATFORM" or "platform".
- Environment variables on Windows are usually case-insensitive, but Maven treats
- them as case-sensitive. Failure to set this environment variable correctly will
- cause msbuild to fail while building the native code in hadoop-common.
- set Platform=x64 (when building on a 64-bit system)
- set Platform=Win32 (when building on a 32-bit system)
- Several tests require that the user must have the Create Symbolic Links
- privilege.
- All Maven goals are the same as described above with the exception that
- native code is built by enabling the 'native-win' Maven profile. -Pnative-win
- is enabled by default when building on Windows since the native components
- are required (not optional) on Windows.
- If native code bindings for zlib are required, then the zlib headers must be
- deployed on the build machine. Set the ZLIB_HOME environment variable to the
- directory containing the headers.
- set ZLIB_HOME=C:\zlib-1.2.7
- At runtime, zlib1.dll must be accessible on the PATH. Hadoop has been tested
- with zlib 1.2.7, built using Visual Studio 2010 out of contrib\vstudio\vc10 in
- the zlib 1.2.7 source tree.
- http://www.zlib.net/
- ----------------------------------------------------------------------------------
- Building distributions:
- * Build distribution with native code : mvn package [-Pdist][-Pdocs][-Psrc][-Dtar]
|