123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140 |
- Build instructions for Hadoop
- ----------------------------------------------------------------------------------
- Requirements:
- * Unix System
- * JDK 1.6
- * Maven 3.0
- * Findbugs 1.3.9 (if running findbugs)
- * ProtocolBuffer 2.4.1+ (for MapReduce and HDFS)
- * CMake 2.6 or newer (if compiling native code)
- * Internet connection for first build (to fetch all Maven and Hadoop dependencies)
- ----------------------------------------------------------------------------------
- Maven main modules:
- hadoop (Main Hadoop project)
- - hadoop-project (Parent POM for all Hadoop Maven modules. )
- (All plugins & dependencies versions are defined here.)
- - hadoop-project-dist (Parent POM for modules that generate distributions.)
- - hadoop-annotations (Generates the Hadoop doclet used to generated the Javadocs)
- - hadoop-assemblies (Maven assemblies used by the different modules)
- - hadoop-common-project (Hadoop Common)
- - hadoop-hdfs-project (Hadoop HDFS)
- - hadoop-mapreduce-project (Hadoop MapReduce)
- - hadoop-tools (Hadoop tools like Streaming, Distcp, etc.)
- - hadoop-dist (Hadoop distribution assembler)
- ----------------------------------------------------------------------------------
- Where to run Maven from?
- It can be run from any module. The only catch is that if not run from utrunk
- all modules that are not part of the build run must be installed in the local
- Maven cache or available in a Maven repository.
- ----------------------------------------------------------------------------------
- Maven build goals:
- * Clean : mvn clean
- * Compile : mvn compile [-Pnative]
- * Run tests : mvn test [-Pnative]
- * Create JAR : mvn package
- * Run findbugs : mvn compile findbugs:findbugs
- * Run checkstyle : mvn compile checkstyle:checkstyle
- * Install JAR in M2 cache : mvn install
- * Deploy JAR to Maven repo : mvn deploy
- * Run clover : mvn test -Pclover [-DcloverLicenseLocation=${user.name}/.clover.license]
- * Run Rat : mvn apache-rat:check
- * Build javadocs : mvn javadoc:javadoc
- * Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar]
- * Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION
- Build options:
- * Use -Pnative to compile/bundle native code
- * Use -Pdocs to generate & bundle the documentation in the distribution (using -Pdist)
- * Use -Psrc to create a project source TAR.GZ
- * Use -Dtar to create a TAR with the distribution (using -Pdist)
- Snappy build options:
- Snappy is a compression library that can be utilized by the native code.
- It is currently an optional component, meaning that Hadoop can be built with
- or without this dependency.
- * Use -Drequire.snappy to fail the build if libsnappy.so is not found.
- If this option is not specified and the snappy library is missing,
- we silently build a version of libhadoop.so that cannot make use of snappy.
- This option is recommended if you plan on making use of snappy and want
- to get more repeatable builds.
- * Use -Dsnappy.prefix to specify a nonstandard location for the libsnappy
- header files and library files. You do not need this option if you have
- installed snappy using a package manager.
- * Use -Dsnappy.lib to specify a nonstandard location for the libsnappy library
- files. Similarly to snappy.prefix, you do not need this option if you have
- installed snappy using a package manager.
- * Use -Dbundle.snappy to copy the contents of the snappy.lib directory into
- the final tar file. This option requires that -Dsnappy.lib is also given,
- and it ignores the -Dsnappy.prefix option.
- Tests options:
- * Use -DskipTests to skip tests when running the following Maven goals:
- 'package', 'install', 'deploy' or 'verify'
- * -Dtest=<TESTCLASSNAME>,<TESTCLASSNAME#METHODNAME>,....
- * -Dtest.exclude=<TESTCLASSNAME>
- * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java
- ----------------------------------------------------------------------------------
- Building components separately
- If you are building a submodule directory, all the hadoop dependencies this
- submodule has will be resolved as all other 3rd party dependencies. This is,
- from the Maven cache or from a Maven repository (if not available in the cache
- or the SNAPSHOT 'timed out').
- An alternative is to run 'mvn install -DskipTests' from Hadoop source top
- level once; and then work from the submodule. Keep in mind that SNAPSHOTs
- time out after a while, using the Maven '-nsu' will stop Maven from trying
- to update SNAPSHOTs from external repos.
- ----------------------------------------------------------------------------------
- Importing projects to eclipse
- When you import the project to eclipse, install hadoop-maven-plugins at first.
- $ cd hadoop-maven-plugins
- $ mvn install
- Then, generate ecplise project files.
- $ mvn eclipse:eclipse -DskipTests
- At last, import to eclipse by specifying the root directory of the project via
- [File] > [Import] > [Existing Projects into Workspace].
- ----------------------------------------------------------------------------------
- Building distributions:
- Create binary distribution without native code and without documentation:
- $ mvn package -Pdist -DskipTests -Dtar
- Create binary distribution with native code and with documentation:
- $ mvn package -Pdist,native,docs -DskipTests -Dtar
- Create source distribution:
- $ mvn package -Psrc -DskipTests
- Create source and binary distributions with native code and documentation:
- $ mvn package -Pdist,native,docs,src -DskipTests -Dtar
- Create a local staging version of the website (in /tmp/hadoop-site)
- $ mvn clean site; mvn site:stage -DstagingDirectory=/tmp/hadoop-site
- ----------------------------------------------------------------------------------
|