INSTALL 2.5 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
  1. To compile Hadoop Mapreduce next following, do the following:
  2. Step 1) Install dependencies for yarn
  3. See http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-porject/hadoop-yarn/README
  4. Make sure protbuf library is in your library path or set: export LD_LIBRARY_PATH=/usr/local/lib
  5. Step 2) Checkout
  6. svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk
  7. Step 3) Build
  8. Go to common directory - choose your regular common build command. For example:
  9. export MAVEN_OPTS=-Xmx512m
  10. mvn clean package -Pdist -Dtar -DskipTests -Pnative
  11. You can omit -Pnative it you don't want to build native packages.
  12. Step 4) Untar the tarball from hadoop-dist/target/ into a clean and different
  13. directory, say HADOOP_YARN_HOME.
  14. Step 5)
  15. Start hdfs
  16. To run Hadoop Mapreduce next applications:
  17. Step 6) export the following variables to where you have things installed:
  18. You probably want to export these in hadoop-env.sh and yarn-env.sh also.
  19. export HADOOP_MAPRED_HOME=<mapred loc>
  20. export HADOOP_COMMON_HOME=<common loc>
  21. export HADOOP_HDFS_HOME=<hdfs loc>
  22. export HADOOP_YARN_HOME=directory where you untarred yarn
  23. export HADOOP_CONF_DIR=<conf loc>
  24. export YARN_CONF_DIR=$HADOOP_CONF_DIR
  25. Step 7) Setup config: for running mapreduce applications, which now are in user land, you need to setup nodemanager with the following configuration in your yarn-site.xml before you start the nodemanager.
  26. <property>
  27. <name>yarn.nodemanager.aux-services</name>
  28. <value>mapreduce.shuffle</value>
  29. </property>
  30. <property>
  31. <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
  32. <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  33. </property>
  34. Step 8) Modify mapred-site.xml to use yarn framework
  35. <property>
  36. <name> mapreduce.framework.name</name>
  37. <value>yarn</value>
  38. </property>
  39. Step 9) cd $HADOOP_YARN_HOME
  40. Step 10) sbin/yarn-daemon.sh start resourcemanager
  41. Step 11) sbin/yarn-daemon.sh start nodemanager
  42. Step 12) sbin/mr-jobhistory-daemon.sh start historyserver
  43. Step 13) You are all set, an example on how to run a mapreduce job is:
  44. cd $HADOOP_MAPRED_HOME
  45. ant examples -Dresolvers=internal
  46. $HADOOP_COMMON_HOME/bin/hadoop jar $HADOOP_MAPRED_HOME/build/hadoop-mapreduce-examples-*.jar randomwriter -Dmapreduce.job.user.name=$USER -Dmapreduce.randomwriter.bytespermap=10000 -Ddfs.blocksize=536870912 -Ddfs.block.size=536870912 -libjars $HADOOP_YARN_HOME/modules/hadoop-mapreduce-client-jobclient-*.jar output
  47. The output on the command line should be almost similar to what you see in the JT/TT setup (Hadoop 0.20/0.21)