11 年之前 · 517eccb0c7
--- a/hadoop-common-project/hadoop-common/CHANGES.txt
+++ b/hadoop-common-project/hadoop-common/CHANGES.txt
@@ -54,6 +54,9 @@ Release 2.5.0 - UNRELEASED
 
				     HADOOP-10614. CBZip2InputStream is not threadsafe (Xiangrui Meng via
			
 
				     Sandy Ryza)
			
 
				 
			
 
				+    HADOOP-10618. Remove SingleNodeSetup.apt.vm. (Akira Ajisaka via
			
 
				+    Arpit Agarwal)
			
 
				+
			
 
				   OPTIMIZATIONS
			
 
				 
			
 
				   BUG FIXES 
			
--- a/hadoop-common-project/hadoop-common/src/site/apt/SingleNodeSetup.apt.vm
+++ b/hadoop-common-project/hadoop-common/src/site/apt/SingleNodeSetup.apt.vm
@@ -18,210 +18,7 @@
 
				 
			
 
				 Single Node Setup
			
 
				 
			
 
				-%{toc|section=1|fromDepth=0}
			
 
				+  This page will be removed in the next major release.
			
 
				 
			
 
				-* Purpose
			
 
				-
			
 
				-   This document describes how to set up and configure a single-node
			
 
				-   Hadoop installation so that you can quickly perform simple operations
			
 
				-   using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
			
 
				-
			
 
				-* Prerequisites
			
 
				-
			
 
				-** Supported Platforms
			
 
				-
			
 
				-     * GNU/Linux is supported as a development and production platform.
			
 
				-       Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
			
 
				-
			
 
				-     * Windows is also a supported platform.
			
 
				-
			
 
				-** Required Software
			
 
				-
			
 
				-   Required software for Linux and Windows include:
			
 
				-
			
 
				-    [[1]] Java^TM 1.6.x, preferably from Sun, must be installed.
			
 
				-
			
 
				-    [[2]] ssh must be installed and sshd must be running to use the Hadoop
			
 
				-       scripts that manage remote Hadoop daemons.
			
 
				-
			
 
				-** Installing Software
			
 
				-
			
 
				-   If your cluster doesn't have the requisite software you will need to
			
 
				-   install it.
			
 
				-
			
 
				-   For example on Ubuntu Linux:
			
 
				-
			
 
				-----
			
 
				-   $ sudo apt-get install ssh
			
 
				-   $ sudo apt-get install rsync
			
 
				-----
			
 
				-
			
 
				-* Download
			
 
				-
			
 
				-   To get a Hadoop distribution, download a recent stable release from one
			
 
				-   of the Apache Download Mirrors.
			
 
				-
			
 
				-* Prepare to Start the Hadoop Cluster
			
 
				-
			
 
				-   Unpack the downloaded Hadoop distribution. In the distribution, edit
			
 
				-   the file <<<conf/hadoop-env.sh>>> to define at least <<<JAVA_HOME>>> to be the root
			
 
				-   of your Java installation.
			
 
				-
			
 
				-   Try the following command:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop
			
 
				-----
			
 
				-
			
 
				-   This will display the usage documentation for the hadoop script.
			
 
				-
			
 
				-   Now you are ready to start your Hadoop cluster in one of the three
			
 
				-   supported modes:
			
 
				-
			
 
				-     * Local (Standalone) Mode
			
 
				-
			
 
				-     * Pseudo-Distributed Mode
			
 
				-
			
 
				-     * Fully-Distributed Mode
			
 
				-
			
 
				-* Standalone Operation
			
 
				-
			
 
				-   By default, Hadoop is configured to run in a non-distributed mode, as a
			
 
				-   single Java process. This is useful for debugging.
			
 
				-
			
 
				-   The following example copies the unpacked conf directory to use as
			
 
				-   input and then finds and displays every match of the given regular
			
 
				-   expression. Output is written to the given output directory.
			
 
				-
			
 
				-----
			
 
				-   $ mkdir input
			
 
				-   $ cp conf/*.xml input
			
 
				-   $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
			
 
				-   $ cat output/*
			
 
				----
			
 
				-
			
 
				-* Pseudo-Distributed Operation
			
 
				-
			
 
				-   Hadoop can also be run on a single-node in a pseudo-distributed mode
			
 
				-   where each Hadoop daemon runs in a separate Java process.
			
 
				-
			
 
				-** Configuration
			
 
				-
			
 
				-   Use the following:
			
 
				-
			
 
				-   conf/core-site.xml:
			
 
				-
			
 
				-----
			
 
				-<configuration>
			
 
				-     <property>
			
 
				-         <name>fs.defaultFS</name>
			
 
				-         <value>hdfs://localhost:9000</value>
			
 
				-     </property>
			
 
				-</configuration>
			
 
				-----
			
 
				-
			
 
				-   conf/hdfs-site.xml:
			
 
				-
			
 
				-----
			
 
				-<configuration>
			
 
				-     <property>
			
 
				-         <name>dfs.replication</name>
			
 
				-         <value>1</value>
			
 
				-     </property>
			
 
				-</configuration>
			
 
				-----
			
 
				-
			
 
				-   conf/mapred-site.xml:
			
 
				-
			
 
				-----
			
 
				-<configuration>
			
 
				-     <property>
			
 
				-         <name>mapred.job.tracker</name>
			
 
				-         <value>localhost:9001</value>
			
 
				-     </property>
			
 
				-</configuration>
			
 
				-----
			
 
				-
			
 
				-** Setup passphraseless ssh
			
 
				-
			
 
				-   Now check that you can ssh to the localhost without a passphrase:
			
 
				-
			
 
				-----
			
 
				-   $ ssh localhost
			
 
				-----
			
 
				-
			
 
				-   If you cannot ssh to localhost without a passphrase, execute the
			
 
				-   following commands:
			
 
				-
			
 
				-----
			
 
				-   $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
			
 
				-   $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
			
 
				-----
			
 
				-
			
 
				-** Execution
			
 
				-
			
 
				-   Format a new distributed-filesystem:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop namenode -format
			
 
				-----
			
 
				-
			
 
				-   Start the hadoop daemons:
			
 
				-
			
 
				-----
			
 
				-   $ bin/start-all.sh
			
 
				-----
			
 
				-
			
 
				-   The hadoop daemon log output is written to the <<<${HADOOP_LOG_DIR}>>>
			
 
				-   directory (defaults to <<<${HADOOP_PREFIX}/logs>>>).
			
 
				-
			
 
				-   Browse the web interface for the NameNode and the JobTracker; by
			
 
				-   default they are available at:
			
 
				-
			
 
				-     * NameNode - <<<http://localhost:50070/>>>
			
 
				-
			
 
				-     * JobTracker - <<<http://localhost:50030/>>>
			
 
				-
			
 
				-   Copy the input files into the distributed filesystem:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop fs -put conf input
			
 
				-----
			
 
				-
			
 
				-   Run some of the examples provided:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
			
 
				-----
			
 
				-
			
 
				-   Examine the output files:
			
 
				-
			
 
				-   Copy the output files from the distributed filesystem to the local
			
 
				-   filesytem and examine them:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop fs -get output output
			
 
				-   $ cat output/*
			
 
				-----
			
 
				-
			
 
				-   or
			
 
				-
			
 
				-   View the output files on the distributed filesystem:
			
 
				-
			
 
				-----
			
 
				-   $ bin/hadoop fs -cat output/*
			
 
				-----
			
 
				-
			
 
				-   When you're done, stop the daemons with:
			
 
				-
			
 
				-----
			
 
				-   $ bin/stop-all.sh
			
 
				-----
			
 
				-
			
 
				-* Fully-Distributed Operation
			
 
				-
			
 
				-   For information on setting up fully-distributed, non-trivial clusters
			
 
				-   see {{{./ClusterSetup.html}Cluster Setup}}.
			
 
				-
			
 
				-   Java and JNI are trademarks or registered trademarks of Sun
			
 
				-   Microsystems, Inc. in the United States and other countries.
			
 
				+  See {{{./SingleCluster.html}Single Cluster Setup}} to set up and configure a
			
 
				+  single-node Hadoop installation.