Browse Source

Merge -r 951479:951480 from trunk to branch-0.21. Fixes: HADOOP-6738

git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21@951482 13f79535-47bb-0310-9956-ffa450edef68
Thomas White 15 years ago
parent
commit
4afae58351

+ 3 - 0
CHANGES.txt

@@ -867,6 +867,9 @@ Release 0.21.0 - Unreleased
     HADOOP-6585.  Add FileStatus#isDirectory and isFile.  (Eli Collins via
     HADOOP-6585.  Add FileStatus#isDirectory and isFile.  (Eli Collins via
     tomwhite)
     tomwhite)
 
 
+    HADOOP-6738.  Move cluster_setup.xml from MapReduce to Common.
+    (Tom White via tomwhite)
+
   OPTIMIZATIONS
   OPTIMIZATIONS
 
 
     HADOOP-5595. NameNode does not need to run a replicator to choose a
     HADOOP-5595. NameNode does not need to run a replicator to choose a

+ 401 - 135
src/docs/src/documentation/content/xdocs/cluster_setup.xml

@@ -33,20 +33,20 @@
       Hadoop clusters ranging from a few nodes to extremely large clusters with 
       Hadoop clusters ranging from a few nodes to extremely large clusters with 
       thousands of nodes.</p>
       thousands of nodes.</p>
       <p>
       <p>
-      To play with Hadoop, you may first want to install Hadoop on a single machine (see <a href="single_node_setup.html"> Single Node Setup</a>).
+      To play with Hadoop, you may first want to install Hadoop on a single machine (see <a href="single_node_setup.html"> Hadoop Quick Start</a>).
       </p>
       </p>
     </section>
     </section>
     
     
     <section>
     <section>
-      <title>Prerequisites</title>
+      <title>Pre-requisites</title>
       
       
       <ol>
       <ol>
         <li>
         <li>
-          Make sure all <a href="single_node_setup.html#PreReqs">required software</a> 
+          Make sure all <a href="single_node_setup.html#PreReqs">requisite</a> software 
           is installed on all nodes in your cluster.
           is installed on all nodes in your cluster.
         </li>
         </li>
         <li>
         <li>
-          <a href="single_node_setup.html#Download">Download</a> the Hadoop software.
+          <a href="single_node_setup.html#Download">Get</a> the Hadoop software.
         </li>
         </li>
       </ol>
       </ol>
     </section>
     </section>
@@ -81,21 +81,23 @@
         <ol>
         <ol>
           <li>
           <li>
             Read-only default configuration - 
             Read-only default configuration - 
-            <a href="ext:common-default">src/common/common-default.xml</a>, 
-            <a href="ext:hdfs-default">src/hdfs/hdfs-default.xml</a> and 
-            <a href="ext:mapred-default">src/mapred/mapred-default.xml</a>.
+            <a href="ext:common-default">src/core/core-default.xml</a>, 
+            <a href="ext:hdfs-default">src/hdfs/hdfs-default.xml</a>, 
+            <a href="ext:mapred-default">src/mapred/mapred-default.xml</a> and
+            <a href="ext:mapred-queues">conf/mapred-queues.xml.template</a>.
           </li>
           </li>
           <li>
           <li>
             Site-specific configuration - 
             Site-specific configuration - 
-            <em>conf/core-site.xml</em>, 
-            <em>conf/hdfs-site.xml</em> and 
-            <em>conf/mapred-site.xml</em>.
+            <a href="#core-site.xml">conf/core-site.xml</a>, 
+            <a href="#hdfs-site.xml">conf/hdfs-site.xml</a>, 
+            <a href="#mapred-site.xml">conf/mapred-site.xml</a> and
+            <a href="#mapred-queues.xml">conf/mapred-queues.xml</a>.
           </li>
           </li>
         </ol>
         </ol>
       
       
         <p>To learn more about how the Hadoop framework is controlled by these 
         <p>To learn more about how the Hadoop framework is controlled by these 
-        configuration files see
-        <a href="ext:api/org/apache/hadoop/conf/configuration">Class Configuration</a>.</p>
+        configuration files, look 
+        <a href="ext:api/org/apache/hadoop/conf/configuration">here</a>.</p>
       
       
         <p>Additionally, you can control the Hadoop scripts found in the 
         <p>Additionally, you can control the Hadoop scripts found in the 
         <code>bin/</code> directory of the distribution, by setting site-specific 
         <code>bin/</code> directory of the distribution, by setting site-specific 
@@ -163,9 +165,8 @@
           <title>Configuring the Hadoop Daemons</title>
           <title>Configuring the Hadoop Daemons</title>
           
           
           <p>This section deals with important parameters to be specified in the
           <p>This section deals with important parameters to be specified in the
-          following:
-          <br/>
-          <code>conf/core-site.xml</code>:</p>
+          following:</p>
+          <anchor id="core-site.xml"/><p><code>conf/core-site.xml</code>:</p>
 
 
 		  <table>
 		  <table>
   		    <tr>
   		    <tr>
@@ -180,7 +181,7 @@
             </tr>
             </tr>
           </table>
           </table>
 
 
-      <p><br/><code>conf/hdfs-site.xml</code>:</p>
+      <anchor id="hdfs-site.xml"/><p><code>conf/hdfs-site.xml</code>:</p>
           
           
       <table>   
       <table>   
         <tr>
         <tr>
@@ -212,7 +213,7 @@
 		    </tr>
 		    </tr>
       </table>
       </table>
 
 
-      <p><br/><code>conf/mapred-site.xml</code>:</p>
+      <anchor id="mapred-site.xml"/><p><code>conf/mapred-site.xml</code>:</p>
 
 
       <table>
       <table>
           <tr>
           <tr>
@@ -221,12 +222,12 @@
           <th>Notes</th>
           <th>Notes</th>
         </tr>
         </tr>
         <tr>
         <tr>
-          <td>mapred.job.tracker</td>
+          <td>mapreduce.jobtracker.address</td>
           <td>Host or IP and port of <code>JobTracker</code>.</td>
           <td>Host or IP and port of <code>JobTracker</code>.</td>
           <td><em>host:port</em> pair.</td>
           <td><em>host:port</em> pair.</td>
         </tr>
         </tr>
 		    <tr>
 		    <tr>
-		      <td>mapred.system.dir</td>
+		      <td>mapreduce.jobtracker.system.dir</td>
 		      <td>
 		      <td>
 		        Path on the HDFS where where the Map/Reduce framework stores 
 		        Path on the HDFS where where the Map/Reduce framework stores 
 		        system files e.g. <code>/hadoop/mapred/system/</code>.
 		        system files e.g. <code>/hadoop/mapred/system/</code>.
@@ -237,7 +238,7 @@
 		      </td>
 		      </td>
 		    </tr>
 		    </tr>
 		    <tr>
 		    <tr>
-		      <td>mapred.local.dir</td>
+		      <td>mapreduce.cluster.local.dir</td>
 		      <td>
 		      <td>
 		        Comma-separated list of paths on the local filesystem where 
 		        Comma-separated list of paths on the local filesystem where 
 		        temporary Map/Reduce data is written.
 		        temporary Map/Reduce data is written.
@@ -264,7 +265,7 @@
 		      </td>
 		      </td>
 		    </tr>
 		    </tr>
 		    <tr>
 		    <tr>
-		      <td>mapred.hosts/mapred.hosts.exclude</td>
+		      <td>mapreduce.jobtracker.hosts.filename/mapreduce.jobtracker.hosts.exclude.filename</td>
 		      <td>List of permitted/excluded TaskTrackers.</td>
 		      <td>List of permitted/excluded TaskTrackers.</td>
 		      <td>
 		      <td>
 		        If necessary, use these files to control the list of allowable 
 		        If necessary, use these files to control the list of allowable 
@@ -272,82 +273,331 @@
 		      </td>
 		      </td>
   		    </tr>
   		    </tr>
         <tr>
         <tr>
-          <td>mapred.queue.names</td>
-          <td>Comma separated list of queues to which jobs can be submitted.</td>
+          <td>mapreduce.cluster.job-authorization-enabled</td>
+          <td>Boolean, specifying whether job ACLs are supported for 
+              authorizing view and modification of a job</td>
           <td>
           <td>
-            The Map/Reduce system always supports atleast one queue
-            with the name as <em>default</em>. Hence, this parameter's
-            value should always contain the string <em>default</em>.
-            Some job schedulers supported in Hadoop, like the 
-            <a href="http://hadoop.apache.org/mapreduce/docs/current/capacity_scheduler.html">Capacity Scheduler</a>, 
-            support multiple queues. If such a scheduler is
-            being used, the list of configured queue names must be
-            specified here. Once queues are defined, users can submit
-            jobs to a queue using the property name 
-            <em>mapred.job.queue.name</em> in the job configuration.
-            There could be a separate 
-            configuration file for configuring properties of these 
-            queues that is managed by the scheduler. 
-            Refer to the documentation of the scheduler for information on 
-            the same.
+            If <em>true</em>, job ACLs would be checked while viewing or
+            modifying a job. More details are available at 
+            <a href ="ext:mapred-tutorial/JobAuthorization">Job Authorization</a>. 
           </td>
           </td>
         </tr>
         </tr>
-        <tr>
-          <td>mapred.acls.enabled</td>
-          <td>Specifies whether ACLs are supported for controlling job
-              submission and administration</td>
-          <td>
-            If <em>true</em>, ACLs would be checked while submitting
-            and administering jobs. ACLs can be specified using the
-            configuration parameters of the form
-            <em>mapred.queue.queue-name.acl-name</em>, defined below.
-          </td>
-        </tr>
-		  </table>
-      
-      <p><br/><code> conf/mapred-queue-acls.xml</code></p>
-      
-      <table>
-       <tr>
-          <th>Parameter</th>
-          <th>Value</th> 
-          <th>Notes</th>
-       </tr>
-        <tr>
-          <td>mapred.queue.<em>queue-name</em>.acl-submit-job</td>
-          <td>List of users and groups that can submit jobs to the
-              specified <em>queue-name</em>.</td>
-          <td>
-            The list of users and groups are both comma separated
-            list of names. The two lists are separated by a blank.
-            Example: <em>user1,user2 group1,group2</em>.
-            If you wish to define only a list of groups, provide
-            a blank at the beginning of the value.
-          </td>
-        </tr>
-        <tr>
-          <td>mapred.queue.<em>queue-name</em>.acl-administer-job</td>
-          <td>List of users and groups that can change the priority
-              or kill jobs that have been submitted to the
-              specified <em>queue-name</em>.</td>
-          <td>
-            The list of users and groups are both comma separated
-            list of names. The two lists are separated by a blank.
-            Example: <em>user1,user2 group1,group2</em>.
-            If you wish to define only a list of groups, provide
-            a blank at the beginning of the value. Note that an
-            owner of a job can always change the priority or kill
-            his/her own job, irrespective of the ACLs.
-          </td>
-        </tr>
-      </table>
-      
+  		    
+		  </table>      
 
 
           <p>Typically all the above parameters are marked as 
           <p>Typically all the above parameters are marked as 
           <a href="ext:api/org/apache/hadoop/conf/configuration/final_parameters">
           <a href="ext:api/org/apache/hadoop/conf/configuration/final_parameters">
           final</a> to ensure that they cannot be overriden by user-applications.
           final</a> to ensure that they cannot be overriden by user-applications.
           </p>
           </p>
 
 
+          <anchor id="mapred-queues.xml"/><p><code>conf/mapred-queues.xml
+          </code>:</p>
+          <p>This file is used to configure the queues in the Map/Reduce
+          system. Queues are abstract entities in the JobTracker that can be
+          used to manage collections of jobs. They provide a way for 
+          administrators to organize jobs in specific ways and to enforce 
+          certain policies on such collections, thus providing varying
+          levels of administrative control and management functions on jobs.
+          </p> 
+          <p>One can imagine the following sample scenarios:</p>
+          <ul>
+            <li> Jobs submitted by a particular group of users can all be 
+            submitted to one queue. </li> 
+            <li> Long running jobs in an organization can be submitted to a
+            queue. </li>
+            <li> Short running jobs can be submitted to a queue and the number
+            of jobs that can run concurrently can be restricted. </li> 
+          </ul> 
+          <p>The usage of queues is closely tied to the scheduler configured
+          at the JobTracker via <em>mapreduce.jobtracker.taskscheduler</em>.
+          The degree of support of queues depends on the scheduler used. Some
+          schedulers support a single queue, while others support more complex
+          configurations. Schedulers also implement the policies that apply 
+          to jobs in a queue. Some schedulers, such as the Fairshare scheduler,
+          implement their own mechanisms for collections of jobs and do not rely
+          on queues provided by the framework. The administrators are 
+          encouraged to refer to the documentation of the scheduler they are
+          interested in for determining the level of support for queues.</p>
+          <p>The Map/Reduce framework supports some basic operations on queues
+          such as job submission to a specific queue, access control for queues,
+          queue states, viewing configured queues and their properties
+          and refresh of queue properties. In order to fully implement some of
+          these operations, the framework takes the help of the configured
+          scheduler.</p>
+          <p>The following types of queue configurations are possible:</p>
+          <ul>
+            <li> Single queue: The default configuration in Map/Reduce comprises
+            of a single queue, as supported by the default scheduler. All jobs
+            are submitted to this default queue which maintains jobs in a priority
+            based FIFO order.</li>
+            <li> Multiple single level queues: Multiple queues are defined, and
+            jobs can be submitted to any of these queues. Different policies
+            can be applied to these queues by schedulers that support this 
+            configuration to provide a better level of support. For example,
+            the <a href="ext:capacity-scheduler">capacity scheduler</a>
+            provides ways of configuring different 
+            capacity and fairness guarantees on these queues.</li>
+            <li> Hierarchical queues: Hierarchical queues are a configuration in
+            which queues can contain other queues within them recursively. The
+            queues that contain other queues are referred to as 
+            container queues. Queues that do not contain other queues are 
+            referred as leaf or job queues. Jobs can only be submitted to leaf
+            queues. Hierarchical queues can potentially offer a higher level 
+            of control to administrators, as schedulers can now build a
+            hierarchy of policies where policies applicable to a container
+            queue can provide context for policies applicable to queues it
+            contains. It also opens up possibilities for delegating queue
+            administration where administration of queues in a container queue
+            can be turned over to a different set of administrators, within
+            the context provided by the container queue. For example, the
+            <a href="ext:capacity-scheduler">capacity scheduler</a>
+            uses hierarchical queues to partition capacity of a cluster
+            among container queues, and allowing queues they contain to divide
+            that capacity in more ways.</li> 
+          </ul>
+
+          <p>Most of the configuration of the queues can be refreshed/reloaded
+          without restarting the Map/Reduce sub-system by editing this
+          configuration file as described in the section on
+          <a href="commands_manual.html#RefreshQueues">reloading queue 
+          configuration</a>.
+          Not all configuration properties can be reloaded of course,
+          as will description of each property below explain.</p>
+
+          <p>The format of conf/mapred-queues.xml is different from the other 
+          configuration files, supporting nested configuration
+          elements to support hierarchical queues. The format is as follows:
+          </p>
+
+          <source>
+          &lt;queues aclsEnabled="$aclsEnabled"&gt;
+            &lt;queue&gt;
+              &lt;name&gt;$queue-name&lt;/name&gt;
+              &lt;state&gt;$state&lt;/state&gt;
+              &lt;queue&gt;
+                &lt;name&gt;$child-queue1&lt;/name&gt;
+                &lt;properties&gt;
+                   &lt;property key="$key" value="$value"/&gt;
+                   ...
+                &lt;/properties&gt;
+                &lt;queue&gt;
+                  &lt;name&gt;$grand-child-queue1&lt;/name&gt;
+                  ...
+                &lt;/queue&gt;
+              &lt;/queue&gt;
+              &lt;queue&gt;
+                &lt;name&gt;$child-queue2&lt;/name&gt;
+                ...
+              &lt;/queue&gt;
+              ...
+              ...
+              ...
+              &lt;queue&gt;
+                &lt;name&gt;$leaf-queue&lt;/name&gt;
+                &lt;acl-submit-job&gt;$acls&lt;/acl-submit-job&gt;
+                &lt;acl-administer-jobs&gt;$acls&lt;/acl-administer-jobs&gt;
+                &lt;properties&gt;
+                   &lt;property key="$key" value="$value"/&gt;
+                   ...
+                &lt;/properties&gt;
+              &lt;/queue&gt;
+            &lt;/queue&gt;
+          &lt;/queues&gt;
+          </source>
+          <table>
+            <tr>
+              <th>Tag/Attribute</th>
+              <th>Value</th>
+              <th>
+              	<a href="commands_manual.html#RefreshQueues">Refresh-able?</a>
+              </th>
+              <th>Notes</th>
+            </tr>
+
+            <tr>
+              <td><anchor id="queues_tag"/>queues</td>
+              <td>Root element of the configuration file.</td>
+              <td>Not-applicable</td>
+              <td>All the queues are nested inside this root element of the
+              file. There can be only one root queues element in the file.</td>
+            </tr>
+
+            <tr>
+              <td>aclsEnabled</td>
+              <td>Boolean attribute to the
+              <a href="#queues_tag"><em>&lt;queues&gt;</em></a> tag
+              specifying whether ACLs are supported for controlling job
+              submission and administration for <em>all</em> the queues
+              configured.
+              </td>
+              <td>Yes</td>
+              <td>If <em>false</em>, ACLs are ignored for <em>all</em> the
+              configured queues. <br/><br/>
+              If <em>true</em>, the user and group details of the user
+              are checked against the configured ACLs of the corresponding
+              job-queue while submitting and administering jobs. ACLs can be
+              specified for each queue using the queue-specific tags
+              "acl-$acl_name", defined below. ACLs are checked only against
+              the job-queues, i.e. the leaf-level queues; ACLs configured
+              for the rest of the queues in the hierarchy are ignored.
+              </td>
+            </tr>
+
+            <tr>
+              <td><anchor id="queue_tag"/>queue</td>
+              <td>A child element of the
+              <a href="#queues_tag"><em>&lt;queues&gt;</em></a> tag or another
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a>. Denotes a queue
+              in the system.
+              </td>
+              <td>Not applicable</td>
+              <td>Queues can be hierarchical and so this element can contain
+              children of this same type.</td>
+            </tr>
+
+            <tr>
+              <td>name</td>
+              <td>Child element of a 
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a> specifying the
+              name of the queue.</td>
+              <td>No</td>
+              <td>Name of the queue cannot contain the character <em>":"</em>
+              which is reserved as the queue-name delimiter when addressing a
+              queue in a hierarchy.</td>
+            </tr>
+
+            <tr>
+              <td>state</td>
+              <td>Child element of a
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a> specifying the
+              state of the queue.
+              </td>
+              <td>Yes</td>
+              <td>Each queue has a corresponding state. A queue in
+              <em>'running'</em> state can accept new jobs, while a queue in
+              <em>'stopped'</em> state will stop accepting any new jobs. State
+              is defined and respected by the framework only for the
+              leaf-level queues and is ignored for all other queues.
+              <br/><br/>
+              The state of the queue can be viewed from the command line using
+              <code>'bin/mapred queue'</code> command and also on the the Web
+              UI.<br/><br/>
+              Administrators can stop and start queues at runtime using the
+              feature of <a href="commands_manual.html#RefreshQueues">reloading
+              queue configuration</a>. If a queue is stopped at runtime, it
+              will complete all the existing running jobs and will stop
+              accepting any new jobs.
+              </td>
+            </tr>
+
+            <tr>
+              <td>acl-submit-job</td>
+              <td>Child element of a
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a> specifying the
+              list of users and groups that can submit jobs to the specified
+              queue.</td>
+              <td>Yes</td>
+              <td>
+              Applicable only to leaf-queues.<br/><br/>
+              The list of users and groups are both comma separated
+              list of names. The two lists are separated by a blank.
+              Example: <em>user1,user2 group1,group2</em>.
+              If you wish to define only a list of groups, provide
+              a blank at the beginning of the value.
+              <br/><br/>
+              </td>
+            </tr>
+
+            <tr>
+              <td>acl-administer-job</td>
+              <td>Child element of a
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a> specifying the
+              list of users and groups that can change the priority of a job
+              or kill a job that has been submitted to the specified queue.
+              </td>
+              <td>Yes</td>
+              <td>
+              Applicable only to leaf-queues.<br/><br/>
+              The list of users and groups are both comma separated
+              list of names. The two lists are separated by a blank.
+              Example: <em>user1,user2 group1,group2</em>.
+              If you wish to define only a list of groups, provide
+              a blank at the beginning of the value. Note that an
+              owner of a job can always change the priority or kill
+              his/her own job, irrespective of the ACLs.
+              </td>
+            </tr>
+
+            <tr>
+              <td><anchor id="properties_tag"/>properties</td>
+              <td>Child element of a 
+              <a href="#queue_tag"><em>&lt;queue&gt;</em></a> specifying the
+              scheduler specific properties.</td>
+              <td>Not applicable</td>
+              <td>The scheduler specific properties are the children of this
+              element specified as a group of &lt;property&gt; tags described
+              below. The JobTracker completely ignores these properties. These
+              can be used as per-queue properties needed by the scheduler
+              being configured. Please look at the scheduler specific
+              documentation as to how these properties are used by that
+              particular scheduler.
+              </td>
+            </tr>
+
+            <tr>
+              <td><anchor id="property_tag"/>property</td>
+              <td>Child element of
+              <a href="#properties_tag"><em>&lt;properties&gt;</em></a> for a
+              specific queue.</td>
+              <td>Not applicable</td>
+              <td>A single scheduler specific queue-property. Ignored by
+              the JobTracker and used by the scheduler that is configured.</td>
+            </tr>
+
+            <tr>
+              <td>key</td>
+              <td>Attribute of a
+              <a href="#property_tag"><em>&lt;property&gt;</em></a> for a
+              specific queue.</td>
+              <td>Scheduler-specific</td>
+              <td>The name of a single scheduler specific queue-property.</td>
+            </tr>
+
+            <tr>
+              <td>value</td>
+              <td>Attribute of a
+              <a href="#property_tag"><em>&lt;property&gt;</em></a> for a
+              specific queue.</td>
+              <td>Scheduler-specific</td>
+              <td>The value of a single scheduler specific queue-property.
+              The value can be anything that is left for the proper
+              interpretation by the scheduler that is configured.</td>
+            </tr>
+
+         </table>
+
+          <p>Once the queues are configured properly and the Map/Reduce
+          system is up and running, from the command line one can
+          <a href="commands_manual.html#QueuesList">get the list
+          of queues</a> and
+          <a href="commands_manual.html#QueuesInfo">obtain
+          information specific to each queue</a>. This information is also
+          available from the web UI. On the web UI, queue information can be
+          seen by going to queueinfo.jsp, linked to from the queues table-cell
+          in the cluster-summary table. The queueinfo.jsp prints the hierarchy
+          of queues as well as the specific information for each queue.
+          </p>
+
+          <p> Users can submit jobs only to a
+          leaf-level queue by specifying the fully-qualified queue-name for
+          the property name <em>mapreduce.job.queuename</em> in the job
+          configuration. The character ':' is the queue-name delimiter and so,
+          for e.g., if one wants to submit to a configured job-queue 'Queue-C'
+          which is one of the sub-queues of 'Queue-B' which in-turn is a
+          sub-queue of 'Queue-A', then the job configuration should contain
+          property <em>mapreduce.job.queuename</em> set to the <em>
+          &lt;value&gt;Queue-A:Queue-B:Queue-C&lt;/value&gt;</em></p>
+         </section>
           <section>
           <section>
             <title>Real-World Cluster Configurations</title>
             <title>Real-World Cluster Configurations</title>
             
             
@@ -383,7 +633,7 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.reduce.parallel.copies</td>
+                    <td>mapreduce.reduce.shuffle.parallelcopies</td>
                     <td>20</td>
                     <td>20</td>
                     <td>
                     <td>
                       Higher number of parallel copies run by reduces to fetch
                       Higher number of parallel copies run by reduces to fetch
@@ -392,7 +642,7 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.map.child.java.opts</td>
+                    <td>mapreduce.map.java.opts</td>
                     <td>-Xmx512M</td>
                     <td>-Xmx512M</td>
                     <td>
                     <td>
                       Larger heap-size for child jvms of maps. 
                       Larger heap-size for child jvms of maps. 
@@ -400,7 +650,7 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.reduce.child.java.opts</td>
+                    <td>mapreduce.reduce.java.opts</td>
                     <td>-Xmx512M</td>
                     <td>-Xmx512M</td>
                     <td>
                     <td>
                       Larger heap-size for child jvms of reduces. 
                       Larger heap-size for child jvms of reduces. 
@@ -417,13 +667,13 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/core-site.xml</td>
                     <td>conf/core-site.xml</td>
-                    <td>io.sort.factor</td>
+                    <td>mapreduce.task.io.sort.factor</td>
                     <td>100</td>
                     <td>100</td>
                     <td>More streams merged at once while sorting files.</td>
                     <td>More streams merged at once while sorting files.</td>
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/core-site.xml</td>
                     <td>conf/core-site.xml</td>
-                    <td>io.sort.mb</td>
+                    <td>mapreduce.task.io.sort.mb</td>
                     <td>200</td>
                     <td>200</td>
                     <td>Higher memory-limit while sorting data.</td>
                     <td>Higher memory-limit while sorting data.</td>
                   </tr>
                   </tr>
@@ -448,7 +698,7 @@
 		          </tr>
 		          </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.job.tracker.handler.count</td>
+                    <td>mapreduce.jobtracker.handler.count</td>
                     <td>60</td>
                     <td>60</td>
                     <td>
                     <td>
                       More JobTracker server threads to handle RPCs from large 
                       More JobTracker server threads to handle RPCs from large 
@@ -457,13 +707,13 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.reduce.parallel.copies</td>
+                    <td>mapreduce.reduce.shuffle.parallelcopies</td>
                     <td>50</td>
                     <td>50</td>
                     <td></td>
                     <td></td>
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>tasktracker.http.threads</td>
+                    <td>mapreduce.tasktracker.http.threads</td>
                     <td>50</td>
                     <td>50</td>
                     <td>
                     <td>
                       More worker threads for the TaskTracker's http server. The
                       More worker threads for the TaskTracker's http server. The
@@ -473,7 +723,7 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.map.child.java.opts</td>
+                    <td>mapreduce.map.java.opts</td>
                     <td>-Xmx512M</td>
                     <td>-Xmx512M</td>
                     <td>
                     <td>
                       Larger heap-size for child jvms of maps. 
                       Larger heap-size for child jvms of maps. 
@@ -481,7 +731,7 @@
                   </tr>
                   </tr>
                   <tr>
                   <tr>
                     <td>conf/mapred-site.xml</td>
                     <td>conf/mapred-site.xml</td>
-                    <td>mapred.reduce.child.java.opts</td>
+                    <td>mapreduce.reduce.java.opts</td>
                     <td>-Xmx1024M</td>
                     <td>-Xmx1024M</td>
                     <td>Larger heap-size for child jvms of reduces.</td>
                     <td>Larger heap-size for child jvms of reduces.</td>
                   </tr>
                   </tr>
@@ -500,11 +750,11 @@
         or equal to the -Xmx passed to JavaVM, else the VM might not start. 
         or equal to the -Xmx passed to JavaVM, else the VM might not start. 
         </p>
         </p>
         
         
-        <p>Note: <code>mapred.child.java.opts</code> are used only for 
+        <p>Note: <code>mapred.{map|reduce}.child.java.opts</code> are used only for 
         configuring the launched child tasks from task tracker. Configuring 
         configuring the launched child tasks from task tracker. Configuring 
-        the memory options for daemons is documented under 
+        the memory options for daemons is documented in 
         <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
         <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
-        Configuring the Environment of the Hadoop Daemons</a>.</p>
+        cluster_setup.html </a></p>
         
         
         <p>The memory available to some parts of the framework is also
         <p>The memory available to some parts of the framework is also
         configurable. In map and reduce tasks, performance may be influenced
         configurable. In map and reduce tasks, performance may be influenced
@@ -558,7 +808,7 @@
 
 
     <table>
     <table>
           <tr><th>Name</th><th>Type</th><th>Description</th></tr>
           <tr><th>Name</th><th>Type</th><th>Description</th></tr>
-          <tr><td>mapred.tasktracker.taskmemorymanager.monitoring-interval</td>
+          <tr><td>mapreduce.tasktracker.taskmemorymanager.monitoringinterval</td>
             <td>long</td>
             <td>long</td>
             <td>The time interval, in milliseconds, between which the TT 
             <td>The time interval, in milliseconds, between which the TT 
             checks for any memory violation. The default value is 5000 msec
             checks for any memory violation. The default value is 5000 msec
@@ -668,10 +918,11 @@
             the tasks. For maximum security, this task controller 
             the tasks. For maximum security, this task controller 
             sets up restricted permissions and user/group ownership of
             sets up restricted permissions and user/group ownership of
             local files and directories used by the tasks such as the
             local files and directories used by the tasks such as the
-            job jar files, intermediate files and task log files. Currently
-            permissions on distributed cache files are opened up to be
-            accessible by all users. In future, it is expected that stricter
-            file permissions are set for these files too.
+            job jar files, intermediate files, task log files and distributed
+            cache files. Particularly note that, because of this, except the
+            job owner and tasktracker, no other user can access any of the
+            local files/directories including those localized as part of the
+            distributed cache.
             </td>
             </td>
             </tr>
             </tr>
             </table>
             </table>
@@ -684,7 +935,7 @@
             <th>Property</th><th>Value</th><th>Notes</th>
             <th>Property</th><th>Value</th><th>Notes</th>
             </tr>
             </tr>
             <tr>
             <tr>
-            <td>mapred.task.tracker.task-controller</td>
+            <td>mapreduce.tasktracker.taskcontroller</td>
             <td>Fully qualified class name of the task controller class</td>
             <td>Fully qualified class name of the task controller class</td>
             <td>Currently there are two implementations of task controller
             <td>Currently there are two implementations of task controller
             in the Hadoop system, DefaultTaskController and LinuxTaskController.
             in the Hadoop system, DefaultTaskController and LinuxTaskController.
@@ -715,21 +966,35 @@
             <p>
             <p>
             The executable must have specific permissions as follows. The
             The executable must have specific permissions as follows. The
             executable should have <em>6050 or --Sr-s---</em> permissions
             executable should have <em>6050 or --Sr-s---</em> permissions
-            user-owned by root(super-user) and group-owned by a group 
-            of which only the TaskTracker's user is the sole group member. 
+            user-owned by root(super-user) and group-owned by a special group 
+            of which the TaskTracker's user is the group member and no job 
+            submitter is. If any job submitter belongs to this special group,
+            security will be compromised. This special group name should be
+            specified for the configuration property 
+            <em>"mapreduce.tasktracker.group"</em> in both mapred-site.xml and 
+            <a href="#task-controller.cfg">task-controller.cfg</a>.  
             For example, let's say that the TaskTracker is run as user
             For example, let's say that the TaskTracker is run as user
             <em>mapred</em> who is part of the groups <em>users</em> and
             <em>mapred</em> who is part of the groups <em>users</em> and
-            <em>mapredGroup</em> any of them being the primary group.
+            <em>specialGroup</em> any of them being the primary group.
             Let also be that <em>users</em> has both <em>mapred</em> and
             Let also be that <em>users</em> has both <em>mapred</em> and
-            another user <em>X</em> as its members, while <em>mapredGroup</em>
-            has only <em>mapred</em> as its member. Going by the above
+            another user (job submitter) <em>X</em> as its members, and X does
+            not belong to <em>specialGroup</em>. Going by the above
             description, the setuid/setgid executable should be set
             description, the setuid/setgid executable should be set
             <em>6050 or --Sr-s---</em> with user-owner as <em>mapred</em> and
             <em>6050 or --Sr-s---</em> with user-owner as <em>mapred</em> and
-            group-owner as <em>mapredGroup</em> which has
-            only <em>mapred</em> as its member(and not <em>users</em> which has
+            group-owner as <em>specialGroup</em> which has
+            <em>mapred</em> as its member(and not <em>users</em> which has
             <em>X</em> also as its member besides <em>mapred</em>).
             <em>X</em> also as its member besides <em>mapred</em>).
             </p>
             </p>
+
+            <p>
+            The LinuxTaskController requires that paths including and leading up
+            to the directories specified in
+            <em>mapreduce.cluster.local.dir</em> and <em>hadoop.log.dir</em> to
+            be set 755 permissions.
+            </p>
             
             
+            <section>
+            <title>task-controller.cfg</title>
             <p>The executable requires a configuration file called 
             <p>The executable requires a configuration file called 
             <em>taskcontroller.cfg</em> to be
             <em>taskcontroller.cfg</em> to be
             present in the configuration directory passed to the ant target 
             present in the configuration directory passed to the ant target 
@@ -747,8 +1012,8 @@
             </p>
             </p>
             <table><tr><th>Name</th><th>Description</th></tr>
             <table><tr><th>Name</th><th>Description</th></tr>
             <tr>
             <tr>
-            <td>mapred.local.dir</td>
-            <td>Path to mapred local directories. Should be same as the value 
+            <td>mapreduce.cluster.local.dir</td>
+            <td>Path to mapreduce.cluster.local.directories. Should be same as the value 
             which was provided to key in mapred-site.xml. This is required to
             which was provided to key in mapred-site.xml. This is required to
             validate paths passed to the setuid executable in order to prevent
             validate paths passed to the setuid executable in order to prevent
             arbitrary paths being passed to it.</td>
             arbitrary paths being passed to it.</td>
@@ -760,14 +1025,16 @@
             permissions on the log files so that they can be written to by the user's
             permissions on the log files so that they can be written to by the user's
             tasks and read by the TaskTracker for serving on the web UI.</td>
             tasks and read by the TaskTracker for serving on the web UI.</td>
             </tr>
             </tr>
+            <tr>
+            <td>mapreduce.tasktracker.group</td>
+            <td>Group to which the TaskTracker belongs. The group owner of the
+            taskcontroller binary should be this group. Should be same as
+            the value with which the TaskTracker is configured. This 
+            configuration is required for validating the secure access of the
+            task-controller binary.</td>
+            </tr>
             </table>
             </table>
-
-            <p>
-            The LinuxTaskController requires that paths including and leading up to
-            the directories specified in
-            <em>mapred.local.dir</em> and <em>hadoop.log.dir</em> to be set 755
-            permissions.
-            </p>
+            </section>
             </section>
             </section>
             
             
           </section>
           </section>
@@ -800,7 +1067,7 @@
             monitoring script in <em>mapred-site.xml</em>.</p>
             monitoring script in <em>mapred-site.xml</em>.</p>
             <table>
             <table>
             <tr><th>Name</th><th>Description</th></tr>
             <tr><th>Name</th><th>Description</th></tr>
-            <tr><td><code>mapred.healthChecker.script.path</code></td>
+            <tr><td><code>mapreduce.tasktracker.healthchecker.script.path</code></td>
             <td>Absolute path to the script which is periodically run by the 
             <td>Absolute path to the script which is periodically run by the 
             TaskTracker to determine if the node is 
             TaskTracker to determine if the node is 
             healthy or not. The file should be executable by the TaskTracker.
             healthy or not. The file should be executable by the TaskTracker.
@@ -809,18 +1076,18 @@
             is not started.</td>
             is not started.</td>
             </tr>
             </tr>
             <tr>
             <tr>
-            <td><code>mapred.healthChecker.interval</code></td>
+            <td><code>mapreduce.tasktracker.healthchecker.interval</code></td>
             <td>Frequency at which the node health script is run, 
             <td>Frequency at which the node health script is run, 
             in milliseconds</td>
             in milliseconds</td>
             </tr>
             </tr>
             <tr>
             <tr>
-            <td><code>mapred.healthChecker.script.timeout</code></td>
+            <td><code>mapreduce.tasktracker.healthchecker.script.timeout</code></td>
             <td>Time after which the node health script will be killed by
             <td>Time after which the node health script will be killed by
             the TaskTracker if unresponsive.
             the TaskTracker if unresponsive.
             The node is marked unhealthy. if node health script times out.</td>
             The node is marked unhealthy. if node health script times out.</td>
             </tr>
             </tr>
             <tr>
             <tr>
-            <td><code>mapred.healthChecker.script.args</code></td>
+            <td><code>mapreduce.tasktracker.healthchecker.script.args</code></td>
             <td>Extra arguments that can be passed to the node health script 
             <td>Extra arguments that can be passed to the node health script 
             when launched.
             when launched.
             These should be comma separated list of arguments. </td>
             These should be comma separated list of arguments. </td>
@@ -857,17 +1124,17 @@
             <title>History Logging</title>
             <title>History Logging</title>
             
             
             <p> The job history files are stored in central location 
             <p> The job history files are stored in central location 
-            <code> hadoop.job.history.location </code> which can be on DFS also,
+            <code> mapreduce.jobtracker.jobhistory.location </code> which can be on DFS also,
             whose default value is <code>${HADOOP_LOG_DIR}/history</code>. 
             whose default value is <code>${HADOOP_LOG_DIR}/history</code>. 
             The history web UI is accessible from job tracker web UI.</p>
             The history web UI is accessible from job tracker web UI.</p>
             
             
             <p> The history files are also logged to user specified directory
             <p> The history files are also logged to user specified directory
-            <code>hadoop.job.history.user.location</code> 
+            <code>mapreduce.job.userhistorylocation</code> 
             which defaults to job output directory. The files are stored in
             which defaults to job output directory. The files are stored in
             "_logs/history/" in the specified directory. Hence, by default 
             "_logs/history/" in the specified directory. Hence, by default 
-            they will be in "mapred.output.dir/_logs/history/". User can stop
+            they will be in "mapreduce.output.fileoutputformat.outputdir/_logs/history/". User can stop
             logging by giving the value <code>none</code> for 
             logging by giving the value <code>none</code> for 
-            <code>hadoop.job.history.user.location</code> </p>
+            <code>mapreduce.job.userhistorylocation</code> </p>
             
             
             <p> User can view the history logs summary in specified directory 
             <p> User can view the history logs summary in specified directory 
             using the following command <br/>
             using the following command <br/>
@@ -880,7 +1147,6 @@
             <code>$ bin/hadoop job -history all output-dir</code><br/></p> 
             <code>$ bin/hadoop job -history all output-dir</code><br/></p> 
           </section>
           </section>
         </section>
         </section>
-      </section>
       
       
       <p>Once all the necessary configuration is complete, distribute the files
       <p>Once all the necessary configuration is complete, distribute the files
       to the <code>HADOOP_CONF_DIR</code> directory on all the machines, 
       to the <code>HADOOP_CONF_DIR</code> directory on all the machines, 
@@ -891,9 +1157,9 @@
       <section>
       <section>
         <title>Map/Reduce</title>
         <title>Map/Reduce</title>
         <p>The job tracker restart can recover running jobs if 
         <p>The job tracker restart can recover running jobs if 
-        <code>mapred.jobtracker.restart.recover</code> is set true and 
+        <code>mapreduce.jobtracker.restart.recover</code> is set true and 
         <a href="#Logging">JobHistory logging</a> is enabled. Also 
         <a href="#Logging">JobHistory logging</a> is enabled. Also 
-        <code>mapred.jobtracker.job.history.block.size</code> value should be 
+        <code>mapreduce.jobtracker.jobhistory.block.size</code> value should be 
         set to an optimal value to dump job history to disk as soon as 
         set to an optimal value to dump job history to disk as soon as 
         possible, the typical value is 3145728(3MB).</p>
         possible, the typical value is 3145728(3MB).</p>
       </section>
       </section>
@@ -951,7 +1217,7 @@
       and starts the <code>TaskTracker</code> daemon on all the listed slaves.
       and starts the <code>TaskTracker</code> daemon on all the listed slaves.
       </p>
       </p>
     </section>
     </section>
-    
+
     <section>
     <section>
       <title>Hadoop Shutdown</title>
       <title>Hadoop Shutdown</title>
       
       

+ 772 - 0
src/docs/src/documentation/content/xdocs/commands_manual.xml

@@ -0,0 +1,772 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+<document>
+	<header>
+		<title>Hadoop Commands Guide</title>
+	</header>
+	
+	<body>
+		<section>
+			<title>Overview</title>
+			<p>
+				All Hadoop commands are invoked by the bin/hadoop script. Running the Hadoop
+				script without any arguments prints the description for all commands.
+			</p>
+			<p>
+				<code>Usage: hadoop [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]</code>
+			</p>
+			<p>
+				Hadoop has an option parsing framework that employs parsing generic options as well as running classes.
+			</p>
+			<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>--config confdir</code></td>
+			            <td>Overwrites the default Configuration directory. Default is ${HADOOP_HOME}/conf.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>GENERIC_OPTIONS</code></td>
+			            <td>The common set of options supported by multiple commands.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>COMMAND</code><br/><code>COMMAND_OPTIONS</code></td>
+			            <td>Various commands with their options are described in the following sections. The commands 
+			            have been grouped into <a href="commands_manual.html#User+Commands">User Commands</a> 
+			            and <a href="commands_manual.html#Administration+Commands">Administration Commands</a>.</td>
+			           </tr>
+			     </table>
+			 <section>
+				<title>Generic Options</title>
+				<p>
+				  The following options are supported by <a href="commands_manual.html#dfsadmin">dfsadmin</a>, 
+				  <a href="commands_manual.html#fs">fs</a>, <a href="commands_manual.html#fsck">fsck</a> and 
+				  <a href="commands_manual.html#job">job</a>. 
+				  Applications should implement 
+				  <a href="ext:api/org/apache/hadoop/util/tool">Tool</a> to support
+				  <a href="ext:api/org/apache/hadoop/util/genericoptionsparser">
+				  GenericOptions</a>.
+				</p>
+			     <table>
+			          <tr><th> GENERIC_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-conf &lt;configuration file&gt;</code></td>
+			            <td>Specify an application configuration file.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-D &lt;property=value&gt;</code></td>
+			            <td>Use value for given property.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-fs &lt;local|namenode:port&gt;</code></td>
+			            <td>Specify a namenode.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-jt &lt;local|jobtracker:port&gt;</code></td>
+			            <td>Specify a job tracker. Applies only to <a href="commands_manual.html#job">job</a>.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-files &lt;comma separated list of files&gt;</code></td>
+			            <td>Specify comma separated files to be copied to the map reduce cluster. 
+			            Applies only to <a href="commands_manual.html#job">job</a>.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-libjars &lt;comma seperated list of jars&gt;</code></td>
+			            <td>Specify comma separated jar files to include in the classpath. 
+			            Applies only to <a href="commands_manual.html#job">job</a>.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-archives &lt;comma separated list of archives&gt;</code></td>
+			            <td>Specify comma separated archives to be unarchived on the compute machines. 
+			            Applies only to <a href="commands_manual.html#job">job</a>.</td>
+			           </tr>
+				</table>
+			</section>	   
+		</section>
+		
+		<section>
+			<title> User Commands </title>
+			<p>Commands useful for users of a Hadoop cluster.</p>
+			<section>
+				<title> archive </title>
+				<p>
+					Creates a Hadoop archive. More information see the <a href="ext:hadoop-archives">Hadoop Archives Guide</a>.
+				</p>
+				<p>
+					<code>Usage: hadoop archive -archiveName NAME &lt;src&gt;* &lt;dest&gt;</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+					   <tr>
+			          	<td><code>-archiveName NAME</code></td>
+			            <td>Name of the archive to be created.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>src</code></td>
+			            <td>Filesystem pathnames which work as usual with regular expressions.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>dest</code></td>
+			            <td>Destination directory which would contain the archive.</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> distcp </title>
+				<p>
+					Copy file or directories recursively. More information can be found at <a href="ext:distcp">DistCp Guide</a>.
+				</p>
+				<p>
+					<code>Usage: hadoop distcp &lt;srcurl&gt; &lt;desturl&gt;</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>srcurl</code></td>
+			            <td>Source Url</td>
+			           </tr>
+			           <tr>
+			          	<td><code>desturl</code></td>
+			            <td>Destination Url</td>
+			           </tr>
+			     </table>
+			</section>
+			       
+			<section>
+				<title> fs </title>
+				<p>
+					Runs a generic filesystem user client.
+				</p>
+				<p>
+					<code>Usage: hadoop fs [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
+					[COMMAND_OPTIONS]</code>
+				</p>
+				<p>
+					The various COMMAND_OPTIONS can be found at 
+					<a href="file_system_shell.html">File System Shell Guide</a>.
+				</p>   
+			</section>
+			
+			<section>
+				<title> fsck </title>
+				<p>
+					Runs a HDFS filesystem checking utility. See <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Fsck">Fsck</a> for more info.
+				</p> 
+				<p><code>Usage: hadoop fsck [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
+				&lt;path&gt; [-move | -delete | -openforwrite] [-files [-blocks 
+				[-locations | -racks]]]</code></p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			          <tr>
+			            <td><code>&lt;path&gt;</code></td>
+			            <td>Start checking from this path.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-move</code></td>
+			            <td>Move corrupted files to /lost+found</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-delete</code></td>
+			            <td>Delete corrupted files.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-openforwrite</code></td>
+			            <td>Print out files opened for write.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-files</code></td>
+			            <td>Print out files being checked.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-blocks</code></td>
+			            <td>Print out block report.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-locations</code></td>
+			            <td>Print out locations for every block.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-racks</code></td>
+			            <td>Print out network topology for data-node locations.</td>
+			           </tr>
+					</table>
+			</section>
+			
+			<section>
+				<title> jar </title>
+				<p>
+					Runs a jar file. Users can bundle their Map Reduce code in a jar file and execute it using this command.
+				</p> 
+				<p>
+					<code>Usage: hadoop jar &lt;jar&gt; [mainClass] args...</code>
+				</p>
+				<p>
+					The streaming jobs are run via this command. For examples, see 
+					<a href="ext:streaming">Hadoop Streaming</a>.
+				</p>
+				<p>
+					The WordCount example is also run using jar command. For examples, see the
+					<a href="ext:mapred-tutorial">MapReduce Tutorial</a>.
+				</p>
+			</section>
+			
+			<section>
+				<title> job </title>
+				<p>
+					Command to interact with Map Reduce Jobs.
+				</p>
+				<p>
+					<code>Usage: hadoop job [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] 
+					[-submit &lt;job-file&gt;] | [-status &lt;job-id&gt;] | 
+					[-counter &lt;job-id&gt; &lt;group-name&gt; &lt;counter-name&gt;] | [-kill &lt;job-id&gt;] | 
+					[-events &lt;job-id&gt; &lt;from-event-#&gt; &lt;#-of-events&gt;] | [-history [all] &lt;historyFile&gt;] |
+					[-list [all]] | [-kill-task &lt;task-id&gt;] | [-fail-task &lt;task-id&gt;] | 
+          [-set-priority &lt;job-id&gt; &lt;priority&gt;]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-submit &lt;job-file&gt;</code></td>
+			            <td>Submits the job.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-status &lt;job-id&gt;</code></td>
+			            <td>Prints the map and reduce completion percentage and all job counters.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-counter &lt;job-id&gt; &lt;group-name&gt; &lt;counter-name&gt;</code></td>
+			            <td>Prints the counter value.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-kill &lt;job-id&gt;</code></td>
+			            <td>Kills the job.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-events &lt;job-id&gt; &lt;from-event-#&gt; &lt;#-of-events&gt;</code></td>
+			            <td>Prints the events' details received by jobtracker for the given range.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-history [all] &lt;historyFile&gt;</code></td>
+			            <td>-history &lt;historyFile&gt; prints job details, failed and killed tip details. More details 
+			            about the job such as successful tasks and task attempts made for each task can be viewed by 
+			            specifying the [all] option. </td>
+			           </tr>
+			           <tr>
+			          	<td><code>-list [all]</code></td>
+			            <td>-list all displays all jobs. -list displays only jobs which are yet to complete.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-kill-task &lt;task-id&gt;</code></td>
+			            <td>Kills the task. Killed tasks are NOT counted against failed attempts.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-fail-task &lt;task-id&gt;</code></td>
+			            <td>Fails the task. Failed tasks are counted against failed attempts.</td>
+			           </tr>
+                 <tr>
+                  <td><code>-set-priority &lt;job-id&gt; &lt;priority&gt;</code></td>
+                  <td>Changes the priority of the job. 
+                  Allowed priority values are VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW</td>
+                 </tr>
+					</table>
+			</section>
+			
+			<section>
+				<title> pipes </title>
+				<p>
+					Runs a pipes job.
+				</p>
+				<p>
+					<code>Usage: hadoop pipes [-conf &lt;path&gt;] [-jobconf &lt;key=value&gt;, &lt;key=value&gt;, ...] 
+					[-input &lt;path&gt;] [-output &lt;path&gt;] [-jar &lt;jar file&gt;] [-inputformat &lt;class&gt;] 
+					[-map &lt;class&gt;] [-partitioner &lt;class&gt;] [-reduce &lt;class&gt;] [-writer &lt;class&gt;] 
+					[-program &lt;executable&gt;] [-reduces &lt;num&gt;] </code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			          <tr>
+			          	<td><code>-conf &lt;path&gt;</code></td>
+			            <td>Configuration for job</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-jobconf &lt;key=value&gt;, &lt;key=value&gt;, ...</code></td>
+			            <td>Add/override configuration for job</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-input &lt;path&gt;</code></td>
+			            <td>Input directory</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-output &lt;path&gt;</code></td>
+			            <td>Output directory</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-jar &lt;jar file&gt;</code></td>
+			            <td>Jar filename</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-inputformat &lt;class&gt;</code></td>
+			            <td>InputFormat class</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-map &lt;class&gt;</code></td>
+			            <td>Java Map class</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-partitioner &lt;class&gt;</code></td>
+			            <td>Java Partitioner</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-reduce &lt;class&gt;</code></td>
+			            <td>Java Reduce class</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-writer &lt;class&gt;</code></td>
+			            <td>Java RecordWriter</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-program &lt;executable&gt;</code></td>
+			            <td>Executable URI</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-reduces &lt;num&gt;</code></td>
+			            <td>Number of reduces</td>
+			           </tr>
+					</table>
+			</section>
+      <section>
+        <title> queue </title>
+        <p>
+          command to interact and view Job Queue information
+        </p>
+        <p>
+          <code>Usage : hadoop queue [-list] | [-info &lt;job-queue-name&gt; [-showJobs]] | [-showacls]</code>
+        </p>
+        <table>
+        <tr>
+          <th> COMMAND_OPTION </th><th> Description </th>
+        </tr>
+        <tr>
+          <td><anchor id="QueuesList"/><code>-list</code> </td>
+          <td>Gets list of Job Queues configured in the system. Along with scheduling information
+          associated with the job queues.
+          </td>
+        </tr>
+        <tr>
+          <td><anchor id="QueuesInfo"/><code>-info &lt;job-queue-name&gt; [-showJobs]</code></td>
+          <td>
+           Displays the job queue information and associated scheduling information of particular
+           job queue. If -showJobs options is present a list of jobs submitted to the particular job
+           queue is displayed. 
+          </td>
+        </tr>
+        <tr>
+          <td><code>-showacls</code></td>
+          <td>Displays the queue name and associated queue operations allowed for the current user.
+          The list consists of only those queues to which the user has access.
+          </td>
+          </tr>
+        </table>
+      </section>  	
+			<section>
+				<title> version </title>
+				<p>
+					Prints the version.
+				</p> 
+				<p>
+					<code>Usage: hadoop version</code>
+				</p>
+			</section>
+			<section>
+				<title> CLASSNAME </title>
+				<p>
+					 Hadoop script can be used to invoke any class.
+				</p>
+				<p>
+					 Runs the class named CLASSNAME.
+				</p>
+
+				<p>
+					<code>Usage: hadoop CLASSNAME</code>
+				</p>
+
+			</section>
+    </section>
+		<section>
+			<title> Administration Commands </title>
+			<p>Commands useful for administrators of a Hadoop cluster.</p>
+			<section>
+				<title> balancer </title>
+				<p>
+					Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the 
+					rebalancing process. For more details see 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Rebalancer">Rebalancer</a>.
+				</p>
+				<p>
+					<code>Usage: hadoop balancer [-threshold &lt;threshold&gt;]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-threshold &lt;threshold&gt;</code></td>
+			            <td>Percentage of disk capacity. This overwrites the default threshold.</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> daemonlog </title>
+				<p>
+					 Get/Set the log level for each daemon.
+				</p> 
+				<p>
+					<code>Usage: hadoop daemonlog  -getlevel &lt;host:port&gt; &lt;name&gt;</code><br/>
+					<code>Usage: hadoop daemonlog  -setlevel &lt;host:port&gt; &lt;name&gt; &lt;level&gt;</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-getlevel &lt;host:port&gt; &lt;name&gt;</code></td>
+			            <td>Prints the log level of the daemon running at &lt;host:port&gt;. 
+			            This command internally connects to http://&lt;host:port&gt;/logLevel?log=&lt;name&gt;</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-setlevel &lt;host:port&gt; &lt;name&gt; &lt;level&gt;</code></td>
+			            <td>Sets the log level of the daemon running at &lt;host:port&gt;. 
+			            This command internally connects to http://&lt;host:port&gt;/logLevel?log=&lt;name&gt;</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> datanode</title>
+				<p>
+					Runs a HDFS datanode.
+				</p> 
+				<p>
+					<code>Usage: hadoop datanode [-rollback]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-rollback</code></td>
+			            <td>Rollsback the datanode to the previous version. This should be used after stopping the datanode 
+			            and distributing the old Hadoop version.</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> dfsadmin </title>
+				<p>
+					Runs a HDFS dfsadmin client.
+				</p> 
+				<p>
+					<code>Usage: hadoop dfsadmin  [</code><a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a><code>] [-report] [-safemode enter | leave | get | wait] [-refreshNodes]
+					 [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] 
+					 [-setQuota &lt;quota&gt; &lt;dirname&gt;...&lt;dirname&gt;] [-clrQuota &lt;dirname&gt;...&lt;dirname&gt;] 
+					 [-restoreFailedStorage true|false|check] 
+					 [-help [cmd]]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-report</code></td>
+			            <td>Reports basic filesystem information and statistics.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-safemode enter | leave | get | wait</code></td>
+			            <td>Safe mode maintenance command.
+                Safe mode is a Namenode state in which it <br/>
+                        1.  does not accept changes to the name space (read-only) <br/> 
+                        2.  does not replicate or delete blocks. <br/>
+                Safe mode is entered automatically at Namenode startup, and
+                leaves safe mode automatically when the configured minimum
+                percentage of blocks satisfies the minimum replication
+                condition.  Safe mode can also be entered manually, but then
+                it can only be turned off manually as well.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-refreshNodes</code></td>
+			            <td>Re-read the hosts and exclude files to update the set
+                of Datanodes that are allowed to connect to the Namenode
+                and those that should be decommissioned or recommissioned.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-finalizeUpgrade</code></td>
+			            <td>Finalize upgrade of HDFS.
+                Datanodes delete their previous version working directories,
+                followed by Namenode doing the same.
+                This completes the upgrade process.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-printTopology</code></td>
+			            <td>Print a tree of the rack/datanode topology of the
+                 cluster as seen by the NameNode.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-upgradeProgress status | details | force</code></td>
+			            <td>Request current distributed upgrade status,
+                a detailed status or force the upgrade to proceed.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-metasave filename</code></td>
+			            <td>Save Namenode's primary data structures
+                to &lt;filename&gt; in the directory specified by hadoop.log.dir property.
+                &lt;filename&gt; will contain one line for each of the following <br/>
+                        1. Datanodes heart beating with Namenode<br/>
+                        2. Blocks waiting to be replicated<br/>
+                        3. Blocks currrently being replicated<br/>
+                        4. Blocks waiting to be deleted</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-setQuota &lt;quota&gt; &lt;dirname&gt;...&lt;dirname&gt;</code></td>
+			            <td>Set the quota &lt;quota&gt; for each directory &lt;dirname&gt;.
+                The directory quota is a long integer that puts a hard limit on the number of names in the directory tree.<br/>
+                Best effort for the directory, with faults reported if<br/>
+                1. N is not a positive integer, or<br/>
+                2. user is not an administrator, or<br/>
+                3. the directory does not exist or is a file, or<br/>
+                4. the directory would immediately exceed the new quota.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-clrQuota &lt;dirname&gt;...&lt;dirname&gt;</code></td>
+			            <td>Clear the quota for each directory &lt;dirname&gt;.<br/>
+                Best effort for the directory. with fault reported if<br/>
+                1. the directory does not exist or is a file, or<br/>
+                2. user is not an administrator.<br/>
+                It does not fault if the directory has no quota.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-restoreFailedStorage true | false | check</code></td>
+			            <td>This option will turn on/off automatic attempt to restore failed storage replicas. 
+			            If a failed storage becomes available again the system will attempt to restore 
+			            edits and/or fsimage during checkpoint. 'check' option will return current setting.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-help [cmd]</code></td>
+			            <td> Displays help for the given command or all commands if none
+                is specified.</td>
+			           </tr>
+			     </table>
+			</section>
+			<section>
+        <title>mradmin</title>
+        <p>Runs MR admin client</p>
+        <p><code>Usage: hadoop mradmin  [</code>
+        <a href="commands_manual.html#Generic+Options">GENERIC_OPTIONS</a>
+        <code>] [-refreshServiceAcl] [-refreshQueues] [-refreshNodes] [-help [cmd]] </code></p>
+        <table>
+        <tr>
+        <th> COMMAND_OPTION </th><th> Description </th>
+        </tr>
+        <tr>
+        <td><code>-refreshServiceAcl</code></td>
+        <td> Reload the service-level authorization policies. Jobtracker
+         will reload the authorization policy file.</td>
+        </tr>
+        <tr>
+        <td><anchor id="RefreshQueues"/><code>-refreshQueues</code></td>
+        <td><p> Reload the queues' configuration at the JobTracker.
+          Most of the configuration of the queues can be refreshed/reloaded
+          without restarting the Map/Reduce sub-system. Administrators
+          typically own the
+          <a href="cluster_setup.html#mapred-queues.xml">
+          <em>conf/mapred-queues.xml</em></a>
+          file, can edit it while the JobTracker is still running, and can do
+          a reload by running this command.</p>
+          <p>It should be noted that while trying to refresh queues'
+          configuration, one cannot change the hierarchy of queues itself.
+          This means no operation that involves a change in either the
+          hierarchy structure itself or the queues' names will be allowed.
+          Only selected properties of queues can be changed during refresh.
+          For example, new queues cannot be added dynamically, neither can an
+          existing queue be deleted.</p>
+          <p>If during a reload of queue configuration,
+          a syntactic or semantic error in made during the editing of the
+          configuration file, the refresh command fails with an exception that
+          is printed on the standard output of this command, thus informing the
+          requester with any helpful messages of what has gone wrong during
+          the edit/reload. Importantly, the existing queue configuration is
+          untouched and the system is left in a consistent state.
+          </p>
+          <p>As described in the
+          <a href="cluster_setup.html#mapred-queues.xml"><em>
+          conf/mapred-queues.xml</em></a> section, the
+          <a href="cluster_setup.html#properties_tag"><em>
+          &lt;properties&gt;</em></a> tag in the queue configuration file can
+          also be used to specify per-queue properties needed by the scheduler.
+           When the framework's queue configuration is reloaded using this
+          command, this scheduler specific configuration will also be reloaded
+          , provided the scheduler being configured supports this reload.
+          Please see the documentation of the particular scheduler in use.</p>
+          </td>
+        </tr>
+        <tr>
+        <td><code>-refreshNodes</code></td>
+        <td> Refresh the hosts information at the jobtracker.</td>
+        </tr>
+        <tr>
+        <td><code>-help [cmd]</code></td>
+        <td>Displays help for the given command or all commands if none
+                is specified.</td>
+        </tr>
+        </table>
+      </section>
+			<section>
+				<title> jobtracker </title>
+				<p>
+					Runs the MapReduce job Tracker node.
+				</p> 
+				<p>
+					<code>Usage: hadoop jobtracker [-dumpConfiguration]</code>
+					</p>
+          <table>
+          <tr>
+          <th>COMMAND_OPTION</th><th> Description</th>
+          </tr>
+          <tr>
+          <td><code>-dumpConfiguration</code></td>
+          <td> Dumps the configuration used by the JobTracker alongwith queue
+          configuration in JSON format into Standard output used by the 
+          jobtracker and exits.</td>
+          </tr>
+          </table>
+				
+			</section>
+			
+			<section>
+				<title> namenode </title>
+				<p>
+					Runs the namenode. For more information about upgrade, rollback and finalize see 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Upgrade+and+Rollback">Upgrade and Rollback</a>.
+				</p>
+				<p>
+					<code>Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] | [-checkpoint] | [-backup]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+                <tr>
+                  <td><code>-regular</code></td>
+                  <td>Start namenode in standard, active role rather than as backup or checkpoint node. This is the default role.</td>
+                </tr>
+                <tr>
+                  <td><code>-checkpoint</code></td>
+                  <td>Start namenode in checkpoint role, creating periodic checkpoints of the active namenode metadata.</td>
+                </tr>
+                <tr>
+                  <td><code>-backup</code></td>
+                  <td>Start namenode in backup role, maintaining an up-to-date in-memory copy of the namespace and creating periodic checkpoints.</td>
+                </tr>
+			           <tr>
+			          	<td><code>-format</code></td>
+			            <td>Formats the namenode. It starts the namenode, formats it and then shut it down.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-upgrade</code></td>
+			            <td>Namenode should be started with upgrade option after the distribution of new Hadoop version.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-rollback</code></td>
+			            <td>Rollsback the namenode to the previous version. This should be used after stopping the cluster 
+			            and distributing the old Hadoop version.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-finalize</code></td>
+			            <td>Finalize will remove the previous state of the files system. Recent upgrade will become permanent. 
+			            Rollback option will not be available anymore. After finalization it shuts the namenode down.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-importCheckpoint</code></td>
+			            <td>Loads image from a checkpoint directory and saves it into the current one. Checkpoint directory 
+			            is read from property fs.checkpoint.dir
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Import+checkpoint">Import Checkpoint</a>).
+			            </td>
+			           </tr>
+			            <tr>
+			          	<td><code>-checkpoint</code></td>
+			            <td>Enables checkpointing 
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node">Checkpoint Node</a>).</td>
+			           </tr>
+			            <tr>
+			          	<td><code>-backup</code></td>
+			            <td>Enables checkpointing and maintains an in-memory, up-to-date copy of the file system namespace 
+			            (see <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Backup+Node">Backup Node</a>).</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> secondarynamenode </title>
+				<note>
+					The Secondary NameNode has been deprecated. Instead, consider using the
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node">Checkpoint Node</a> or 
+					<a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Backup+Node">Backup Node</a>. 
+				</note>
+				<p>	
+					Runs the HDFS secondary 
+					namenode. See <a href="http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Secondary+NameNode">Secondary NameNode</a> 
+					for more info.
+				</p>
+				<p>
+					<code>Usage: hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize]</code>
+				</p>
+				<table>
+			          <tr><th> COMMAND_OPTION </th><th> Description </th></tr>
+			
+			           <tr>
+			          	<td><code>-checkpoint [force]</code></td>
+			            <td>Checkpoints the Secondary namenode if EditLog size >= fs.checkpoint.size. 
+			            If -force is used, checkpoint irrespective of EditLog size.</td>
+			           </tr>
+			           <tr>
+			          	<td><code>-geteditsize</code></td>
+			            <td>Prints the EditLog size.</td>
+			           </tr>
+			     </table>
+			</section>
+			
+			<section>
+				<title> tasktracker </title>
+				<p>
+					Runs a MapReduce task Tracker node.
+				</p> 
+				<p>
+					<code>Usage: hadoop tasktracker</code>
+				</p>
+			</section>
+			
+		</section>
+		
+		
+		      
+
+	</body>
+</document>      

+ 1445 - 0
src/docs/src/documentation/content/xdocs/hod_scheduler.xml

@@ -0,0 +1,1445 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
+          "http://forrest.apache.org/dtd/document-v20.dtd">
+<document>
+  <header>
+    <title>
+      HOD Scheduler
+    </title>
+  </header>
+
+<!-- HOD USERS -->
+
+<body>
+
+<section>
+<title>Introduction</title>
+<p>Hadoop On Demand (HOD) is a system for provisioning and managing independent Hadoop MapReduce and 
+Hadoop Distributed File System (HDFS) instances on a shared cluster of nodes. HOD is a tool that makes it easy 
+for administrators and users to quickly setup and use Hadoop. HOD is also a very useful tool for Hadoop developers 
+and testers who need to share a physical cluster for testing their own Hadoop versions. </p>
+
+<p>HOD uses the Torque resource manager to do node allocation. On the allocated nodes, it can start Hadoop 
+MapReduce and HDFS daemons. It automatically generates the appropriate configuration files (hadoop-site.xml) 
+for the Hadoop daemons and client. HOD also has the capability to distribute Hadoop to the nodes in the virtual 
+cluster that it allocates. HOD supports Hadoop from version 0.15 onwards.</p>
+</section>
+
+  <section>
+    <title>HOD Users</title>
+      <p>This section shows users how to get started using HOD, reviews various HOD features and command line options, 
+  and provides detailed troubleshooting help.</p>
+
+  <section>
+		<title> Getting Started</title><anchor id="Getting_Started_Using_HOD_0_4"></anchor>
+  <p>In this section, we shall see a step-by-step introduction on how to use HOD for the most basic operations. Before 
+  following these steps, it is assumed that HOD and its dependent hardware and software components are setup and 
+  configured correctly. This is a step that is generally performed by system administrators of the cluster.</p>
+  
+  <p>The HOD user interface is a command line utility called <code>hod</code>. It is driven by a configuration file, 
+  that is typically setup for users by system administrators. Users can override this configuration when using 
+  the <code>hod</code>, which is described later in this documentation. The configuration file can be specified in 
+  two ways when using <code>hod</code>, as described below: </p>
+  <ul>
+    <li> Specify it on command line, using the -c option. Such as 
+    <code>hod &lt;operation&gt; &lt;required-args&gt; -c path-to-the-configuration-file [other-options]</code></li>
+    <li> Set up an environment variable <em>HOD_CONF_DIR</em> where <code>hod</code> will be run. 
+    This should be pointed to a directory on the local file system, containing a file called <em>hodrc</em>. 
+    Note that this is analogous to the <em>HADOOP_CONF_DIR</em> and <em>hadoop-site.xml</em> file for Hadoop. 
+    If no configuration file is specified on the command line, <code>hod</code> shall look for the <em>HOD_CONF_DIR</em> 
+    environment variable and a <em>hodrc</em> file under that.</li>
+    </ul>
+  <p>In examples listed below, we shall not explicitly point to the configuration option, assuming it is correctly specified.</p>
+  
+  <section><title>A Typical HOD Session</title><anchor id="HOD_Session"></anchor>
+  <p>A typical session of HOD will involve at least three steps: allocate, run hadoop jobs, deallocate. In order to do this, 
+  perform the following steps.</p>
+  
+  <p><strong> Create a Cluster Directory </strong></p><anchor id="Create_a_Cluster_Directory"></anchor>
+  
+  <p>The <em>cluster directory</em> is a directory on the local file system where <code>hod</code> will generate the 
+  Hadoop configuration, <em>hadoop-site.xml</em>, corresponding to the cluster it allocates. Pass this directory to the 
+  <code>hod</code> operations as stated below. If the cluster directory passed doesn't already exist, HOD will automatically 
+  try to create it and use it. Once a cluster is allocated, a user can utilize it to run Hadoop jobs by specifying the cluster 
+  directory as the Hadoop --config option. </p>
+  
+  <p><strong>Operation allocate</strong></p><anchor id="Operation_allocate"></anchor>
+  
+  <p>The <em>allocate</em> operation is used to allocate a set of nodes and install and provision Hadoop on them. 
+  It has the following syntax. Note that it requires a cluster_dir ( -d, --hod.clusterdir) and the number of nodes 
+  (-n, --hod.nodecount) needed to be allocated:</p>
+    
+      <source>$ hod allocate -d cluster_dir -n number_of_nodes [OPTIONS]</source>    
+    
+  <p>If the command completes successfully, then <code>cluster_dir/hadoop-site.xml</code> will be generated and 
+  will contain information about the allocated cluster. It will also print out the information about the Hadoop web UIs.</p>
+  
+  <p>An example run of this command produces the following output. Note in this example that <code>~/hod-clusters/test</code> 
+  is the cluster directory, and we are allocating 5 nodes:</p>
+   
+<source>
+$ hod allocate -d ~/hod-clusters/test -n 5 
+INFO - HDFS UI on http://foo1.bar.com:53422 
+INFO - Mapred UI on http://foo2.bar.com:55380</source>   
+   
+  <p><strong> Running Hadoop jobs using the allocated cluster </strong></p><anchor id="Running_Hadoop_jobs_using_the_al"></anchor>
+  
+  <p>Now, one can run Hadoop jobs using the allocated cluster in the usual manner. This assumes variables like <em>JAVA_HOME</em> 
+  and path to the Hadoop installation are set up correctly.:</p>
+
+  <source>$ hadoop --config cluster_dir hadoop_command hadoop_command_args</source>
+  <p>or</p>
+
+     <source>
+$ export HADOOP_CONF_DIR=cluster_dir
+$ hadoop hadoop_command hadoop_command_args</source>
+
+  <p>Continuing our example, the following command will run a wordcount example on the allocated cluster:</p>
+ <source>$ hadoop --config ~/hod-clusters/test jar /path/to/hadoop/hadoop-examples.jar wordcount /path/to/input /path/to/output</source>
+ 
+  <p>or</p>
+  
+   <source>
+$ export HADOOP_CONF_DIR=~/hod-clusters/test
+$ hadoop jar /path/to/hadoop/hadoop-examples.jar wordcount /path/to/input /path/to/output</source>
+   
+  <p><strong> Operation deallocate</strong></p><anchor id="Operation_deallocate"></anchor>
+  <p>The <em>deallocate</em> operation is used to release an allocated cluster. When finished with a cluster, deallocate must be 
+  run so that the nodes become free for others to use. The <em>deallocate</em> operation has the following syntax. Note that it 
+  requires the cluster_dir (-d, --hod.clusterdir) argument:</p>
+     <source>$ hod deallocate -d cluster_dir</source>
+     
+  <p>Continuing our example, the following command will deallocate the cluster:</p>
+   <source>$ hod deallocate -d ~/hod-clusters/test</source>
+   
+  <p>As can be seen, HOD allows the users to allocate a cluster, and use it flexibly for running Hadoop jobs. For example, users 
+  can run multiple jobs in parallel on the same cluster, by running hadoop from multiple shells pointing to the same configuration.</p>
+	</section>
+	
+  <section><title>Running Hadoop Scripts Using HOD</title><anchor id="HOD_Script_Mode"></anchor>
+  <p>The HOD <em>script operation</em> combines the operations of allocating, using and deallocating a cluster into a single operation. 
+  This is very useful for users who want to run a script of hadoop jobs and let HOD handle the cleanup automatically once the script completes. 
+  In order to run hadoop scripts using <code>hod</code>, do the following:</p>
+  
+  <p><strong> Create a script file </strong></p><anchor id="Create_a_script_file"></anchor>
+  
+  <p>This will be a regular shell script that will typically contain hadoop commands, such as:</p>
+
+  <source>$ hadoop jar jar_file options</source>
+  
+  <p>However, the user can add any valid commands as part of the script. HOD will execute this script setting <em>HADOOP_CONF_DIR</em> 
+  automatically to point to the allocated cluster. So users do not need to worry about this. The users however need to specify a cluster directory 
+  just like when using the allocate operation.</p>
+  <p><strong> Running the script </strong></p><anchor id="Running_the_script"></anchor>
+  <p>The syntax for the <em>script operation</em> as is as follows. Note that it requires a cluster directory ( -d, --hod.clusterdir), number of 
+  nodes (-n, --hod.nodecount) and a script file (-s, --hod.script):</p>
+
+     <source>$ hod script -d cluster_directory -n number_of_nodes -s script_file</source>
+  <p>Note that HOD will deallocate the cluster as soon as the script completes, and this means that the script must not complete until the 
+  hadoop jobs themselves are completed. Users must take care of this while writing the script. </p>
+   </section>
+  </section>
+  <section>
+		<title> HOD Features </title><anchor id="HOD_0_4_Features"></anchor>
+  <section><title> Provisioning and Managing Hadoop Clusters </title><anchor id="Provisioning_and_Managing_Hadoop"></anchor>
+  <p>The primary feature of HOD is to provision Hadoop MapReduce and HDFS clusters. This is described above in the Getting Started section. 
+  Also, as long as nodes are available, and organizational policies allow, a user can use HOD to allocate multiple MapReduce clusters simultaneously. 
+  The user would need to specify different paths for the <code>cluster_dir</code> parameter mentioned above for each cluster he/she allocates. 
+  HOD provides the <em>list</em> and the <em>info</em> operations to enable managing multiple clusters.</p>
+  
+  <p><strong> Operation list</strong></p><anchor id="Operation_list"></anchor>
+  
+  <p>The list operation lists all the clusters allocated so far by a user. The cluster directory where the hadoop-site.xml is stored for the cluster, 
+  and its status vis-a-vis connectivity with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
+
+     <source>$ hod list</source>
+     
+  <p><strong> Operation info</strong></p><anchor id="Operation_info"></anchor>
+  <p>The info operation shows information about a given cluster. The information shown includes the Torque job id, and locations of the important 
+  daemons like the HOD Ringmaster process, and the Hadoop JobTracker and NameNode daemons. The info operation has the following syntax. 
+  Note that it requires a cluster directory (-d, --hod.clusterdir):</p>
+
+     <source>$ hod info -d cluster_dir</source>
+     
+  <p>The <code>cluster_dir</code> should be a valid cluster directory specified in an earlier <em>allocate</em> operation.</p>
+  </section>
+  
+  <section><title> Using a Tarball to Distribute Hadoop </title><anchor id="Using_a_tarball_to_distribute_Ha"></anchor>
+  <p>When provisioning Hadoop, HOD can use either a pre-installed Hadoop on the cluster nodes or distribute and install a Hadoop tarball as part 
+  of the provisioning operation. If the tarball option is being used, there is no need to have a pre-installed Hadoop on the cluster nodes, nor a need 
+  to use a pre-installed one. This is especially useful in a development / QE environment where individual developers may have different versions of 
+  Hadoop to test on a shared cluster. </p>
+  
+  <p>In order to use a pre-installed Hadoop, you must specify, in the hodrc, the <code>pkgs</code> option in the <code>gridservice-hdfs</code> 
+  and <code>gridservice-mapred</code> sections. This must point to the path where Hadoop is installed on all nodes of the cluster.</p>
+  
+  <p>The syntax for specifying tarball is as follows:</p>
+  
+ <source>$ hod allocate -d cluster_dir -n number_of_nodes -t hadoop_tarball_location</source>    
+    
+  <p>For example, the following command allocates Hadoop provided by the tarball <code>~/share/hadoop.tar.gz</code>:</p>
+  <source>$ hod allocate -d ~/hadoop-cluster -n 10 -t ~/share/hadoop.tar.gz</source>
+  
+  <p>Similarly, when using hod script, the syntax is as follows:</p>
+    <source>$ hod script -d cluster_directory -s script_file -n number_of_nodes -t hadoop_tarball_location</source> 
+   
+  <p>The hadoop_tarball specified in the syntax above should point to a path on a shared file system that is accessible from all the compute nodes. 
+  Currently, HOD only supports NFS mounted file systems.</p>
+  <p><em>Note:</em></p>
+  <ul>
+    <li> For better distribution performance it is recommended that the Hadoop tarball contain only the libraries and binaries, and not the source or documentation.</li>
+    
+    <li> When you want to run jobs against a cluster allocated using the tarball, you must use a compatible version of hadoop to submit your jobs. 
+    The best would be to untar and use the version that is present in the tarball itself.</li>
+    <li> You need to make sure that there are no Hadoop configuration files, hadoop-env.sh and hadoop-site.xml, present in the conf directory of the
+     tarred distribution. The presence of these files with incorrect values could make the cluster allocation to fail.</li>
+  </ul>
+  </section>
+  
+  <section><title> Using an External HDFS </title><anchor id="Using_an_external_HDFS"></anchor>
+  <p>In typical Hadoop clusters provisioned by HOD, HDFS is already set up statically (without using HOD). This allows data to persist in HDFS after 
+  the HOD provisioned clusters is deallocated. To use a statically configured HDFS, your hodrc must point to an external HDFS. Specifically, set the 
+  following options to the correct values in the section <code>gridservice-hdfs</code> of the hodrc:</p>
+  
+  <source>
+external = true
+host = Hostname of the HDFS NameNode
+fs_port = Port number of the HDFS NameNode
+info_port = Port number of the HDFS NameNode web UI
+</source>
+  
+  <p><em>Note:</em> You can also enable this option from command line. That is, to use a static HDFS, you will need to say: <br />
+    </p>
+     <source>$ hod allocate -d cluster_dir -n number_of_nodes --gridservice-hdfs.external</source>
+     
+  <p>HOD can be used to provision an HDFS cluster as well as a MapReduce cluster, if required. To do so, set the following option in the section 
+  <code>gridservice-hdfs</code> of the hodrc:</p>
+  <source>external = false</source>
+  </section>
+  
+  <section><title> Options for Configuring Hadoop </title><anchor id="Options_for_Configuring_Hadoop"></anchor>
+  <p>HOD provides a very convenient mechanism to configure both the Hadoop daemons that it provisions and also the hadoop-site.xml that 
+  it generates on the client side. This is done by specifying Hadoop configuration parameters in either the HOD configuration file, or from the 
+  command line when allocating clusters.</p>
+  
+  <p><strong> Configuring Hadoop Daemons </strong></p><anchor id="Configuring_Hadoop_Daemons"></anchor>
+  
+  <p>For configuring the Hadoop daemons, you can do the following:</p>
+  
+  <p>For MapReduce, specify the options as a comma separated list of key-value pairs to the <code>server-params</code> option in the 
+  <code>gridservice-mapred</code> section. Likewise for a dynamically provisioned HDFS cluster, specify the options in the 
+  <code>server-params</code> option in the <code>gridservice-hdfs</code> section. If these parameters should be marked as 
+  <em>final</em>, then include these in the <code>final-server-params</code> option of the appropriate section.</p>
+  <p>For example:</p>
+<source>
+server-params = mapred.reduce.parallel.copies=20,io.sort.factor=100,io.sort.mb=128,io.file.buffer.size=131072
+final-server-params = mapred.child.java.opts=-Xmx512m,dfs.block.size=134217728,fs.inmemory.size.mb=128   
+</source>
+  <p>In order to provide the options from command line, you can use the following syntax:</p>
+  <p>For configuring the MapReduce daemons use:</p>
+
+    <source>$ hod allocate -d cluster_dir -n number_of_nodes -Mmapred.reduce.parallel.copies=20 -Mio.sort.factor=100</source>
+    
+  <p>In the example above, the <em>mapred.reduce.parallel.copies</em> parameter and the <em>io.sort.factor</em> 
+  parameter will be appended to the other <code>server-params</code> or if they already exist in <code>server-params</code>, 
+  will override them. In order to specify these are <em>final</em> parameters, you can use:</p>
+
+    <source>$ hod allocate -d cluster_dir -n number_of_nodes -Fmapred.reduce.parallel.copies=20 -Fio.sort.factor=100</source>
+    
+  <p>However, note that final parameters cannot be overwritten from command line. They can only be appended if not already specified.</p>
+  
+  <p>Similar options exist for configuring dynamically provisioned HDFS daemons. For doing so, replace -M with -H and -F with -S.</p>
+  
+  <p><strong> Configuring Hadoop Job Submission (Client) Programs </strong></p><anchor id="Configuring_Hadoop_Job_Submissio"></anchor>
+  
+  <p>As mentioned above, if the allocation operation completes successfully then <code>cluster_dir/hadoop-site.xml</code> will be generated 
+  and will contain information about the allocated cluster's JobTracker and NameNode. This configuration is used when submitting jobs to the cluster. 
+  HOD provides an option to include additional Hadoop configuration parameters into this file. The syntax for doing so is as follows:</p>
+  
+    <source>$ hod allocate -d cluster_dir -n number_of_nodes -Cmapred.userlog.limit.kb=200 -Cmapred.child.java.opts=-Xmx512m</source>
+    
+  <p>In this example, the <em>mapred.userlog.limit.kb</em> and <em>mapred.child.java.opts</em> options will be included into 
+  the hadoop-site.xml that is generated by HOD.</p>
+  </section>
+  
+  <section><title> Viewing Hadoop Web-UIs </title><anchor id="Viewing_Hadoop_Web_UIs"></anchor>
+  <p>The HOD allocation operation prints the JobTracker and NameNode web UI URLs. For example:</p>
+
+<source>
+$ hod allocate -d ~/hadoop-cluster -n 10 -c ~/hod-conf-dir/hodrc
+INFO - HDFS UI on http://host242.foo.com:55391
+INFO - Mapred UI on http://host521.foo.com:54874
+</source>    
+    
+  <p>The same information is also available via the <em>info</em> operation described above.</p>
+  </section>
+  
+  <section><title> Collecting and Viewing Hadoop Logs </title><anchor id="Collecting_and_Viewing_Hadoop_Lo"></anchor>
+  <p>To get the Hadoop logs of the daemons running on one of the allocated nodes: </p>
+  <ul>
+    <li> Log into the node of interest. If you want to look at the logs of the JobTracker or NameNode, then you can find the node running these by 
+    using the <em>list</em> and <em>info</em> operations mentioned above.</li>
+    <li> Get the process information of the daemon of interest (for example, <code>ps ux | grep TaskTracker</code>)</li>
+    <li> In the process information, search for the value of the variable <code>-Dhadoop.log.dir</code>. Typically this will be a decendent directory 
+    of the <code>hodring.temp-dir</code> value from the hod configuration file.</li>
+    <li> Change to the <code>hadoop.log.dir</code> directory to view daemon and user logs.</li>
+  </ul>
+  <p>HOD also provides a mechanism to collect logs when a cluster is being deallocated and persist them into a file system, or an externally 
+  configured HDFS. By doing so, these logs can be viewed after the jobs are completed and the nodes are released. In order to do so, configure 
+  the log-destination-uri to a URI as follows:</p>
+    <source>
+log-destination-uri = hdfs://host123:45678/user/hod/logs
+log-destination-uri = file://path/to/store/log/files</source>
+
+  <p>Under the root directory specified above in the path, HOD will create a path user_name/torque_jobid and store gzipped log files for each 
+  node that was part of the job.</p>
+  <p>Note that to store the files to HDFS, you may need to configure the <code>hodring.pkgs</code> option with the Hadoop version that 
+  matches the HDFS mentioned. If not, HOD will try to use the Hadoop version that it is using to provision the Hadoop cluster itself.</p>
+  </section>
+  
+  <section><title> Auto-deallocation of Idle Clusters </title><anchor id="Auto_deallocation_of_Idle_Cluste"></anchor>
+  <p>HOD automatically deallocates clusters that are not running Hadoop jobs for a given period of time. Each HOD allocation includes a 
+  monitoring facility that constantly checks for running Hadoop jobs. If it detects no running Hadoop jobs for a given period, it will automatically 
+  deallocate its own cluster and thus free up nodes which are not being used effectively.</p>
+  
+  <p><em>Note:</em> While the cluster is deallocated, the <em>cluster directory</em> is not cleaned up automatically. The user must 
+  deallocate this cluster through the regular <em>deallocate</em> operation to clean this up.</p>
+	</section>
+  <section><title> Specifying Additional Job Attributes </title><anchor id="Specifying_Additional_Job_Attrib"></anchor>
+  <p>HOD allows the user to specify a wallclock time and a name (or title) for a Torque job. </p>
+  <p>The wallclock time is the estimated amount of time for which the Torque job will be valid. After this time has expired, Torque will 
+  automatically delete the job and free up the nodes. Specifying the wallclock time can also help the job scheduler to better schedule 
+  jobs, and help improve utilization of cluster resources.</p>
+  <p>To specify the wallclock time, use the following syntax:</p>
+
+<source>$ hod allocate -d cluster_dir -n number_of_nodes -l time_in_seconds</source>    
+  <p>The name or title of a Torque job helps in user friendly identification of the job. The string specified here will show up in all information 
+  where Torque job attributes are displayed, including the <code>qstat</code> command.</p>
+  <p>To specify the name or title, use the following syntax:</p>
+<source>$ hod allocate -d cluster_dir -n number_of_nodes -N name_of_job</source>   
+ 
+  <p><em>Note:</em> Due to restriction in the underlying Torque resource manager, names which do not start with an alphabet character 
+  or contain a 'space' will cause the job to fail. The failure message points to the problem being in the specified job name.</p>
+  </section>
+  
+  <section><title> Capturing HOD Exit Codes in Torque </title><anchor id="Capturing_HOD_exit_codes_in_Torq"></anchor>
+  <p>HOD exit codes are captured in the Torque exit_status field. This will help users and system administrators to distinguish successful 
+  runs from unsuccessful runs of HOD. The exit codes are 0 if allocation succeeded and all hadoop jobs ran on the allocated cluster correctly. 
+  They are non-zero if allocation failed or some of the hadoop jobs failed on the allocated cluster. The exit codes that are possible are 
+  mentioned in the table below. <em>Note: Hadoop job status is captured only if the version of Hadoop used is 16 or above.</em></p>
+  <table>
+    
+      <tr>
+        <th> Exit Code </th>
+        <th> Meaning </th>
+      </tr>
+      <tr>
+        <td> 6 </td>
+        <td> Ringmaster failure </td>
+      </tr>
+      <tr>
+        <td> 7 </td>
+        <td> HDFS failure </td>
+      </tr>
+      <tr>
+        <td> 8 </td>
+        <td> Job tracker failure </td>
+      </tr>
+      <tr>
+        <td> 10 </td>
+        <td> Cluster dead </td>
+      </tr>
+      <tr>
+        <td> 12 </td>
+        <td> Cluster already allocated </td>
+      </tr>
+      <tr>
+        <td> 13 </td>
+        <td> HDFS dead </td>
+      </tr>
+      <tr>
+        <td> 14 </td>
+        <td> Mapred dead </td>
+      </tr>
+      <tr>
+        <td> 16 </td>
+        <td> All MapReduce jobs that ran on the cluster failed. Refer to hadoop logs for more details. </td>
+      </tr>
+      <tr>
+        <td> 17 </td>
+        <td> Some of the MapReduce jobs that ran on the cluster failed. Refer to hadoop logs for more details. </td>
+      </tr>
+    
+  </table>
+  </section>
+  <section>
+    <title> Command Line</title><anchor id="Command_Line"></anchor>
+    <p>HOD command line has the following general syntax:</p>
+    <source>hod &lt;operation&gt; [ARGS] [OPTIONS]</source>
+      
+    <p> Allowed operations are 'allocate', 'deallocate', 'info', 'list', 'script' and 'help'. For help with a particular operation do: </p> 
+    <source>hod help &lt;operation&gt;</source>
+      
+      <p>To have a look at possible options do:</p>
+      <source>hod help options</source>
+      
+      <ul>
+
+      <li><em>allocate</em><br />
+      <em>Usage : hod allocate -d cluster_dir -n number_of_nodes [OPTIONS]</em><br />
+        Allocates a cluster on the given number of cluster nodes, and store the allocation information in cluster_dir for use with subsequent 
+        <code>hadoop</code> commands. Note that the <code>cluster_dir</code> must exist before running the command.</li>
+        
+      <li><em>list</em><br/>
+      <em>Usage : hod list [OPTIONS]</em><br />
+       Lists the clusters allocated by this user. Information provided includes the Torque job id corresponding to the cluster, the cluster 
+       directory where the allocation information is stored, and whether the MapReduce daemon is still active or not.</li>
+       
+      <li><em>info</em><br/>
+      <em>Usage : hod info -d cluster_dir [OPTIONS]</em><br />
+        Lists information about the cluster whose allocation information is stored in the specified cluster directory.</li>
+        
+      <li><em>deallocate</em><br/>
+      <em>Usage : hod deallocate -d cluster_dir [OPTIONS]</em><br />
+        Deallocates the cluster whose allocation information is stored in the specified cluster directory.</li>
+        
+      <li><em>script</em><br/>
+      <em>Usage : hod script -s script_file -d cluster_directory -n number_of_nodes [OPTIONS]</em><br />
+        Runs a hadoop script using HOD<em>script</em> operation. Provisions Hadoop on a given number of nodes, executes the given 
+        script from the submitting node, and deallocates the cluster when the script completes.</li>
+        
+      <li><em>help</em><br/>
+      <em>Usage : hod help [operation | 'options']</em><br/>
+       When no argument is specified, <code>hod help</code> gives the usage and basic options, and is equivalent to 
+       <code>hod --help</code> (See below). When 'options' is given as argument, hod displays only the basic options 
+       that hod takes. When an operation is specified, it displays the usage and description corresponding to that particular 
+       operation. For e.g, to know about allocate operation, one can do a <code>hod help allocate</code></li>
+    </ul>
+    
+    
+      <p>Besides the operations, HOD can take the following command line options.</p>
+      
+      <ul>
+
+      <li><em>--help</em><br />
+        Prints out the help message to see the usage and basic options.</li>
+        
+      <li><em>--verbose-help</em><br />
+        All configuration options provided in the hodrc file can be passed on the command line, using the syntax 
+        <code>--section_name.option_name[=value]</code>. When provided this way, the value provided on command line 
+        overrides the option provided in hodrc. The verbose-help command lists all the available options in the hodrc file. 
+        This is also a nice way to see the meaning of the configuration options. <br />"</li>
+        </ul>
+         
+       <p>See <a href="#Options_Configuring_HOD">Options Configuring HOD</a> for a description of most important hod configuration options. 
+       For basic options do <code>hod help options</code> and for all options possible in hod configuration do <code>hod --verbose-help</code>. 
+       See <a href="#HOD+Configuration">HOD Configuration</a> for a description of all options.</p>
+       
+      
+  </section>
+
+  <section><title> Options Configuring HOD </title><anchor id="Options_Configuring_HOD"></anchor>
+  <p>As described above, HOD is configured using a configuration file that is usually set up by system administrators. 
+  This is a INI style configuration file that is divided into sections, and options inside each section. Each section relates 
+  to one of the HOD processes: client, ringmaster, hodring, mapreduce or hdfs. The options inside a section comprise 
+  of an option name and value. </p>
+  
+  <p>Users can override the configuration defined in the default configuration in two ways: </p>
+  <ul>
+    <li> Users can supply their own configuration file to HOD in each of the commands, using the <code>-c</code> option</li>
+    <li> Users can supply specific configuration options to HOD/ Options provided on command line <em>override</em> 
+    the values provided in the configuration file being used.</li>
+  </ul>
+  <p>This section describes some of the most commonly used configuration options. These commonly used options are 
+  provided with a <em>short</em> option for convenience of specification. All other options can be specified using 
+  a <em>long</em> option that is also described below.</p>
+  
+  <ul>
+
+  <li><em>-c config_file</em><br />
+    Provides the configuration file to use. Can be used with all other options of HOD. Alternatively, the 
+    <code>HOD_CONF_DIR</code> environment variable can be defined to specify a directory that contains a file 
+    named <code>hodrc</code>, alleviating the need to specify the configuration file in each HOD command.</li>
+    
+  <li><em>-d cluster_dir</em><br />
+        This is required for most of the hod operations. As described under <a href="#Create_a_Cluster_Directory">Create a Cluster Directory</a>, 
+        the <em>cluster directory</em> is a directory on the local file system where <code>hod</code> will generate the Hadoop configuration, 
+        <em>hadoop-site.xml</em>, corresponding to the cluster it allocates. Pass it to the <code>hod</code> operations as an argument 
+        to -d or --hod.clusterdir. If it doesn't already exist, HOD will automatically try to create it and use it. Once a cluster is allocated, a 
+        user can utilize it to run Hadoop jobs by specifying the clusterdirectory as the Hadoop --config option.</li>
+        
+  <li><em>-n number_of_nodes</em><br />
+  This is required for the hod 'allocation' operation and for script operation. This denotes the number of nodes to be allocated.</li>
+  
+  <li><em>-s script-file</em><br/>
+   Required when using script operation, specifies the script file to execute.</li>
+   
+ <li><em>-b 1|2|3|4</em><br />
+    Enables the given debug level. Can be used with all other options of HOD. 4 is most verbose.</li>
+    
+  <li><em>-t hadoop_tarball</em><br />
+    Provisions Hadoop from the given tar.gz file. This option is only applicable to the <em>allocate</em> operation. For better 
+    distribution performance it is strongly recommended that the Hadoop tarball is created <em>after</em> removing the source 
+    or documentation.</li>
+    
+  <li><em>-N job-name</em><br />
+    The Name to give to the resource manager job that HOD uses underneath. For e.g. in the case of Torque, this translates to 
+    the <code>qsub -N</code> option, and can be seen as the job name using the <code>qstat</code> command.</li>
+    
+  <li><em>-l wall-clock-time</em><br />
+    The amount of time for which the user expects to have work on the allocated cluster. This is passed to the resource manager 
+    underneath HOD, and can be used in more efficient scheduling and utilization of the cluster. Note that in the case of Torque, 
+    the cluster is automatically deallocated after this time expires.</li>
+    
+  <li><em>-j java-home</em><br />
+    Path to be set to the JAVA_HOME environment variable. This is used in the <em>script</em> operation. HOD sets the 
+    JAVA_HOME environment variable tot his value and launches the user script in that.</li>
+    
+  <li><em>-A account-string</em><br />
+    Accounting information to pass to underlying resource manager.</li>
+    
+  <li><em>-Q queue-name</em><br />
+    Name of the queue in the underlying resource manager to which the job must be submitted.</li>
+    
+  <li><em>-Mkey1=value1 -Mkey2=value2</em><br />
+    Provides configuration parameters for the provisioned MapReduce daemons (JobTracker and TaskTrackers). A 
+    hadoop-site.xml is generated with these values on the cluster nodes. <br />
+    <em>Note:</em> Values which have the following characters: space, comma, equal-to, semi-colon need to be 
+    escaped with a '\' character, and need to be enclosed within quotes. You can escape a '\' with a '\' too. </li>
+    
+  <li><em>-Hkey1=value1 -Hkey2=value2</em><br />
+    Provides configuration parameters for the provisioned HDFS daemons (NameNode and DataNodes). A hadoop-site.xml 
+    is generated with these values on the cluster nodes <br />
+    <em>Note:</em> Values which have the following characters: space, comma, equal-to, semi-colon need to be 
+    escaped with a '\' character, and need to be enclosed within quotes. You can escape a '\' with a '\' too. </li>
+    
+  <li><em>-Ckey1=value1 -Ckey2=value2</em><br />
+    Provides configuration parameters for the client from where jobs can be submitted. A hadoop-site.xml is generated 
+    with these values on the submit node. <br />
+    <em>Note:</em> Values which have the following characters: space, comma, equal-to, semi-colon need to be 
+    escaped with a '\' character, and need to be enclosed within quotes. You can escape a '\' with a '\' too. </li>
+    
+  <li><em>--section-name.option-name=value</em><br />
+    This is the method to provide options using the <em>long</em> format. For e.g. you could say <em>--hod.script-wait-time=20</em></li>
+   </ul>
+    
+    </section>
+	</section>
+	
+	
+	<section>
+	  <title> Troubleshooting </title><anchor id="Troubleshooting"></anchor>
+  <p>The following section identifies some of the most likely error conditions users can run into when using HOD and ways to trouble-shoot them</p>
+  
+  <section><title>HOD Hangs During Allocation </title><anchor id="_hod_Hangs_During_Allocation"></anchor>
+  <anchor id="hod_Hangs_During_Allocation"></anchor>
+  <p><em>Possible Cause:</em> One of the HOD or Hadoop components have failed to come up. In such a case, the 
+  <code>hod</code> command will return after a few minutes (typically 2-3 minutes) with an error code of either 7 or 8 
+  as defined in the Error Codes section. Refer to that section for further details. </p>
+  <p><em>Possible Cause:</em> A large allocation is fired with a tarball. Sometimes due to load in the network, or on 
+  the allocated nodes, the tarball distribution might be significantly slow and take a couple of minutes to come back. 
+  Wait for completion. Also check that the tarball does not have the Hadoop sources or documentation.</p>
+  <p><em>Possible Cause:</em> A Torque related problem. If the cause is Torque related, the <code>hod</code> 
+  command will not return for more than 5 minutes. Running <code>hod</code> in debug mode may show the 
+  <code>qstat</code> command being executed repeatedly. Executing the <code>qstat</code> command from 
+  a separate shell may show that the job is in the <code>Q</code> (Queued) state. This usually indicates a 
+  problem with Torque. Possible causes could include some nodes being down, or new nodes added that Torque 
+  is not aware of. Generally, system administator help is needed to resolve this problem.</p>
+    </section>
+    
+  <section><title>HOD Hangs During Deallocation </title>
+  <anchor id="_hod_Hangs_During_Deallocation"></anchor><anchor id="hod_Hangs_During_Deallocation"></anchor>
+  <p><em>Possible Cause:</em> A Torque related problem, usually load on the Torque server, or the allocation is very large. 
+  Generally, waiting for the command to complete is the only option.</p>
+  </section>
+  
+  <section><title>HOD Fails With an Error Code and Error Message </title>
+  <anchor id="hod_Fails_With_an_error_code_and"></anchor><anchor id="_hod_Fails_With_an_error_code_an"></anchor>
+  <p>If the exit code of the <code>hod</code> command is not <code>0</code>, then refer to the following table 
+  of error exit codes to determine why the code may have occurred and how to debug the situation.</p>
+  <p><strong> Error Codes </strong></p><anchor id="Error_Codes"></anchor>
+  <table>
+    
+      <tr>
+        <th>Error Code</th>
+        <th>Meaning</th>
+        <th>Possible Causes and Remedial Actions</th>
+      </tr>
+      <tr>
+        <td> 1 </td>
+        <td> Configuration error </td>
+        <td> Incorrect configuration values specified in hodrc, or other errors related to HOD configuration. 
+        The error messages in this case must be sufficient to debug and fix the problem. </td>
+      </tr>
+      <tr>
+        <td> 2 </td>
+        <td> Invalid operation </td>
+        <td> Do <code>hod help</code> for the list of valid operations. </td>
+      </tr>
+      <tr>
+        <td> 3 </td>
+        <td> Invalid operation arguments </td>
+        <td> Do <code>hod help operation</code> for listing the usage of a particular operation.</td>
+      </tr>
+      <tr>
+        <td> 4 </td>
+        <td> Scheduler failure </td>
+        <td> 1. Requested more resources than available. Run <code>checknodes cluster_name</code> to see if enough nodes are available. <br />
+          2. Requested resources exceed resource manager limits. <br />
+          3. Torque is misconfigured, the path to Torque binaries is misconfigured, or other Torque problems. Contact system administrator. </td>
+      </tr>
+      <tr>
+        <td> 5 </td>
+        <td> Job execution failure </td>
+        <td> 1. Torque Job was deleted from outside. Execute the Torque <code>qstat</code> command to see if you have any jobs in the 
+        <code>R</code> (Running) state. If none exist, try re-executing HOD. <br />
+          2. Torque problems such as the server momentarily going down, or becoming unresponsive. Contact system administrator. <br/>
+          3. The system administrator might have configured account verification, and an invalid account is specified. Contact system administrator.</td>
+      </tr>
+      <tr>
+        <td> 6 </td>
+        <td> Ringmaster failure </td>
+        <td> HOD prints the message "Cluster could not be allocated because of the following errors on the ringmaster host &lt;hostname&gt;". 
+        The actual error message may indicate one of the following:<br/>
+          1. Invalid configuration on the node running the ringmaster, specified by the hostname in the error message.<br/>
+          2. Invalid configuration in the <code>ringmaster</code> section,<br />
+          3. Invalid <code>pkgs</code> option in <code>gridservice-mapred or gridservice-hdfs</code> section,<br />
+          4. An invalid hadoop tarball, or a tarball which has bundled an invalid configuration file in the conf directory,<br />
+          5. Mismatched version in Hadoop between the MapReduce and an external HDFS.<br />
+          The Torque <code>qstat</code> command will most likely show a job in the <code>C</code> (Completed) state. <br/>
+          One can login to the ringmaster host as given by HOD failure message and debug the problem with the help of the error message. 
+          If the error message doesn't give complete information, ringmaster logs should help finding out the root cause of the problem. 
+          Refer to the section <em>Locating Ringmaster Logs</em> below for more information. </td>
+      </tr>
+      <tr>
+        <td> 7 </td>
+        <td> HDFS failure </td>
+        <td> When HOD fails to allocate due to HDFS failures (or Job tracker failures, error code 8, see below), it prints a failure message 
+        "Hodring at &lt;hostname&gt; failed with following errors:" and then gives the actual error message, which may indicate one of the following:<br/>
+          1. Problem in starting Hadoop clusters. Usually the actual cause in the error message will indicate the problem on the hostname mentioned. 
+          Also, review the Hadoop related configuration in the HOD configuration files. Look at the Hadoop logs using information specified in 
+          <em>Collecting and Viewing Hadoop Logs</em> section above. <br />
+          2. Invalid configuration on the node running the hodring, specified by the hostname in the error message <br/>
+          3. Invalid configuration in the <code>hodring</code> section of hodrc. <code>ssh</code> to the hostname specified in the 
+          error message and grep for <code>ERROR</code> or <code>CRITICAL</code> in hodring logs. Refer to the section 
+          <em>Locating Hodring Logs</em> below for more information. <br />
+          4. Invalid tarball specified which is not packaged correctly. <br />
+          5. Cannot communicate with an externally configured HDFS.<br/>
+          When such HDFS or Job tracker failure occurs, one can login into the host with hostname mentioned in HOD failure message and debug the problem. 
+          While fixing the problem, one should also review other log messages in the ringmaster log to see which other machines also might have had problems 
+          bringing up the jobtracker/namenode, apart from the hostname that is reported in the failure message. This possibility of other machines also having problems 
+          occurs because HOD continues to try and launch hadoop daemons on multiple machines one after another depending upon the value of the configuration 
+          variable <a href="hod_scheduler.html#ringmaster+options">ringmaster.max-master-failures</a>. 
+          See <a href="hod_scheduler.html#Locating+Ringmaster+Logs">Locating Ringmaster Logs</a> for more information.</td>
+      </tr>
+      <tr>
+        <td> 8 </td>
+        <td> Job tracker failure </td>
+        <td> Similar to the causes in <em>DFS failure</em> case. </td>
+      </tr>
+      <tr>
+        <td> 10 </td>
+        <td> Cluster dead </td>
+        <td> 1. Cluster was auto-deallocated because it was idle for a long time. <br />
+          2. Cluster was auto-deallocated because the wallclock time specified by the system administrator or user was exceeded. <br />
+          3. Cannot communicate with the JobTracker and HDFS NameNode which were successfully allocated. Deallocate the cluster, and allocate again. </td>
+      </tr>
+      <tr>
+        <td> 12 </td>
+        <td> Cluster already allocated </td>
+        <td> The cluster directory specified has been used in a previous allocate operation and is not yet deallocated. 
+        Specify a different directory, or deallocate the previous allocation first. </td>
+      </tr>
+      <tr>
+        <td> 13 </td>
+        <td> HDFS dead </td>
+        <td> Cannot communicate with the HDFS NameNode. HDFS NameNode went down. </td>
+      </tr>
+      <tr>
+        <td> 14 </td>
+        <td> Mapred dead </td>
+        <td> 1. Cluster was auto-deallocated because it was idle for a long time. <br />
+          2. Cluster was auto-deallocated because the wallclock time specified by the system administrator or user was exceeded. <br />
+          3. Cannot communicate with the MapReduce JobTracker. JobTracker node went down. <br />
+          </td>
+      </tr>
+      <tr>
+        <td> 15 </td>
+        <td> Cluster not allocated </td>
+        <td> An operation which requires an allocated cluster is given a cluster directory with no state information. </td>
+      </tr>
+   
+      <tr>
+        <td> Any non-zero exit code </td>
+        <td> HOD script error </td>
+        <td> If the hod script option was used, it is likely that the exit code is from the script. Unfortunately, this could clash with the 
+        exit codes of the hod command itself. In order to help users differentiate these two, hod writes the script's exit code to a file 
+        called script.exitcode in the cluster directory, if the script returned an exit code. You can cat this file to determine the script's 
+        exit code. If it does not exist, then it is a hod command exit code.</td> 
+      </tr>
+  </table>
+    </section>
+  <section><title>Hadoop DFSClient Warns with a
+  NotReplicatedYetException</title>
+  <p>Sometimes, when you try to upload a file to the HDFS immediately after
+  allocating a HOD cluster, DFSClient warns with a NotReplicatedYetException. It
+  usually shows a message something like - </p>
+  
+  <source>
+WARN hdfs.DFSClient: NotReplicatedYetException sleeping  &lt;filename&gt; retries left 3
+08/01/25 16:31:40 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
+File &lt;filename&gt; could only be replicated to 0 nodes, instead of 1</source>
+  
+  <p> This scenario arises when you try to upload a file
+  to the HDFS while the DataNodes are still in the process of contacting the
+  NameNode. This can be resolved by waiting for some time before uploading a new
+  file to the HDFS, so that enough DataNodes start and contact the NameNode.</p>
+  </section>
+  
+  <section><title> Hadoop Jobs Not Running on a Successfully Allocated Cluster </title><anchor id="Hadoop_Jobs_Not_Running_on_a_Suc"></anchor>
+  
+  <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs 
+  are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
+  
+  <source>08/01/25 16:31:40 INFO ipc.Client: Retrying connect to server: foo.bar.com/1.1.1.1:53567. Already tried 1 time(s).</source>
+  
+  <p><em>Possible Cause:</em> No Hadoop jobs were run for a significant portion of time. Thus the cluster would have got 
+  deallocated as described in the section <em>Auto-deallocation of Idle Clusters</em>. Deallocate the cluster and allocate it again.</p>
+  <p><em>Possible Cause:</em> The wallclock limit specified by the Torque administrator or the <code>-l</code> option 
+  defined in the section <em>Specifying Additional Job Attributes</em> was exceeded since allocation time. Thus the cluster 
+  would have got released. Deallocate the cluster and allocate it again.</p>
+  <p><em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used in provisioning 
+  (typically via the tarball option) and the external HDFS. Ensure compatible versions are being used.</p>
+  <p><em>Possible Cause:</em> There is a version mismatch between the version of the hadoop client being used to submit
+   jobs and the hadoop used in provisioning (typically via the tarball option). Ensure compatible versions are being used.</p>
+  <p><em>Possible Cause:</em> You used one of the options for specifying Hadoop configuration <code>-M or -H</code>, 
+  which had special characters like space or comma that were not escaped correctly. Refer to the section 
+  <em>Options Configuring HOD</em> for checking how to specify such options correctly.</p>
+    </section>
+  <section><title> My Hadoop Job Got Killed </title><anchor id="My_Hadoop_Job_Got_Killed"></anchor>
+  <p><em>Possible Cause:</em> The wallclock limit specified by the Torque administrator or the <code>-l</code> 
+  option defined in the section <em>Specifying Additional Job Attributes</em> was exceeded since allocation time. 
+  Thus the cluster would have got released. Deallocate the cluster and allocate it again, this time with a larger wallclock time.</p>
+  <p><em>Possible Cause:</em> Problems with the JobTracker node. Refer to the section in <em>Collecting and Viewing Hadoop Logs</em> to get more information.</p>
+    </section>
+  <section><title> Hadoop Job Fails with Message: 'Job tracker still initializing' </title><anchor id="Hadoop_Job_Fails_with_Message_Jo"></anchor>
+  <p><em>Possible Cause:</em> The hadoop job was being run as part of the HOD script command, and it started before the JobTracker could come up fully. 
+  Allocate the cluster using a large value for the configuration option <code>--hod.script-wait-time</code>.
+   Typically a value of 120 should work, though it is typically unnecessary to be that large.</p>
+    </section>
+  <section><title> The Exit Codes For HOD Are Not Getting Into Torque </title><anchor id="The_Exit_Codes_For_HOD_Are_Not_G"></anchor>
+  <p><em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. 
+  The version of Hadoop used does not match. Use the required version of Hadoop.</p>
+  <p><em>Possible Cause:</em> The deallocation was done without using the <code>hod</code> 
+  command; for e.g. directly using <code>qdel</code>. When the cluster is deallocated in this manner, 
+  the HOD processes are terminated using signals. This results in the exit code to be based on the 
+  signal number, rather than the exit code of the program.</p>
+    </section>
+  <section><title> The Hadoop Logs are Not Uploaded to HDFS </title><anchor id="The_Hadoop_Logs_are_Not_Uploaded"></anchor>
+  <p><em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs 
+  and the external HDFS. Ensure that the correct version is specified in the <code>hodring.pkgs</code> option.</p>
+    </section>
+  <section><title> Locating Ringmaster Logs </title><anchor id="Locating_Ringmaster_Logs"></anchor>
+  <p>To locate the ringmaster logs, follow these steps: </p>
+  <ul>
+    <li> Execute hod in the debug mode using the -b option. This will print the Torque job id for the current run.</li>
+    <li> Execute <code>qstat -f torque_job_id</code> and look up the value of the <code>exec_host</code> parameter in the output. 
+    The first host in this list is the ringmaster node.</li>
+    <li> Login to this node.</li>
+    <li> The ringmaster log location is specified by the <code>ringmaster.log-dir</code> option in the hodrc. The name of the log file will be 
+    <code>username.torque_job_id/ringmaster-main.log</code>.</li>
+    <li> If you don't get enough information, you may want to set the ringmaster debug level to 4. This can be done by passing 
+    <code>--ringmaster.debug 4</code> to the hod command line.</li>
+  </ul>
+  </section>
+  <section><title> Locating Hodring Logs </title><anchor id="Locating_Hodring_Logs"></anchor>
+  <p>To locate hodring logs, follow the steps below: </p>
+  <ul>
+    <li> Execute hod in the debug mode using the -b option. This will print the Torque job id for the current run.</li>
+    <li> Execute <code>qstat -f torque_job_id</code> and look up the value of the <code>exec_host</code> parameter in the output. 
+    All nodes in this list should have a hodring on them.</li>
+    <li> Login to any of these nodes.</li>
+    <li> The hodring log location is specified by the <code>hodring.log-dir</code> option in the hodrc. The name of the log file will be 
+    <code>username.torque_job_id/hodring-main.log</code>.</li>
+    <li> If you don't get enough information, you may want to set the hodring debug level to 4. This can be done by passing 
+    <code>--hodring.debug 4</code> to the hod command line.</li>
+  </ul>
+  </section>
+	</section>
+	  </section>
+	  
+	  
+	  
+<!-- HOD ADMINISTRATORS -->
+
+  <section>
+    <title>HOD Administrators</title>	  
+   <p>This section show administrators how to install, configure and run HOD.</p> 
+	  <section>
+<title>Getting Started</title>
+
+<p>The basic system architecture of HOD includes these components:</p>
+<ul>
+  <li>A Resource manager, possibly together with a scheduler (see <a href="hod_scheduler.html#Prerequisites"> Prerequisites</a>) </li>
+  <li>Various HOD components</li>
+  <li>Hadoop MapReduce and HDFS daemons</li>
+</ul>
+
+<p>
+HOD provisions and maintains Hadoop MapReduce and, optionally, HDFS instances 
+through interaction with the above components on a given cluster of nodes. A cluster of
+nodes can be thought of as comprising two sets of nodes:</p>
+<ul>
+  <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, and then
+use the Hadoop client to submit Hadoop jobs. </li>
+  <li>Compute nodes: Using the resource manager, HOD components are run on these nodes to 
+provision the Hadoop daemons. After that Hadoop jobs run on them.</li>
+</ul>
+
+<p>
+Here is a brief description of the sequence of operations in allocating a cluster and
+running jobs on them.
+</p>
+
+<ul>
+  <li>The user uses the HOD client on the Submit node to allocate a desired number of
+cluster nodes and to provision Hadoop on them.</li>
+  <li>The HOD client uses a resource manager interface (qsub, in Torque) to submit a HOD
+process, called the RingMaster, as a Resource Manager job, to request the user's desired number 
+of nodes. This job is submitted to the central server of the resource manager (pbs_server, in Torque).</li>
+  <li>On the compute nodes, the resource manager slave daemons (pbs_moms in Torque) accept
+and run jobs that they are assigned by the central server (pbs_server in Torque). The RingMaster 
+process is started on one of the compute nodes (mother superior, in Torque).</li>
+  <li>The RingMaster then uses another resource manager interface (pbsdsh, in Torque) to run
+the second HOD component, HodRing, as distributed tasks on each of the compute
+nodes allocated.</li>
+  <li>The HodRings, after initializing, communicate with the RingMaster to get Hadoop commands, 
+and run them accordingly. Once the Hadoop commands are started, they register with the RingMaster,
+giving information about the daemons.</li>
+  <li>All the configuration files needed for Hadoop instances are generated by HOD itself, 
+some obtained from options given by user in its own configuration file.</li>
+  <li>The HOD client keeps communicating with the RingMaster to find out the location of the 
+JobTracker and HDFS daemons.</li>
+</ul>
+
+</section>
+
+<section>
+<title>Prerequisites</title>
+<p>To use HOD, your system should include the following components.</p>
+
+<ul>
+
+<li>Operating System: HOD is currently tested on RHEL4.</li>
+
+<li>Nodes: HOD requires a minimum of three nodes configured through a resource manager.</li>
+
+<li>Software: The following components must be installed on ALL nodes before using HOD:
+<ul>
+ <li><a href="ext:hod/torque">Torque: Resource manager</a></li>
+ <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of Python.</li>
+</ul></li>
+
+<li>Software (optional): The following components are optional and can be installed to obtain better
+functionality from HOD:
+<ul>
+ <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
+  used for improving the scalability of HOD. If this module is detected to be
+  installed, HOD uses it, else it falls back to default modules.</li>
+ <li><a href="http://hadoop.apache.org/common/docs/current/index.html">Hadoop</a>: HOD can automatically
+ distribute Hadoop to all nodes in the cluster. However, it can also use a
+ pre-installed version of Hadoop, if it is available on all nodes in the cluster.
+  HOD currently supports Hadoop 0.15 and above.</li>
+</ul></li>
+
+</ul>
+
+<p>Note: HOD configuration requires the location of installs of these
+components to be the same on all nodes in the cluster. It will also
+make the configuration simpler to have the same location on the submit
+nodes.
+</p>
+</section>
+
+<section>
+<title>Resource Manager</title>
+<p>  Currently HOD works with the Torque resource manager, which it uses for its node
+  allocation and job submission. Torque is an open source resource manager from
+  <a href="ext:hod/cluster-resources">Cluster Resources</a>, a community effort
+  based on the PBS project. It provides control over batch jobs and distributed compute nodes. Torque is
+  freely available for download from <a href="ext:hod/torque-download">here</a>.
+  </p>
+
+<p>  All documentation related to torque can be seen under
+  the section TORQUE Resource Manager <a
+  href="ext:hod/torque-docs">here</a>. You can
+  get wiki documentation from <a
+  href="ext:hod/torque-wiki">here</a>.
+  Users may wish to subscribe to TORQUE’s mailing list or view the archive for questions,
+  comments <a
+  href="ext:hod/torque-mailing-list">here</a>.
+</p>
+
+<p>To use HOD with Torque:</p>
+<ul>
+ <li>Install Torque components: pbs_server on one node (head node), pbs_mom on all
+  compute nodes, and PBS client tools on all compute nodes and submit
+  nodes. Perform at least a basic configuration so that the Torque system is up and
+  running, that is, pbs_server knows which machines to talk to. Look <a
+  href="ext:hod/torque-basic-config">here</a>
+  for basic configuration.
+
+  For advanced configuration, see <a
+  href="ext:hod/torque-advanced-config">here</a></li>
+ <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is the
+  same as the HOD configuration parameter, resource-manager.queue. The HOD client uses this queue to
+  submit the RingMaster process as a Torque job.</li>
+ <li>Specify a cluster name as a property for all nodes in the cluster.
+  This can be done by using the qmgr command. For example:
+  <code>qmgr -c "set node node properties=cluster-name"</code>. The name of the cluster is the same as
+  the HOD configuration parameter, hod.cluster. </li>
+ <li>Make sure that jobs can be submitted to the nodes. This can be done by
+  using the qsub command. For example:
+  <code>echo "sleep 30" | qsub -l nodes=3</code></li>
+</ul>
+
+</section>
+
+<section>
+<title>Installing HOD</title>
+
+<p>Once the resource manager is set up, you can obtain and
+install HOD.</p>
+<ul>
+ <li>If you are getting HOD from the Hadoop tarball, it is available under the 
+  'contrib' section of Hadoop, under the root  directory 'hod'.</li>
+ <li>If you are building from source, you can run ant tar from the Hadoop root
+  directory to generate the Hadoop tarball, and then get HOD from there,
+  as described above.</li>
+ <li>Distribute the files under this directory to all the nodes in the
+  cluster. Note that the location where the files are copied should be
+  the same on all the nodes.</li>
+  <li>Note that compiling hadoop would build HOD with appropriate permissions 
+  set on all the required script files in HOD.</li>
+</ul>
+</section>
+
+<section>
+<title>Configuring HOD</title>
+
+<p>You can configure HOD once it is installed. The minimal configuration needed
+to run HOD is described below. More advanced configuration options are discussed
+in the HOD Configuration.</p>
+<section>
+  <title>Minimal Configuration</title>
+  <p>To get started using HOD, the following minimal configuration is
+  required:</p>
+<ul>
+ <li>On the node from where you want to run HOD, edit the file hodrc
+  located in the &lt;install dir&gt;/conf directory. This file
+  contains the minimal set of values required to run hod.</li>
+ <li>
+<p>Specify values suitable to your environment for the following
+  variables defined in the configuration file. Note that some of these
+  variables are defined at more than one place in the file.</p>
+
+  <ul>
+   <li>${JAVA_HOME}: Location of Java for Hadoop. Hadoop supports Sun JDK
+    1.6.x and above.</li>
+   <li>${CLUSTER_NAME}: Name of the cluster which is specified in the
+    'node property' as mentioned in resource manager configuration.</li>
+   <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
+    submit nodes.</li>
+   <li>${RM_QUEUE}: Queue configured for submitting jobs in the resource
+    manager configuration.</li>
+   <li>${RM_HOME}: Location of the resource manager installation on the
+    compute and submit nodes.</li>
+    </ul>
+</li>
+
+<li>
+<p>The following environment variables may need to be set depending on
+  your environment. These variables must be defined where you run the
+  HOD client and must also be specified in the HOD configuration file as the
+  value of the key resource_manager.env-vars. Multiple variables can be
+  specified as a comma separated list of key=value pairs.</p>
+
+  <ul>
+   <li>HOD_PYTHON_HOME: If you install python to a non-default location
+    of the compute nodes, or submit nodes, then this variable must be
+    defined to point to the python executable in the non-standard
+    location.</li>
+    </ul>
+</li>
+</ul>
+</section>
+
+  <section>
+    <title>Advanced Configuration</title>
+    <p> You can review and modify other configuration options to suit
+ your specific needs. See <a href="#HOD+Configuration">HOD Configuration</a> for more information.</p>
+  </section>
+</section>
+
+  <section>
+    <title>Running HOD</title>
+    <p>You can run HOD once it is configured. Refer to <a
+    href="#HOD+Users"> HOD Users</a> for more information.</p>
+  </section>
+
+  <section>
+    <title>Supporting Tools and Utilities</title>
+    <p>This section describes supporting tools and utilities that can be used to
+    manage HOD deployments.</p>
+    
+    <section>
+      <title>logcondense.py - Manage Log Files</title>
+      <p>As mentioned under 
+         <a href="hod_scheduler.html#Collecting+and+Viewing+Hadoop+Logs">Collecting and Viewing Hadoop Logs</a>,
+         HOD can be configured to upload
+         Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
+         to HDFS could increase. logcondense.py is a tool that helps
+         administrators to remove log files uploaded to HDFS. </p>
+      <section>
+        <title>Running logcondense.py</title>
+        <p>logcondense.py is available under hod_install_location/support folder. You can either
+        run it using python, for example, <em>python logcondense.py</em>, or give execute permissions 
+        to the file, and directly run it as <em>logcondense.py</em>. logcondense.py needs to be 
+        run by a user who has sufficient permissions to remove files from locations where log 
+        files are uploaded in the HDFS, if permissions are enabled. For example as mentioned under
+        <a href="hod_scheduler.html#hodring+options">hodring options</a>, the logs could
+        be configured to come under the user's home directory in HDFS. In that case, the user
+        running logcondense.py should have super user privileges to remove the files from under
+        all user home directories.</p>
+      </section>
+      <section>
+        <title>Command Line Options for logcondense.py</title>
+        <p>The following command line options are supported for logcondense.py.</p>
+          <table>
+            <tr>
+              <th>Short Option</th>
+              <th>Long option</th>
+              <th>Meaning</th>
+              <th>Example</th>
+            </tr>
+            <tr>
+              <td>-p</td>
+              <td>--package</td>
+              <td>Complete path to the hadoop script. The version of hadoop must be the same as the 
+                  one running HDFS.</td>
+              <td>/usr/bin/hadoop</td>
+            </tr>
+            <tr>
+              <td>-d</td>
+              <td>--days</td>
+              <td>Delete log files older than the specified number of days</td>
+              <td>7</td>
+            </tr>
+            <tr>
+              <td>-c</td>
+              <td>--config</td>
+              <td>Path to the Hadoop configuration directory, under which hadoop-site.xml resides.
+              The hadoop-site.xml must point to the HDFS NameNode from where logs are to be removed.</td>
+              <td>/home/foo/hadoop/conf</td>
+            </tr>
+            <tr>
+              <td>-l</td>
+              <td>--logs</td>
+              <td>A HDFS path, this must be the same HDFS path as specified for the log-destination-uri,
+              as mentioned under <a href="hod_scheduler.html#hodring+options">hodring options</a>,
+              without the hdfs:// URI string</td>
+              <td>/user</td>
+            </tr>
+            <tr>
+              <td>-n</td>
+              <td>--dynamicdfs</td>
+              <td>If true, this will indicate that the logcondense.py script should delete HDFS logs
+              in addition to MapReduce logs. Otherwise, it only deletes MapReduce logs, which is also the
+              default if this option is not specified. This option is useful if
+              dynamic HDFS installations 
+              are being provisioned by HOD, and the static HDFS installation is being used only to collect 
+              logs - a scenario that may be common in test clusters.</td>
+              <td>false</td>
+            </tr>
+            <tr>
+              <td>-r</td>
+              <td>--retain-master-logs</td>
+              <td>If true, this will keep the JobTracker logs of job in hod-logs inside HDFS and it 
+              will delete only the TaskTracker logs. Also, this will keep the Namenode logs along with 
+              JobTracker logs and will only delete the Datanode logs if 'dynamicdfs' options is set 
+              to true. Otherwise, it will delete the complete job directory from hod-logs inside 
+              HDFS. By default it is set to false.</td>
+              <td>false</td>
+            </tr>
+          </table>
+        <p>So, for example, to delete all log files older than 7 days using a hadoop-site.xml stored in
+        ~/hadoop-conf, using the hadoop installation under ~/hadoop-0.17.0, you could say:</p>
+        <p><em>python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user</em></p>
+      </section>
+    </section>
+    <section>
+      <title>checklimits.sh - Monitor Resource Limits</title>
+      <p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
+      (<a href="ext:hod/maui">Maui Cluster Scheduler</a> is an open source job
+      scheduler for clusters and supercomputers, from clusterresources). The
+      checklimits.sh script
+      updates the torque comment field when newly submitted job(s) violate or
+      exceed
+      over user limits set up in Maui scheduler. It uses qstat, does one pass
+      over the torque job-list to determine queued or unfinished jobs, runs Maui
+      tool checkjob on each job to see if user limits are violated and then
+      runs torque's qalter utility to update job attribute 'comment'. Currently
+      it updates the comment as <em>User-limits exceeded. Requested:([0-9]*)
+      Used:([0-9]*) MaxLimit:([0-9]*)</em> for those jobs that violate limits.
+      This comment field is then used by HOD to behave accordingly depending on
+      the type of violation.</p>
+      <section>
+        <title>Running checklimits.sh</title>
+        <p>checklimits.sh is available under the hod_install_location/support
+        folder. This shell script can be run directly as <em>sh
+        checklimits.sh </em>or as <em>./checklimits.sh</em> after enabling
+        execute permissions. Torque and Maui binaries should be available
+        on the machine where the tool is run and should be in the path
+        of the shell script process. To update the
+        comment field of jobs from different users, this tool must be run with
+        torque administrative privileges. This tool must be run repeatedly
+        after specific intervals of time to frequently update jobs violating
+        constraints, for example via cron. Please note that the resource manager
+        and scheduler commands used in this script can be expensive and so
+        it is better not to run this inside a tight loop without sleeping.</p>
+      </section>
+    </section>
+
+    <section>
+      <title>verify-account Script</title>
+      <p>Production systems use accounting packages to charge users for using
+      shared compute resources. HOD supports a parameter 
+      <em>resource_manager.pbs-account</em> to allow users to identify the
+      account under which they would like to submit jobs. It may be necessary
+      to verify that this account is a valid one configured in an accounting
+      system. The <em>hod-install-dir/bin/verify-account</em> script 
+      provides a mechanism to plug-in a custom script that can do this
+      verification.</p>
+      
+      <section>
+        <title>Integrating the verify-account script with HOD</title>
+        <p>HOD runs the <em>verify-account</em> script passing in the
+        <em>resource_manager.pbs-account</em> value as argument to the script,
+        before allocating a cluster. Sites can write a script that verify this 
+        account against their accounting systems. Returning a non-zero exit 
+        code from this script will cause HOD to fail allocation. Also, in
+        case of an error, HOD will print the output of script to the user.
+        Any descriptive error message can be passed to the user from the
+        script in this manner.</p>
+        <p>The default script that comes with the HOD installation does not
+        do any validation, and returns a zero exit code.</p>
+        <p>If the verify-account script is not found, then HOD will treat
+        that verification is disabled, and continue allocation as is.</p>
+      </section>
+    </section>
+  </section>
+  </section>
+
+
+<!-- HOD CONFIGURATION -->
+
+   <section>
+    <title>HOD Configuration</title>
+      <p>This section discusses how to work with the HOD configuration options.</p>
+	 
+	  <section>
+      <title>Getting Started</title>
+ 
+      <p>Configuration options can be specified in two ways: as a configuration file 
+      in the INI format and as command line options to the HOD shell, 
+      specified in the format --section.option[=value]. If the same option is 
+      specified in both places, the value specified on the command line 
+      overrides the value in the configuration file.</p>
+      
+      <p>To get a simple description of all configuration options use:</p>
+      <source>$ hod --verbose-help</source>
+      
+       </section>
+       
+        <section>
+     <title>Configuation Options</title>
+      <p>HOD organizes configuration options into these sections:</p>
+      
+      <ul>
+        <li>  common: Options that appear in more than one section. Options defined in a section are used by the
+        process for which that section applies. Common options have the same meaning, but can have different values in each section.</li>
+        <li>  hod: Options for the HOD client</li>
+        <li>  resource_manager: Options for specifying which resource manager to use, and other parameters for using that resource manager</li>
+        <li>  ringmaster: Options for the RingMaster process, </li>
+        <li>  hodring: Options for the HodRing processes</li>
+        <li>  gridservice-mapred: Options for the MapReduce daemons</li>
+        <li>  gridservice-hdfs: Options for the HDFS daemons.</li>
+      </ul>
+      
+      <section> 
+        <title>common options</title>    
+        <ul>
+          <li>temp-dir: Temporary directory for usage by the HOD processes. Make 
+                      sure that the users who will run hod have rights to create 
+                      directories under the directory specified here. If you
+                      wish to make this directory vary across allocations,
+                      you can make use of the environmental variables which will
+                      be made available by the resource manager to the HOD
+                      processes. For example, in a Torque setup, having
+                      --ringmaster.temp-dir=/tmp/hod-temp-dir.$PBS_JOBID would
+                      let ringmaster use different temp-dir for each
+                      allocation; Torque expands this variable before starting
+                      the ringmaster.</li>
+          
+          <li>debug: Numeric value from 1-4. 4 produces the most log information,
+                   and 1 the least.</li>
+          
+          <li>log-dir: Directory where log files are stored. By default, this is
+                     &lt;install-location&gt;/logs/. The restrictions and notes for the
+                     temp-dir variable apply here too.
+          </li>
+          
+          <li>xrs-port-range: Range of ports, among which an available port shall
+                            be picked for use to run an XML-RPC server.</li>
+          
+          <li>http-port-range: Range of ports, among which an available port shall
+                             be picked for use to run an HTTP server.</li>
+          
+          <li>java-home: Location of Java to be used by Hadoop.</li>
+          <li>syslog-address: Address to which a syslog daemon is bound to. The format 
+                              of the value is host:port. If configured, HOD log messages
+                              will be logged to syslog using this value.</li>
+                              
+        </ul>
+      </section>
+      
+      <section>
+        <title>hod options</title>
+        
+        <ul>
+          <li>cluster: Descriptive name given to the cluster. For Torque, this is specified as a 'Node property' for every node in the cluster. 
+          HOD uses this value to compute the number of available nodes.</li>
+          
+          <li>client-params: Comma-separated list of hadoop config parameters specified as key-value pairs. 
+          These will be used to generate a hadoop-site.xml on the submit node that should be used for running MapReduce jobs.</li>
+
+          <li>job-feasibility-attr: Regular expression string that specifies whether and how to check job feasibility - resource 
+          manager or scheduler limits. The current implementation corresponds to the torque job attribute 'comment' and by default is disabled. 
+          When set, HOD uses it to decide what type of limit violation is triggered and either deallocates the cluster or stays in queued state
+          according as the request is beyond maximum limits or the cumulative usage has crossed maximum limits. The torque comment attribute may be updated 
+          periodically by an external mechanism. For example, comment attribute can be updated by running 
+          <a href="hod_scheduler.html#checklimits.sh+-+Monitor+Resource+Limits">checklimits.sh</a> script in hod/support directory, 
+          and then setting job-feasibility-attr equal to the value TORQUE_USER_LIMITS_COMMENT_FIELD, "User-limits exceeded. Requested:([0-9]*) 
+          Used:([0-9]*) MaxLimit:([0-9]*)", will make HOD behave accordingly.</li>
+         </ul>
+      </section>
+      
+      <section>
+        <title>resource_manager options</title>
+      
+        <ul>
+          <li>queue: Name of the queue configured in the resource manager to which
+                   jobs are to be submitted.</li>
+          
+          <li>batch-home: Install directory to which 'bin' is appended and under 
+                        which the executables of the resource manager can be 
+                        found.</li> 
+          
+          <li>env-vars: Comma-separated list of key-value pairs, 
+                      expressed as key=value, which would be passed to the jobs 
+                      launched on the compute nodes. 
+                      For example, if the python installation is 
+                      in a non-standard location, one can set the environment
+                      variable 'HOD_PYTHON_HOME' to the path to the python 
+                      executable. The HOD processes launched on the compute nodes
+                      can then use this variable.</li>
+          <li>options: Comma-separated list of key-value pairs,
+                      expressed as
+                      &lt;option&gt;:&lt;sub-option&gt;=&lt;value&gt;. When
+                      passing to the job submission program, these are expanded
+                      as -&lt;option&gt; &lt;sub-option&gt;=&lt;value&gt;. These
+                      are generally used for specifying additional resource
+                      contraints for scheduling. For instance, with a Torque
+                      setup, one can specify
+                      --resource_manager.options='l:arch=x86_64' for
+                      constraining the nodes being allocated to a particular
+                      architecture; this option will be passed to Torque's qsub
+                      command as "-l arch=x86_64".</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>ringmaster options</title>
+        
+        <ul>
+          <li>work-dirs: Comma-separated list of paths that will serve
+                       as the root for directories that HOD generates and passes
+                       to Hadoop for use to store DFS and MapReduce data. For
+                       example,
+                       this is where DFS data blocks will be stored. Typically,
+                       as many paths are specified as there are disks available
+                       to ensure all disks are being utilized. The restrictions
+                       and notes for the temp-dir variable apply here too.</li>
+          <li>max-master-failures: Number of times a hadoop master
+                       daemon can fail to launch, beyond which HOD will fail
+                       the cluster allocation altogether. In HOD clusters,
+                       sometimes there might be a single or few "bad" nodes due
+                       to issues like missing java, missing or incorrect version
+                       of Hadoop etc. When this configuration variable is set
+                       to a positive integer, the RingMaster returns an error
+                       to the client only when the number of times a hadoop
+                       master (JobTracker or NameNode) fails to start on these
+                       bad nodes because of above issues, exceeds the specified
+                       value. If the number is not exceeded, the next HodRing
+                       which requests for a command to launch is given the same
+                       hadoop master again. This way, HOD tries its best for a
+                       successful allocation even in the presence of a few bad
+                       nodes in the cluster.
+                       </li>
+          <li>workers_per_ring: Number of workers per service per HodRing.
+                       By default this is set to 1. If this configuration
+                       variable is set to a value 'n', the HodRing will run
+                       'n' instances of the workers (TaskTrackers or DataNodes)
+                       on each node acting as a slave. This can be used to run
+                       multiple workers per HodRing, so that the total number of
+                       workers  in a HOD cluster is not limited by the total
+                       number of nodes requested during allocation. However, note
+                       that this will mean each worker should be configured to use
+                       only a proportional fraction of the capacity of the 
+                       resources on the node. In general, this feature is only
+                       useful for testing and simulation purposes, and not for
+                       production use.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>gridservice-hdfs options</title>
+        
+        <ul>
+          <li>external: If false, indicates that a HDFS cluster must be 
+                      bought up by the HOD system, on the nodes which it 
+                      allocates via the allocate command. Note that in that case,
+                      when the cluster is de-allocated, it will bring down the 
+                      HDFS cluster, and all the data will be lost.
+                      If true, it will try and connect to an externally configured
+                      HDFS system.
+                      Typically, because input for jobs are placed into HDFS
+                      before jobs are run, and also the output from jobs in HDFS 
+                      is required to be persistent, an internal HDFS cluster is 
+                      of little value in a production system. However, it allows 
+                      for quick testing.</li>
+          
+          <li>host: Hostname of the externally configured NameNode, if any</li>
+          
+          <li>fs_port: Port to which NameNode RPC server is bound.</li>
+          
+          <li>info_port: Port to which the NameNode web UI server is bound.</li>
+          
+          <li>pkgs: Installation directory, under which bin/hadoop executable is 
+                  located. This can be used to use a pre-installed version of
+                  Hadoop on the cluster.</li>
+          
+          <li>server-params: Comma-separated list of hadoop config parameters
+                           specified key-value pairs. These will be used to
+                           generate a hadoop-site.xml that will be used by the
+                           NameNode and DataNodes.</li>
+          
+          <li>final-server-params: Same as above, except they will be marked final.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>gridservice-mapred options</title>
+        
+        <ul>
+          <li>external: If false, indicates that a MapReduce cluster must be
+                      bought up by the HOD system on the nodes which it allocates
+                      via the allocate command.
+                      If true, if will try and connect to an externally 
+                      configured MapReduce system.</li>
+          
+          <li>host: Hostname of the externally configured JobTracker, if any</li>
+          
+          <li>tracker_port: Port to which the JobTracker RPC server is bound</li>
+          
+          <li>info_port: Port to which the JobTracker web UI server is bound.</li>
+          
+          <li>pkgs: Installation directory, under which bin/hadoop executable is 
+                  located</li>
+          
+          <li>server-params: Comma-separated list of hadoop config parameters
+                           specified key-value pairs. These will be used to
+                           generate a hadoop-site.xml that will be used by the
+                           JobTracker and TaskTrackers</li>
+          
+          <li>final-server-params: Same as above, except they will be marked final.</li>
+        </ul>
+      </section>
+
+      <section>
+        <title>hodring options</title>
+
+        <ul>
+          <li>mapred-system-dir-root: Directory in the DFS under which HOD will
+                                      generate sub-directory names and pass the full path
+                                      as the value of the 'mapred.system.dir' configuration 
+                                      parameter to Hadoop daemons. The format of the full 
+                                      path will be value-of-this-option/userid/mapredsystem/cluster-id.
+                                      Note that the directory specified here should be such
+                                      that all users can create directories under this, if
+                                      permissions are enabled in HDFS. Setting the value of
+                                      this option to /user will make HOD use the user's
+                                      home directory to generate the mapred.system.dir value.</li>
+
+          <li>log-destination-uri: URL describing a path in an external, static DFS or the 
+                                   cluster node's local file system where HOD will upload 
+                                   Hadoop logs when a cluster is deallocated. To specify a 
+                                   DFS path, use the format 'hdfs://path'. To specify a 
+                                   cluster node's local file path, use the format 'file://path'.
+
+                                   When clusters are deallocated by HOD, the hadoop logs will
+                                   be deleted as part of HOD's cleanup process. To ensure these
+                                   logs persist, you can use this configuration option.
+
+                                   The format of the path is 
+                                   value-of-this-option/userid/hod-logs/cluster-id
+
+                                   Note that the directory you specify here must be such that all
+                                   users can create sub-directories under this. Setting this value
+                                   to hdfs://user will make the logs come in the user's home directory
+                                   in DFS.</li>
+
+          <li>pkgs: Installation directory, under which bin/hadoop executable is located. This will
+                    be used by HOD to upload logs if a HDFS URL is specified in log-destination-uri
+                    option. Note that this is useful if the users are using a tarball whose version
+                    may differ from the external, static HDFS version.</li>
+
+          <li>hadoop-port-range: Range of ports, among which an available port shall
+                             be picked for use to run a Hadoop Service, like JobTracker or TaskTracker. </li>
+          
+                                      
+        </ul>
+      </section>
+    </section>
+   </section>
+   
+   
+</body>
+</document>

+ 1 - 1
src/docs/src/documentation/content/xdocs/single_node_setup.xml

@@ -97,7 +97,7 @@
       
       
     </section>
     </section>
     
     
-    <section>
+    <section id="Download">
       <title>Download</title>
       <title>Download</title>
       
       
       <p>
       <p>

+ 11 - 0
src/docs/src/documentation/content/xdocs/site.xml

@@ -39,9 +39,11 @@ See http://forrest.apache.org/docs/linking.html for more info.
   </docs>	
   </docs>	
 		
 		
  <docs label="Guides">
  <docs label="Guides">
+		<commands_manual 				label="Hadoop Commands"  href="commands_manual.html" />
 		<fsshell				        label="File System Shell"               href="file_system_shell.html" />
 		<fsshell				        label="File System Shell"               href="file_system_shell.html" />
 		<SLA					 	label="Service Level Authorization" 	href="service_level_auth.html"/>
 		<SLA					 	label="Service Level Authorization" 	href="service_level_auth.html"/>
 		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
 		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
+		<hod_scheduler 			label="Hadoop On Demand"            href="hod_scheduler.html"/>
    </docs>
    </docs>
 
 
    <docs label="Miscellaneous"> 
    <docs label="Miscellaneous"> 
@@ -68,6 +70,15 @@ See http://forrest.apache.org/docs/linking.html for more info.
     <hdfs-default href="http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html" />
     <hdfs-default href="http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html" />
     <mapred-default href="http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html" />
     <mapred-default href="http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html" />
     
     
+    <mapred-queues href="http://hadoop.apache.org/mapreduce/docs/current/mapred_queues.xml" />
+    <capacity-scheduler href="http://hadoop.apache.org/mapreduce/docs/current/capacity_scheduler.html" />
+    <mapred-tutorial href="http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html" >
+        <JobAuthorization href="#Job+Authorization" />
+    </mapred-tutorial>
+    <streaming href="http://hadoop.apache.org/mapreduce/docs/current/streaming.html" />
+    <distcp href="http://hadoop.apache.org/mapreduce/docs/current/distcp.html" />
+    <hadoop-archives href="http://hadoop.apache.org/mapreduce/docs/current/hadoop_archives.html" />
+    
     <zlib      href="http://www.zlib.net/" />
     <zlib      href="http://www.zlib.net/" />
     <gzip      href="http://www.gzip.org/" />
     <gzip      href="http://www.gzip.org/" />
     <bzip      href="http://www.bzip.org/" />
     <bzip      href="http://www.bzip.org/" />