16 年之前 · 3045710ce4
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -13,6 +13,10 @@ Release 0.20.1 - Unreleased
 
				 
			
 
				     HADOOP-5711. Change Namenode file close log to info. (szetszwo)
			
 
				 
			
 
				+    HADOOP-5736. Update the capacity scheduler documentation for features
			
 
				+    like memory based scheduling, job initialization and removal of pre-emption.
			
 
				+    (Sreekanth Ramakrishnan via yhemanth)
			
 
				+
			
 
				   OPTIMIZATIONS
			
 
				 
			
 
				   BUG FIXES
			
--- a/conf/capacity-scheduler.xml.template
+++ b/conf/capacity-scheduler.xml.template
@@ -60,16 +60,13 @@
 
				   <property>
			
 
				     <name>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</name>
			
 
				     <value>-1</value>
			
 
				-    <description>If mapred.task.maxpmem is set to -1, this configuration will
			
 
				-      be used to calculate job's physical memory requirements as a percentage of
			
 
				-      the job's virtual memory requirements set via mapred.task.maxvmem. This
			
 
				-      property thus provides default value of physical memory for job's that
			
 
				-      don't explicitly specify physical memory requirements.
			
 
				+    <description>A percentage (float) of the default VM limit for jobs
			
 
				+   	  (mapred.task.default.maxvm). This is the default RAM task-limit 
			
 
				+   	  associated with a task. Unless overridden by a job's setting, this 
			
 
				+   	  number defines the RAM task-limit.
			
 
				 
			
 
				-      If not explicitly set to a valid value, scheduler will not consider
			
 
				-      physical memory for scheduling even if virtual memory based scheduling is
			
 
				-      enabled(by setting valid values for both mapred.task.default.maxvmem and
			
 
				-      mapred.task.limit.maxvmem).
			
 
				+      If this property is missing, or set to an invalid value, scheduling 
			
 
				+      based on physical memory, RAM, is disabled.  
			
 
				     </description>
			
 
				   </property>
			
 
				 
			
--- a/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
+++ b/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
@@ -28,7 +28,9 @@
 
				     <section>
			
 
				       <title>Purpose</title>
			
 
				       
			
 
				-      <p>This document describes the Capacity Scheduler, a pluggable Map/Reduce scheduler for Hadoop which provides a way to share large clusters.</p>
			
 
				+      <p>This document describes the Capacity Scheduler, a pluggable 
			
 
				+      Map/Reduce scheduler for Hadoop which provides a way to share 
			
 
				+      large clusters.</p>
			
 
				     </section>
			
 
				     
			
 
				     <section>
			
@@ -40,19 +42,17 @@
 
				           Support for multiple queues, where a job is submitted to a queue.
			
 
				         </li>
			
 
				         <li>
			
 
				-          Queues are guaranteed a fraction of the capacity of the grid (their 
			
 
				- 	      'guaranteed capacity') in the sense that a certain capacity of 
			
 
				- 	      resources will be at their disposal. All jobs submitted to a 
			
 
				- 	      queue will have access to the capacity guaranteed to the queue.
			
 
				+          Queues are allocated a fraction of the capacity of the grid in the 
			
 
				+          sense that a certain capacity of resources will be at their 
			
 
				+          disposal. All jobs submitted to a queue will have access to the 
			
 
				+          capacity allocated to the queue.
			
 
				         </li>
			
 
				         <li>
			
 
				-          Free resources can be allocated to any queue beyond its guaranteed 
			
 
				-          capacity. These excess allocated resources can be reclaimed and made 
			
 
				-          available to another queue in order to meet its capacity guarantee.
			
 
				-        </li>
			
 
				-        <li>
			
 
				-          The scheduler guarantees that excess resources taken from a queue 
			
 
				-          will be restored to it within N minutes of its need for them.
			
 
				+          Free resources can be allocated to any queue beyond it's capacity. 
			
 
				+          When there is demand for these resources from queues running below 
			
 
				+          capacity at a future point in time, as tasks scheduled on these 
			
 
				+          resources complete, they will be assigned to jobs on queues 
			
 
				+          running below the capacity.
			
 
				         </li>
			
 
				         <li>
			
 
				           Queues optionally support job priorities (disabled by default).
			
@@ -60,7 +60,9 @@
 
				         <li>
			
 
				           Within a queue, jobs with higher priority will have access to the 
			
 
				           queue's resources before jobs with lower priority. However, once a 
			
 
				-          job is running, it will not be preempted for a higher priority job.
			
 
				+          job is running, it will not be preempted for a higher priority job,
			
 
				+          though new tasks from the higher priority job will be 
			
 
				+          preferentially scheduled.
			
 
				         </li>
			
 
				         <li>
			
 
				           In order to prevent one or more users from monopolizing its 
			
@@ -83,59 +85,34 @@
 
				       <p>Note that many of these steps can be, and will be, enhanced over time
			
 
				       to provide better algorithms.</p>
			
 
				       
			
 
				-      <p>Whenever a TaskTracker is free, the Capacity Scheduler first picks a 
			
 
				-      queue that needs to reclaim any resources the earliest (this is a queue
			
 
				-      whose resources were temporarily being used by some other queue and now
			
 
				-      needs access to those resources). If no such queue is found, it then picks
			
 
				+      <p>Whenever a TaskTracker is free, the Capacity Scheduler picks 
			
 
				       a queue which has most free space (whose ratio of # of running slots to 
			
 
				-      guaranteed capacity is the lowest).</p>
			
 
				+      capacity is the lowest).</p>
			
 
				       
			
 
				-      <p>Once a queue is selected, the scheduler picks a job in the queue. Jobs
			
 
				+      <p>Once a queue is selected, the Scheduler picks a job in the queue. Jobs
			
 
				       are sorted based on when they're submitted and their priorities (if the 
			
 
				       queue supports priorities). Jobs are considered in order, and a job is 
			
 
				       selected if its user is within the user-quota for the queue, i.e., the 
			
 
				       user is not already using queue resources above his/her limit. The 
			
 
				-      scheduler also makes sure that there is enough free memory in the 
			
 
				+      Scheduler also makes sure that there is enough free memory in the 
			
 
				       TaskTracker to tun the job's task, in case the job has special memory
			
 
				       requirements.</p>
			
 
				       
			
 
				-      <p>Once a job is selected, the scheduler picks a task to run. This logic 
			
 
				+      <p>Once a job is selected, the Scheduler picks a task to run. This logic 
			
 
				       to pick a task remains unchanged from earlier versions.</p> 
			
 
				       
			
 
				     </section>
			
 
				     
			
 
				-    <section>
			
 
				-      <title>Reclaiming capacity</title>
			
 
				-
			
 
				-	  <p>Periodically, the scheduler determines:</p>
			
 
				-	  <ul>
			
 
				-	    <li>
			
 
				-	      if a queue needs to reclaim capacity. This happens when a queue has
			
 
				-	      at least one task pending and part of its guaranteed capacity is 
			
 
				-	      being used by some other queue. If this happens, the scheduler notes
			
 
				-	      the amount of resources it needs to reclaim for this queue within a 
			
 
				-	      specified period of time (the reclaim time). 
			
 
				-	    </li>
			
 
				-	    <li>
			
 
				-	      if a queue has not received all the resources it needed to reclaim,
			
 
				-	      and its reclaim time is about to expire. In this case, the scheduler
			
 
				-	      needs to kill tasks from queues running over capacity. This it does
			
 
				-	      by killing the tasks that started the latest.
			
 
				-	    </li>
			
 
				-	  </ul>   
			
 
				-
			
 
				-    </section>
			
 
				-
			
 
				     <section>
			
 
				       <title>Installation</title>
			
 
				       
			
 
				-        <p>The capacity scheduler is available as a JAR file in the Hadoop
			
 
				+        <p>The Capacity Scheduler is available as a JAR file in the Hadoop
			
 
				         tarball under the <em>contrib/capacity-scheduler</em> directory. The name of 
			
 
				         the JAR file would be on the lines of hadoop-*-capacity-scheduler.jar.</p>
			
 
				-        <p>You can also build the scheduler from source by executing
			
 
				+        <p>You can also build the Scheduler from source by executing
			
 
				         <em>ant package</em>, in which case it would be available under
			
 
				         <em>build/contrib/capacity-scheduler</em>.</p>
			
 
				-        <p>To run the capacity scheduler in your Hadoop installation, you need 
			
 
				+        <p>To run the Capacity Scheduler in your Hadoop installation, you need 
			
 
				         to put it on the <em>CLASSPATH</em>. The easiest way is to copy the 
			
 
				         <code>hadoop-*-capacity-scheduler.jar</code> from 
			
 
				         to <code>HADOOP_HOME/lib</code>. Alternatively, you can modify 
			
@@ -147,9 +124,9 @@
 
				       <title>Configuration</title>
			
 
				 
			
 
				       <section>
			
 
				-        <title>Using the capacity scheduler</title>
			
 
				+        <title>Using the Capacity Scheduler</title>
			
 
				         <p>
			
 
				-          To make the Hadoop framework use the capacity scheduler, set up
			
 
				+          To make the Hadoop framework use the Capacity Scheduler, set up
			
 
				           the following property in the site configuration:</p>
			
 
				           <table>
			
 
				             <tr>
			
@@ -167,7 +144,7 @@
 
				         <title>Setting up queues</title>
			
 
				         <p>
			
 
				           You can define multiple queues to which users can submit jobs with
			
 
				-          the capacity scheduler. To define multiple queues, you should edit
			
 
				+          the Capacity Scheduler. To define multiple queues, you should edit
			
 
				           the site configuration for Hadoop and modify the
			
 
				           <em>mapred.queue.names</em> property.
			
 
				         </p>
			
@@ -185,8 +162,8 @@
 
				       <section>
			
 
				         <title>Configuring properties for queues</title>
			
 
				 
			
 
				-        <p>The capacity scheduler can be configured with several properties
			
 
				-        for each queue that control the behavior of the scheduler. This
			
 
				+        <p>The Capacity Scheduler can be configured with several properties
			
 
				+        for each queue that control the behavior of the Scheduler. This
			
 
				         configuration is in the <em>conf/capacity-scheduler.xml</em>. By
			
 
				         default, the configuration is set up for one queue, named 
			
 
				         <em>default</em>.</p>
			
@@ -194,10 +171,10 @@
 
				         configuration, you should use the property name as
			
 
				         <em>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.&lt;property-name&gt;</em>.
			
 
				         </p>
			
 
				-        <p>For example, to define the property <em>guaranteed-capacity</em>
			
 
				+        <p>For example, to define the property <em>capacity</em>
			
 
				         for queue named <em>research</em>, you should specify the property
			
 
				         name as 
			
 
				-        <em>mapred.capacity-scheduler.queue.research.guaranteed-capacity</em>.
			
 
				+        <em>mapred.capacity-scheduler.queue.research.capacity</em>.
			
 
				         </p>
			
 
				 
			
 
				         <p>The properties defined for queues and their descriptions are
			
@@ -205,15 +182,10 @@
 
				 
			
 
				         <table>
			
 
				           <tr><th>Name</th><th>Description</th></tr>
			
 
				-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.guaranteed-capacity</td>
			
 
				-          	<td>Percentage of the number of slots in the cluster that are
			
 
				-          	guaranteed to be available for jobs in this queue. 
			
 
				-          	The sum of guaranteed capacities for all queues should be less 
			
 
				-          	than or equal 100.</td>
			
 
				-          </tr>
			
 
				-          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.reclaim-time-limit</td>
			
 
				-          	<td>The amount of time, in seconds, before which resources 
			
 
				-          	distributed to other queues will be reclaimed.</td>
			
 
				+          <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.capacity</td>
			
 
				+          	<td>Percentage of the number of slots in the cluster that are made 
			
 
				+            to be available for jobs in this queue. The sum of capacities 
			
 
				+            for all queues should be less than or equal 100.</td>
			
 
				           </tr>
			
 
				           <tr><td>mapred.capacity-scheduler.queue.&lt;queue-name&gt;.supports-priority</td>
			
 
				           	<td>If true, priorities of jobs will be taken into account in scheduling 
			
@@ -236,27 +208,133 @@
 
				       </section>
			
 
				       
			
 
				       <section>
			
 
				-        <title>Configuring the capacity scheduler</title>
			
 
				-        <p>The capacity scheduler's behavior can be controlled through the 
			
 
				-          following properties. 
			
 
				+        <title>Memory management</title>
			
 
				+      
			
 
				+        <p>The Capacity Scheduler supports scheduling of tasks on a
			
 
				+        <code>TaskTracker</code>(TT) based on a job's memory requirements
			
 
				+        and the availability of RAM and Virtual Memory (VMEM) on the TT node.
			
 
				+        See the <a href="mapred_tutorial.html#Memory+monitoring">Hadoop 
			
 
				+        Map/Reduce tutorial</a> for details on how the TT monitors
			
 
				+        memory usage.</p>
			
 
				+        <p>Currently the memory based scheduling is only supported
			
 
				+        in Linux platform.</p>
			
 
				+        <p>Memory-based scheduling works as follows:</p>
			
 
				+        <ol>
			
 
				+          <li>The absence of any one or more of three config parameters 
			
 
				+          or -1 being set as value of any of the parameters, 
			
 
				+          <code>mapred.tasktracker.vmem.reserved</code>, 
			
 
				+          <code>mapred.task.default.maxvmem</code>, or
			
 
				+          <code>mapred.task.limit.maxvmem</code>, disables memory-based
			
 
				+          scheduling, just as it disables memory monitoring for a TT. These
			
 
				+          config parameters are described in the 
			
 
				+          <a href="mapred_tutorial.html#Memory+monitoring">Hadoop Map/Reduce 
			
 
				+          tutorial</a>. The value of  
			
 
				+          <code>mapred.tasktracker.vmem.reserved</code> is 
			
 
				+          obtained from the TT via its heartbeat. 
			
 
				+          </li>
			
 
				+          <li>If all the three mandatory parameters are set, the Scheduler 
			
 
				+          enables VMEM-based scheduling. First, the Scheduler computes the free
			
 
				+          VMEM on the TT. This is the difference between the available VMEM on the
			
 
				+          TT (the node's total VMEM minus the offset, both of which are sent by 
			
 
				+          the TT on each heartbeat)and the sum of VMs already allocated to 
			
 
				+          running tasks (i.e., sum of the VMEM task-limits). Next, the Scheduler
			
 
				+          looks at the VMEM requirements for the job that's first in line to 
			
 
				+          run. If the job's VMEM requirements are less than the available VMEM on 
			
 
				+          the node, the job's task can be scheduled. If not, the Scheduler 
			
 
				+          ensures that the TT does not get a task to run (provided the job 
			
 
				+          has tasks to run). This way, the Scheduler ensures that jobs with 
			
 
				+          high memory requirements are not starved, as eventually, the TT 
			
 
				+          will have enough VMEM available. If the high-mem job does not have 
			
 
				+          any task to run, the Scheduler moves on to the next job. 
			
 
				+          </li>
			
 
				+          <li>In addition to VMEM, the Capacity Scheduler can also consider 
			
 
				+          RAM on the TT node. RAM is considered the same way as VMEM. TTs report
			
 
				+          the total RAM available on their node, and an offset. If both are
			
 
				+          set, the Scheduler computes the available RAM on the node. Next, 
			
 
				+          the Scheduler figures out the RAM requirements of the job, if any. 
			
 
				+          As with VMEM, users can optionally specify a RAM limit for their job
			
 
				+          (<code>mapred.task.maxpmem</code>, described in the Map/Reduce 
			
 
				+          tutorial). The Scheduler also maintains a limit for this value 
			
 
				+          (<code>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</code>, 
			
 
				+          described below). All these three values must be set for the 
			
 
				+          Scheduler to schedule tasks based on RAM constraints.
			
 
				+          </li>
			
 
				+          <li>The Scheduler ensures that jobs cannot ask for RAM or VMEM higher
			
 
				+          than configured limits. If this happens, the job is failed when it
			
 
				+          is submitted. 
			
 
				+          </li>
			
 
				+        </ol>
			
 
				+        
			
 
				+        <p>As described above, the additional scheduler-based config 
			
 
				+        parameters are as follows:</p>
			
 
				+
			
 
				+        <table>
			
 
				+          <tr><th>Name</th><th>Description</th></tr>
			
 
				+          <tr><td>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</td>
			
 
				+          	<td>A percentage of the default VMEM limit for jobs
			
 
				+          	(<code>mapred.task.default.maxvmem</code>). This is the default 
			
 
				+          	RAM task-limit associated with a task. Unless overridden by a 
			
 
				+          	job's setting, this number defines the RAM task-limit.</td>
			
 
				+          </tr>
			
 
				+          <tr><td>mapred.capacity-scheduler.task.limit.maxpmem</td>
			
 
				+          <td>Configuration which provides an upper limit to maximum physical
			
 
				+           memory which can be specified by a job. If a job requires more 
			
 
				+           physical memory than what is specified in this limit then the same
			
 
				+           is rejected.</td>
			
 
				+          </tr>
			
 
				+        </table>
			
 
				+      </section>
			
 
				+   <section>
			
 
				+        <title>Job Initialization Parameters</title>
			
 
				+        <p>Capacity scheduler lazily initializes the jobs before they are
			
 
				+        scheduled, for reducing the memory footprint on jobtracker. 
			
 
				+        Following are the parameters, by which you can control the laziness
			
 
				+        of the job initialization. The following parameters can be 
			
 
				+        configured in capacity-scheduler.xml
			
 
				         </p>
			
 
				+        
			
 
				         <table>
			
 
				+          <tr><th>Name</th><th>Description</th></tr>
			
 
				           <tr>
			
 
				-          <th>Name</th><th>Description</th>
			
 
				+            <td>
			
 
				+              mapred.capacity-scheduler.queue.&lt;queue-name&gt;.maximum-initialized-jobs-per-user
			
 
				+            </td>
			
 
				+            <td>
			
 
				+              Maximum number of jobs which are allowed to be pre-initialized for
			
 
				+              a particular user in the queue. Once a job is scheduled, i.e. 
			
 
				+              it starts running, then that job is not considered
			
 
				+              while scheduler computes the maximum job a user is allowed to
			
 
				+              initialize. 
			
 
				+            </td>
			
 
				           </tr>
			
 
				           <tr>
			
 
				-          <td>mapred.capacity-scheduler.reclaimCapacity.interval</td>
			
 
				-          <td>The time interval, in seconds, between which the scheduler 
			
 
				-          periodically determines whether capacity needs to be reclaimed for 
			
 
				-          any queue. The default value is 5 seconds.
			
 
				-          </td>
			
 
				+            <td>
			
 
				+              mapred.capacity-scheduler.init-poll-interval
			
 
				+            </td>
			
 
				+            <td>
			
 
				+              Amount of time in miliseconds which is used to poll the scheduler
			
 
				+              job queue to look for jobs to be initialized.
			
 
				+            </td>
			
 
				+          </tr>
			
 
				+          <tr>
			
 
				+            <td>
			
 
				+              mapred.capacity-scheduler.init-worker-threads
			
 
				+            </td>
			
 
				+            <td>
			
 
				+              Number of worker threads which would be used by Initialization
			
 
				+              poller to initialize jobs in a set of queue. If number mentioned 
			
 
				+              in property is equal to number of job queues then a thread is 
			
 
				+              assigned jobs from one queue. If the number configured is lesser than
			
 
				+              number of queues, then a thread can get jobs from more than one queue
			
 
				+              which it initializes in a round robin fashion. If the number configured
			
 
				+              is greater than number of queues, then number of threads spawned
			
 
				+              would be equal to number of job queues.
			
 
				+            </td>
			
 
				           </tr>
			
 
				         </table>
			
 
				-        
			
 
				-      </section>
			
 
				-
			
 
				+      </section>   
			
 
				       <section>
			
 
				-        <title>Reviewing the configuration of the capacity scheduler</title>
			
 
				+        <title>Reviewing the configuration of the Capacity Scheduler</title>
			
 
				         <p>
			
 
				           Once the installation and configuration is completed, you can review
			
 
				           it after starting the Map/Reduce cluster from the admin UI.
			
@@ -270,7 +348,8 @@
 
				               Information</em> column against each queue.</li>
			
 
				         </ul>
			
 
				       </section>
			
 
				-    </section>
			
 
				+      
			
 
				+   </section>
			
 
				   </body>
			
 
				   
			
 
				 </document>
			
--- a/src/docs/src/documentation/content/xdocs/cluster_setup.xml
+++ b/src/docs/src/documentation/content/xdocs/cluster_setup.xml
@@ -463,6 +463,120 @@
 
				           </section>
			
 
				           
			
 
				         </section>
			
 
				+        <section>
			
 
				+        <title> Memory monitoring</title>
			
 
				+        <p>A <code>TaskTracker</code>(TT) can be configured to monitor memory 
			
 
				+        usage of tasks it spawns, so that badly-behaved jobs do not bring 
			
 
				+        down a machine due to excess memory consumption. With monitoring 
			
 
				+        enabled, every task is assigned a task-limit for virtual memory (VMEM). 
			
 
				+        In addition, every node is assigned a node-limit for VMEM usage. 
			
 
				+        A TT ensures that a task is killed if it, and 
			
 
				+        its descendants, use VMEM over the task's per-task limit. It also 
			
 
				+        ensures that one or more tasks are killed if the sum total of VMEM 
			
 
				+        usage by all tasks, and their descendents, cross the node-limit.</p>
			
 
				+        
			
 
				+        <p>Users can, optionally, specify the VMEM task-limit per job. If no
			
 
				+        such limit is provided, a default limit is used. A node-limit can be 
			
 
				+        set per node.</p>   
			
 
				+        <p>Currently the memory monitoring and management is only supported
			
 
				+        in Linux platform.</p>
			
 
				+        <p>To enable monitoring for a TT, the 
			
 
				+        following parameters all need to be set:</p> 
			
 
				+
			
 
				+        <table>
			
 
				+          <tr><th>Name</th><th>Type</th><th>Description</th></tr>
			
 
				+          <tr><td>mapred.tasktracker.vmem.reserved</td><td>long</td>
			
 
				+            <td>A number, in bytes, that represents an offset. The total VMEM on 
			
 
				+            the machine, minus this offset, is the VMEM node-limit for all 
			
 
				+            tasks, and their descendants, spawned by the TT. 
			
 
				+          </td></tr>
			
 
				+          <tr><td>mapred.task.default.maxvmem</td><td>long</td>
			
 
				+            <td>A number, in bytes, that represents the default VMEM task-limit 
			
 
				+            associated with a task. Unless overridden by a job's setting, 
			
 
				+            this number defines the VMEM task-limit.   
			
 
				+          </td></tr>
			
 
				+          <tr><td>mapred.task.limit.maxvmem</td><td>long</td>
			
 
				+            <td>A number, in bytes, that represents the upper VMEM task-limit 
			
 
				+            associated with a task. Users, when specifying a VMEM task-limit 
			
 
				+            for their tasks, should not specify a limit which exceeds this amount. 
			
 
				+          </td></tr>
			
 
				+        </table>
			
 
				+        
			
 
				+        <p>In addition, the following parameters can also be configured.</p>
			
 
				+
			
 
				+    <table>
			
 
				+          <tr><th>Name</th><th>Type</th><th>Description</th></tr>
			
 
				+          <tr><td>mapred.tasktracker.taskmemorymanager.monitoring-interval</td>
			
 
				+            <td>long</td>
			
 
				+            <td>The time interval, in milliseconds, between which the TT 
			
 
				+            checks for any memory violation. The default value is 5000 msec
			
 
				+            (5 seconds). 
			
 
				+          </td></tr>
			
 
				+        </table>
			
 
				+        
			
 
				+        <p>Here's how the memory monitoring works for a TT.</p>
			
 
				+        <ol>
			
 
				+          <li>If one or more of the configuration parameters described 
			
 
				+          above are missing or -1 is specified , memory monitoring is 
			
 
				+          disabled for the TT.
			
 
				+          </li>
			
 
				+          <li>In addition, monitoring is disabled if 
			
 
				+          <code>mapred.task.default.maxvmem</code> is greater than 
			
 
				+          <code>mapred.task.limit.maxvmem</code>. 
			
 
				+          </li>
			
 
				+          <li>If a TT receives a task whose task-limit is set by the user
			
 
				+          to a value larger than <code>mapred.task.limit.maxvmem</code>, it 
			
 
				+          logs a warning but executes the task.
			
 
				+          </li> 
			
 
				+          <li>Periodically, the TT checks the following: 
			
 
				+          <ul>
			
 
				+            <li>If any task's current VMEM usage is greater than that task's
			
 
				+            VMEM task-limit, the task is killed and reason for killing 
			
 
				+            the task is logged in task diagonistics . Such a task is considered 
			
 
				+            failed, i.e., the killing counts towards the task's failure count.
			
 
				+            </li> 
			
 
				+            <li>If the sum total of VMEM used by all tasks and descendants is 
			
 
				+            greater than the node-limit, the TT kills enough tasks, in the
			
 
				+            order of least progress made, till the overall VMEM usage falls
			
 
				+            below the node-limt. Such killed tasks are not considered failed
			
 
				+            and their killing does not count towards the tasks' failure counts.
			
 
				+            </li>
			
 
				+          </ul>
			
 
				+          </li>
			
 
				+        </ol>
			
 
				+        
			
 
				+        <p>Schedulers can choose to ease the monitoring pressure on the TT by 
			
 
				+        preventing too many tasks from running on a node and by scheduling 
			
 
				+        tasks only if the TT has enough VMEM free. In addition, Schedulers may 
			
 
				+        choose to consider the physical memory (RAM) available on the node
			
 
				+        as well. To enable Scheduler support, TTs report their memory settings 
			
 
				+        to the JobTracker in every heartbeat. Before getting into details, 
			
 
				+        consider the following additional memory-related parameters than can be 
			
 
				+        configured to enable better scheduling:</p> 
			
 
				+
			
 
				+        <table>
			
 
				+          <tr><th>Name</th><th>Type</th><th>Description</th></tr>
			
 
				+          <tr><td>mapred.tasktracker.pmem.reserved</td><td>int</td>
			
 
				+            <td>A number, in bytes, that represents an offset. The total 
			
 
				+            physical memory (RAM) on the machine, minus this offset, is the 
			
 
				+            recommended RAM node-limit. The RAM node-limit is a hint to a
			
 
				+            Scheduler to scheduler only so many tasks such that the sum 
			
 
				+            total of their RAM requirements does not exceed this limit. 
			
 
				+            RAM usage is not monitored by a TT.   
			
 
				+          </td></tr>
			
 
				+        </table>
			
 
				+        
			
 
				+        <p>A TT reports the following memory-related numbers in every 
			
 
				+        heartbeat:</p>
			
 
				+        <ul>
			
 
				+          <li>The total VMEM available on the node.</li>
			
 
				+          <li>The value of <code>mapred.tasktracker.vmem.reserved</code>,
			
 
				+           if set.</li>
			
 
				+          <li>The total RAM available on the node.</li> 
			
 
				+          <li>The value of <code>mapred.tasktracker.pmem.reserved</code>,
			
 
				+           if set.</li>
			
 
				+         </ul>
			
 
				+        </section>
			
 
				         
			
 
				         <section>
			
 
				           <title>Slaves</title>
			
--- a/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
+++ b/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
@@ -1104,8 +1104,26 @@
 
				         counters for a job- particularly relative to byte counts from the map
			
 
				         and into the reduce- is invaluable to the tuning of these
			
 
				         parameters.</p>
			
 
				+        
			
 
				+        <p>Users can choose to override default limits of Virtual Memory and RAM 
			
 
				+          enforced by the task tracker, if memory management is enabled. 
			
 
				+          Users can set the following parameter per job:</p>
			
 
				+           
			
 
				+          <table>
			
 
				+          <tr><th>Name</th><th>Type</th><th>Description</th></tr>
			
 
				+          <tr><td><code>mapred.task.maxvmem</code></td><td>int</td>
			
 
				+            <td>A number, in bytes, that represents the maximum Virtual Memory
			
 
				+            task-limit for each task of the job. A task will be killed if 
			
 
				+            it consumes more Virtual Memory than this number. 
			
 
				+          </td></tr>
			
 
				+          <tr><td>mapred.task.maxpmem</td><td>int</td>
			
 
				+            <td>A number, in bytes, that represents the maximum RAM task-limit
			
 
				+            for each task of the job. This number can be optionally used by
			
 
				+            Schedulers to prevent over-scheduling of tasks on a node based 
			
 
				+            on RAM needs.  
			
 
				+          </td></tr>
			
 
				+        </table>       
			
 
				         </section>
			
 
				-
			
 
				         <section>
			
 
				           <title>Map Parameters</title>