Sfoglia il codice sorgente

HADOOP-4145. Add an accounting plugin (script) for HOD. Contributed by Hemanth Yamijala.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@694702 13f79535-47bb-0310-9956-ffa450edef68
Nigel Daley 16 anni fa
parent
commit
965b473f6e

+ 22 - 4
docs/changes.html

@@ -56,7 +56,7 @@
 </a></h2>
 <ul id="trunk_(unreleased_changes)_">
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._incompatible_changes_')">  INCOMPATIBLE CHANGES
-</a>&nbsp;&nbsp;&nbsp;(12)
+</a>&nbsp;&nbsp;&nbsp;(13)
     <ol id="trunk_(unreleased_changes)_._incompatible_changes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3595">HADOOP-3595</a>. Remove deprecated methods for mapred.combine.once
 functionality, which was necessary to providing backwards
@@ -91,10 +91,14 @@ omalley)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3150">HADOOP-3150</a>. Moves task promotion to tasks. Defines a new interface for
 committing output files. Moves job setup to jobclient, and moves jobcleanup
 to a separate task.<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3446">HADOOP-3446</a>. Keep map outputs in memory during the reduce. Remove
+fs.inmemory.size.mb and replace with properties defining in memory map
+output retention during the shuffle and reduce relative to maximum heap
+usage.<br />(cdouglas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._new_features_')">  NEW FEATURES
-</a>&nbsp;&nbsp;&nbsp;(29)
+</a>&nbsp;&nbsp;&nbsp;(30)
     <ol id="trunk_(unreleased_changes)_._new_features_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3341">HADOOP-3341</a>. Allow streaming jobs to specify the field separator for map
 and reduce input and output. The new configuration values are:
@@ -157,10 +161,12 @@ chains of Maps and Reduces in a single Map/Reduce job, something like
 MAP+ / REDUCE MAP*.<br />(Alejandro Abdelnur via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3445">HADOOP-3445</a>. Add capacity scheduler that provides guaranteed capacities to
 queues as a percentage of the cluster.<br />(Vivek Ratan via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3992">HADOOP-3992</a>. Add a synthetic load generation facility to the test
+directory.<br />(hairong via szetszwo)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._improvements_')">  IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(49)
+</a>&nbsp;&nbsp;&nbsp;(50)
     <ol id="trunk_(unreleased_changes)_._improvements_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3908">HADOOP-3908</a>. Fuse-dfs: better error message if llibhdfs.so doesn't exist.<br />(Pete Wyckoff through zshao)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3732">HADOOP-3732</a>. Delay intialization of datanode block verification till
@@ -250,6 +256,8 @@ components.<br />(tomwhite)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3361">HADOOP-3361</a>. Implement renames for NativeS3FileSystem.<br />(Albert Chern via tomwhite)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3605">HADOOP-3605</a>. Make EC2 scripts show an error message if AWS_ACCOUNT_ID is
 unset.<br />(Al Hoang via tomwhite)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4147">HADOOP-4147</a>. Remove unused class JobWithTaskContext from class
+JobInProgress.<br />(Amareshwari Sriramadasu via johan)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._optimizations_')">  OPTIMIZATIONS
@@ -275,7 +283,7 @@ it from a different .crc file.<br />(Jothi Padmanabhan via ddas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._bug_fixes_')">  BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(60)
+</a>&nbsp;&nbsp;&nbsp;(66)
     <ol id="trunk_(unreleased_changes)_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3563">HADOOP-3563</a>.  Refactor the distributed upgrade code so that it is
 easier to identify datanode and namenode related code.<br />(dhruba)</li>
@@ -396,6 +404,16 @@ implementations and moves it to the JobTracker.<br />(Amareshwari Sriramadasu vi
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-4097">HADOOP-4097</a>. Make hive work well with speculative execution turned on.<br />(Joydeep Sen Sarma via dhruba)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-4113">HADOOP-4113</a>. Changes to libhdfs to not exit on its own, rather return
 an error code to the caller.<br />(Pete Wyckoff via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4054">HADOOP-4054</a>. Remove duplicate lease removal during edit log loading.<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4071">HADOOP-4071</a>. FSNameSystem.isReplicationInProgress should add an
+underReplicated block to the neededReplication queue using method
+"add" not "update".<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4154">HADOOP-4154</a>. Fix type warnings in WritableUtils.<br />(szetszwo via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4133">HADOOP-4133</a>. Log files generated by Hive should reside in the
+build directory.<br />(Prasad Chakka via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4094">HADOOP-4094</a>. Hive now has hive-default.xml and hive-site.xml similar
+to core hadoop.<br />(Prasad Chakka via dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4112">HADOOP-4112</a>. Handles cleanupTask in JobHistory<br />(Amareshwari Sriramadasu via ddas)</li>
     </ol>
   </li>
 </ul>

+ 35 - 0
docs/hod_admin_guide.html

@@ -245,6 +245,15 @@ document.write("Last Published: " + document.lastModified);
 </li>
 </ul>
 </li>
+<li>
+<a href="#verify-account+-+Script+to+verify+an+account+under+which+%0A+++++++++++++jobs+are+submitted">verify-account - Script to verify an account under which 
+             jobs are submitted</a>
+<ul class="minitoc">
+<li>
+<a href="#Integrating+the+verify-account+script+with+HOD">Integrating the verify-account script with HOD</a>
+</li>
+</ul>
+</li>
 </ul>
 </li>
 </ul>
@@ -646,8 +655,34 @@ in the HOD Configuration Guide.</p>
         constraints, for example via cron. Please note that the resource manager
         and scheduler commands used in this script can be expensive and so
         it is better not to run this inside a tight loop without sleeping.</p>
+<a name="N1022D"></a><a name="verify-account+-+Script+to+verify+an+account+under+which+%0A+++++++++++++jobs+are+submitted"></a>
+<h3 class="h4">verify-account - Script to verify an account under which 
+             jobs are submitted</h3>
+<p>Production systems use accounting packages to charge users for using
+      shared compute resources. HOD supports a parameter 
+      <em>resource_manager.pbs-account</em> to allow users to identify the
+      account under which they would like to submit jobs. It may be necessary
+      to verify that this account is a valid one configured in an accounting
+      system. The <em>hod-install-dir/bin/verify-account</em> script 
+      provides a mechanism to plug-in a custom script that can do this
+      verification.</p>
+<a name="N1023C"></a><a name="Integrating+the+verify-account+script+with+HOD"></a>
+<h4>Integrating the verify-account script with HOD</h4>
+<p>HOD runs the <em>verify-account</em> script passing in the
+        <em>resource_manager.pbs-account</em> value as argument to the script,
+        before allocating a cluster. Sites can write a script that verify this 
+        account against their accounting systems. Returning a non-zero exit 
+        code from this script will cause HOD to fail allocation. Also, in
+        case of an error, HOD will print the output of script to the user.
+        Any descriptive error message can be passed to the user from the
+        script in this manner.</p>
+<p>The default script that comes with the HOD installation does not
+        do any validation, and returns a zero exit code.</p>
+<p>If the verify-account script is not found, then HOD will treat
+        that verification is disabled, and continue allocation as is.</p>
 </div>
 
+
 </div>
 <!--+
     |end content

File diff suppressed because it is too large
+ 13 - 2
docs/hod_admin_guide.pdf


+ 9 - 8
docs/hod_user_guide.html

@@ -1021,7 +1021,8 @@ document.write("Last Published: " + document.lastModified);
 <td colspan="1" rowspan="1"> 5 </td>
         <td colspan="1" rowspan="1"> Job execution failure </td>
         <td colspan="1" rowspan="1"> 1. Torque Job was deleted from outside. Execute the Torque <span class="codefrag">qstat</span> command to see if you have any jobs in the <span class="codefrag">R</span> (Running) state. If none exist, try re-executing HOD. <br>
-          2. Torque problems such as the server momentarily going down, or becoming unresponsive. Contact system administrator. </td>
+          2. Torque problems such as the server momentarily going down, or becoming unresponsive. Contact system administrator. <br>
+          3. The system administrator might have configured account verification, and an invalid account is specified. Contact system administrator.</td>
       
 </tr>
       
@@ -1119,7 +1120,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
   
 </table>
-<a name="N10755"></a><a name="Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"></a>
+<a name="N10757"></a><a name="Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"></a>
 <h3 class="h4"> Hadoop Jobs Not Running on a Successfully Allocated Cluster </h3>
 <a name="Hadoop_Jobs_Not_Running_on_a_Suc" id="Hadoop_Jobs_Not_Running_on_a_Suc"></a>
 <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
@@ -1138,31 +1139,31 @@ document.write("Last Published: " + document.lastModified);
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop client being used to submit jobs and the hadoop used in provisioning (typically via the tarball option). Ensure compatible versions are being used.</p>
 <p>
 <em>Possible Cause:</em> You used one of the options for specifying Hadoop configuration <span class="codefrag">-M or -H</span>, which had special characters like space or comma that were not escaped correctly. Refer to the section <em>Options Configuring HOD</em> for checking how to specify such options correctly.</p>
-<a name="N10790"></a><a name="My+Hadoop+Job+Got+Killed"></a>
+<a name="N10792"></a><a name="My+Hadoop+Job+Got+Killed"></a>
 <h3 class="h4"> My Hadoop Job Got Killed </h3>
 <a name="My_Hadoop_Job_Got_Killed" id="My_Hadoop_Job_Got_Killed"></a>
 <p>
 <em>Possible Cause:</em> The wallclock limit specified by the Torque administrator or the <span class="codefrag">-l</span> option defined in the section <em>Specifying Additional Job Attributes</em> was exceeded since allocation time. Thus the cluster would have got released. Deallocate the cluster and allocate it again, this time with a larger wallclock time.</p>
 <p>
 <em>Possible Cause:</em> Problems with the JobTracker node. Refer to the section in <em>Collecting and Viewing Hadoop Logs</em> to get more information.</p>
-<a name="N107AB"></a><a name="Hadoop+Job+Fails+with+Message%3A+%27Job+tracker+still+initializing%27"></a>
+<a name="N107AD"></a><a name="Hadoop+Job+Fails+with+Message%3A+%27Job+tracker+still+initializing%27"></a>
 <h3 class="h4"> Hadoop Job Fails with Message: 'Job tracker still initializing' </h3>
 <a name="Hadoop_Job_Fails_with_Message_Jo" id="Hadoop_Job_Fails_with_Message_Jo"></a>
 <p>
 <em>Possible Cause:</em> The hadoop job was being run as part of the HOD script command, and it started before the JobTracker could come up fully. Allocate the cluster using a large value for the configuration option <span class="codefrag">--hod.script-wait-time</span>. Typically a value of 120 should work, though it is typically unnecessary to be that large.</p>
-<a name="N107BB"></a><a name="The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"></a>
+<a name="N107BD"></a><a name="The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"></a>
 <h3 class="h4"> The Exit Codes For HOD Are Not Getting Into Torque </h3>
 <a name="The_Exit_Codes_For_HOD_Are_Not_G" id="The_Exit_Codes_For_HOD_Are_Not_G"></a>
 <p>
 <em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. The version of Hadoop used does not match. Use the required version of Hadoop.</p>
 <p>
 <em>Possible Cause:</em> The deallocation was done without using the <span class="codefrag">hod</span> command; for e.g. directly using <span class="codefrag">qdel</span>. When the cluster is deallocated in this manner, the HOD processes are terminated using signals. This results in the exit code to be based on the signal number, rather than the exit code of the program.</p>
-<a name="N107D3"></a><a name="The+Hadoop+Logs+are+Not+Uploaded+to+DFS"></a>
+<a name="N107D5"></a><a name="The+Hadoop+Logs+are+Not+Uploaded+to+DFS"></a>
 <h3 class="h4"> The Hadoop Logs are Not Uploaded to DFS </h3>
 <a name="The_Hadoop_Logs_are_Not_Uploaded" id="The_Hadoop_Logs_are_Not_Uploaded"></a>
 <p>
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs and the external HDFS. Ensure that the correct version is specified in the <span class="codefrag">hodring.pkgs</span> option.</p>
-<a name="N107E3"></a><a name="Locating+Ringmaster+Logs"></a>
+<a name="N107E5"></a><a name="Locating+Ringmaster+Logs"></a>
 <h3 class="h4"> Locating Ringmaster Logs </h3>
 <a name="Locating_Ringmaster_Logs" id="Locating_Ringmaster_Logs"></a>
 <p>To locate the ringmaster logs, follow these steps: </p>
@@ -1179,7 +1180,7 @@ document.write("Last Published: " + document.lastModified);
 <li> If you don't get enough information, you may want to set the ringmaster debug level to 4. This can be done by passing <span class="codefrag">--ringmaster.debug 4</span> to the hod command line.</li>
   
 </ul>
-<a name="N1080F"></a><a name="Locating+Hodring+Logs"></a>
+<a name="N10811"></a><a name="Locating+Hodring+Logs"></a>
 <h3 class="h4"> Locating Hodring Logs </h3>
 <a name="Locating_Hodring_Logs" id="Locating_Hodring_Logs"></a>
 <p>To locate hodring logs, follow the steps below: </p>

File diff suppressed because it is too large
+ 7 - 8
docs/hod_user_guide.pdf


+ 213 - 28
docs/mapred_tutorial.html

@@ -247,6 +247,14 @@ document.write("Last Published: " + document.lastModified);
 </li>
 <li>
 <a href="#Task+Execution+%26+Environment">Task Execution &amp; Environment</a>
+<ul class="minitoc">
+<li>
+<a href="#Map+Parameters">Map Parameters</a>
+</li>
+<li>
+<a href="#Shuffle%2FReduce+Parameters">Shuffle/Reduce Parameters</a>
+</li>
+</ul>
 </li>
 <li>
 <a href="#Job+Submission+and+Monitoring">Job Submission and Monitoring</a>
@@ -316,7 +324,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <li>
-<a href="#Source+Code-N10E46">Source Code</a>
+<a href="#Source+Code-N10F30">Source Code</a>
 </li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
@@ -1608,6 +1616,183 @@ document.write("Last Published: " + document.lastModified);
         greater than any value specified for a maximum heap-size
         of the child jvm via <span class="codefrag">mapred.child.java.opts</span>, or a ulimit
         value in <span class="codefrag">mapred.child.ulimit</span>. </p>
+<p>The memory available to some parts of the framework is also
+        configurable. In map and reduce tasks, performance may be influenced
+        by adjusting parameters influencing the concurrency of operations and
+        the frequency with which data will hit disk. Monitoring the filesystem
+        counters for a job- particularly relative to byte counts from the map
+        and into the reduce- is invaluable to the tuning of these
+        parameters.</p>
+<a name="N108E9"></a><a name="Map+Parameters"></a>
+<h4>Map Parameters</h4>
+<p>A record emitted from a map will be serialized into a buffer and
+          metadata will be stored into accounting buffers. As described in the
+          following options, when either the serialization buffer or the
+          metadata exceed a threshold, the contents of the buffers will be
+          sorted and written to disk in the background while the map continues
+          to output records. If either buffer fills completely while the spill
+          is in progress, the map thread will block. When the map is finished,
+          any remaining records are written to disk and all on-disk segments
+          are merged into a single file. Minimizing the number of spills to
+          disk can decrease map time, but a larger buffer also decreases the
+          memory available to the mapper.</p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4">
+            
+<tr>
+<th colspan="1" rowspan="1">Name</th><th colspan="1" rowspan="1">Type</th><th colspan="1" rowspan="1">Description</th>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">io.sort.mb</td><td colspan="1" rowspan="1">int</td>
+                <td colspan="1" rowspan="1">The cumulative size of the serialization and accounting
+                buffers storing records emitted from the map, in megabytes.
+                </td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">io.sort.record.percent</td><td colspan="1" rowspan="1">float</td>
+                <td colspan="1" rowspan="1">The ratio of serialization to accounting space can be
+                adjusted. Each serialized record requires 16 bytes of
+                accounting information in addition to its serialized size to
+                effect the sort. This percentage of space allocated from
+                <span class="codefrag">io.sort.mb</span> affects the probability of a spill to
+                disk being caused by either exhaustion of the serialization
+                buffer or the accounting space. Clearly, for a map outputting
+                small records, a higher value than the default will likely
+                decrease the number of spills to disk.</td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">io.sort.spill.percent</td><td colspan="1" rowspan="1">float</td>
+                <td colspan="1" rowspan="1">This is the threshold for the accounting and serialization
+                buffers. When this percentage of either buffer has filled,
+                their contents will be spilled to disk in the background. Let
+                <span class="codefrag">io.sort.record.percent</span> be <em>r</em>,
+                <span class="codefrag">io.sort.mb</span> be <em>x</em>, and this value be
+                <em>q</em>. The maximum number of records collected before the
+                collection thread will spill is <span class="codefrag">r * x * q * 2^16</span>.
+                Note that a higher value may decrease the number of- or even
+                eliminate- merges, but will also increase the probability of
+                the map task getting blocked. The lowest average map times are
+                usually obtained by accurately estimating the size of the map
+                output and preventing multiple spills.</td>
+</tr>
+          
+</table>
+<p>Other notes</p>
+<ul>
+            
+<li>If either spill threshold is exceeded while a spill is in
+            progress, collection will continue until the spill is finished.
+            For example, if <span class="codefrag">io.sort.buffer.spill.percent</span> is set
+            to 0.33, and the remainder of the buffer is filled while the spill
+            runs, the next spill will include all the collected records, or
+            0.66 of the buffer, and will not generate additional spills. In
+            other words, the thresholds are defining triggers, not
+            blocking.</li>
+            
+<li>A record larger than the serialization buffer will first
+            trigger a spill, then be spilled to a separate file. It is
+            undefined whether or not this record will first pass through the
+            combiner.</li>
+          
+</ul>
+<a name="N10955"></a><a name="Shuffle%2FReduce+Parameters"></a>
+<h4>Shuffle/Reduce Parameters</h4>
+<p>As described previously, each reduce fetches the output assigned
+          to it by the Partitioner via HTTP into memory and periodically
+          merges these outputs to disk. If intermediate compression of map
+          outputs is turned on, each output is decompressed into memory. The
+          following options affect the frequency of these merges to disk prior
+          to the reduce and the memory allocated to map output during the
+          reduce.</p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4">
+            
+<tr>
+<th colspan="1" rowspan="1">Name</th><th colspan="1" rowspan="1">Type</th><th colspan="1" rowspan="1">Description</th>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">io.sort.factor</td><td colspan="1" rowspan="1">int</td>
+                <td colspan="1" rowspan="1">Specifies the number of segments on disk to be merged at
+                the same time. It limits the number of open files and
+                compression codecs during the merge. If the number of files
+                exceeds this limit, the merge will proceed in several passes.
+                Though this limit also applies to the map, most jobs should be
+                configured so that hitting this limit is unlikely
+                there.</td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">mapred.inmem.merge.threshold</td><td colspan="1" rowspan="1">int</td>
+                <td colspan="1" rowspan="1">The number of sorted map outputs fetched into memory
+                before being merged to disk. Like the spill thresholds in the
+                preceding note, this is not defining a unit of partition, but
+                a trigger. In practice, this is usually set very high (1000)
+                or disabled (0), since merging in-memory segments is often
+                less expensive than merging from disk (see notes following
+                this table). This threshold influences only the frequency of
+                in-memory merges during the shuffle.</td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">mapred.job.shuffle.merge.percent</td><td colspan="1" rowspan="1">float</td>
+                <td colspan="1" rowspan="1">The memory threshold for fetched map outputs before an
+                in-memory merge is started, expressed as a percentage of
+                memory allocated to storing map outputs in memory. Since map
+                outputs that can't fit in memory can be stalled, setting this
+                high may decrease parallelism between the fetch and merge.
+                Conversely, values as high as 1.0 have been effective for
+                reduces whose input can fit entirely in memory. This parameter
+                influences only the frequency of in-memory merges during the
+                shuffle.</td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">mapred.job.shuffle.input.buffer.percent</td><td colspan="1" rowspan="1">float</td>
+                <td colspan="1" rowspan="1">The percentage of memory- relative to the maximum heapsize
+                as typically specified in <span class="codefrag">mapred.child.java.opts</span>-
+                that can be allocated to storing map outputs during the
+                shuffle. Though some memory should be set aside for the
+                framework, in general it is advantageous to set this high
+                enough to store large and numerous map outputs.</td>
+</tr>
+            
+<tr>
+<td colspan="1" rowspan="1">mapred.job.reduce.input.buffer.percent</td><td colspan="1" rowspan="1">float</td>
+                <td colspan="1" rowspan="1">The percentage of memory relative to the maximum heapsize
+                in which map outputs may be retained during the reduce. When
+                the reduce begins, map outputs will be merged to disk until
+                those that remain are under the resource limit this defines.
+                By default, all map outputs are merged to disk before the
+                reduce begins to maximize the memory available to the reduce.
+                For less memory-intensive reduces, this should be increased to
+                avoid trips to disk.</td>
+</tr>
+          
+</table>
+<p>Other notes</p>
+<ul>
+            
+<li>If a map output is larger than 25 percent of the memory
+            allocated to copying map outputs, it will be written directly to
+            disk without first staging through memory.</li>
+            
+<li>When running with a combiner, the reasoning about high merge
+            thresholds and large buffers may not hold. For merges started
+            before all map outputs have been fetched, the combiner is run
+            while spilling to disk. In some cases, one can obtain better
+            reduce times by spending resources combining map outputs- making
+            disk spills small and parallelizing spilling and fetching- rather
+            than aggressively increasing buffer sizes.</li>
+            
+<li>When merging in-memory map outputs to disk to begin the
+            reduce, if an intermediate merge is necessary because there are
+            segments to spill and at least <span class="codefrag">io.sort.factor</span>
+            segments already on disk, the in-memory map outputs will be part
+            of the intermediate merge.</li>
+          
+</ul>
 <p>The task tracker has local directory,
         <span class="codefrag"> ${mapred.local.dir}/taskTracker/</span> to create localized
         cache and localized job. It can define multiple local directories 
@@ -1786,7 +1971,7 @@ document.write("Last Published: " + document.lastModified);
         <a href="native_libraries.html#Loading+native+libraries+through+DistributedCache">
         native_libraries.html</a>
 </p>
-<a name="N10A23"></a><a name="Job+Submission+and+Monitoring"></a>
+<a name="N10B0D"></a><a name="Job+Submission+and+Monitoring"></a>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
@@ -1847,7 +2032,7 @@ document.write("Last Published: " + document.lastModified);
 <p>Normally the user creates the application, describes various facets 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
-<a name="N10A83"></a><a name="Job+Control"></a>
+<a name="N10B6D"></a><a name="Job+Control"></a>
 <h4>Job Control</h4>
 <p>Users may need to chain Map/Reduce jobs to accomplish complex
           tasks which cannot be done via a single Map/Reduce job. This is fairly
@@ -1883,7 +2068,7 @@ document.write("Last Published: " + document.lastModified);
             </li>
           
 </ul>
-<a name="N10AAD"></a><a name="Job+Input"></a>
+<a name="N10B97"></a><a name="Job+Input"></a>
 <h3 class="h4">Job Input</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
@@ -1931,7 +2116,7 @@ document.write("Last Published: " + document.lastModified);
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         compressed files with the above extensions cannot be <em>split</em> and 
         each compressed file is processed in its entirety by a single mapper.</p>
-<a name="N10B17"></a><a name="InputSplit"></a>
+<a name="N10C01"></a><a name="InputSplit"></a>
 <h4>InputSplit</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
@@ -1945,7 +2130,7 @@ document.write("Last Published: " + document.lastModified);
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           logical split.</p>
-<a name="N10B3C"></a><a name="RecordReader"></a>
+<a name="N10C26"></a><a name="RecordReader"></a>
 <h4>RecordReader</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
@@ -1957,7 +2142,7 @@ document.write("Last Published: " + document.lastModified);
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           responsibility of processing record boundaries and presents the tasks 
           with keys and values.</p>
-<a name="N10B5F"></a><a name="Job+Output"></a>
+<a name="N10C49"></a><a name="Job+Output"></a>
 <h3 class="h4">Job Output</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
@@ -1982,7 +2167,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <span class="codefrag">TextOutputFormat</span> is the default 
         <span class="codefrag">OutputFormat</span>.</p>
-<a name="N10B88"></a><a name="OutputCommitter"></a>
+<a name="N10C72"></a><a name="OutputCommitter"></a>
 <h4>OutputCommitter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputCommitter.html">
@@ -2026,7 +2211,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <span class="codefrag">FileOutputCommitter</span> is the default 
         <span class="codefrag">OutputCommitter</span>.</p>
-<a name="N10BB8"></a><a name="Task+Side-Effect+Files"></a>
+<a name="N10CA2"></a><a name="Task+Side-Effect+Files"></a>
 <h4>Task Side-Effect Files</h4>
 <p>In some applications, component tasks need to create and/or write to
           side-files, which differ from the actual job-output files.</p>
@@ -2067,7 +2252,7 @@ document.write("Last Published: " + document.lastModified);
 <p>The entire discussion holds true for maps of jobs with 
            reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
            goes directly to HDFS.</p>
-<a name="N10C06"></a><a name="RecordWriter"></a>
+<a name="N10CF0"></a><a name="RecordWriter"></a>
 <h4>RecordWriter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
@@ -2075,9 +2260,9 @@ document.write("Last Published: " + document.lastModified);
           pairs to an output file.</p>
 <p>RecordWriter implementations write the job outputs to the 
           <span class="codefrag">FileSystem</span>.</p>
-<a name="N10C1D"></a><a name="Other+Useful+Features"></a>
+<a name="N10D07"></a><a name="Other+Useful+Features"></a>
 <h3 class="h4">Other Useful Features</h3>
-<a name="N10C23"></a><a name="Counters"></a>
+<a name="N10D0D"></a><a name="Counters"></a>
 <h4>Counters</h4>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined either by 
@@ -2094,7 +2279,7 @@ document.write("Last Published: " + document.lastModified);
           in the <span class="codefrag">map</span> and/or 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           aggregated by the framework.</p>
-<a name="N10C52"></a><a name="DistributedCache"></a>
+<a name="N10D3C"></a><a name="DistributedCache"></a>
 <h4>DistributedCache</h4>
 <p>
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
@@ -2165,7 +2350,7 @@ document.write("Last Published: " + document.lastModified);
           <span class="codefrag">mapred.job.classpath.{files|archives}</span>. Similarly the
           cached files that are symlinked into the working directory of the
           task can be used to distribute native libraries and load them.</p>
-<a name="N10CD5"></a><a name="Tool"></a>
+<a name="N10DBF"></a><a name="Tool"></a>
 <h4>Tool</h4>
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
           interface supports the handling of generic Hadoop command-line options.
@@ -2205,7 +2390,7 @@ document.write("Last Published: " + document.lastModified);
             </span>
           
 </p>
-<a name="N10D07"></a><a name="IsolationRunner"></a>
+<a name="N10DF1"></a><a name="IsolationRunner"></a>
 <h4>IsolationRunner</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
@@ -2229,7 +2414,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10D3A"></a><a name="Profiling"></a>
+<a name="N10E24"></a><a name="Profiling"></a>
 <h4>Profiling</h4>
 <p>Profiling is a utility to get a representative (2 or 3) sample
           of built-in java profiler for a sample of maps and reduces. </p>
@@ -2262,7 +2447,7 @@ document.write("Last Published: " + document.lastModified);
           <span class="codefrag">-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s</span>
           
 </p>
-<a name="N10D6E"></a><a name="Debugging"></a>
+<a name="N10E58"></a><a name="Debugging"></a>
 <h4>Debugging</h4>
 <p>Map/Reduce framework provides a facility to run user-provided 
           scripts for debugging. When map/reduce task fails, user can run 
@@ -2273,14 +2458,14 @@ document.write("Last Published: " + document.lastModified);
 <p> In the following sections we discuss how to submit debug script
           along with the job. For submitting debug script, first it has to
           distributed. Then the script has to supplied in Configuration. </p>
-<a name="N10D7A"></a><a name="How+to+distribute+script+file%3A"></a>
+<a name="N10E64"></a><a name="How+to+distribute+script+file%3A"></a>
 <h5> How to distribute script file: </h5>
 <p>
           The user has to use 
           <a href="mapred_tutorial.html#DistributedCache">DistributedCache</a>
           mechanism to <em>distribute</em> and <em>symlink</em> the
           debug script file.</p>
-<a name="N10D8E"></a><a name="How+to+submit+script%3A"></a>
+<a name="N10E78"></a><a name="How+to+submit+script%3A"></a>
 <h5> How to submit script: </h5>
 <p> A quick way to submit debug script is to set values for the 
           properties "mapred.map.task.debug.script" and 
@@ -2304,17 +2489,17 @@ document.write("Last Published: " + document.lastModified);
 <span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>  
           
 </p>
-<a name="N10DB0"></a><a name="Default+Behavior%3A"></a>
+<a name="N10E9A"></a><a name="Default+Behavior%3A"></a>
 <h5> Default Behavior: </h5>
 <p> For pipes, a default script is run to process core dumps under
           gdb, prints stack trace and gives info about running threads. </p>
-<a name="N10DBB"></a><a name="JobControl"></a>
+<a name="N10EA5"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map/Reduce jobs
           and their dependencies.</p>
-<a name="N10DC8"></a><a name="Data+Compression"></a>
+<a name="N10EB2"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <p>Hadoop Map/Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
@@ -2328,7 +2513,7 @@ document.write("Last Published: " + document.lastModified);
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10DE8"></a><a name="Intermediate+Outputs"></a>
+<a name="N10ED2"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
             via the 
@@ -2337,7 +2522,7 @@ document.write("Last Published: " + document.lastModified);
             <span class="codefrag">CompressionCodec</span> to be used via the
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressorClass(java.lang.Class)">
             JobConf.setMapOutputCompressorClass(Class)</a> api.</p>
-<a name="N10DFD"></a><a name="Job+Outputs"></a>
+<a name="N10EE7"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/FileOutputFormat.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -2357,7 +2542,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 
     
-<a name="N10E2C"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10F16"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
@@ -2367,7 +2552,7 @@ document.write("Last Published: " + document.lastModified);
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       Hadoop installation.</p>
-<a name="N10E46"></a><a name="Source+Code-N10E46"></a>
+<a name="N10F30"></a><a name="Source+Code-N10F30"></a>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -3577,7 +3762,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
         
 </table>
-<a name="N115A8"></a><a name="Sample+Runs"></a>
+<a name="N11692"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>
@@ -3745,7 +3930,7 @@ document.write("Last Published: " + document.lastModified);
 <br>
         
 </p>
-<a name="N1167C"></a><a name="Highlights"></a>
+<a name="N11766"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
         previous one by using some features offered by the Map/Reduce framework:

File diff suppressed because it is too large
+ 3 - 3
docs/mapred_tutorial.pdf


+ 5 - 0
src/contrib/hod/CHANGES.txt

@@ -22,6 +22,11 @@ Release 0.18.1 - Unreleased
     HADOOP-4060. Modified HOD to rotate log files on the client side.
     (Vinod Kumar Vavilapalli via yhemanth)
 
+  IMPROVEMENTS
+
+    HADOOP-4145. Add an accounting plugin (script) for HOD.
+    (Hemanth Yamijala via nigel)
+
   BUG FIXES
 
     HADOOP-4161. Fixed bug in HOD cleanup that had the potential to

+ 36 - 1
src/contrib/hod/hodlib/Hod/hadoop.py

@@ -451,8 +451,43 @@ class hadoopCluster:
       raise Exception("Invalid state: Node pool is not initialized to delete the given job.")
     return ret
          
+  def is_valid_account(self):
+    """Verify if the account being used to submit the job is a valid account.
+       This code looks for a file <install-dir>/bin/verify-account. 
+       If the file is present, it executes the file, passing as argument 
+       the account name. It returns the exit code and output from the 
+       script on non-zero exit code."""
+
+    accountValidationScript = os.path.abspath('./verify-account')
+    if not os.path.exists(accountValidationScript):
+      return (0, None)
+
+    account = self.__nodePool.getAccountString()
+    exitCode = 0
+    errMsg = None
+    try:
+      accountValidationCmd = simpleCommand('Account Validation Command',\
+                                             '%s %s' % (accountValidationScript,
+                                                        account))
+      accountValidationCmd.start()
+      accountValidationCmd.wait()
+      accountValidationCmd.join()
+      exitCode = accountValidationCmd.exit_code()
+      self.__log.debug('account validation script is run %d' \
+                          % exitCode)
+      errMsg = None
+      if exitCode is not 0:
+        errMsg = accountValidationCmd.output()
+    except Exception, e:
+      exitCode = 0
+      self.__log.warn('Error executing account script: %s ' \
+                         'Accounting is disabled.' \
+                          % get_exception_error_string())
+      self.__log.debug(get_exception_string())
+    return (exitCode, errMsg)
+    
   def allocate(self, clusterDir, min, max=None):
-    status = 0  
+    status = 0
     self.__svcrgyClient = self.__get_svcrgy_client()
         
     self.__log.debug("allocate %s %s %s" % (clusterDir, min, max))

+ 15 - 1
src/contrib/hod/hodlib/Hod/hod.py

@@ -252,7 +252,6 @@ class hodRunner:
     self.__cfg['ringmaster']['max-master-failures'] = \
                               min(maxFailures, maxFailedNodes)
 
-    
   def _op_allocate(self, args):
     operation = "allocate"
     argLength = len(args)
@@ -313,6 +312,21 @@ class hodRunner:
           return
  
       self.__setup_cluster_logger(clusterDir)
+
+      (status, message) = self.__cluster.is_valid_account()
+      if status is not 0:
+        if message:
+          for line in message:
+            self.__log.critical("verify-account output: %s" % line)
+        self.__log.critical("Cluster cannot be allocated because account verification failed. " \
+                              + "verify-account returned exit code: %s." % status)
+        self.__opCode = 4
+        return
+      else:
+        self.__log.debug("verify-account returned zero exit code.")
+        if message:
+          self.__log.debug("verify-account output: %s" % message)
+
       if re.match('\d+-\d+', nodes):
         (min, max) = nodes.split("-")
         min = int(min)

+ 4 - 0
src/contrib/hod/hodlib/Hod/nodePool.py

@@ -116,6 +116,10 @@ class NodePool:
     """Update information about the workers started by this NodePool."""
     raise NotImplementedError
 
+  def getAccountString(self):
+    """Return the account string for this job"""
+    raise NotImplementedError
+
   def getNextNodeSetId(self):
     id = self.nextNodeSetId
     self.nextNodeSetId += 1

+ 6 - 0
src/contrib/hod/hodlib/NodePools/torque.py

@@ -51,6 +51,12 @@ class TorquePool(NodePool):
     self.__torque = torqueInterface(
       self._cfg['resource_manager']['batch-home'], environ, self._log)
 
+  def getAccountString(self):
+    account = ''
+    if self._cfg['resource_manager'].has_key('pbs-account'):
+      account = self._cfg['resource_manager']['pbs-account']
+    return account
+
   def __gen_submit_params(self, nodeSet, walltime = None, qosLevel = None, 
                           account = None):
     argList = []

+ 31 - 0
src/docs/src/documentation/content/xdocs/hod_admin_guide.xml

@@ -351,6 +351,37 @@ in the HOD Configuration Guide.</p>
         it is better not to run this inside a tight loop without sleeping.</p>
       </section>
     </section>
+
+    <section>
+      <title>verify-account - Script to verify an account under which 
+             jobs are submitted</title>
+      <p>Production systems use accounting packages to charge users for using
+      shared compute resources. HOD supports a parameter 
+      <em>resource_manager.pbs-account</em> to allow users to identify the
+      account under which they would like to submit jobs. It may be necessary
+      to verify that this account is a valid one configured in an accounting
+      system. The <em>hod-install-dir/bin/verify-account</em> script 
+      provides a mechanism to plug-in a custom script that can do this
+      verification.</p>
+      
+      <section>
+        <title>Integrating the verify-account script with HOD</title>
+        <p>HOD runs the <em>verify-account</em> script passing in the
+        <em>resource_manager.pbs-account</em> value as argument to the script,
+        before allocating a cluster. Sites can write a script that verify this 
+        account against their accounting systems. Returning a non-zero exit 
+        code from this script will cause HOD to fail allocation. Also, in
+        case of an error, HOD will print the output of script to the user.
+        Any descriptive error message can be passed to the user from the
+        script in this manner.</p>
+        <p>The default script that comes with the HOD installation does not
+        do any validation, and returns a zero exit code.</p>
+        <p>If the verify-account script is not found, then HOD will treat
+        that verification is disabled, and continue allocation as is.</p>
+      </section>
+    </section>
+
   </section>
+
 </body>
 </document>

+ 2 - 1
src/docs/src/documentation/content/xdocs/hod_user_guide.xml

@@ -412,7 +412,8 @@
         <td> 5 </td>
         <td> Job execution failure </td>
         <td> 1. Torque Job was deleted from outside. Execute the Torque <code>qstat</code> command to see if you have any jobs in the <code>R</code> (Running) state. If none exist, try re-executing HOD. <br />
-          2. Torque problems such as the server momentarily going down, or becoming unresponsive. Contact system administrator. </td>
+          2. Torque problems such as the server momentarily going down, or becoming unresponsive. Contact system administrator. <br/>
+          3. The system administrator might have configured account verification, and an invalid account is specified. Contact system administrator.</td>
       </tr>
       <tr>
         <td> 6 </td>

Some files were not shown because too many files changed in this diff