浏览代码

Merge -r 705429:705430 from trunk to branch-0.18 to fix HADOOP-3786.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18@705436 13f79535-47bb-0310-9956-ffa450edef68
Arun Murthy 16 年之前
父节点
当前提交
86d21ddfcc

+ 49 - 5
docs/changes.html

@@ -36,7 +36,7 @@
     function collapse() {
     function collapse() {
       for (var i = 0; i < document.getElementsByTagName("ul").length; i++) {
       for (var i = 0; i < document.getElementsByTagName("ul").length; i++) {
         var list = document.getElementsByTagName("ul")[i];
         var list = document.getElementsByTagName("ul")[i];
-        if (list.id != 'release_0.18.1_-_2008-09-17_' && list.id != 'release_0.18.0_-_2008-08-19_') {
+        if (list.id != 'release_0.18.2_-_unreleased_' && list.id != 'release_0.18.1_-_2008-09-17_') {
           list.style.display = "none";
           list.style.display = "none";
         }
         }
       }
       }
@@ -52,6 +52,35 @@
 <a href="http://hadoop.apache.org/core/"><img class="logoImage" alt="Hadoop" src="images/hadoop-logo.jpg" title="Scalable Computing Platform"></a>
 <a href="http://hadoop.apache.org/core/"><img class="logoImage" alt="Hadoop" src="images/hadoop-logo.jpg" title="Scalable Computing Platform"></a>
 <h1>Hadoop Change Log</h1>
 <h1>Hadoop Change Log</h1>
 
 
+<h2><a href="javascript:toggleList('release_0.18.2_-_unreleased_')">Release 0.18.2 - Unreleased
+</a></h2>
+<ul id="release_0.18.2_-_unreleased_">
+  <li><a href="javascript:toggleList('release_0.18.2_-_unreleased_._bug_fixes_')">  BUG FIXES
+</a>&nbsp;&nbsp;&nbsp;(9)
+    <ol id="release_0.18.2_-_unreleased_._bug_fixes_">
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4116">HADOOP-4116</a>. Balancer should provide better resource management.<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3614">HADOOP-3614</a>. Fix a bug that Datanode may use an old GenerationStamp to get
+meta file.<br />(szetszwo)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4314">HADOOP-4314</a>. Simulated datanodes should not include blocks that are still
+being written in their block report.<br />(Raghu Angadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4228">HADOOP-4228</a>. dfs datanoe metrics, bytes_read and bytes_written, overflow
+due to incorrect type used.<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4395">HADOOP-4395</a>. The FSEditLog loading is incorrect for the case OP_SET_OWNER.<br />(szetszwo)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4351">HADOOP-4351</a>. FSNamesystem.getBlockLocationsInternal throws
+ArrayIndexOutOfBoundsException.<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4407">HADOOP-4407</a>. <a href="http://issues.apache.org/jira/browse/HADOOP-4395">HADOOP-4395</a> should use a Java 1.5 API for 0.18.<br />(szetszwo)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4403">HADOOP-4403</a>. Make TestLeaseRecovery and TestFileCreation more robust.<br />(szetszwo)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4292">HADOOP-4292</a>. Do not support append() for LocalFileSystem.<br />(hairong)</li>
+    </ol>
+  </li>
+  <li><a href="javascript:toggleList('release_0.18.2_-_unreleased_._new_features_')">  NEW FEATURES
+</a>&nbsp;&nbsp;&nbsp;(1)
+    <ol id="release_0.18.2_-_unreleased_._new_features_">
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2421">HADOOP-2421</a>.  Add jdiff output to documentation, listing all API
+changes from the prior release.<br />(cutting)</li>
+    </ol>
+  </li>
+</ul>
 <h2><a href="javascript:toggleList('release_0.18.1_-_2008-09-17_')">Release 0.18.1 - 2008-09-17
 <h2><a href="javascript:toggleList('release_0.18.1_-_2008-09-17_')">Release 0.18.1 - 2008-09-17
 </a></h2>
 </a></h2>
 <ul id="release_0.18.1_-_2008-09-17_">
 <ul id="release_0.18.1_-_2008-09-17_">
@@ -78,8 +107,10 @@ outputs or when the final map outputs are being fetched without contention.<br /
     </ol>
     </ol>
   </li>
   </li>
 </ul>
 </ul>
-<h2><a href="javascript:toggleList('release_0.18.0_-_2008-08-19_')">Release 0.18.0 - 2008-08-19
-</a></h2>
+<h2><a href="javascript:toggleList('older')">Older Releases</a></h2>
+<ul id="older">
+<h3><a href="javascript:toggleList('release_0.18.0_-_2008-08-19_')">Release 0.18.0 - 2008-08-19
+</a></h3>
 <ul id="release_0.18.0_-_2008-08-19_">
 <ul id="release_0.18.0_-_2008-08-19_">
   <li><a href="javascript:toggleList('release_0.18.0_-_2008-08-19_._incompatible_changes_')">  INCOMPATIBLE CHANGES
   <li><a href="javascript:toggleList('release_0.18.0_-_2008-08-19_._incompatible_changes_')">  INCOMPATIBLE CHANGES
 </a>&nbsp;&nbsp;&nbsp;(23)
 </a>&nbsp;&nbsp;&nbsp;(23)
@@ -620,8 +651,21 @@ cdouglas)</li>
     </ol>
     </ol>
   </li>
   </li>
 </ul>
 </ul>
-<h2><a href="javascript:toggleList('older')">Older Releases</a></h2>
-<ul id="older">
+<h3><a href="javascript:toggleList('release_0.17.3_-_unreleased_')">Release 0.17.3 - Unreleased
+</a></h3>
+<ul id="release_0.17.3_-_unreleased_">
+  <li><a href="javascript:toggleList('release_0.17.3_-_unreleased_._bug_fixes_')">  BUG FIXES
+</a>&nbsp;&nbsp;&nbsp;(4)
+    <ol id="release_0.17.3_-_unreleased_._bug_fixes_">
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4277">HADOOP-4277</a>. Checksum verification was mistakenly disabled for
+LocalFileSystem.<br />(Raghu Angadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4271">HADOOP-4271</a>. Checksum input stream can sometimes return invalid
+data to the user.<br />(Ning Li via rangadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4318">HADOOP-4318</a>. DistCp should use absolute paths for cleanup.<br />(szetszwo)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-4326">HADOOP-4326</a>. ChecksumFileSystem does not override create(...) correctly.<br />(szetszwo)</li>
+    </ol>
+  </li>
+</ul>
 <h3><a href="javascript:toggleList('release_0.17.2_-_2008-08-11_')">Release 0.17.2 - 2008-08-11
 <h3><a href="javascript:toggleList('release_0.17.2_-_2008-08-11_')">Release 0.17.2 - 2008-08-11
 </a></h3>
 </a></h3>
 <ul id="release_0.17.2_-_2008-08-11_">
 <ul id="release_0.17.2_-_2008-08-11_">

+ 3 - 0
docs/cluster_setup.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/commands_manual.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/distcp.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hadoop_archives.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hdfs_design.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hdfs_permissions_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hdfs_quota_admin_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hdfs_shell.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hdfs_user_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/hod.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 20 - 15
docs/hod_admin_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
@@ -343,7 +346,9 @@ Nodes : HOD requires a minimum of three nodes configured through a resource mana
 <p>The following components must be installed on ALL nodes before using HOD:</p>
 <p>The following components must be installed on ALL nodes before using HOD:</p>
 <ul>
 <ul>
  
  
-<li>Torque: Resource manager</li>
+<li>
+<a href="http://www.clusterresources.com/pages/products/torque-resource-manager.php">Torque: Resource manager</a>
+</li>
  
  
 <li>
 <li>
 <a href="http://www.python.org">Python</a> : HOD requires version 2.5.1 of Python.</li>
 <a href="http://www.python.org">Python</a> : HOD requires version 2.5.1 of Python.</li>
@@ -373,7 +378,7 @@ nodes.
 </div>
 </div>
 
 
 
 
-<a name="N1008D"></a><a name="Resource+Manager"></a>
+<a name="N1008F"></a><a name="Resource+Manager"></a>
 <h2 class="h3">Resource Manager</h2>
 <h2 class="h3">Resource Manager</h2>
 <div class="section">
 <div class="section">
 <p>  Currently HOD works with the Torque resource manager, which it uses for its node
 <p>  Currently HOD works with the Torque resource manager, which it uses for its node
@@ -418,7 +423,7 @@ nodes.
 </div>
 </div>
 
 
 
 
-<a name="N100CC"></a><a name="Installing+HOD"></a>
+<a name="N100CE"></a><a name="Installing+HOD"></a>
 <h2 class="h3">Installing HOD</h2>
 <h2 class="h3">Installing HOD</h2>
 <div class="section">
 <div class="section">
 <p>Once the resource manager is set up, you can obtain and
 <p>Once the resource manager is set up, you can obtain and
@@ -443,13 +448,13 @@ install HOD.</p>
 </div>
 </div>
 
 
 
 
-<a name="N100E5"></a><a name="Configuring+HOD"></a>
+<a name="N100E7"></a><a name="Configuring+HOD"></a>
 <h2 class="h3">Configuring HOD</h2>
 <h2 class="h3">Configuring HOD</h2>
 <div class="section">
 <div class="section">
 <p>You can configure HOD once it is installed. The minimal configuration needed
 <p>You can configure HOD once it is installed. The minimal configuration needed
 to run HOD is described below. More advanced configuration options are discussed
 to run HOD is described below. More advanced configuration options are discussed
 in the HOD Configuration Guide.</p>
 in the HOD Configuration Guide.</p>
-<a name="N100EE"></a><a name="Minimal+Configuration"></a>
+<a name="N100F0"></a><a name="Minimal+Configuration"></a>
 <h3 class="h4">Minimal Configuration</h3>
 <h3 class="h4">Minimal Configuration</h3>
 <p>To get started using HOD, the following minimal configuration is
 <p>To get started using HOD, the following minimal configuration is
   required:</p>
   required:</p>
@@ -509,7 +514,7 @@ in the HOD Configuration Guide.</p>
 </li>
 </li>
 
 
 </ul>
 </ul>
-<a name="N10122"></a><a name="Advanced+Configuration"></a>
+<a name="N10124"></a><a name="Advanced+Configuration"></a>
 <h3 class="h4">Advanced Configuration</h3>
 <h3 class="h4">Advanced Configuration</h3>
 <p> You can review and modify other configuration options to suit
 <p> You can review and modify other configuration options to suit
  your specific needs. Refer to the <a href="hod_config_guide.html">Configuration
  your specific needs. Refer to the <a href="hod_config_guide.html">Configuration
@@ -517,19 +522,19 @@ in the HOD Configuration Guide.</p>
 </div>
 </div>
 
 
   
   
-<a name="N10131"></a><a name="Running+HOD"></a>
+<a name="N10133"></a><a name="Running+HOD"></a>
 <h2 class="h3">Running HOD</h2>
 <h2 class="h3">Running HOD</h2>
 <div class="section">
 <div class="section">
 <p>You can run HOD once it is configured. Refer to <a href="hod_user_guide.html">the HOD User Guide</a> for more information.</p>
 <p>You can run HOD once it is configured. Refer to <a href="hod_user_guide.html">the HOD User Guide</a> for more information.</p>
 </div>
 </div>
 
 
   
   
-<a name="N1013F"></a><a name="Supporting+Tools+and+Utilities"></a>
+<a name="N10141"></a><a name="Supporting+Tools+and+Utilities"></a>
 <h2 class="h3">Supporting Tools and Utilities</h2>
 <h2 class="h3">Supporting Tools and Utilities</h2>
 <div class="section">
 <div class="section">
 <p>This section describes supporting tools and utilities that can be used to
 <p>This section describes supporting tools and utilities that can be used to
     manage HOD deployments.</p>
     manage HOD deployments.</p>
-<a name="N10148"></a><a name="logcondense.py+-+Manage+Log+Files"></a>
+<a name="N1014A"></a><a name="logcondense.py+-+Manage+Log+Files"></a>
 <h3 class="h4">logcondense.py - Manage Log Files</h3>
 <h3 class="h4">logcondense.py - Manage Log Files</h3>
 <p>As mentioned in the 
 <p>As mentioned in the 
          <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">HOD User Guide</a>,
          <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">HOD User Guide</a>,
@@ -537,7 +542,7 @@ in the HOD Configuration Guide.</p>
          Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
          Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
          to HDFS could increase. logcondense.py is a tool that helps
          to HDFS could increase. logcondense.py is a tool that helps
          administrators to remove log files uploaded to HDFS. </p>
          administrators to remove log files uploaded to HDFS. </p>
-<a name="N10155"></a><a name="Running+logcondense.py"></a>
+<a name="N10157"></a><a name="Running+logcondense.py"></a>
 <h4>Running logcondense.py</h4>
 <h4>Running logcondense.py</h4>
 <p>logcondense.py is available under hod_install_location/support folder. You can either
 <p>logcondense.py is available under hod_install_location/support folder. You can either
         run it using python, for example, <em>python logcondense.py</em>, or give execute permissions 
         run it using python, for example, <em>python logcondense.py</em>, or give execute permissions 
@@ -548,7 +553,7 @@ in the HOD Configuration Guide.</p>
         be configured to come under the user's home directory in HDFS. In that case, the user
         be configured to come under the user's home directory in HDFS. In that case, the user
         running logcondense.py should have super user privileges to remove the files from under
         running logcondense.py should have super user privileges to remove the files from under
         all user home directories.</p>
         all user home directories.</p>
-<a name="N10169"></a><a name="Command+Line+Options+for+logcondense.py"></a>
+<a name="N1016B"></a><a name="Command+Line+Options+for+logcondense.py"></a>
 <h4>Command Line Options for logcondense.py</h4>
 <h4>Command Line Options for logcondense.py</h4>
 <p>The following command line options are supported for logcondense.py.</p>
 <p>The following command line options are supported for logcondense.py.</p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
@@ -622,7 +627,7 @@ in the HOD Configuration Guide.</p>
 <p>
 <p>
 <em>python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user</em>
 <em>python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user</em>
 </p>
 </p>
-<a name="N1020C"></a><a name="checklimits.sh+-+Monitor+Resource+Limits"></a>
+<a name="N1020E"></a><a name="checklimits.sh+-+Monitor+Resource+Limits"></a>
 <h3 class="h4">checklimits.sh - Monitor Resource Limits</h3>
 <h3 class="h4">checklimits.sh - Monitor Resource Limits</h3>
 <p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
 <p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
       (<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">Maui Cluster Scheduler</a> is an open source job
       (<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">Maui Cluster Scheduler</a> is an open source job
@@ -638,7 +643,7 @@ in the HOD Configuration Guide.</p>
       Used:([0-9]*) MaxLimit:([0-9]*)</em> for those jobs that violate limits.
       Used:([0-9]*) MaxLimit:([0-9]*)</em> for those jobs that violate limits.
       This comment field is then used by HOD to behave accordingly depending on
       This comment field is then used by HOD to behave accordingly depending on
       the type of violation.</p>
       the type of violation.</p>
-<a name="N1021C"></a><a name="Running+checklimits.sh"></a>
+<a name="N1021E"></a><a name="Running+checklimits.sh"></a>
 <h4>Running checklimits.sh</h4>
 <h4>Running checklimits.sh</h4>
 <p>checklimits.sh is available under the hod_install_location/support
 <p>checklimits.sh is available under the hod_install_location/support
         folder. This shell script can be run directly as <em>sh
         folder. This shell script can be run directly as <em>sh
@@ -652,7 +657,7 @@ in the HOD Configuration Guide.</p>
         constraints, for example via cron. Please note that the resource manager
         constraints, for example via cron. Please note that the resource manager
         and scheduler commands used in this script can be expensive and so
         and scheduler commands used in this script can be expensive and so
         it is better not to run this inside a tight loop without sleeping.</p>
         it is better not to run this inside a tight loop without sleeping.</p>
-<a name="N1022D"></a><a name="verify-account+-+Script+to+verify+an+account+under+which+%0A+++++++++++++jobs+are+submitted"></a>
+<a name="N1022F"></a><a name="verify-account+-+Script+to+verify+an+account+under+which+%0A+++++++++++++jobs+are+submitted"></a>
 <h3 class="h4">verify-account - Script to verify an account under which 
 <h3 class="h4">verify-account - Script to verify an account under which 
              jobs are submitted</h3>
              jobs are submitted</h3>
 <p>Production systems use accounting packages to charge users for using
 <p>Production systems use accounting packages to charge users for using
@@ -663,7 +668,7 @@ in the HOD Configuration Guide.</p>
       system. The <em>hod-install-dir/bin/verify-account</em> script 
       system. The <em>hod-install-dir/bin/verify-account</em> script 
       provides a mechanism to plug-in a custom script that can do this
       provides a mechanism to plug-in a custom script that can do this
       verification.</p>
       verification.</p>
-<a name="N1023C"></a><a name="Integrating+the+verify-account+script+with+HOD"></a>
+<a name="N1023E"></a><a name="Integrating+the+verify-account+script+with+HOD"></a>
 <h4>Integrating the verify-account script with HOD</h4>
 <h4>Integrating the verify-account script with HOD</h4>
 <p>HOD runs the <em>verify-account</em> script passing in the
 <p>HOD runs the <em>verify-account</em> script passing in the
         <em>resource_manager.pbs-account</em> value as argument to the script,
         <em>resource_manager.pbs-account</em> value as argument to the script,

文件差异内容过多而无法显示
+ 1 - 1
docs/hod_admin_guide.pdf


+ 29 - 5
docs/hod_config_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
@@ -290,7 +293,15 @@ document.write("Last Published: " + document.lastModified);
           
           
 <li>temp-dir: Temporary directory for usage by the HOD processes. Make 
 <li>temp-dir: Temporary directory for usage by the HOD processes. Make 
                       sure that the users who will run hod have rights to create 
                       sure that the users who will run hod have rights to create 
-                      directories under the directory specified here.</li>
+                      directories under the directory specified here. If you
+                      wish to make this directory vary across allocations,
+                      you can make use of the environmental variables which will
+                      be made available by the resource manager to the HOD
+                      processes. For example, in a Torque setup, having
+                      --ringmaster.temp-dir=/tmp/hod-temp-dir.$PBS_JOBID would
+                      let ringmaster use different temp-dir for each
+                      allocation; Torque expands this variable before starting
+                      the ringmaster.</li>
           
           
           
           
 <li>debug: Numeric value from 1-4. 4 produces the most log information,
 <li>debug: Numeric value from 1-4. 4 produces the most log information,
@@ -376,9 +387,22 @@ document.write("Last Published: " + document.lastModified);
                       variable 'HOD_PYTHON_HOME' to the path to the python 
                       variable 'HOD_PYTHON_HOME' to the path to the python 
                       executable. The HOD processes launched on the compute nodes
                       executable. The HOD processes launched on the compute nodes
                       can then use this variable.</li>
                       can then use this variable.</li>
+          
+<li>options: Comma-separated list of key-value pairs,
+                      expressed as
+                      &lt;option&gt;:&lt;sub-option&gt;=&lt;value&gt;. When
+                      passing to the job submission program, these are expanded
+                      as -&lt;option&gt; &lt;sub-option&gt;=&lt;value&gt;. These
+                      are generally used for specifying additional resource
+                      contraints for scheduling. For instance, with a Torque
+                      setup, one can specify
+                      --resource_manager.options='l:arch=x86_64' for
+                      constraining the nodes being allocated to a particular
+                      architecture; this option will be passed to Torque's qsub
+                      command as "-l arch=x86_64".</li>
         
         
 </ul>
 </ul>
-<a name="N10095"></a><a name="3.4+ringmaster+options"></a>
+<a name="N10098"></a><a name="3.4+ringmaster+options"></a>
 <h3 class="h4">3.4 ringmaster options</h3>
 <h3 class="h4">3.4 ringmaster options</h3>
 <ul>
 <ul>
           
           
@@ -408,7 +432,7 @@ document.write("Last Published: " + document.lastModified);
                        </li>
                        </li>
         
         
 </ul>
 </ul>
-<a name="N100A5"></a><a name="3.5+gridservice-hdfs+options"></a>
+<a name="N100A8"></a><a name="3.5+gridservice-hdfs+options"></a>
 <h3 class="h4">3.5 gridservice-hdfs options</h3>
 <h3 class="h4">3.5 gridservice-hdfs options</h3>
 <ul>
 <ul>
           
           
@@ -449,7 +473,7 @@ document.write("Last Published: " + document.lastModified);
 <li>final-server-params: Same as above, except they will be marked final.</li>
 <li>final-server-params: Same as above, except they will be marked final.</li>
         
         
 </ul>
 </ul>
-<a name="N100C4"></a><a name="3.6+gridservice-mapred+options"></a>
+<a name="N100C7"></a><a name="3.6+gridservice-mapred+options"></a>
 <h3 class="h4">3.6 gridservice-mapred options</h3>
 <h3 class="h4">3.6 gridservice-mapred options</h3>
 <ul>
 <ul>
           
           
@@ -482,7 +506,7 @@ document.write("Last Published: " + document.lastModified);
 <li>final-server-params: Same as above, except they will be marked final.</li>
 <li>final-server-params: Same as above, except they will be marked final.</li>
         
         
 </ul>
 </ul>
-<a name="N100E3"></a><a name="3.7+hodring+options"></a>
+<a name="N100E6"></a><a name="3.7+hodring+options"></a>
 <h3 class="h4">3.7 hodring options</h3>
 <h3 class="h4">3.7 hodring options</h3>
 <ul>
 <ul>
           
           

文件差异内容过多而无法显示
+ 3 - 3
docs/hod_config_guide.pdf


+ 47 - 16
docs/hod_user_guide.html

@@ -155,6 +155,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
@@ -256,7 +259,11 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Hangs+During+Deallocation">hod Hangs During Deallocation </a>
 <a href="#Hangs+During+Deallocation">hod Hangs During Deallocation </a>
 </li>
 </li>
 <li>
 <li>
-<a href="#Fails+With+an+error+code+and+error+message">hod Fails With an error code and error message </a>
+<a href="#Fails+With+an+Error+Code+and+Error+Message">hod Fails With an Error Code and Error Message </a>
+</li>
+<li>
+<a href="#Hadoop+DFSClient+Warns+with+a%0A++NotReplicatedYetException">Hadoop DFSClient Warns with a
+  NotReplicatedYetException</a>
 </li>
 </li>
 <li>
 <li>
 <a href="#Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"> Hadoop Jobs Not Running on a Successfully Allocated Cluster </a>
 <a href="#Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"> Hadoop Jobs Not Running on a Successfully Allocated Cluster </a>
@@ -271,7 +278,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"> The Exit Codes For HOD Are Not Getting Into Torque </a>
 <a href="#The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"> The Exit Codes For HOD Are Not Getting Into Torque </a>
 </li>
 </li>
 <li>
 <li>
-<a href="#The+Hadoop+Logs+are+Not+Uploaded+to+DFS"> The Hadoop Logs are Not Uploaded to DFS </a>
+<a href="#The+Hadoop+Logs+are+Not+Uploaded+to+HDFS"> The Hadoop Logs are Not Uploaded to HDFS </a>
 </li>
 </li>
 <li>
 <li>
 <a href="#Locating+Ringmaster+Logs"> Locating Ringmaster Logs </a>
 <a href="#Locating+Ringmaster+Logs"> Locating Ringmaster Logs </a>
@@ -742,7 +749,7 @@ document.write("Last Published: " + document.lastModified);
 <tr>
 <tr>
         
         
 <td colspan="1" rowspan="1"> 7 </td>
 <td colspan="1" rowspan="1"> 7 </td>
-        <td colspan="1" rowspan="1"> DFS failure </td>
+        <td colspan="1" rowspan="1"> HDFS failure </td>
       
       
 </tr>
 </tr>
       
       
@@ -960,8 +967,8 @@ document.write("Last Published: " + document.lastModified);
 <a name="_hod_Hangs_During_Deallocation" id="_hod_Hangs_During_Deallocation"></a><a name="hod_Hangs_During_Deallocation" id="hod_Hangs_During_Deallocation"></a>
 <a name="_hod_Hangs_During_Deallocation" id="_hod_Hangs_During_Deallocation"></a><a name="hod_Hangs_During_Deallocation" id="hod_Hangs_During_Deallocation"></a>
 <p>
 <p>
 <em>Possible Cause:</em> A Torque related problem, usually load on the Torque server, or the allocation is very large. Generally, waiting for the command to complete is the only option.</p>
 <em>Possible Cause:</em> A Torque related problem, usually load on the Torque server, or the allocation is very large. Generally, waiting for the command to complete is the only option.</p>
-<a name="N105C2"></a><a name="Fails+With+an+error+code+and+error+message"></a>
-<h3 class="h4">hod Fails With an error code and error message </h3>
+<a name="N105C2"></a><a name="Fails+With+an+Error+Code+and+Error+Message"></a>
+<h3 class="h4">hod Fails With an Error Code and Error Message </h3>
 <a name="hod_Fails_With_an_error_code_and" id="hod_Fails_With_an_error_code_and"></a><a name="_hod_Fails_With_an_error_code_an" id="_hod_Fails_With_an_error_code_an"></a>
 <a name="hod_Fails_With_an_error_code_and" id="hod_Fails_With_an_error_code_and"></a><a name="_hod_Fails_With_an_error_code_an" id="_hod_Fails_With_an_error_code_an"></a>
 <p>If the exit code of the <span class="codefrag">hod</span> command is not <span class="codefrag">0</span>, then refer to the following table of error exit codes to determine why the code may have occurred and how to debug the situation.</p>
 <p>If the exit code of the <span class="codefrag">hod</span> command is not <span class="codefrag">0</span>, then refer to the following table of error exit codes to determine why the code may have occurred and how to debug the situation.</p>
 <p>
 <p>
@@ -1041,14 +1048,14 @@ document.write("Last Published: " + document.lastModified);
 <tr>
 <tr>
         
         
 <td colspan="1" rowspan="1"> 7 </td>
 <td colspan="1" rowspan="1"> 7 </td>
-        <td colspan="1" rowspan="1"> DFS failure </td>
-        <td colspan="1" rowspan="1"> When HOD fails to allocate due to DFS failures (or Job tracker failures, error code 8, see below), it prints a failure message "Hodring at &lt;hostname&gt; failed with following errors:" and then gives the actual error message, which may indicate one of the following:<br>
+        <td colspan="1" rowspan="1"> HDFS failure </td>
+        <td colspan="1" rowspan="1"> When HOD fails to allocate due to HDFS failures (or Job tracker failures, error code 8, see below), it prints a failure message "Hodring at &lt;hostname&gt; failed with following errors:" and then gives the actual error message, which may indicate one of the following:<br>
           1. Problem in starting Hadoop clusters. Usually the actual cause in the error message will indicate the problem on the hostname mentioned. Also, review the Hadoop related configuration in the HOD configuration files. Look at the Hadoop logs using information specified in <em>Collecting and Viewing Hadoop Logs</em> section above. <br>
           1. Problem in starting Hadoop clusters. Usually the actual cause in the error message will indicate the problem on the hostname mentioned. Also, review the Hadoop related configuration in the HOD configuration files. Look at the Hadoop logs using information specified in <em>Collecting and Viewing Hadoop Logs</em> section above. <br>
           2. Invalid configuration on the node running the hodring, specified by the hostname in the error message <br>
           2. Invalid configuration on the node running the hodring, specified by the hostname in the error message <br>
           3. Invalid configuration in the <span class="codefrag">hodring</span> section of hodrc. <span class="codefrag">ssh</span> to the hostname specified in the error message and grep for <span class="codefrag">ERROR</span> or <span class="codefrag">CRITICAL</span> in hodring logs. Refer to the section <em>Locating Hodring Logs</em> below for more information. <br>
           3. Invalid configuration in the <span class="codefrag">hodring</span> section of hodrc. <span class="codefrag">ssh</span> to the hostname specified in the error message and grep for <span class="codefrag">ERROR</span> or <span class="codefrag">CRITICAL</span> in hodring logs. Refer to the section <em>Locating Hodring Logs</em> below for more information. <br>
           4. Invalid tarball specified which is not packaged correctly. <br>
           4. Invalid tarball specified which is not packaged correctly. <br>
           5. Cannot communicate with an externally configured HDFS.<br>
           5. Cannot communicate with an externally configured HDFS.<br>
-          When such DFS or Job tracker failure occurs, one can login into the host with hostname mentioned in HOD failure message and debug the problem. While fixing the problem, one should also review other log messages in the ringmaster log to see which other machines also might have had problems bringing up the jobtracker/namenode, apart from the hostname that is reported in the failure message. This possibility of other machines also having problems occurs because HOD continues to try and launch hadoop daemons on multiple machines one after another depending upon the value of the configuration variable <a href="hod_config_guide.html#3.4+ringmaster+options">ringmaster.max-master-failures</a>. Refer to the section <em>Locating Ringmaster Logs</em> below to find more about ringmaster logs.
+          When such HDFS or Job tracker failure occurs, one can login into the host with hostname mentioned in HOD failure message and debug the problem. While fixing the problem, one should also review other log messages in the ringmaster log to see which other machines also might have had problems bringing up the jobtracker/namenode, apart from the hostname that is reported in the failure message. This possibility of other machines also having problems occurs because HOD continues to try and launch hadoop daemons on multiple machines one after another depending upon the value of the configuration variable <a href="hod_config_guide.html#3.4+ringmaster+options">ringmaster.max-master-failures</a>. Refer to the section <em>Locating Ringmaster Logs</em> below to find more about ringmaster logs.
           </td>
           </td>
       
       
 </tr>
 </tr>
@@ -1117,7 +1124,31 @@ document.write("Last Published: " + document.lastModified);
 </tr>
 </tr>
   
   
 </table>
 </table>
-<a name="N10757"></a><a name="Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"></a>
+<a name="N10757"></a><a name="Hadoop+DFSClient+Warns+with+a%0A++NotReplicatedYetException"></a>
+<h3 class="h4">Hadoop DFSClient Warns with a
+  NotReplicatedYetException</h3>
+<p>Sometimes, when you try to upload a file to the HDFS immediately after
+  allocating a HOD cluster, DFSClient warns with a NotReplicatedYetException. It
+  usually shows a message something like - </p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4">
+<tr>
+<td colspan="1" rowspan="1"><span class="codefrag">WARN
+  hdfs.DFSClient: NotReplicatedYetException sleeping &lt;filename&gt; retries
+  left 3</span></td>
+</tr>
+<tr>
+<td colspan="1" rowspan="1"><span class="codefrag">08/01/25 16:31:40 INFO hdfs.DFSClient:
+  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
+  &lt;filename&gt; could only be replicated to 0 nodes, instead of
+  1</span></td>
+</tr>
+</table>
+<p> This scenario arises when you try to upload a file
+  to the HDFS while the DataNodes are still in the process of contacting the
+  NameNode. This can be resolved by waiting for some time before uploading a new
+  file to the HDFS, so that enough DataNodes start and contact the
+  NameNode.</p>
+<a name="N1076F"></a><a name="Hadoop+Jobs+Not+Running+on+a+Successfully+Allocated+Cluster"></a>
 <h3 class="h4"> Hadoop Jobs Not Running on a Successfully Allocated Cluster </h3>
 <h3 class="h4"> Hadoop Jobs Not Running on a Successfully Allocated Cluster </h3>
 <a name="Hadoop_Jobs_Not_Running_on_a_Suc" id="Hadoop_Jobs_Not_Running_on_a_Suc"></a>
 <a name="Hadoop_Jobs_Not_Running_on_a_Suc" id="Hadoop_Jobs_Not_Running_on_a_Suc"></a>
 <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
 <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
@@ -1136,31 +1167,31 @@ document.write("Last Published: " + document.lastModified);
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop client being used to submit jobs and the hadoop used in provisioning (typically via the tarball option). Ensure compatible versions are being used.</p>
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop client being used to submit jobs and the hadoop used in provisioning (typically via the tarball option). Ensure compatible versions are being used.</p>
 <p>
 <p>
 <em>Possible Cause:</em> You used one of the options for specifying Hadoop configuration <span class="codefrag">-M or -H</span>, which had special characters like space or comma that were not escaped correctly. Refer to the section <em>Options Configuring HOD</em> for checking how to specify such options correctly.</p>
 <em>Possible Cause:</em> You used one of the options for specifying Hadoop configuration <span class="codefrag">-M or -H</span>, which had special characters like space or comma that were not escaped correctly. Refer to the section <em>Options Configuring HOD</em> for checking how to specify such options correctly.</p>
-<a name="N10792"></a><a name="My+Hadoop+Job+Got+Killed"></a>
+<a name="N107AA"></a><a name="My+Hadoop+Job+Got+Killed"></a>
 <h3 class="h4"> My Hadoop Job Got Killed </h3>
 <h3 class="h4"> My Hadoop Job Got Killed </h3>
 <a name="My_Hadoop_Job_Got_Killed" id="My_Hadoop_Job_Got_Killed"></a>
 <a name="My_Hadoop_Job_Got_Killed" id="My_Hadoop_Job_Got_Killed"></a>
 <p>
 <p>
 <em>Possible Cause:</em> The wallclock limit specified by the Torque administrator or the <span class="codefrag">-l</span> option defined in the section <em>Specifying Additional Job Attributes</em> was exceeded since allocation time. Thus the cluster would have got released. Deallocate the cluster and allocate it again, this time with a larger wallclock time.</p>
 <em>Possible Cause:</em> The wallclock limit specified by the Torque administrator or the <span class="codefrag">-l</span> option defined in the section <em>Specifying Additional Job Attributes</em> was exceeded since allocation time. Thus the cluster would have got released. Deallocate the cluster and allocate it again, this time with a larger wallclock time.</p>
 <p>
 <p>
 <em>Possible Cause:</em> Problems with the JobTracker node. Refer to the section in <em>Collecting and Viewing Hadoop Logs</em> to get more information.</p>
 <em>Possible Cause:</em> Problems with the JobTracker node. Refer to the section in <em>Collecting and Viewing Hadoop Logs</em> to get more information.</p>
-<a name="N107AD"></a><a name="Hadoop+Job+Fails+with+Message%3A+%27Job+tracker+still+initializing%27"></a>
+<a name="N107C5"></a><a name="Hadoop+Job+Fails+with+Message%3A+%27Job+tracker+still+initializing%27"></a>
 <h3 class="h4"> Hadoop Job Fails with Message: 'Job tracker still initializing' </h3>
 <h3 class="h4"> Hadoop Job Fails with Message: 'Job tracker still initializing' </h3>
 <a name="Hadoop_Job_Fails_with_Message_Jo" id="Hadoop_Job_Fails_with_Message_Jo"></a>
 <a name="Hadoop_Job_Fails_with_Message_Jo" id="Hadoop_Job_Fails_with_Message_Jo"></a>
 <p>
 <p>
 <em>Possible Cause:</em> The hadoop job was being run as part of the HOD script command, and it started before the JobTracker could come up fully. Allocate the cluster using a large value for the configuration option <span class="codefrag">--hod.script-wait-time</span>. Typically a value of 120 should work, though it is typically unnecessary to be that large.</p>
 <em>Possible Cause:</em> The hadoop job was being run as part of the HOD script command, and it started before the JobTracker could come up fully. Allocate the cluster using a large value for the configuration option <span class="codefrag">--hod.script-wait-time</span>. Typically a value of 120 should work, though it is typically unnecessary to be that large.</p>
-<a name="N107BD"></a><a name="The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"></a>
+<a name="N107D5"></a><a name="The+Exit+Codes+For+HOD+Are+Not+Getting+Into+Torque"></a>
 <h3 class="h4"> The Exit Codes For HOD Are Not Getting Into Torque </h3>
 <h3 class="h4"> The Exit Codes For HOD Are Not Getting Into Torque </h3>
 <a name="The_Exit_Codes_For_HOD_Are_Not_G" id="The_Exit_Codes_For_HOD_Are_Not_G"></a>
 <a name="The_Exit_Codes_For_HOD_Are_Not_G" id="The_Exit_Codes_For_HOD_Are_Not_G"></a>
 <p>
 <p>
 <em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. The version of Hadoop used does not match. Use the required version of Hadoop.</p>
 <em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. The version of Hadoop used does not match. Use the required version of Hadoop.</p>
 <p>
 <p>
 <em>Possible Cause:</em> The deallocation was done without using the <span class="codefrag">hod</span> command; for e.g. directly using <span class="codefrag">qdel</span>. When the cluster is deallocated in this manner, the HOD processes are terminated using signals. This results in the exit code to be based on the signal number, rather than the exit code of the program.</p>
 <em>Possible Cause:</em> The deallocation was done without using the <span class="codefrag">hod</span> command; for e.g. directly using <span class="codefrag">qdel</span>. When the cluster is deallocated in this manner, the HOD processes are terminated using signals. This results in the exit code to be based on the signal number, rather than the exit code of the program.</p>
-<a name="N107D5"></a><a name="The+Hadoop+Logs+are+Not+Uploaded+to+DFS"></a>
-<h3 class="h4"> The Hadoop Logs are Not Uploaded to DFS </h3>
+<a name="N107ED"></a><a name="The+Hadoop+Logs+are+Not+Uploaded+to+HDFS"></a>
+<h3 class="h4"> The Hadoop Logs are Not Uploaded to HDFS </h3>
 <a name="The_Hadoop_Logs_are_Not_Uploaded" id="The_Hadoop_Logs_are_Not_Uploaded"></a>
 <a name="The_Hadoop_Logs_are_Not_Uploaded" id="The_Hadoop_Logs_are_Not_Uploaded"></a>
 <p>
 <p>
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs and the external HDFS. Ensure that the correct version is specified in the <span class="codefrag">hodring.pkgs</span> option.</p>
 <em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs and the external HDFS. Ensure that the correct version is specified in the <span class="codefrag">hodring.pkgs</span> option.</p>
-<a name="N107E5"></a><a name="Locating+Ringmaster+Logs"></a>
+<a name="N107FD"></a><a name="Locating+Ringmaster+Logs"></a>
 <h3 class="h4"> Locating Ringmaster Logs </h3>
 <h3 class="h4"> Locating Ringmaster Logs </h3>
 <a name="Locating_Ringmaster_Logs" id="Locating_Ringmaster_Logs"></a>
 <a name="Locating_Ringmaster_Logs" id="Locating_Ringmaster_Logs"></a>
 <p>To locate the ringmaster logs, follow these steps: </p>
 <p>To locate the ringmaster logs, follow these steps: </p>
@@ -1177,7 +1208,7 @@ document.write("Last Published: " + document.lastModified);
 <li> If you don't get enough information, you may want to set the ringmaster debug level to 4. This can be done by passing <span class="codefrag">--ringmaster.debug 4</span> to the hod command line.</li>
 <li> If you don't get enough information, you may want to set the ringmaster debug level to 4. This can be done by passing <span class="codefrag">--ringmaster.debug 4</span> to the hod command line.</li>
   
   
 </ul>
 </ul>
-<a name="N10811"></a><a name="Locating+Hodring+Logs"></a>
+<a name="N10829"></a><a name="Locating+Hodring+Logs"></a>
 <h3 class="h4"> Locating Hodring Logs </h3>
 <h3 class="h4"> Locating Hodring Logs </h3>
 <a name="Locating_Hodring_Logs" id="Locating_Hodring_Logs"></a>
 <a name="Locating_Hodring_Logs" id="Locating_Hodring_Logs"></a>
 <p>To locate hodring logs, follow the steps below: </p>
 <p>To locate hodring logs, follow the steps below: </p>

文件差异内容过多而无法显示
+ 13 - 15
docs/hod_user_guide.pdf


+ 3 - 0
docs/index.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 9 - 0
docs/linkmap.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
@@ -306,6 +309,12 @@ document.write("Last Published: " + document.lastModified);
 </li>
 </li>
 </ul>
 </ul>
     
     
+<ul>
+<li>
+<a href="jdiff/changes.html">API Changes</a>&nbsp;&nbsp;___________________&nbsp;&nbsp;<em>jdiff</em>
+</li>
+</ul>
+    
 <ul>
 <ul>
 <li>
 <li>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>&nbsp;&nbsp;___________________&nbsp;&nbsp;<em>wiki</em>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>&nbsp;&nbsp;___________________&nbsp;&nbsp;<em>wiki</em>

+ 16 - 16
docs/linkmap.pdf

@@ -5,10 +5,10 @@
 /Producer (FOP 0.20.5) >>
 /Producer (FOP 0.20.5) >>
 endobj
 endobj
 5 0 obj
 5 0 obj
-<< /Length 1176 /Filter [ /ASCII85Decode /FlateDecode ]
+<< /Length 1177 /Filter [ /ASCII85Decode /FlateDecode ]
  >>
  >>
 stream
 stream
-Gatn'?#uJp'Sc)T/%A8+U=D2AkrP<B4&+JZmd.o9B1E+cBS+&$iCs9""\Oc=Y)3H3R7SYYSG1FY3BBL12iBo'8VAOCd:U"@rP*.LUUp\eT_V#Rg5g4\!T?VWfgYpGXHf*TU.ZIRk/kSh<Sd,^IN7!rDl*'<hN$u'?"mI+p/Ua'[/%lnZNt!?m)cLl"rpMk[jb;2Rpc2Y$b%_-jqE=5=@q8uKcE0W^"\lO4!UkJH_SY'-1+72]'2!+2%gn^;1P46+AJH[^LGE91Kaj-[kJ9mrAiWA,-U<]/dr6,s8De81rt/$LgpS9'`06E*5`>TpCJ#>$pErMK^@d'lE&a.FV@s/eQaptM+)K>niHYR0Hf3_0Pp^mcorL]F&;FI2AIHe+RX$p8#'!n&;)5nG2'"_^=At*o@]N(R&ff9R%9nF#oU)5);V=Z=tQ3u^t2J.ZbqfjYj9!c.CRge'I9SdUcsEYHZd"C$("@eQq/:(E&MBu!_sD9[E,i+5tEF4k(h:YV02b5VGm:.\.io%IIYd*qkQC^pQKE:WbkIB[kC-DM0TCS5/V*aO9bC!VnVP5/KE\[^/L:P]<orX96m/L*"\Y8<%#qC,]Pg8lJ9b8)2J/[<1PFM;js$Y_7jN)'U^0,8ML*JJ$05_Q7j!'E=)uJ^@j!^V.mG-RC-Dh?ORJbQZ?Fe2[\71IR](Ms*r*/Z+p2fo.G*E*1P4VX?("iq]1[m]&kj!aF'Ek_-jj&0ZF%LE$!=O:]:%Q>9b%4gY(MP!Fiu1L_IUEMLtZ3er:AU1UoNEm.LbIM6s&..5hi%_1MC)M*jirbkE88-Vf8tY:5K$Vb#(c-WnfgOaaCOp6FFdKC[6:AX;"L0Bs2dS^:bV/O-pipkD/tgr7=OREH6WHSnNQBJ>eP<qg>e:nQ@RouFo(":n+-0L`.2\*)^p`jar9R`:T?:f?'S<SDOXDCqu^^OS$YMMnB7>)Q=@/.Ug[B3@jS3r@lk>+^qd0cdth4N0dUPr1VA8Wn$:\5diTM!FC:8o@[t;dV3TBleOdcctHX\1JO]I1_E(bT]*H6\&=haCr2q?+fTD)6eArSo,DY*)l!)18+VFLU/BTJAkQ&<U[[]lE8Loota_Rd+H5I`m1\RiS6FEpEF\Ul[#TR7H9>3f3OXPhm:+DbrX0o#4:"+9677#a;g.;rrIn0%IF~>
+Gatn'?#uJp'Sc)T/%A8+U=D2AkrMPN4&+JZhV!g1B1E+cBS4,%iCs:M$C!DG=qqQC16Y7<3:A*V*1_^gD\g)nP6b(eUWmr_qg[9#7X#4S61^#/Zle<B!lC.7Z\uP&XHf*TU.ZIRk/kSh<Sd,^IN7!rDl*'<hN$u'?"mI+p/Ua/[/%TfZNt!?m)cLl"rpMk[jb;2Rpc2Y$b%_-jqE=5=@q8uKcE0W^"\lO4!UkJH_SY'-1+72]'2!+2%gn^;1P46+AJH[^LGE90j+YVARQP[qYtBhM7g"g\M,uKs7>9sd3/J*+H@%);s^!\EstEDhEt*B0_DS&&[IV8Wj?SUp.]4#=*)ap,o4`?b3SCqc@C.(^p*VR5l5efbsBIlfi3n1K"/!]+,s$X5n&nVfkmXmqc8VEcY$a2`E\Z5a6WpL9un/?cfeD@f4rut%_e7?S*bC;\6ZjqSmlIe:-in]HKm<-G?<D^O`8uaL%44I$>5!g.H!&Q<Ju5dm=^_0@!qa>J%"dV^?nmK"tg(LEF8W#gchZ4_YkaLSSD@Bg$K/E;ds'bM9g,Z2<;Sunu6Q>=pkR[RI=+ObUG,crDK[l@h&c>I&`Cm6'=2IH1c&fnr1J8,H=M'ZRO-lKOA.C4,Q^ong#'$mWu&E=dC`&eStp%V7uXPX%Li[MDBOng:nII2G;2@_;$GLe)qSdSH%c62e%:4=ls#PPT61<335/Rqth%dqRWOBC1KNQMI8-n5+#l1E&`o$a]&5QqUUHla`]R%Aq1_<&)ourS>'D=X`YkB3=P?$8\9C9Gp;WhFc+F':GAJ*>Jq\]*TjOK0bl'%f[gg-UJmtcp'IV'B;?WMZ$dpB[SWR[Yfnl7l:$-<V@sl,XQ?A;*PS5GS@eTAW4>#FYufCrL^%A2UK\J-\iXEF\VYOubPdp2m<4X(V(<H?J#RcPS#APsI-ZiG\LAS`?A#4iH$^luCO@WRVJ0-P%#,^p8pd3N&&2/khn^->l-E9ZXi2_[Op"/1ZRhqdSJ1lfXj:smQk&"BSmiqfar2;WHa?j;$s*VMG;"P&Lfm+NE<n7V)B!B.[u8l:'a$A?_:',dJn)rn)[=fAkAV35"L-N.Xm5lZ7bnf*]K>nG)&di=_UuD>.%C[#j^)J8IpI!cUM-0[laj.fiF<W_YM&clFp@?bkt=t)+(Pl$WW*4dq[BD~>
 endstream
 endstream
 endobj
 endobj
 6 0 obj
 6 0 obj
@@ -20,10 +20,10 @@ endobj
 >>
 >>
 endobj
 endobj
 7 0 obj
 7 0 obj
-<< /Length 316 /Filter [ /ASCII85Decode /FlateDecode ]
+<< /Length 388 /Filter [ /ASCII85Decode /FlateDecode ]
  >>
  >>
 stream
 stream
-Gaqcq4\rsL&;GE/MAt"F@peTV*be%J_uQer?uW*/7\k#p%#BCo'3Z]EVrr0rIX#AEm1)E\!Oa*Z9#?.Ua$ChGNB&DY;iV(K).'Us(&\JW:ap//)BjI/RZFdj(Gk@eol;Ps$f$I`qBkndGQ[h.mOe5SXUV&IR45-]#WVm[.&$6/OCtj:U@N*LCJ:/UjZ3VkW)f7L:Tdt!j2G`2ap6arCMKc#S2KG,;R;[^74QO]V7*)rmAf^k7m3D<ZWEB=,%OAAA5@Z+]@O[AGoqq=j1<oS(VZ?5McFHSm&[?;(qoeaqqcPSr^`s%qZFKMB>X~>
+Gar?-5uWCi&;BTNMESBQA$TVY349iSE'3>@2sYQp)^o2(Q(j<=h>jY\L"#d(p[-_+S*5#e+HURs=i;JE"E=AJ&?6H[.AjTXF+i_u)Ngn9,6W#dBehai4CJbY]e+;Lp&abZf_g5rn,hb/4/uZ.X';Z].4*#`gM>,.)+)2N=="P^Jh^pN1P'7I[43*HcD<Bk[6g6C']n)sSjo0RYD;XT<Aq]_<Aq!rY3'pTHLKFK[c%A@9G>K"3bLu[434.oB\r/Ur#M"!G]=?`.TrK!;a/o_m#OG+_(YQPl*R8U.RYgf"Guc%7GhiWS\FD6h`"iUUg<s<8IU9Bl[d::%59lsD9/b-CVFU*.*LnNgtAR'8f!ZaVsg>N4pUH"Y5WN)2XKi:]EsU>~>
 endstream
 endstream
 endobj
 endobj
 8 0 obj
 8 0 obj
@@ -87,19 +87,19 @@ endobj
 xref
 xref
 0 14
 0 14
 0000000000 65535 f 
 0000000000 65535 f 
-0000002515 00000 n 
-0000002579 00000 n 
-0000002629 00000 n 
+0000002588 00000 n 
+0000002652 00000 n 
+0000002702 00000 n 
 0000000015 00000 n 
 0000000015 00000 n 
 0000000071 00000 n 
 0000000071 00000 n 
-0000001339 00000 n 
-0000001445 00000 n 
-0000001852 00000 n 
-0000001958 00000 n 
-0000002070 00000 n 
-0000002180 00000 n 
-0000002291 00000 n 
-0000002399 00000 n 
+0000001340 00000 n 
+0000001446 00000 n 
+0000001925 00000 n 
+0000002031 00000 n 
+0000002143 00000 n 
+0000002253 00000 n 
+0000002364 00000 n 
+0000002472 00000 n 
 trailer
 trailer
 <<
 <<
 /Size 14
 /Size 14
@@ -107,5 +107,5 @@ trailer
 /Info 4 0 R
 /Info 4 0 R
 >>
 >>
 startxref
 startxref
-2751
+2824
 %%EOF
 %%EOF

+ 3 - 0
docs/mapred_tutorial.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/native_libraries.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/quickstart.html

@@ -153,6 +153,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 3 - 0
docs/streaming.html

@@ -156,6 +156,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="api/index.html">API Docs</a>
 <a href="api/index.html">API Docs</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">
+<a href="jdiff/changes.html">API Changes</a>
+</div>
+<div class="menuitem">
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 <a href="http://wiki.apache.org/hadoop/">Wiki</a>
 </div>
 </div>
 <div class="menuitem">
 <div class="menuitem">

+ 7 - 0
src/contrib/hod/CHANGES.txt

@@ -1,5 +1,12 @@
 HOD Change Log
 HOD Change Log
 
 
+Release 0.18.2 - Unreleased 
+
+  BUG FIXES
+
+    HADOOP-3786. Use HDFS instead of DFS in all docs and hyperlink to Torque.
+    (Vinod Kumar Vavilapalli via acmurthy)
+
 Release 0.18.1 - 2008-09-17
 Release 0.18.1 - 2008-09-17
 
 
   INCOMPATIBLE CHANGES
   INCOMPATIBLE CHANGES

+ 1 - 1
src/docs/src/documentation/content/xdocs/hod_admin_guide.xml

@@ -89,7 +89,7 @@ Nodes : HOD requires a minimum of three nodes configured through a resource mana
 <p> Software </p>
 <p> Software </p>
 <p>The following components must be installed on ALL nodes before using HOD:</p>
 <p>The following components must be installed on ALL nodes before using HOD:</p>
 <ul>
 <ul>
- <li>Torque: Resource manager</li>
+ <li><a href="ext:hod/torque">Torque: Resource manager</a></li>
  <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of Python.</li>
  <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of Python.</li>
 </ul>
 </ul>
 
 

+ 21 - 1
src/docs/src/documentation/content/xdocs/hod_config_guide.xml

@@ -68,7 +68,15 @@
         <ul>
         <ul>
           <li>temp-dir: Temporary directory for usage by the HOD processes. Make 
           <li>temp-dir: Temporary directory for usage by the HOD processes. Make 
                       sure that the users who will run hod have rights to create 
                       sure that the users who will run hod have rights to create 
-                      directories under the directory specified here.</li>
+                      directories under the directory specified here. If you
+                      wish to make this directory vary across allocations,
+                      you can make use of the environmental variables which will
+                      be made available by the resource manager to the HOD
+                      processes. For example, in a Torque setup, having
+                      --ringmaster.temp-dir=/tmp/hod-temp-dir.$PBS_JOBID would
+                      let ringmaster use different temp-dir for each
+                      allocation; Torque expands this variable before starting
+                      the ringmaster.</li>
           
           
           <li>debug: Numeric value from 1-4. 4 produces the most log information,
           <li>debug: Numeric value from 1-4. 4 produces the most log information,
                    and 1 the least.</li>
                    and 1 the least.</li>
@@ -147,6 +155,18 @@
                       variable 'HOD_PYTHON_HOME' to the path to the python 
                       variable 'HOD_PYTHON_HOME' to the path to the python 
                       executable. The HOD processes launched on the compute nodes
                       executable. The HOD processes launched on the compute nodes
                       can then use this variable.</li>
                       can then use this variable.</li>
+          <li>options: Comma-separated list of key-value pairs,
+                      expressed as
+                      &lt;option&gt;:&lt;sub-option&gt;=&lt;value&gt;. When
+                      passing to the job submission program, these are expanded
+                      as -&lt;option&gt; &lt;sub-option&gt;=&lt;value&gt;. These
+                      are generally used for specifying additional resource
+                      contraints for scheduling. For instance, with a Torque
+                      setup, one can specify
+                      --resource_manager.options='l:arch=x86_64' for
+                      constraining the nodes being allocated to a particular
+                      architecture; this option will be passed to Torque's qsub
+                      command as "-l arch=x86_64".</li>
         </ul>
         </ul>
       </section>
       </section>
       
       

+ 21 - 6
src/docs/src/documentation/content/xdocs/hod_user_guide.xml

@@ -258,7 +258,7 @@
       </tr>
       </tr>
       <tr>
       <tr>
         <td> 7 </td>
         <td> 7 </td>
-        <td> DFS failure </td>
+        <td> HDFS failure </td>
       </tr>
       </tr>
       <tr>
       <tr>
         <td> 8 </td>
         <td> 8 </td>
@@ -376,7 +376,7 @@
   <section><title><code>hod</code> Hangs During Deallocation </title><anchor id="_hod_Hangs_During_Deallocation"></anchor><anchor id="hod_Hangs_During_Deallocation"></anchor>
   <section><title><code>hod</code> Hangs During Deallocation </title><anchor id="_hod_Hangs_During_Deallocation"></anchor><anchor id="hod_Hangs_During_Deallocation"></anchor>
   <p><em>Possible Cause:</em> A Torque related problem, usually load on the Torque server, or the allocation is very large. Generally, waiting for the command to complete is the only option.</p>
   <p><em>Possible Cause:</em> A Torque related problem, usually load on the Torque server, or the allocation is very large. Generally, waiting for the command to complete is the only option.</p>
   </section>
   </section>
-  <section><title><code>hod</code> Fails With an error code and error message </title><anchor id="hod_Fails_With_an_error_code_and"></anchor><anchor id="_hod_Fails_With_an_error_code_an"></anchor>
+  <section><title><code>hod</code> Fails With an Error Code and Error Message </title><anchor id="hod_Fails_With_an_error_code_and"></anchor><anchor id="_hod_Fails_With_an_error_code_an"></anchor>
   <p>If the exit code of the <code>hod</code> command is not <code>0</code>, then refer to the following table of error exit codes to determine why the code may have occurred and how to debug the situation.</p>
   <p>If the exit code of the <code>hod</code> command is not <code>0</code>, then refer to the following table of error exit codes to determine why the code may have occurred and how to debug the situation.</p>
   <p><strong> Error Codes </strong></p><anchor id="Error_Codes"></anchor>
   <p><strong> Error Codes </strong></p><anchor id="Error_Codes"></anchor>
   <table>
   <table>
@@ -429,14 +429,14 @@
       </tr>
       </tr>
       <tr>
       <tr>
         <td> 7 </td>
         <td> 7 </td>
-        <td> DFS failure </td>
-        <td> When HOD fails to allocate due to DFS failures (or Job tracker failures, error code 8, see below), it prints a failure message "Hodring at &lt;hostname&gt; failed with following errors:" and then gives the actual error message, which may indicate one of the following:<br/>
+        <td> HDFS failure </td>
+        <td> When HOD fails to allocate due to HDFS failures (or Job tracker failures, error code 8, see below), it prints a failure message "Hodring at &lt;hostname&gt; failed with following errors:" and then gives the actual error message, which may indicate one of the following:<br/>
           1. Problem in starting Hadoop clusters. Usually the actual cause in the error message will indicate the problem on the hostname mentioned. Also, review the Hadoop related configuration in the HOD configuration files. Look at the Hadoop logs using information specified in <em>Collecting and Viewing Hadoop Logs</em> section above. <br />
           1. Problem in starting Hadoop clusters. Usually the actual cause in the error message will indicate the problem on the hostname mentioned. Also, review the Hadoop related configuration in the HOD configuration files. Look at the Hadoop logs using information specified in <em>Collecting and Viewing Hadoop Logs</em> section above. <br />
           2. Invalid configuration on the node running the hodring, specified by the hostname in the error message <br/>
           2. Invalid configuration on the node running the hodring, specified by the hostname in the error message <br/>
           3. Invalid configuration in the <code>hodring</code> section of hodrc. <code>ssh</code> to the hostname specified in the error message and grep for <code>ERROR</code> or <code>CRITICAL</code> in hodring logs. Refer to the section <em>Locating Hodring Logs</em> below for more information. <br />
           3. Invalid configuration in the <code>hodring</code> section of hodrc. <code>ssh</code> to the hostname specified in the error message and grep for <code>ERROR</code> or <code>CRITICAL</code> in hodring logs. Refer to the section <em>Locating Hodring Logs</em> below for more information. <br />
           4. Invalid tarball specified which is not packaged correctly. <br />
           4. Invalid tarball specified which is not packaged correctly. <br />
           5. Cannot communicate with an externally configured HDFS.<br/>
           5. Cannot communicate with an externally configured HDFS.<br/>
-          When such DFS or Job tracker failure occurs, one can login into the host with hostname mentioned in HOD failure message and debug the problem. While fixing the problem, one should also review other log messages in the ringmaster log to see which other machines also might have had problems bringing up the jobtracker/namenode, apart from the hostname that is reported in the failure message. This possibility of other machines also having problems occurs because HOD continues to try and launch hadoop daemons on multiple machines one after another depending upon the value of the configuration variable <a href="hod_config_guide.html#3.4+ringmaster+options">ringmaster.max-master-failures</a>. Refer to the section <em>Locating Ringmaster Logs</em> below to find more about ringmaster logs.
+          When such HDFS or Job tracker failure occurs, one can login into the host with hostname mentioned in HOD failure message and debug the problem. While fixing the problem, one should also review other log messages in the ringmaster log to see which other machines also might have had problems bringing up the jobtracker/namenode, apart from the hostname that is reported in the failure message. This possibility of other machines also having problems occurs because HOD continues to try and launch hadoop daemons on multiple machines one after another depending upon the value of the configuration variable <a href="hod_config_guide.html#3.4+ringmaster+options">ringmaster.max-master-failures</a>. Refer to the section <em>Locating Ringmaster Logs</em> below to find more about ringmaster logs.
           </td>
           </td>
       </tr>
       </tr>
       <tr>
       <tr>
@@ -482,6 +482,21 @@
       </tr>
       </tr>
   </table>
   </table>
     </section>
     </section>
+  <section><title>Hadoop DFSClient Warns with a
+  NotReplicatedYetException</title>
+  <p>Sometimes, when you try to upload a file to the HDFS immediately after
+  allocating a HOD cluster, DFSClient warns with a NotReplicatedYetException. It
+  usually shows a message something like - </p><table><tr><td><code>WARN
+  hdfs.DFSClient: NotReplicatedYetException sleeping &lt;filename&gt; retries
+  left 3</code></td></tr><tr><td><code>08/01/25 16:31:40 INFO hdfs.DFSClient:
+  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
+  &lt;filename&gt; could only be replicated to 0 nodes, instead of
+  1</code></td></tr></table><p> This scenario arises when you try to upload a file
+  to the HDFS while the DataNodes are still in the process of contacting the
+  NameNode. This can be resolved by waiting for some time before uploading a new
+  file to the HDFS, so that enough DataNodes start and contact the
+  NameNode.</p>
+  </section>
   <section><title> Hadoop Jobs Not Running on a Successfully Allocated Cluster </title><anchor id="Hadoop_Jobs_Not_Running_on_a_Suc"></anchor>
   <section><title> Hadoop Jobs Not Running on a Successfully Allocated Cluster </title><anchor id="Hadoop_Jobs_Not_Running_on_a_Suc"></anchor>
   <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
   <p>This scenario generally occurs when a cluster is allocated, and is left inactive for sometime, and then hadoop jobs are attempted to be run on them. Then Hadoop jobs fail with the following exception:</p>
   <table><tr><td><code>08/01/25 16:31:40 INFO ipc.Client: Retrying connect to server: foo.bar.com/1.1.1.1:53567. Already tried 1 time(s).</code></td></tr></table>
   <table><tr><td><code>08/01/25 16:31:40 INFO ipc.Client: Retrying connect to server: foo.bar.com/1.1.1.1:53567. Already tried 1 time(s).</code></td></tr></table>
@@ -502,7 +517,7 @@
   <p><em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. The version of Hadoop used does not match. Use the required version of Hadoop.</p>
   <p><em>Possible Cause:</em> Version 0.16 of hadoop is required for this functionality to work. The version of Hadoop used does not match. Use the required version of Hadoop.</p>
   <p><em>Possible Cause:</em> The deallocation was done without using the <code>hod</code> command; for e.g. directly using <code>qdel</code>. When the cluster is deallocated in this manner, the HOD processes are terminated using signals. This results in the exit code to be based on the signal number, rather than the exit code of the program.</p>
   <p><em>Possible Cause:</em> The deallocation was done without using the <code>hod</code> command; for e.g. directly using <code>qdel</code>. When the cluster is deallocated in this manner, the HOD processes are terminated using signals. This results in the exit code to be based on the signal number, rather than the exit code of the program.</p>
     </section>
     </section>
-  <section><title> The Hadoop Logs are Not Uploaded to DFS </title><anchor id="The_Hadoop_Logs_are_Not_Uploaded"></anchor>
+  <section><title> The Hadoop Logs are Not Uploaded to HDFS </title><anchor id="The_Hadoop_Logs_are_Not_Uploaded"></anchor>
   <p><em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs and the external HDFS. Ensure that the correct version is specified in the <code>hodring.pkgs</code> option.</p>
   <p><em>Possible Cause:</em> There is a version mismatch between the version of the hadoop being used for uploading the logs and the external HDFS. Ensure that the correct version is specified in the <code>hodring.pkgs</code> option.</p>
     </section>
     </section>
   <section><title> Locating Ringmaster Logs </title><anchor id="Locating_Ringmaster_Logs"></anchor>
   <section><title> Locating Ringmaster Logs </title><anchor id="Locating_Ringmaster_Logs"></anchor>

部分文件因为文件数量过多而无法显示