Browse Source

Merge -r 668611:668612 from trunk onto 0.18 branch. Fixes HADOOP-2762.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18@668617 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das 17 năm trước cách đây
mục cha
commit
32171004eb

+ 3 - 0
CHANGES.txt

@@ -290,6 +290,9 @@ Release 0.18.0 - Unreleased
     HADOOP-3406. Add forrest documentation for Profiling.
     HADOOP-3406. Add forrest documentation for Profiling.
     (Amareshwari Sriramadasu via ddas)
     (Amareshwari Sriramadasu via ddas)
 
 
+    HADOOP-2762. Add forrest documentation for controls of memory limits on 
+    hadoop daemons and Map-Reduce tasks. (Amareshwari Sriramadasu via ddas)
+
   OPTIMIZATIONS
   OPTIMIZATIONS
 
 
     HADOOP-3274. The default constructor of BytesWritable creates empty 
     HADOOP-3274. The default constructor of BytesWritable creates empty 

+ 51 - 10
docs/cluster_setup.html

@@ -324,6 +324,45 @@ document.write("Last Published: " + document.lastModified);
 <p>At the very least you should specify the
 <p>At the very least you should specify the
           <span class="codefrag">JAVA_HOME</span> so that it is correctly defined on each
           <span class="codefrag">JAVA_HOME</span> so that it is correctly defined on each
           remote node.</p>
           remote node.</p>
+<p>Administrators can configure individual daemons using the
+          configuration options <span class="codefrag">HADOOP_*_OPTS</span>. Various options 
+          available are shown below in the table. </p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4">
+          
+<tr>
+<th colspan="1" rowspan="1">Daemon</th><th colspan="1" rowspan="1">Configure Options</th>
+</tr>
+          
+<tr>
+<td colspan="1" rowspan="1">NameNode</td><td colspan="1" rowspan="1">HADOOP_NAMENODE_OPTS</td>
+</tr>
+          
+<tr>
+<td colspan="1" rowspan="1">DataNode</td><td colspan="1" rowspan="1">HADOOP_DATANODE_OPTS</td>
+</tr>
+          
+<tr>
+<td colspan="1" rowspan="1">SecondaryNamenode</td>
+              <td colspan="1" rowspan="1">HADOOP_SECONDARYNAMENODE_OPTS</td>
+</tr>
+          
+<tr>
+<td colspan="1" rowspan="1">JobTracker</td><td colspan="1" rowspan="1">HADOOP_JOBTRACKER_OPTS</td>
+</tr>
+          
+<tr>
+<td colspan="1" rowspan="1">TaskTracker</td><td colspan="1" rowspan="1">HADOOP_TASKTRACKER_OPTS</td>
+</tr>
+          
+</table>
+<p> For example, To configure Namenode to use parallelGC, the
+          following statement should be added in <span class="codefrag">hadoop-env.sh</span> :
+          <br>
+<span class="codefrag">
+          export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
+          </span>
+<br>
+</p>
 <p>Other useful configuration parameters that you can customize 
 <p>Other useful configuration parameters that you can customize 
           include:</p>
           include:</p>
 <ul>
 <ul>
@@ -338,11 +377,13 @@ document.write("Last Published: " + document.lastModified);
 <li>
 <li>
               
               
 <span class="codefrag">HADOOP_HEAPSIZE</span> - The maximum amount of heapsize 
 <span class="codefrag">HADOOP_HEAPSIZE</span> - The maximum amount of heapsize 
-              to use, in MB e.g. <span class="codefrag">2000MB</span>.
+              to use, in MB e.g. <span class="codefrag">1000MB</span>. This is used to 
+              configure the heap size for the hadoop daemon. By default,
+              the value is <span class="codefrag">1000MB</span>.
             </li>
             </li>
           
           
 </ul>
 </ul>
-<a name="N100DD"></a><a name="Configuring+the+Hadoop+Daemons"></a>
+<a name="N10130"></a><a name="Configuring+the+Hadoop+Daemons"></a>
 <h4>Configuring the Hadoop Daemons</h4>
 <h4>Configuring the Hadoop Daemons</h4>
 <p>This section deals with important parameters to be specified in the
 <p>This section deals with important parameters to be specified in the
           <span class="codefrag">conf/hadoop-site.xml</span> for the Hadoop cluster.</p>
           <span class="codefrag">conf/hadoop-site.xml</span> for the Hadoop cluster.</p>
@@ -466,7 +507,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/conf/Configuration.html#FinalParams">
           <a href="api/org/apache/hadoop/conf/Configuration.html#FinalParams">
           final</a> to ensure that they cannot be overriden by user-applications.
           final</a> to ensure that they cannot be overriden by user-applications.
           </p>
           </p>
-<a name="N101BC"></a><a name="Real-World+Cluster+Configurations"></a>
+<a name="N1020F"></a><a name="Real-World+Cluster+Configurations"></a>
 <h5>Real-World Cluster Configurations</h5>
 <h5>Real-World Cluster Configurations</h5>
 <p>This section lists some non-default configuration parameters which 
 <p>This section lists some non-default configuration parameters which 
             have been used to run the <em>sort</em> benchmark on very large 
             have been used to run the <em>sort</em> benchmark on very large 
@@ -618,7 +659,7 @@ document.write("Last Published: " + document.lastModified);
                     
                     
 <td colspan="1" rowspan="1">mapred.child.java.opts</td>
 <td colspan="1" rowspan="1">mapred.child.java.opts</td>
                     <td colspan="1" rowspan="1">-Xmx1024M</td>
                     <td colspan="1" rowspan="1">-Xmx1024M</td>
-                    <td colspan="1" rowspan="1"></td>
+                    <td colspan="1" rowspan="1">Larger heap-size for child jvms of maps/reduces.</td>
                   
                   
 </tr>
 </tr>
                 
                 
@@ -627,7 +668,7 @@ document.write("Last Published: " + document.lastModified);
 </li>
 </li>
             
             
 </ul>
 </ul>
-<a name="N102D9"></a><a name="Slaves"></a>
+<a name="N1032D"></a><a name="Slaves"></a>
 <h4>Slaves</h4>
 <h4>Slaves</h4>
 <p>Typically you choose one machine in the cluster to act as the 
 <p>Typically you choose one machine in the cluster to act as the 
           <span class="codefrag">NameNode</span> and one machine as to act as the 
           <span class="codefrag">NameNode</span> and one machine as to act as the 
@@ -636,14 +677,14 @@ document.write("Last Published: " + document.lastModified);
           referred to as <em>slaves</em>.</p>
           referred to as <em>slaves</em>.</p>
 <p>List all slave hostnames or IP addresses in your 
 <p>List all slave hostnames or IP addresses in your 
           <span class="codefrag">conf/slaves</span> file, one per line.</p>
           <span class="codefrag">conf/slaves</span> file, one per line.</p>
-<a name="N102F8"></a><a name="Logging"></a>
+<a name="N1034C"></a><a name="Logging"></a>
 <h4>Logging</h4>
 <h4>Logging</h4>
 <p>Hadoop uses the <a href="http://logging.apache.org/log4j/">Apache 
 <p>Hadoop uses the <a href="http://logging.apache.org/log4j/">Apache 
           log4j</a> via the <a href="http://commons.apache.org/logging/">Apache 
           log4j</a> via the <a href="http://commons.apache.org/logging/">Apache 
           Commons Logging</a> framework for logging. Edit the 
           Commons Logging</a> framework for logging. Edit the 
           <span class="codefrag">conf/log4j.properties</span> file to customize the Hadoop 
           <span class="codefrag">conf/log4j.properties</span> file to customize the Hadoop 
           daemons' logging configuration (log-formats and so on).</p>
           daemons' logging configuration (log-formats and so on).</p>
-<a name="N1030C"></a><a name="History+Logging"></a>
+<a name="N10360"></a><a name="History+Logging"></a>
 <h5>History Logging</h5>
 <h5>History Logging</h5>
 <p> The job history files are stored in central location 
 <p> The job history files are stored in central location 
             <span class="codefrag"> hadoop.job.history.location </span> which can be on DFS also,
             <span class="codefrag"> hadoop.job.history.location </span> which can be on DFS also,
@@ -677,7 +718,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 </div>
     
     
     
     
-<a name="N10344"></a><a name="Hadoop+Rack+Awareness"></a>
+<a name="N10398"></a><a name="Hadoop+Rack+Awareness"></a>
 <h2 class="h3">Hadoop Rack Awareness</h2>
 <h2 class="h3">Hadoop Rack Awareness</h2>
 <div class="section">
 <div class="section">
 <p>The HDFS and the Map-Reduce components are rack-aware.</p>
 <p>The HDFS and the Map-Reduce components are rack-aware.</p>
@@ -700,7 +741,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 </div>
     
     
     
     
-<a name="N1036A"></a><a name="Hadoop+Startup"></a>
+<a name="N103BE"></a><a name="Hadoop+Startup"></a>
 <h2 class="h3">Hadoop Startup</h2>
 <h2 class="h3">Hadoop Startup</h2>
 <div class="section">
 <div class="section">
 <p>To start a Hadoop cluster you will need to start both the HDFS and 
 <p>To start a Hadoop cluster you will need to start both the HDFS and 
@@ -735,7 +776,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 </div>
     
     
     
     
-<a name="N103B0"></a><a name="Hadoop+Shutdown"></a>
+<a name="N10404"></a><a name="Hadoop+Shutdown"></a>
 <h2 class="h3">Hadoop Shutdown</h2>
 <h2 class="h3">Hadoop Shutdown</h2>
 <div class="section">
 <div class="section">
 <p>
 <p>

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 2 - 2
docs/cluster_setup.pdf


+ 38 - 28
docs/mapred_tutorial.html

@@ -307,7 +307,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <ul class="minitoc">
 <li>
 <li>
-<a href="#Source+Code-N10D94">Source Code</a>
+<a href="#Source+Code-N10DA0">Source Code</a>
 </li>
 </li>
 <li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
 <a href="#Sample+Runs">Sample Runs</a>
@@ -1547,7 +1547,17 @@ document.write("Last Published: " + document.lastModified);
         
         
 </p>
 </p>
 <p>Users/admins can also specify the maximum virtual memory 
 <p>Users/admins can also specify the maximum virtual memory 
-        of the launched child-task using <span class="codefrag">mapred.child.ulimit</span>.</p>
+        of the launched child-task using <span class="codefrag">mapred.child.ulimit</span>.
+        The value for <span class="codefrag">mapred.child.ulimit</span> should be specified 
+        in kilo bytes (KB). And also the value must be greater than
+        or equal to the -Xmx passed to JavaVM, else the VM might not start. 
+        </p>
+<p>Note: <span class="codefrag">mapred.child.java.opts</span> are used only for 
+        configuring the launched child tasks from task tracker. Configuring 
+        the memory options for daemons is documented in 
+        <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
+        cluster_setup.html </a>
+</p>
 <p>The task tracker has local directory,
 <p>The task tracker has local directory,
         <span class="codefrag"> ${mapred.local.dir}/taskTracker/</span> to create localized
         <span class="codefrag"> ${mapred.local.dir}/taskTracker/</span> to create localized
         cache and localized job. It can define multiple local directories 
         cache and localized job. It can define multiple local directories 
@@ -1731,7 +1741,7 @@ document.write("Last Published: " + document.lastModified);
         loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
         loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
         System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
         System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
         System.load</a>.</p>
         System.load</a>.</p>
-<a name="N109EB"></a><a name="Job+Submission+and+Monitoring"></a>
+<a name="N109F7"></a><a name="Job+Submission+and+Monitoring"></a>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
@@ -1792,7 +1802,7 @@ document.write("Last Published: " + document.lastModified);
 <p>Normally the user creates the application, describes various facets 
 <p>Normally the user creates the application, describes various facets 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
-<a name="N10A4B"></a><a name="Job+Control"></a>
+<a name="N10A57"></a><a name="Job+Control"></a>
 <h4>Job Control</h4>
 <h4>Job Control</h4>
 <p>Users may need to chain map-reduce jobs to accomplish complex
 <p>Users may need to chain map-reduce jobs to accomplish complex
           tasks which cannot be done via a single map-reduce job. This is fairly
           tasks which cannot be done via a single map-reduce job. This is fairly
@@ -1828,7 +1838,7 @@ document.write("Last Published: " + document.lastModified);
             </li>
             </li>
           
           
 </ul>
 </ul>
-<a name="N10A75"></a><a name="Job+Input"></a>
+<a name="N10A81"></a><a name="Job+Input"></a>
 <h3 class="h4">Job Input</h3>
 <h3 class="h4">Job Input</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
@@ -1876,7 +1886,7 @@ document.write("Last Published: " + document.lastModified);
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         compressed files with the above extensions cannot be <em>split</em> and 
         compressed files with the above extensions cannot be <em>split</em> and 
         each compressed file is processed in its entirety by a single mapper.</p>
         each compressed file is processed in its entirety by a single mapper.</p>
-<a name="N10ADF"></a><a name="InputSplit"></a>
+<a name="N10AEB"></a><a name="InputSplit"></a>
 <h4>InputSplit</h4>
 <h4>InputSplit</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
@@ -1890,7 +1900,7 @@ document.write("Last Published: " + document.lastModified);
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           logical split.</p>
           logical split.</p>
-<a name="N10B04"></a><a name="RecordReader"></a>
+<a name="N10B10"></a><a name="RecordReader"></a>
 <h4>RecordReader</h4>
 <h4>RecordReader</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
@@ -1902,7 +1912,7 @@ document.write("Last Published: " + document.lastModified);
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           responsibility of processing record boundaries and presents the tasks 
           responsibility of processing record boundaries and presents the tasks 
           with keys and values.</p>
           with keys and values.</p>
-<a name="N10B27"></a><a name="Job+Output"></a>
+<a name="N10B33"></a><a name="Job+Output"></a>
 <h3 class="h4">Job Output</h3>
 <h3 class="h4">Job Output</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
@@ -1927,7 +1937,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <p>
 <span class="codefrag">TextOutputFormat</span> is the default 
 <span class="codefrag">TextOutputFormat</span> is the default 
         <span class="codefrag">OutputFormat</span>.</p>
         <span class="codefrag">OutputFormat</span>.</p>
-<a name="N10B50"></a><a name="Task+Side-Effect+Files"></a>
+<a name="N10B5C"></a><a name="Task+Side-Effect+Files"></a>
 <h4>Task Side-Effect Files</h4>
 <h4>Task Side-Effect Files</h4>
 <p>In some applications, component tasks need to create and/or write to
 <p>In some applications, component tasks need to create and/or write to
           side-files, which differ from the actual job-output files.</p>
           side-files, which differ from the actual job-output files.</p>
@@ -1966,7 +1976,7 @@ document.write("Last Published: " + document.lastModified);
 <p>The entire discussion holds true for maps of jobs with 
 <p>The entire discussion holds true for maps of jobs with 
            reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
            reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
            goes directly to HDFS.</p>
            goes directly to HDFS.</p>
-<a name="N10B98"></a><a name="RecordWriter"></a>
+<a name="N10BA4"></a><a name="RecordWriter"></a>
 <h4>RecordWriter</h4>
 <h4>RecordWriter</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
@@ -1974,9 +1984,9 @@ document.write("Last Published: " + document.lastModified);
           pairs to an output file.</p>
           pairs to an output file.</p>
 <p>RecordWriter implementations write the job outputs to the 
 <p>RecordWriter implementations write the job outputs to the 
           <span class="codefrag">FileSystem</span>.</p>
           <span class="codefrag">FileSystem</span>.</p>
-<a name="N10BAF"></a><a name="Other+Useful+Features"></a>
+<a name="N10BBB"></a><a name="Other+Useful+Features"></a>
 <h3 class="h4">Other Useful Features</h3>
 <h3 class="h4">Other Useful Features</h3>
-<a name="N10BB5"></a><a name="Counters"></a>
+<a name="N10BC1"></a><a name="Counters"></a>
 <h4>Counters</h4>
 <h4>Counters</h4>
 <p>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined either by 
 <span class="codefrag">Counters</span> represent global counters, defined either by 
@@ -1990,7 +2000,7 @@ document.write("Last Published: " + document.lastModified);
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           aggregated by the framework.</p>
           aggregated by the framework.</p>
-<a name="N10BE0"></a><a name="DistributedCache"></a>
+<a name="N10BEC"></a><a name="DistributedCache"></a>
 <h4>DistributedCache</h4>
 <h4>DistributedCache</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
@@ -2024,7 +2034,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymlink(Configuration)</a> api. Files 
           DistributedCache.createSymlink(Configuration)</a> api. Files 
           have <em>execution permissions</em> set.</p>
           have <em>execution permissions</em> set.</p>
-<a name="N10C1E"></a><a name="Tool"></a>
+<a name="N10C2A"></a><a name="Tool"></a>
 <h4>Tool</h4>
 <h4>Tool</h4>
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
           interface supports the handling of generic Hadoop command-line options.
           interface supports the handling of generic Hadoop command-line options.
@@ -2064,7 +2074,7 @@ document.write("Last Published: " + document.lastModified);
             </span>
             </span>
           
           
 </p>
 </p>
-<a name="N10C50"></a><a name="IsolationRunner"></a>
+<a name="N10C5C"></a><a name="IsolationRunner"></a>
 <h4>IsolationRunner</h4>
 <h4>IsolationRunner</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
@@ -2088,7 +2098,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
           jvm, which can be in the debugger, over precisely the same input.</p>
           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10C83"></a><a name="Profiling"></a>
+<a name="N10C8F"></a><a name="Profiling"></a>
 <h4>Profiling</h4>
 <h4>Profiling</h4>
 <p>Profiling is a utility to get a representative (2 or 3) sample
 <p>Profiling is a utility to get a representative (2 or 3) sample
           of built-in java profiler for a sample of maps and reduces. </p>
           of built-in java profiler for a sample of maps and reduces. </p>
@@ -2121,7 +2131,7 @@ document.write("Last Published: " + document.lastModified);
           <span class="codefrag">-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s</span>
           <span class="codefrag">-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s</span>
           
           
 </p>
 </p>
-<a name="N10CB7"></a><a name="Debugging"></a>
+<a name="N10CC3"></a><a name="Debugging"></a>
 <h4>Debugging</h4>
 <h4>Debugging</h4>
 <p>Map/Reduce framework provides a facility to run user-provided 
 <p>Map/Reduce framework provides a facility to run user-provided 
           scripts for debugging. When map/reduce task fails, user can run 
           scripts for debugging. When map/reduce task fails, user can run 
@@ -2132,7 +2142,7 @@ document.write("Last Published: " + document.lastModified);
 <p> In the following sections we discuss how to submit debug script
 <p> In the following sections we discuss how to submit debug script
           along with the job. For submitting debug script, first it has to
           along with the job. For submitting debug script, first it has to
           distributed. Then the script has to supplied in Configuration. </p>
           distributed. Then the script has to supplied in Configuration. </p>
-<a name="N10CC3"></a><a name="How+to+distribute+script+file%3A"></a>
+<a name="N10CCF"></a><a name="How+to+distribute+script+file%3A"></a>
 <h5> How to distribute script file: </h5>
 <h5> How to distribute script file: </h5>
 <p>
 <p>
           To distribute  the debug script file, first copy the file to the dfs.
           To distribute  the debug script file, first copy the file to the dfs.
@@ -2155,7 +2165,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymLink(Configuration) </a> api.
           DistributedCache.createSymLink(Configuration) </a> api.
           </p>
           </p>
-<a name="N10CDC"></a><a name="How+to+submit+script%3A"></a>
+<a name="N10CE8"></a><a name="How+to+submit+script%3A"></a>
 <h5> How to submit script: </h5>
 <h5> How to submit script: </h5>
 <p> A quick way to submit debug script is to set values for the 
 <p> A quick way to submit debug script is to set values for the 
           properties "mapred.map.task.debug.script" and 
           properties "mapred.map.task.debug.script" and 
@@ -2179,17 +2189,17 @@ document.write("Last Published: " + document.lastModified);
 <span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>  
 <span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>  
           
           
 </p>
 </p>
-<a name="N10CFE"></a><a name="Default+Behavior%3A"></a>
+<a name="N10D0A"></a><a name="Default+Behavior%3A"></a>
 <h5> Default Behavior: </h5>
 <h5> Default Behavior: </h5>
 <p> For pipes, a default script is run to process core dumps under
 <p> For pipes, a default script is run to process core dumps under
           gdb, prints stack trace and gives info about running threads. </p>
           gdb, prints stack trace and gives info about running threads. </p>
-<a name="N10D09"></a><a name="JobControl"></a>
+<a name="N10D15"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <h4>JobControl</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           and their dependencies.</p>
           and their dependencies.</p>
-<a name="N10D16"></a><a name="Data+Compression"></a>
+<a name="N10D22"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <h4>Data Compression</h4>
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
           specify compression for both intermediate map-outputs and the
@@ -2203,7 +2213,7 @@ document.write("Last Published: " + document.lastModified);
           codecs for reasons of both performance (zlib) and non-availability of
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10D36"></a><a name="Intermediate+Outputs"></a>
+<a name="N10D42"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
 <p>Applications can control compression of intermediate map-outputs
             via the 
             via the 
@@ -2212,7 +2222,7 @@ document.write("Last Published: " + document.lastModified);
             <span class="codefrag">CompressionCodec</span> to be used via the
             <span class="codefrag">CompressionCodec</span> to be used via the
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressorClass(java.lang.Class)">
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressorClass(java.lang.Class)">
             JobConf.setMapOutputCompressorClass(Class)</a> api.</p>
             JobConf.setMapOutputCompressorClass(Class)</a> api.</p>
-<a name="N10D4B"></a><a name="Job+Outputs"></a>
+<a name="N10D57"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -2232,7 +2242,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 </div>
 
 
     
     
-<a name="N10D7A"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10D86"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
@@ -2242,7 +2252,7 @@ document.write("Last Published: " + document.lastModified);
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       Hadoop installation.</p>
       Hadoop installation.</p>
-<a name="N10D94"></a><a name="Source+Code-N10D94"></a>
+<a name="N10DA0"></a><a name="Source+Code-N10DA0"></a>
 <h3 class="h4">Source Code</h3>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
           
@@ -3452,7 +3462,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
 </tr>
         
         
 </table>
 </table>
-<a name="N114F6"></a><a name="Sample+Runs"></a>
+<a name="N11502"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>Sample text-files as input:</p>
 <p>
 <p>
@@ -3620,7 +3630,7 @@ document.write("Last Published: " + document.lastModified);
 <br>
 <br>
         
         
 </p>
 </p>
-<a name="N115CA"></a><a name="Highlights"></a>
+<a name="N115D6"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
         previous one by using some features offered by the Map-Reduce framework:
         previous one by using some features offered by the Map-Reduce framework:

Những thai đổi đã bị hủy bỏ vì nó quá lớn
+ 3 - 3
docs/mapred_tutorial.pdf


+ 23 - 2
src/docs/src/documentation/content/xdocs/cluster_setup.xml

@@ -118,6 +118,25 @@
           <code>JAVA_HOME</code> so that it is correctly defined on each
           <code>JAVA_HOME</code> so that it is correctly defined on each
           remote node.</p>
           remote node.</p>
           
           
+          <p>Administrators can configure individual daemons using the
+          configuration options <code>HADOOP_*_OPTS</code>. Various options 
+          available are shown below in the table. </p>
+          <table>
+          <tr><th>Daemon</th><th>Configure Options</th></tr>
+          <tr><td>NameNode</td><td>HADOOP_NAMENODE_OPTS</td></tr>
+          <tr><td>DataNode</td><td>HADOOP_DATANODE_OPTS</td></tr>
+          <tr><td>SecondaryNamenode</td>
+              <td>HADOOP_SECONDARYNAMENODE_OPTS</td></tr>
+          <tr><td>JobTracker</td><td>HADOOP_JOBTRACKER_OPTS</td></tr>
+          <tr><td>TaskTracker</td><td>HADOOP_TASKTRACKER_OPTS</td></tr>
+          </table>
+          
+          <p> For example, To configure Namenode to use parallelGC, the
+          following statement should be added in <code>hadoop-env.sh</code> :
+          <br/><code>
+          export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC ${HADOOP_NAMENODE_OPTS}"
+          </code><br/></p>
+          
           <p>Other useful configuration parameters that you can customize 
           <p>Other useful configuration parameters that you can customize 
           include:</p>
           include:</p>
           <ul>
           <ul>
@@ -128,7 +147,9 @@
             </li>
             </li>
             <li>
             <li>
               <code>HADOOP_HEAPSIZE</code> - The maximum amount of heapsize 
               <code>HADOOP_HEAPSIZE</code> - The maximum amount of heapsize 
-              to use, in MB e.g. <code>2000MB</code>.
+              to use, in MB e.g. <code>1000MB</code>. This is used to 
+              configure the heap size for the hadoop daemon. By default,
+              the value is <code>1000MB</code>.
             </li>
             </li>
           </ul>
           </ul>
         </section>
         </section>
@@ -335,7 +356,7 @@
                   <tr>
                   <tr>
                     <td>mapred.child.java.opts</td>
                     <td>mapred.child.java.opts</td>
                     <td>-Xmx1024M</td>
                     <td>-Xmx1024M</td>
-                    <td></td>
+                    <td>Larger heap-size for child jvms of maps/reduces.</td>
                   </tr>
                   </tr>
                 </table>
                 </table>
               </li>
               </li>

+ 11 - 1
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1066,7 +1066,17 @@
         </p>
         </p>
         
         
         <p>Users/admins can also specify the maximum virtual memory 
         <p>Users/admins can also specify the maximum virtual memory 
-        of the launched child-task using <code>mapred.child.ulimit</code>.</p>
+        of the launched child-task using <code>mapred.child.ulimit</code>.
+        The value for <code>mapred.child.ulimit</code> should be specified 
+        in kilo bytes (KB). And also the value must be greater than
+        or equal to the -Xmx passed to JavaVM, else the VM might not start. 
+        </p>
+        
+        <p>Note: <code>mapred.child.java.opts</code> are used only for 
+        configuring the launched child tasks from task tracker. Configuring 
+        the memory options for daemons is documented in 
+        <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
+        cluster_setup.html </a></p>
         
         
         <p>The task tracker has local directory,
         <p>The task tracker has local directory,
         <code> ${mapred.local.dir}/taskTracker/</code> to create localized
         <code> ${mapred.local.dir}/taskTracker/</code> to create localized

Một số tệp đã không được hiển thị bởi vì quá nhiều tập tin thay đổi trong này khác