Browse Source

HADOOP-3162 related ocumentation changes

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@648762 13f79535-47bb-0310-9956-ffa450edef68
Arun Murthy 17 years ago
parent
commit
64b591e3b0

+ 29 - 5
docs/changes.html

@@ -61,15 +61,19 @@
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._new_features_')">  NEW FEATURES
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._new_features_')">  NEW FEATURES
-</a>&nbsp;&nbsp;&nbsp;(1)
+</a>&nbsp;&nbsp;&nbsp;(2)
     <ol id="trunk_(unreleased_changes)_._new_features_">
     <ol id="trunk_(unreleased_changes)_._new_features_">
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3074">HADOOP-3074</a>. Provides a UrlStreamHandler for DFS and other FS,
+relying on FileSystem<br />(taton)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a>. Name-node imports namespace data from a recent checkpoint
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a>. Name-node imports namespace data from a recent checkpoint
 accessible via a NFS mount.<br />(shv)</li>
 accessible via a NFS mount.<br />(shv)</li>
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._improvements_')">  IMPROVEMENTS
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._improvements_')">  IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(none)
+</a>&nbsp;&nbsp;&nbsp;(2)
     <ol id="trunk_(unreleased_changes)_._improvements_">
     <ol id="trunk_(unreleased_changes)_._improvements_">
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-2928">HADOOP-2928</a>. Remove deprecated FileSystem.getContentLength().<br />(Lohit Vjayarenu via rangadi)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3130">HADOOP-3130</a>. Make the connect timeout smaller for getFile.<br />(Amar Ramesh Kamat via ddas)</li>
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._optimizations_')">  OPTIMIZATIONS
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._optimizations_')">  OPTIMIZATIONS
@@ -78,10 +82,16 @@ accessible via a NFS mount.<br />(shv)</li>
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._bug_fixes_')">  BUG FIXES
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._bug_fixes_')">  BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(2)
+</a>&nbsp;&nbsp;&nbsp;(4)
     <ol id="trunk_(unreleased_changes)_._bug_fixes_">
     <ol id="trunk_(unreleased_changes)_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2905">HADOOP-2905</a>. 'fsck -move' triggers NPE in NameNode.<br />(Lohit Vjayarenu via rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2905">HADOOP-2905</a>. 'fsck -move' triggers NPE in NameNode.<br />(Lohit Vjayarenu via rangadi)</li>
       <li>Increment ClientProtocol.versionID missed by <a href="http://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a>.<br />(shv)</li>
       <li>Increment ClientProtocol.versionID missed by <a href="http://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a>.<br />(shv)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3254">HADOOP-3254</a>. Restructure internal namenode methods that process
+heartbeats to use well-defined BlockCommand object(s) instead of
+using the base java Object. (Tsz Wo (Nicholas), SZE via dhruba)
+</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3176">HADOOP-3176</a>.  Change lease record when a open-for-write-file
+gets renamed.<br />(dhruba)</li>
     </ol>
     </ol>
   </li>
   </li>
 </ul>
 </ul>
@@ -89,7 +99,7 @@ accessible via a NFS mount.<br />(shv)</li>
 </a></h2>
 </a></h2>
 <ul id="release_0.17.0_-_unreleased_">
 <ul id="release_0.17.0_-_unreleased_">
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._incompatible_changes_')">  INCOMPATIBLE CHANGES
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._incompatible_changes_')">  INCOMPATIBLE CHANGES
-</a>&nbsp;&nbsp;&nbsp;(23)
+</a>&nbsp;&nbsp;&nbsp;(24)
     <ol id="release_0.17.0_-_unreleased_._incompatible_changes_">
     <ol id="release_0.17.0_-_unreleased_._incompatible_changes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2786">HADOOP-2786</a>.  Move hbase out of hadoop core
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2786">HADOOP-2786</a>.  Move hbase out of hadoop core
 </li>
 </li>
@@ -136,6 +146,8 @@ added to a cluster on the fly with new nodes starting in the same EC2
 availability zone as the cluster.  Ganglia monitoring and large
 availability zone as the cluster.  Ganglia monitoring and large
 instance sizes have also been added.<br />(Chris K Wensel via tomwhite)</li>
 instance sizes have also been added.<br />(Chris K Wensel via tomwhite)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2826">HADOOP-2826</a>. Deprecated FileSplit.getFile(), LineRecordReader.readLine().<br />(Amareshwari Sriramadasu via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2826">HADOOP-2826</a>. Deprecated FileSplit.getFile(), LineRecordReader.readLine().<br />(Amareshwari Sriramadasu via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3239">HADOOP-3239</a>. getFileInfo() returns null for non-existing files instead
+of throwing FileNotFoundException.<br />(Lohit Vijayarenu via shv)</li>
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._new_features_')">  NEW FEATURES
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._new_features_')">  NEW FEATURES
@@ -265,7 +277,7 @@ records/log).<br />(Zheng Shao via omalley)</li>
     </ol>
     </ol>
   </li>
   </li>
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._bug_fixes_')">  BUG FIXES
   <li><a href="javascript:toggleList('release_0.17.0_-_unreleased_._bug_fixes_')">  BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(93)
+</a>&nbsp;&nbsp;&nbsp;(99)
     <ol id="release_0.17.0_-_unreleased_._bug_fixes_">
     <ol id="release_0.17.0_-_unreleased_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2195">HADOOP-2195</a>. '-mkdir' behaviour is now closer to Linux shell in case of
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2195">HADOOP-2195</a>. '-mkdir' behaviour is now closer to Linux shell in case of
 errors.<br />(Mahadev Konar via rangadi)</li>
 errors.<br />(Mahadev Konar via rangadi)</li>
@@ -433,6 +445,18 @@ deserialized Writables.<br />(Enis Soztutar via cdouglas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-1373">HADOOP-1373</a>. checkPath() should ignore case when it compares authoriy.<br />(Edward J. Yoon via rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-1373">HADOOP-1373</a>. checkPath() should ignore case when it compares authoriy.<br />(Edward J. Yoon via rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3204">HADOOP-3204</a>. Fixes a problem to do with ReduceTask's LocalFSMerger not
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3204">HADOOP-3204</a>. Fixes a problem to do with ReduceTask's LocalFSMerger not
 catching Throwable.<br />(Amar Ramesh Kamat via ddas)</li>
 catching Throwable.<br />(Amar Ramesh Kamat via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3229">HADOOP-3229</a>. Report progress when collecting records from the mapper and
+the combiner.<br />(Doug Cutting via cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3225">HADOOP-3225</a>. Unwrapping methods of RemoteException should initialize
+detailedMassage field.<br />(Mahadev Konar, shv, cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3247">HADOOP-3247</a>. Fix gridmix scripts to use the correct globbing syntax and
+change maxentToSameCluster to run the correct number of jobs.<br />(Runping Qi via cdouglas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3242">HADOOP-3242</a>. Fix the RecordReader of SequenceFileAsBinaryInputFormat to
+correctly read from the start of the split and not the beginning of the
+file.<br />(cdouglas via acmurthy)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3256">HADOOP-3256</a>. Encodes the job name used in the filename for history files.<br />(Arun Murthy via ddas)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3162">HADOOP-3162</a>. Ensure that comma-separated input paths are treated correctly
+as multiple input paths.<br />(Amareshwari Sri Ramadasu via acmurthy)</li>
     </ol>
     </ol>
   </li>
   </li>
 </ul>
 </ul>

+ 33 - 30
docs/mapred_tutorial.html

@@ -292,7 +292,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <ul class="minitoc">
 <li>
 <li>
-<a href="#Source+Code-N10C76">Source Code</a>
+<a href="#Source+Code-N10C7E">Source Code</a>
 </li>
 </li>
 <li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
 <a href="#Sample+Runs">Sample Runs</a>
@@ -944,7 +944,7 @@ document.write("Last Published: " + document.lastModified);
 <td colspan="1" rowspan="1">52.</td>
 <td colspan="1" rowspan="1">52.</td>
             <td colspan="1" rowspan="1">
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setInputPath(new Path(args[0]));</span>
+              <span class="codefrag">FileInputFormat.setInputPaths(conf, new Path(args[0]));</span>
             </td>
             </td>
           
           
 </tr>
 </tr>
@@ -1466,7 +1466,10 @@ document.write("Last Published: " + document.lastModified);
         <span class="codefrag">Reducer</span>, <span class="codefrag">InputFormat</span> and 
         <span class="codefrag">Reducer</span>, <span class="codefrag">InputFormat</span> and 
         <span class="codefrag">OutputFormat</span> implementations. <span class="codefrag">JobConf</span> also 
         <span class="codefrag">OutputFormat</span> implementations. <span class="codefrag">JobConf</span> also 
         indicates the set of input files 
         indicates the set of input files 
-        (<a href="api/org/apache/hadoop/mapred/JobConf.html#setInputPath(org.apache.hadoop.fs.Path)">setInputPath(Path)</a>/<a href="api/org/apache/hadoop/mapred/JobConf.html#addInputPath(org.apache.hadoop.fs.Path)">addInputPath(Path)</a>)
+        (<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#setInputPaths(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path[])">setInputPaths(JobConf, Path...)</a>
+        /<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#addInputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)">addInputPath(JobConf, Path)</a>)
+        and (<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#setInputPaths(org.apache.hadoop.mapred.JobConf,%20java.lang.String)">setInputPaths(JobConf, String)</a>
+        /<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#addInputPath(org.apache.hadoop.mapred.JobConf,%20java.lang.String)">addInputPaths(JobConf, String)</a>)
         and where the output files should be written
         and where the output files should be written
         (<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#setOutputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)">setOutputPath(Path)</a>).</p>
         (<a href="api/org/apache/hadoop/mapred/FileInputFormat.html#setOutputPath(org.apache.hadoop.mapred.JobConf,%20org.apache.hadoop.fs.Path)">setOutputPath(Path)</a>).</p>
 <p>Optionally, <span class="codefrag">JobConf</span> is used to specify other advanced 
 <p>Optionally, <span class="codefrag">JobConf</span> is used to specify other advanced 
@@ -1486,7 +1489,7 @@ document.write("Last Published: " + document.lastModified);
         <a href="api/org/apache/hadoop/conf/Configuration.html#set(java.lang.String, java.lang.String)">set(String, String)</a>/<a href="api/org/apache/hadoop/conf/Configuration.html#get(java.lang.String, java.lang.String)">get(String, String)</a>
         <a href="api/org/apache/hadoop/conf/Configuration.html#set(java.lang.String, java.lang.String)">set(String, String)</a>/<a href="api/org/apache/hadoop/conf/Configuration.html#get(java.lang.String, java.lang.String)">get(String, String)</a>
         to set/get arbitrary parameters needed by applications. However, use the 
         to set/get arbitrary parameters needed by applications. However, use the 
         <span class="codefrag">DistributedCache</span> for large amounts of (read-only) data.</p>
         <span class="codefrag">DistributedCache</span> for large amounts of (read-only) data.</p>
-<a name="N10844"></a><a name="Task+Execution+%26+Environment"></a>
+<a name="N1084C"></a><a name="Task+Execution+%26+Environment"></a>
 <h3 class="h4">Task Execution &amp; Environment</h3>
 <h3 class="h4">Task Execution &amp; Environment</h3>
 <p>The <span class="codefrag">TaskTracker</span> executes the <span class="codefrag">Mapper</span>/ 
 <p>The <span class="codefrag">TaskTracker</span> executes the <span class="codefrag">Mapper</span>/ 
         <span class="codefrag">Reducer</span>  <em>task</em> as a child process in a separate jvm.
         <span class="codefrag">Reducer</span>  <em>task</em> as a child process in a separate jvm.
@@ -1582,7 +1585,7 @@ document.write("Last Published: " + document.lastModified);
         loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
         loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
         System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
         System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
         System.load</a>.</p>
         System.load</a>.</p>
-<a name="N108EA"></a><a name="Job+Submission+and+Monitoring"></a>
+<a name="N108F2"></a><a name="Job+Submission+and+Monitoring"></a>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <h3 class="h4">Job Submission and Monitoring</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
 <a href="api/org/apache/hadoop/mapred/JobClient.html">
@@ -1643,7 +1646,7 @@ document.write("Last Published: " + document.lastModified);
 <p>Normally the user creates the application, describes various facets 
 <p>Normally the user creates the application, describes various facets 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         of the job via <span class="codefrag">JobConf</span>, and then uses the 
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
-<a name="N1094A"></a><a name="Job+Control"></a>
+<a name="N10952"></a><a name="Job+Control"></a>
 <h4>Job Control</h4>
 <h4>Job Control</h4>
 <p>Users may need to chain map-reduce jobs to accomplish complex
 <p>Users may need to chain map-reduce jobs to accomplish complex
           tasks which cannot be done via a single map-reduce job. This is fairly
           tasks which cannot be done via a single map-reduce job. This is fairly
@@ -1679,7 +1682,7 @@ document.write("Last Published: " + document.lastModified);
             </li>
             </li>
           
           
 </ul>
 </ul>
-<a name="N10974"></a><a name="Job+Input"></a>
+<a name="N1097C"></a><a name="Job+Input"></a>
 <h3 class="h4">Job Input</h3>
 <h3 class="h4">Job Input</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
@@ -1727,7 +1730,7 @@ document.write("Last Published: " + document.lastModified);
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
         compressed files with the above extensions cannot be <em>split</em> and 
         compressed files with the above extensions cannot be <em>split</em> and 
         each compressed file is processed in its entirety by a single mapper.</p>
         each compressed file is processed in its entirety by a single mapper.</p>
-<a name="N109DE"></a><a name="InputSplit"></a>
+<a name="N109E6"></a><a name="InputSplit"></a>
 <h4>InputSplit</h4>
 <h4>InputSplit</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
@@ -1741,7 +1744,7 @@ document.write("Last Published: " + document.lastModified);
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           <span class="codefrag">map.input.file</span> to the path of the input file for the
           logical split.</p>
           logical split.</p>
-<a name="N10A03"></a><a name="RecordReader"></a>
+<a name="N10A0B"></a><a name="RecordReader"></a>
 <h4>RecordReader</h4>
 <h4>RecordReader</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
@@ -1753,7 +1756,7 @@ document.write("Last Published: " + document.lastModified);
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
           responsibility of processing record boundaries and presents the tasks 
           responsibility of processing record boundaries and presents the tasks 
           with keys and values.</p>
           with keys and values.</p>
-<a name="N10A26"></a><a name="Job+Output"></a>
+<a name="N10A2E"></a><a name="Job+Output"></a>
 <h3 class="h4">Job Output</h3>
 <h3 class="h4">Job Output</h3>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
@@ -1778,7 +1781,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <p>
 <span class="codefrag">TextOutputFormat</span> is the default 
 <span class="codefrag">TextOutputFormat</span> is the default 
         <span class="codefrag">OutputFormat</span>.</p>
         <span class="codefrag">OutputFormat</span>.</p>
-<a name="N10A4F"></a><a name="Task+Side-Effect+Files"></a>
+<a name="N10A57"></a><a name="Task+Side-Effect+Files"></a>
 <h4>Task Side-Effect Files</h4>
 <h4>Task Side-Effect Files</h4>
 <p>In some applications, component tasks need to create and/or write to
 <p>In some applications, component tasks need to create and/or write to
           side-files, which differ from the actual job-output files.</p>
           side-files, which differ from the actual job-output files.</p>
@@ -1817,7 +1820,7 @@ document.write("Last Published: " + document.lastModified);
 <p>The entire discussion holds true for maps of jobs with 
 <p>The entire discussion holds true for maps of jobs with 
            reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
            reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
            goes directly to HDFS.</p>
            goes directly to HDFS.</p>
-<a name="N10A97"></a><a name="RecordWriter"></a>
+<a name="N10A9F"></a><a name="RecordWriter"></a>
 <h4>RecordWriter</h4>
 <h4>RecordWriter</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
@@ -1825,9 +1828,9 @@ document.write("Last Published: " + document.lastModified);
           pairs to an output file.</p>
           pairs to an output file.</p>
 <p>RecordWriter implementations write the job outputs to the 
 <p>RecordWriter implementations write the job outputs to the 
           <span class="codefrag">FileSystem</span>.</p>
           <span class="codefrag">FileSystem</span>.</p>
-<a name="N10AAE"></a><a name="Other+Useful+Features"></a>
+<a name="N10AB6"></a><a name="Other+Useful+Features"></a>
 <h3 class="h4">Other Useful Features</h3>
 <h3 class="h4">Other Useful Features</h3>
-<a name="N10AB4"></a><a name="Counters"></a>
+<a name="N10ABC"></a><a name="Counters"></a>
 <h4>Counters</h4>
 <h4>Counters</h4>
 <p>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined either by 
 <span class="codefrag">Counters</span> represent global counters, defined either by 
@@ -1841,7 +1844,7 @@ document.write("Last Published: " + document.lastModified);
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           aggregated by the framework.</p>
           aggregated by the framework.</p>
-<a name="N10ADF"></a><a name="DistributedCache"></a>
+<a name="N10AE7"></a><a name="DistributedCache"></a>
 <h4>DistributedCache</h4>
 <h4>DistributedCache</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
@@ -1874,7 +1877,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymlink(Configuration)</a> api. Files 
           DistributedCache.createSymlink(Configuration)</a> api. Files 
           have <em>execution permissions</em> set.</p>
           have <em>execution permissions</em> set.</p>
-<a name="N10B1D"></a><a name="Tool"></a>
+<a name="N10B25"></a><a name="Tool"></a>
 <h4>Tool</h4>
 <h4>Tool</h4>
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
           interface supports the handling of generic Hadoop command-line options.
           interface supports the handling of generic Hadoop command-line options.
@@ -1914,7 +1917,7 @@ document.write("Last Published: " + document.lastModified);
             </span>
             </span>
           
           
 </p>
 </p>
-<a name="N10B4F"></a><a name="IsolationRunner"></a>
+<a name="N10B57"></a><a name="IsolationRunner"></a>
 <h4>IsolationRunner</h4>
 <h4>IsolationRunner</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
@@ -1938,7 +1941,7 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
           jvm, which can be in the debugger, over precisely the same input.</p>
           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10B82"></a><a name="Debugging"></a>
+<a name="N10B8A"></a><a name="Debugging"></a>
 <h4>Debugging</h4>
 <h4>Debugging</h4>
 <p>Map/Reduce framework provides a facility to run user-provided 
 <p>Map/Reduce framework provides a facility to run user-provided 
           scripts for debugging. When map/reduce task fails, user can run 
           scripts for debugging. When map/reduce task fails, user can run 
@@ -1949,7 +1952,7 @@ document.write("Last Published: " + document.lastModified);
 <p> In the following sections we discuss how to submit debug script
 <p> In the following sections we discuss how to submit debug script
           along with the job. For submitting debug script, first it has to
           along with the job. For submitting debug script, first it has to
           distributed. Then the script has to supplied in Configuration. </p>
           distributed. Then the script has to supplied in Configuration. </p>
-<a name="N10B8E"></a><a name="How+to+distribute+script+file%3A"></a>
+<a name="N10B96"></a><a name="How+to+distribute+script+file%3A"></a>
 <h5> How to distribute script file: </h5>
 <h5> How to distribute script file: </h5>
 <p>
 <p>
           To distribute  the debug script file, first copy the file to the dfs.
           To distribute  the debug script file, first copy the file to the dfs.
@@ -1972,7 +1975,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymLink(Configuration) </a> api.
           DistributedCache.createSymLink(Configuration) </a> api.
           </p>
           </p>
-<a name="N10BA7"></a><a name="How+to+submit+script%3A"></a>
+<a name="N10BAF"></a><a name="How+to+submit+script%3A"></a>
 <h5> How to submit script: </h5>
 <h5> How to submit script: </h5>
 <p> A quick way to submit debug script is to set values for the 
 <p> A quick way to submit debug script is to set values for the 
           properties "mapred.map.task.debug.script" and 
           properties "mapred.map.task.debug.script" and 
@@ -1996,17 +1999,17 @@ document.write("Last Published: " + document.lastModified);
 <span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>  
 <span class="codefrag">$script $stdout $stderr $syslog $jobconf $program </span>  
           
           
 </p>
 </p>
-<a name="N10BC9"></a><a name="Default+Behavior%3A"></a>
+<a name="N10BD1"></a><a name="Default+Behavior%3A"></a>
 <h5> Default Behavior: </h5>
 <h5> Default Behavior: </h5>
 <p> For pipes, a default script is run to process core dumps under
 <p> For pipes, a default script is run to process core dumps under
           gdb, prints stack trace and gives info about running threads. </p>
           gdb, prints stack trace and gives info about running threads. </p>
-<a name="N10BD4"></a><a name="JobControl"></a>
+<a name="N10BDC"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <h4>JobControl</h4>
 <p>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           and their dependencies.</p>
           and their dependencies.</p>
-<a name="N10BE1"></a><a name="Data+Compression"></a>
+<a name="N10BE9"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <h4>Data Compression</h4>
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
           specify compression for both intermediate map-outputs and the
@@ -2020,7 +2023,7 @@ document.write("Last Published: " + document.lastModified);
           codecs for reasons of both performance (zlib) and non-availability of
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10C01"></a><a name="Intermediate+Outputs"></a>
+<a name="N10C09"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
 <p>Applications can control compression of intermediate map-outputs
             via the 
             via the 
@@ -2041,7 +2044,7 @@ document.write("Last Published: " + document.lastModified);
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
             api.</p>
             api.</p>
-<a name="N10C2D"></a><a name="Job+Outputs"></a>
+<a name="N10C35"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -2061,7 +2064,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 </div>
 
 
     
     
-<a name="N10C5C"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10C64"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
@@ -2071,7 +2074,7 @@ document.write("Last Published: " + document.lastModified);
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       Hadoop installation.</p>
       Hadoop installation.</p>
-<a name="N10C76"></a><a name="Source+Code-N10C76"></a>
+<a name="N10C7E"></a><a name="Source+Code-N10C7E"></a>
 <h3 class="h4">Source Code</h3>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
           
@@ -3160,7 +3163,7 @@ document.write("Last Published: " + document.lastModified);
 <td colspan="1" rowspan="1">111.</td>
 <td colspan="1" rowspan="1">111.</td>
             <td colspan="1" rowspan="1">
             <td colspan="1" rowspan="1">
               &nbsp;&nbsp;&nbsp;&nbsp;
               &nbsp;&nbsp;&nbsp;&nbsp;
-              <span class="codefrag">conf.setInputPath(new Path(other_args.get(0)));</span>
+              <span class="codefrag">FileInputFormat.setInputPaths(conf, new Path(other_args.get(0)));</span>
             </td>
             </td>
           
           
 </tr>
 </tr>
@@ -3281,7 +3284,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
 </tr>
         
         
 </table>
 </table>
-<a name="N113D8"></a><a name="Sample+Runs"></a>
+<a name="N113E0"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>Sample text-files as input:</p>
 <p>
 <p>
@@ -3449,7 +3452,7 @@ document.write("Last Published: " + document.lastModified);
 <br>
 <br>
         
         
 </p>
 </p>
-<a name="N114AC"></a><a name="Highlights"></a>
+<a name="N114B4"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
         previous one by using some features offered by the Map-Reduce framework:
         previous one by using some features offered by the Map-Reduce framework:

File diff suppressed because it is too large
+ 1 - 2
docs/mapred_tutorial.pdf


BIN
docs/skin/images/rc-b-l-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-b-r-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-b-r-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-l-5-1header-2searchbox-3searchbox.png


BIN
docs/skin/images/rc-t-l-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-l-5-1header-2tab-unselected-3tab-unselected.png


BIN
docs/skin/images/rc-t-r-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-t-r-5-1header-2searchbox-3searchbox.png


BIN
docs/skin/images/rc-t-r-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-r-5-1header-2tab-unselected-3tab-unselected.png


Some files were not shown because too many files changed in this diff