Quellcode durchsuchen

HADOOP-3041. Reverted in favor of moving this change to a later release.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16@640782 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das vor 17 Jahren
Ursprung
Commit
e2850a8b71

+ 0 - 4
CHANGES.txt

@@ -27,10 +27,6 @@ Release 0.16.2 - Unreleased
     HADOOP-3007. Tolerate mirror failures while DataNode is replicating
     blocks as it used to before. (rangadi)
 
-    HADOOP-3041. Deprecates getOutputPath and defines two new APIs
-    getCurrentOutputPath and getFinalOutputPath.
-    (Amareshwari Sriramadasu via ddas)
-
     HADOOP-2944. Fixes a "Run on Hadoop" wizard NPE when creating a
     Location from the wizard. (taton)
 

+ 18 - 22
docs/mapred_tutorial.html

@@ -283,7 +283,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
 <ul class="minitoc">
 <li>
-<a href="#Source+Code-N10BC1">Source Code</a>
+<a href="#Source+Code-N10BBE">Source Code</a>
 </li>
 <li>
 <a href="#Sample+Runs">Sample Runs</a>
@@ -1731,15 +1731,11 @@ document.write("Last Published: " + document.lastModified);
 <p>The application-writer can take advantage of this feature by 
           creating any side-files required in <span class="codefrag">${mapred.output.dir}</span> 
           during execution of a task via 
-          <a href="api/org/apache/hadoop/mapred/JobConf.html#getCurrentOutputPath()">
-          JobConf.getCurrentOutputPath()</a>, and the framework will promote them 
+          <a href="api/org/apache/hadoop/mapred/JobConf.html#getOutputPath()">
+          JobConf.getOutputPath()</a>, and the framework will promote them 
           similarly for succesful task-attempts, thus eliminating the need to 
-          pick unique paths per task-attempt. She can get the actual configured 
-          path (final output path) via 
-          <a href="api/org/apache/hadoop/mapred/JobConf.html#getFinalOutputPath()">
-          JobConf.getFinalOutputPath()</a>
-</p>
-<a name="N10A34"></a><a name="RecordWriter"></a>
+          pick unique paths per task-attempt.</p>
+<a name="N10A31"></a><a name="RecordWriter"></a>
 <h4>RecordWriter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
@@ -1747,9 +1743,9 @@ document.write("Last Published: " + document.lastModified);
           pairs to an output file.</p>
 <p>RecordWriter implementations write the job outputs to the 
           <span class="codefrag">FileSystem</span>.</p>
-<a name="N10A4B"></a><a name="Other+Useful+Features"></a>
+<a name="N10A48"></a><a name="Other+Useful+Features"></a>
 <h3 class="h4">Other Useful Features</h3>
-<a name="N10A51"></a><a name="Counters"></a>
+<a name="N10A4E"></a><a name="Counters"></a>
 <h4>Counters</h4>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined either by 
@@ -1763,7 +1759,7 @@ document.write("Last Published: " + document.lastModified);
           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
           <span class="codefrag">reduce</span> methods. These counters are then globally 
           aggregated by the framework.</p>
-<a name="N10A7C"></a><a name="DistributedCache"></a>
+<a name="N10A79"></a><a name="DistributedCache"></a>
 <h4>DistributedCache</h4>
 <p>
 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
@@ -1796,7 +1792,7 @@ document.write("Last Published: " + document.lastModified);
           <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
           DistributedCache.createSymlink(Path, Configuration)</a> api. Files 
           have <em>execution permissions</em> set.</p>
-<a name="N10ABA"></a><a name="Tool"></a>
+<a name="N10AB7"></a><a name="Tool"></a>
 <h4>Tool</h4>
 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
           interface supports the handling of generic Hadoop command-line options.
@@ -1836,7 +1832,7 @@ document.write("Last Published: " + document.lastModified);
             </span>
           
 </p>
-<a name="N10AEC"></a><a name="IsolationRunner"></a>
+<a name="N10AE9"></a><a name="IsolationRunner"></a>
 <h4>IsolationRunner</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
@@ -1860,13 +1856,13 @@ document.write("Last Published: " + document.lastModified);
 <p>
 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
           jvm, which can be in the debugger, over precisely the same input.</p>
-<a name="N10B1F"></a><a name="JobControl"></a>
+<a name="N10B1C"></a><a name="JobControl"></a>
 <h4>JobControl</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
           and their dependencies.</p>
-<a name="N10B2C"></a><a name="Data+Compression"></a>
+<a name="N10B29"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
 <p>Hadoop Map-Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
@@ -1880,7 +1876,7 @@ document.write("Last Published: " + document.lastModified);
           codecs for reasons of both performance (zlib) and non-availability of
           Java libraries (lzo). More details on their usage and availability are
           available <a href="native_libraries.html">here</a>.</p>
-<a name="N10B4C"></a><a name="Intermediate+Outputs"></a>
+<a name="N10B49"></a><a name="Intermediate+Outputs"></a>
 <h5>Intermediate Outputs</h5>
 <p>Applications can control compression of intermediate map-outputs
             via the 
@@ -1901,7 +1897,7 @@ document.write("Last Published: " + document.lastModified);
             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
             api.</p>
-<a name="N10B78"></a><a name="Job+Outputs"></a>
+<a name="N10B75"></a><a name="Job+Outputs"></a>
 <h5>Job Outputs</h5>
 <p>Applications can control compression of job-outputs via the
             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
@@ -1921,7 +1917,7 @@ document.write("Last Published: " + document.lastModified);
 </div>
 
     
-<a name="N10BA7"></a><a name="Example%3A+WordCount+v2.0"></a>
+<a name="N10BA4"></a><a name="Example%3A+WordCount+v2.0"></a>
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
@@ -1931,7 +1927,7 @@ document.write("Last Published: " + document.lastModified);
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
       <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
       Hadoop installation.</p>
-<a name="N10BC1"></a><a name="Source+Code-N10BC1"></a>
+<a name="N10BBE"></a><a name="Source+Code-N10BBE"></a>
 <h3 class="h4">Source Code</h3>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
           
@@ -3141,7 +3137,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
         
 </table>
-<a name="N11323"></a><a name="Sample+Runs"></a>
+<a name="N11320"></a><a name="Sample+Runs"></a>
 <h3 class="h4">Sample Runs</h3>
 <p>Sample text-files as input:</p>
 <p>
@@ -3309,7 +3305,7 @@ document.write("Last Published: " + document.lastModified);
 <br>
         
 </p>
-<a name="N113F7"></a><a name="Highlights"></a>
+<a name="N113F4"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
         previous one by using some features offered by the Map-Reduce framework:

Datei-Diff unterdrückt, da er zu groß ist
+ 1 - 1
docs/mapred_tutorial.pdf


+ 3 - 6
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1282,13 +1282,10 @@
           <p>The application-writer can take advantage of this feature by 
           creating any side-files required in <code>${mapred.output.dir}</code> 
           during execution of a task via 
-          <a href="ext:api/org/apache/hadoop/mapred/jobconf/getcurrentoutputpath">
-          JobConf.getCurrentOutputPath()</a>, and the framework will promote them 
+          <a href="ext:api/org/apache/hadoop/mapred/jobconf/getoutputpath">
+          JobConf.getOutputPath()</a>, and the framework will promote them 
           similarly for succesful task-attempts, thus eliminating the need to 
-          pick unique paths per task-attempt. She can get the actual configured 
-          path (final output path) via 
-          <a href="ext:api/org/apache/hadoop/mapred/jobconf/getfinaloutputpath">
-          JobConf.getFinalOutputPath()</a></p>
+          pick unique paths per task-attempt.</p>
         </section>
         
         <section>

+ 1 - 2
src/docs/src/documentation/content/xdocs/site.xml

@@ -136,8 +136,7 @@ See http://forrest.apache.org/docs/linking.html for more info.
                 <setoutputvaluegroupingcomparator href="#setOutputValueGroupingComparator(java.lang.Class)" />
                 <setinputpath href="#setInputPath(org.apache.hadoop.fs.Path)" />
                 <addinputpath href="#addInputPath(org.apache.hadoop.fs.Path)" />
-                <getcurrentoutputpath href="#getCurrentOutputPath()" />
-                <getfinaloutputpath href="#getFinalOutputPath()" />
+                <getoutputpath href="#getOutputPath()" />
                 <setoutputpath href="#setOutputPath(org.apache.hadoop.fs.Path)" />
                 <setcombinerclass href="#setCombinerClass(java.lang.Class)" />
                 <setmapdebugscript href="#setMapDebugScript(java.lang.String)" />

+ 1 - 1
src/examples/org/apache/hadoop/examples/RandomWriter.java

@@ -105,7 +105,7 @@ public class RandomWriter extends Configured implements Tool {
     public InputSplit[] getSplits(JobConf job, 
                                   int numSplits) throws IOException {
       InputSplit[] result = new InputSplit[numSplits];
-      Path outDir = job.getCurrentOutputPath();
+      Path outDir = job.getOutputPath();
       for(int i=0; i < result.length; ++i) {
         result[i] = new FileSplit(new Path(outDir, "dummy-split-" + i), 0, 1, job);
       }

+ 1 - 1
src/examples/org/apache/hadoop/examples/Sort.java

@@ -140,7 +140,7 @@ public class Sort extends Configured implements Tool {
         cluster.getTaskTrackers() +
         " nodes to sort from " + 
         jobConf.getInputPaths()[0] + " into " +
-        jobConf.getCurrentOutputPath() + " with " + num_reduces + " reduces.");
+        jobConf.getOutputPath() + " with " + num_reduces + " reduces.");
     Date startTime = new Date();
     System.out.println("Job started: " + startTime);
     JobClient.runJob(jobConf);

+ 6 - 37
src/java/org/apache/hadoop/mapred/JobConf.java

@@ -353,20 +353,7 @@ public class JobConf extends Configuration {
   }
   
   /**
-   * @deprecated Please use {@link #getCurrentOutputPath()} 
-   *             or {@link #getFinalOutputPath()} 
-   *             
-   * @return the {@link Path} to the output directory for the map-reduce job.
-   */
-  @Deprecated
-  public Path getOutputPath() {
-    return getCurrentOutputPath();
-  }
-
-  /**
-   * Get the {@link Path} to the output directory for the map-reduce job
-   * (This is sensitive to the task execution. While executing task, this 
-   * value points to the task's temporary output directory)
+   * Get the {@link Path} to the output directory for the map-reduce job.
    * 
    * <h4 id="SideEffectFiles">Tasks' Side-Effect Files</h4>
    * 
@@ -391,44 +378,28 @@ public class JobConf extends Configuration {
    * 
    * <p>The application-writer can take advantage of this by creating any 
    * side-files required in <tt>${mapred.output.dir}</tt> during execution of his 
-   * reduce-task i.e. via {@link #getCurrentOutputPath()}, 
-   * and the framework will move them out similarly 
-   * - thus she doesn't have to pick unique paths per task-attempt.</p>
+   * reduce-task i.e. via {@link #getOutputPath()}, and the framework will move 
+   * them out similarly - thus she doesn't have to pick unique paths per 
+   * task-attempt.</p>
    * 
    * <p><i>Note</i>: the value of <tt>${mapred.output.dir}</tt> during execution 
    * of a particular task-attempt is actually 
    * <tt>${mapred.output.dir}/_temporary/_{$taskid}</tt>, not the value set by 
    * {@link #setOutputPath(Path)}. So, just create any side-files in the path 
-   * returned by {@link #getCurrentOutputPath()} from map/reduce task to take 
+   * returned by {@link #getOutputPath()} from map/reduce task to take 
    * advantage of this feature.</p>
    * 
    * <p>The entire discussion holds true for maps of jobs with 
    * reducer=NONE (i.e. 0 reduces) since output of the map, in that case, 
    * goes directly to HDFS.</p> 
    * 
-   * @see #getFinalOutputPath()
-   * 
    * @return the {@link Path} to the output directory for the map-reduce job.
    */
-  public Path getCurrentOutputPath() { 
+  public Path getOutputPath() { 
     String name = get("mapred.output.dir");
     return name == null ? null: new Path(name);
   }
 
-  /**
-   * Get the {@link Path} to the output directory for the map-reduce job
-   * 
-   * This is the actual configured output path set 
-   * using {@link #setOutputPath(Path)} for job submission.
-   * 
-   * @see #getCurrentOutputPath()
-   * @return the {@link Path} to the output directory for the map-reduce job.
-   */
-  public Path getFinalOutputPath() { 
-    String name = get("mapred.final.output.dir");
-    return name == null ? null: new Path(name);
-  }
-
   /**
    * Set the {@link Path} of the output directory for the map-reduce job.
    * 
@@ -439,8 +410,6 @@ public class JobConf extends Configuration {
   public void setOutputPath(Path dir) {
     dir = new Path(getWorkingDirectory(), dir);
     set("mapred.output.dir", dir.toString());
-    if (get("mapred.final.output.dir") == null)
-      set("mapred.final.output.dir", dir.toString());
   }
 
   /**

+ 2 - 2
src/java/org/apache/hadoop/mapred/JobInProgress.java

@@ -277,7 +277,7 @@ class JobInProgress {
     }
 
     // create job specific temporary directory in output path
-    Path outputPath = conf.getCurrentOutputPath();
+    Path outputPath = conf.getOutputPath();
     if (outputPath != null) {
       Path tmpDir = new Path(outputPath, MRConstants.TEMP_DIR_NAME);
       FileSystem fileSys = tmpDir.getFileSystem(conf);
@@ -1141,7 +1141,7 @@ class JobInProgress {
       fs.delete(tempDir); 
 
       // delete the temporary directory in output directory
-      Path outputPath = conf.getCurrentOutputPath();
+      Path outputPath = conf.getOutputPath();
       if (outputPath != null) {
         Path tmpDir = new Path(outputPath, MRConstants.TEMP_DIR_NAME);
         FileSystem fileSys = tmpDir.getFileSystem(conf);

+ 1 - 1
src/java/org/apache/hadoop/mapred/LocalJobRunner.java

@@ -114,7 +114,7 @@ class LocalJobRunner implements JobSubmissionProtocol {
           job.setNumReduceTasks(1);
         }
         // create job specific temp directory in output path
-        Path outputPath = job.getCurrentOutputPath();
+        Path outputPath = job.getOutputPath();
         FileSystem outputFs = null;
         Path tmpDir = null;
         if (outputPath != null) {

+ 1 - 1
src/java/org/apache/hadoop/mapred/MapFileOutputFormat.java

@@ -42,7 +42,7 @@ public class MapFileOutputFormat extends OutputFormatBase {
                                       String name, Progressable progress)
     throws IOException {
 
-    Path outputPath = job.getCurrentOutputPath();
+    Path outputPath = job.getOutputPath();
     FileSystem fs = outputPath.getFileSystem(job);
     if (!fs.exists(outputPath)) {
       throw new IOException("Output directory doesnt exist");

+ 1 - 1
src/java/org/apache/hadoop/mapred/OutputFormatBase.java

@@ -100,7 +100,7 @@ public abstract class OutputFormatBase<K extends WritableComparable,
     throws FileAlreadyExistsException, 
            InvalidJobConfException, IOException {
     // Ensure that the output directory is set and not already there
-    Path outDir = job.getCurrentOutputPath();
+    Path outDir = job.getOutputPath();
     if (outDir == null && job.getNumReduceTasks() != 0) {
       throw new InvalidJobConfException("Output directory not set in JobConf.");
     }

+ 1 - 1
src/java/org/apache/hadoop/mapred/SequenceFileOutputFormat.java

@@ -40,7 +40,7 @@ public class SequenceFileOutputFormat extends OutputFormatBase {
                                       String name, Progressable progress)
     throws IOException {
 
-    Path outputPath = job.getCurrentOutputPath();
+    Path outputPath = job.getOutputPath();
     FileSystem fs = outputPath.getFileSystem(job);
     if (!fs.exists(outputPath)) {
       throw new IOException("Output directory doesnt exist");

+ 3 - 3
src/java/org/apache/hadoop/mapred/Task.java

@@ -190,7 +190,7 @@ abstract class Task implements Writable, Configurable {
   public String toString() { return taskId; }
 
   private Path getTaskOutputPath(JobConf conf) {
-    Path p = new Path(conf.getCurrentOutputPath(), 
+    Path p = new Path(conf.getOutputPath(), 
       (MRConstants.TEMP_DIR_NAME + Path.SEPARATOR + "_" + taskId));
     try {
       FileSystem fs = p.getFileSystem(conf);
@@ -212,7 +212,7 @@ abstract class Task implements Writable, Configurable {
     conf.set("mapred.job.id", jobId);
     
     // The task-specific output path
-    if (conf.getCurrentOutputPath() != null) {
+    if (conf.getOutputPath() != null) {
       taskOutputPath = getTaskOutputPath(conf);
       conf.setOutputPath(taskOutputPath);
     }
@@ -397,7 +397,7 @@ abstract class Task implements Writable, Configurable {
       this.conf = (JobConf) conf;
 
       if (taskId != null && taskOutputPath == null && 
-              this.conf.getCurrentOutputPath() != null) {
+              this.conf.getOutputPath() != null) {
         taskOutputPath = getTaskOutputPath(this.conf);
       }
     } else {

+ 1 - 1
src/java/org/apache/hadoop/mapred/TaskTracker.java

@@ -1420,7 +1420,7 @@ public class TaskTracker
       keepFailedTaskFiles = localJobConf.getKeepFailedTaskFiles();
 
       // create _taskid directory in output path temporary directory.
-      Path outputPath = localJobConf.getCurrentOutputPath();
+      Path outputPath = localJobConf.getOutputPath();
       if (outputPath != null) {
         Path jobTmpDir = new Path(outputPath, MRConstants.TEMP_DIR_NAME);
         FileSystem fs = jobTmpDir.getFileSystem(localJobConf);

+ 1 - 1
src/java/org/apache/hadoop/mapred/TextOutputFormat.java

@@ -106,7 +106,7 @@ public class TextOutputFormat<K extends WritableComparable,
                                                   Progressable progress)
     throws IOException {
 
-    Path dir = job.getCurrentOutputPath();
+    Path dir = job.getOutputPath();
     FileSystem fs = dir.getFileSystem(job);
     if (!fs.exists(dir)) {
       throw new IOException("Output directory doesnt exist");

+ 1 - 1
src/test/org/apache/hadoop/io/FileBench.java

@@ -112,7 +112,7 @@ public class FileBench extends Configured implements Tool {
     Text val = new Text();
 
     final String fn = conf.get("test.filebench.name", "");
-    final Path outd = conf.getCurrentOutputPath();
+    final Path outd = conf.getOutputPath();
     OutputFormat outf = conf.getOutputFormat();
     RecordWriter<Text,Text> rw =
       outf.getRecordWriter(outd.getFileSystem(conf), conf, fn,

+ 1 - 1
src/test/org/apache/hadoop/mapred/GenericMRLoadGenerator.java

@@ -140,7 +140,7 @@ public class GenericMRLoadGenerator extends Configured implements Tool {
       return -1;
     }
 
-    if (null == job.getCurrentOutputPath()) {
+    if (null == job.getOutputPath()) {
       // No output dir? No writes
       job.setOutputFormat(NullOutputFormat.class);
     }

+ 1 - 1
src/test/org/apache/hadoop/mapred/MRBench.java

@@ -184,7 +184,7 @@ public class MRBench {
 
       LOG.info("Running job " + i + ":" +
                " input=" + jobConf.getInputPaths()[0] + 
-               " output=" + jobConf.getCurrentOutputPath());
+               " output=" + jobConf.getOutputPath());
       
       // run the mapred task now 
       long curTime = System.currentTimeMillis();

+ 2 - 3
src/test/org/apache/hadoop/mapred/SortValidator.java

@@ -351,7 +351,7 @@ public class SortValidator extends Configured implements Tool {
                          "from " + jobConf.getInputPaths()[0] + " (" + 
                          noSortInputpaths + " files), " + 
                          jobConf.getInputPaths()[1] + " (" + noSortReduceTasks + 
-                         " files) into " + jobConf.getCurrentOutputPath() + 
+                         " files) into " + jobConf.getOutputPath() + 
                          " with 1 reducer.");
       Date startTime = new Date();
       System.out.println("Job started: " + startTime);
@@ -492,8 +492,7 @@ public class SortValidator extends Configured implements Tool {
       System.out.println("\nSortValidator.RecordChecker: Running on " +
                          cluster.getTaskTrackers() +
                          " nodes to validate sort from " + jobConf.getInputPaths()[0] + ", " + 
-                         jobConf.getInputPaths()[1] + " into " +
-                         jobConf.getCurrentOutputPath() + 
+                         jobConf.getInputPaths()[1] + " into " + jobConf.getOutputPath() + 
                          " with " + noReduces + " reduces.");
       Date startTime = new Date();
       System.out.println("Job started: " + startTime);

+ 1 - 1
src/test/org/apache/hadoop/mapred/ThreadedMapBenchmark.java

@@ -78,7 +78,7 @@ public class ThreadedMapBenchmark extends Configured implements Tool {
     public InputSplit[] getSplits(JobConf job, 
                                   int numSplits) throws IOException {
       InputSplit[] result = new InputSplit[numSplits];
-      Path outDir = job.getCurrentOutputPath();
+      Path outDir = job.getOutputPath();
       for(int i=0; i < result.length; ++i) {
         result[i] = new FileSplit(new Path(outDir, "dummy-split-" + i), 0, 1, 
                                   job);

Einige Dateien werden nicht angezeigt, da zu viele Dateien in diesem Diff geändert wurden.