Sfoglia il codice sorgente

Merge -r 755937:755938 from trunk onto 0.20 branch. Fixes HADOOP-5522.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20@755956 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das 16 anni fa
parent
commit
9a182c0f21

+ 3 - 0
CHANGES.txt

@@ -822,6 +822,9 @@ Release 0.19.2 - Unreleased
     HADOOP-5259. Job with output hdfs:/user/<username>/outputpath (no 
     HADOOP-5259. Job with output hdfs:/user/<username>/outputpath (no 
     authority) fails with Wrong FS. (Doug Cutting via hairong)
     authority) fails with Wrong FS. (Doug Cutting via hairong)
 
 
+    HADOOP-5522. Documents the setup/cleanup tasks in the mapred tutorial.
+    (Amareshwari Sriramadasu via ddas)
+
 Release 0.19.1 - 2009-02-23 
 Release 0.19.1 - 2009-02-23 
 
 
   IMPROVEMENTS
   IMPROVEMENTS

+ 15 - 1
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1591,13 +1591,20 @@
             Setup the job during initialization. For example, create
             Setup the job during initialization. For example, create
             the temporary output directory for the job during the
             the temporary output directory for the job during the
             initialization of the job. 
             initialization of the job. 
+            Job setup is done by a separate task when the job is
+            in PREP state and after initializing tasks. Once the setup task
+            completes, the job will be moved to RUNNING state.
           </li>
           </li>
           <li>
           <li>
             Cleanup the job after the job completion. For example, remove the
             Cleanup the job after the job completion. For example, remove the
             temporary output directory after the job completion.
             temporary output directory after the job completion.
+            Job cleanup is done by a separate task at the end of the job.
+            Job is declared SUCCEDED/FAILED/KILLED after the cleanup
+            task completes.
           </li>
           </li>
           <li>
           <li>
             Setup the task temporary output.
             Setup the task temporary output.
+            Task setup is done as part of the same task, during task initialization.
           </li> 
           </li> 
           <li>
           <li>
             Check whether a task needs a commit. This is to avoid the commit
             Check whether a task needs a commit. This is to avoid the commit
@@ -1605,13 +1612,20 @@
           </li>
           </li>
           <li>
           <li>
             Commit of the task output. 
             Commit of the task output. 
+            Once task is done, the task will commit it's output if required.  
           </li> 
           </li> 
           <li>
           <li>
             Discard the task commit.
             Discard the task commit.
+            If the task has been failed/killed, the output will be cleaned-up. 
+            If task could not cleanup (in exception block), a separate task 
+            will be launched with same attempt-id to do the cleanup.
           </li>
           </li>
         </ol>
         </ol>
         <p><code>FileOutputCommitter</code> is the default 
         <p><code>FileOutputCommitter</code> is the default 
-        <code>OutputCommitter</code>.</p>
+        <code>OutputCommitter</code>. Job setup/cleanup tasks occupy 
+        map or reduce slots, whichever is free on the TaskTracker. And
+        JobCleanup task, TaskCleanup tasks and JobSetup task have the highest
+        priority, and in that order.</p>
         </section>
         </section>
  
  
         <section>
         <section>