Selaa lähdekoodia

Merge -r 755937:755938 from trunk onto 0.20 branch. Fixes HADOOP-5522.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20@755956 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das 16 vuotta sitten
vanhempi
commit
9a182c0f21
2 muutettua tiedostoa jossa 18 lisäystä ja 1 poistoa
  1. 3 0
      CHANGES.txt
  2. 15 1
      src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

+ 3 - 0
CHANGES.txt

@@ -822,6 +822,9 @@ Release 0.19.2 - Unreleased
     HADOOP-5259. Job with output hdfs:/user/<username>/outputpath (no 
     authority) fails with Wrong FS. (Doug Cutting via hairong)
 
+    HADOOP-5522. Documents the setup/cleanup tasks in the mapred tutorial.
+    (Amareshwari Sriramadasu via ddas)
+
 Release 0.19.1 - 2009-02-23 
 
   IMPROVEMENTS

+ 15 - 1
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1591,13 +1591,20 @@
             Setup the job during initialization. For example, create
             the temporary output directory for the job during the
             initialization of the job. 
+            Job setup is done by a separate task when the job is
+            in PREP state and after initializing tasks. Once the setup task
+            completes, the job will be moved to RUNNING state.
           </li>
           <li>
             Cleanup the job after the job completion. For example, remove the
             temporary output directory after the job completion.
+            Job cleanup is done by a separate task at the end of the job.
+            Job is declared SUCCEDED/FAILED/KILLED after the cleanup
+            task completes.
           </li>
           <li>
             Setup the task temporary output.
+            Task setup is done as part of the same task, during task initialization.
           </li> 
           <li>
             Check whether a task needs a commit. This is to avoid the commit
@@ -1605,13 +1612,20 @@
           </li>
           <li>
             Commit of the task output. 
+            Once task is done, the task will commit it's output if required.  
           </li> 
           <li>
             Discard the task commit.
+            If the task has been failed/killed, the output will be cleaned-up. 
+            If task could not cleanup (in exception block), a separate task 
+            will be launched with same attempt-id to do the cleanup.
           </li>
         </ol>
         <p><code>FileOutputCommitter</code> is the default 
-        <code>OutputCommitter</code>.</p>
+        <code>OutputCommitter</code>. Job setup/cleanup tasks occupy 
+        map or reduce slots, whichever is free on the TaskTracker. And
+        JobCleanup task, TaskCleanup tasks and JobSetup task have the highest
+        priority, and in that order.</p>
         </section>
  
         <section>