浏览代码

Merge -r 755937:755938 from trunk onto 0.19 branch. Fixes HADOOP-5522.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19@755955 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das 16 年之前
父节点
当前提交
e9dad147f4
共有 2 个文件被更改,包括 19 次插入2 次删除
  1. 4 1
      CHANGES.txt
  2. 15 1
      src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

+ 4 - 1
CHANGES.txt

@@ -77,7 +77,10 @@ Release 0.19.2 - Unreleased
     HADOOP-5479. NameNode should not send empty block replication request to
     HADOOP-5479. NameNode should not send empty block replication request to
     DataNode. (hairong)
     DataNode. (hairong)
 
 
-Release 0.19.1 - 2009-02-23
+    HADOOP-5522. Documents the setup/cleanup tasks in the mapred tutorial.
+    (Amareshwari Sriramadasu via ddas)
+
+Release 0.19.1 - 2009-02-23 
 
 
     HADOOP-5225. Workaround for tmp file handling in HDFS. sync() is 
     HADOOP-5225. Workaround for tmp file handling in HDFS. sync() is 
     incomplete as a result. committed only to 0.19.x. (Raghu Angadi)
     incomplete as a result. committed only to 0.19.x. (Raghu Angadi)

+ 15 - 1
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1591,13 +1591,20 @@
             Setup the job during initialization. For example, create
             Setup the job during initialization. For example, create
             the temporary output directory for the job during the
             the temporary output directory for the job during the
             initialization of the job. 
             initialization of the job. 
+            Job setup is done by a separate task when the job is
+            in PREP state and after initializing tasks. Once the setup task
+            completes, the job will be moved to RUNNING state.
           </li>
           </li>
           <li>
           <li>
             Cleanup the job after the job completion. For example, remove the
             Cleanup the job after the job completion. For example, remove the
             temporary output directory after the job completion.
             temporary output directory after the job completion.
+            Job cleanup is done by a separate task at the end of the job.
+            Job is declared SUCCEDED/FAILED/KILLED after the cleanup
+            task completes.
           </li>
           </li>
           <li>
           <li>
             Setup the task temporary output.
             Setup the task temporary output.
+            Task setup is done as part of the same task, during task initialization.
           </li> 
           </li> 
           <li>
           <li>
             Check whether a task needs a commit. This is to avoid the commit
             Check whether a task needs a commit. This is to avoid the commit
@@ -1605,13 +1612,20 @@
           </li>
           </li>
           <li>
           <li>
             Commit of the task output. 
             Commit of the task output. 
+            Once task is done, the task will commit it's output if required.  
           </li> 
           </li> 
           <li>
           <li>
             Discard the task commit.
             Discard the task commit.
+            If the task has been failed/killed, the output will be cleaned-up. 
+            If task could not cleanup (in exception block), a separate task 
+            will be launched with same attempt-id to do the cleanup.
           </li>
           </li>
         </ol>
         </ol>
         <p><code>FileOutputCommitter</code> is the default 
         <p><code>FileOutputCommitter</code> is the default 
-        <code>OutputCommitter</code>.</p>
+        <code>OutputCommitter</code>. Job setup/cleanup tasks occupy 
+        map or reduce slots, whichever is free on the TaskTracker. And
+        JobCleanup task, TaskCleanup tasks and JobSetup task have the highest
+        priority, and in that order.</p>
         </section>
         </section>
  
  
         <section>
         <section>