浏览代码

Merge -r 755937:755938 from trunk onto 0.19 branch. Fixes HADOOP-5522.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19@755955 13f79535-47bb-0310-9956-ffa450edef68
Devaraj Das 16 年之前
父节点
当前提交
e9dad147f4
共有 2 个文件被更改,包括 19 次插入2 次删除
  1. 4 1
      CHANGES.txt
  2. 15 1
      src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

+ 4 - 1
CHANGES.txt

@@ -77,7 +77,10 @@ Release 0.19.2 - Unreleased
     HADOOP-5479. NameNode should not send empty block replication request to
     DataNode. (hairong)
 
-Release 0.19.1 - 2009-02-23
+    HADOOP-5522. Documents the setup/cleanup tasks in the mapred tutorial.
+    (Amareshwari Sriramadasu via ddas)
+
+Release 0.19.1 - 2009-02-23 
 
     HADOOP-5225. Workaround for tmp file handling in HDFS. sync() is 
     incomplete as a result. committed only to 0.19.x. (Raghu Angadi)

+ 15 - 1
src/docs/src/documentation/content/xdocs/mapred_tutorial.xml

@@ -1591,13 +1591,20 @@
             Setup the job during initialization. For example, create
             the temporary output directory for the job during the
             initialization of the job. 
+            Job setup is done by a separate task when the job is
+            in PREP state and after initializing tasks. Once the setup task
+            completes, the job will be moved to RUNNING state.
           </li>
           <li>
             Cleanup the job after the job completion. For example, remove the
             temporary output directory after the job completion.
+            Job cleanup is done by a separate task at the end of the job.
+            Job is declared SUCCEDED/FAILED/KILLED after the cleanup
+            task completes.
           </li>
           <li>
             Setup the task temporary output.
+            Task setup is done as part of the same task, during task initialization.
           </li> 
           <li>
             Check whether a task needs a commit. This is to avoid the commit
@@ -1605,13 +1612,20 @@
           </li>
           <li>
             Commit of the task output. 
+            Once task is done, the task will commit it's output if required.  
           </li> 
           <li>
             Discard the task commit.
+            If the task has been failed/killed, the output will be cleaned-up. 
+            If task could not cleanup (in exception block), a separate task 
+            will be launched with same attempt-id to do the cleanup.
           </li>
         </ol>
         <p><code>FileOutputCommitter</code> is the default 
-        <code>OutputCommitter</code>.</p>
+        <code>OutputCommitter</code>. Job setup/cleanup tasks occupy 
+        map or reduce slots, whichever is free on the TaskTracker. And
+        JobCleanup task, TaskCleanup tasks and JobSetup task have the highest
+        priority, and in that order.</p>
         </section>
  
         <section>