瀏覽代碼

HADOOP-1043. Optimize shuffle, increasing parallelism. Contributed by Devaraj.

git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk@512936 13f79535-47bb-0310-9956-ffa450edef68
Doug Cutting 18 年之前
父節點
當前提交
5aada819c8
共有 2 個文件被更改,包括 12 次插入1 次删除
  1. 3 0
      CHANGES.txt
  2. 9 1
      src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java

+ 3 - 0
CHANGES.txt

@@ -159,6 +159,9 @@ Trunk (unreleased changes)
 47. HADOOP-972.  Optimize HDFS's rack-aware block placement algorithm.
     (Hairong Kuang via cutting)
 
+48. HADOOP-1043.  Optimize shuffle, increasing parallelism.
+    (Devaraj Das via cutting)
+
 
 Release 0.11.2 - 2007-02-16
 

+ 9 - 1
src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java

@@ -608,9 +608,17 @@ class ReduceTaskRunner extends TaskRunner implements MRConstants {
           numInFlight--;
         }
         
+        boolean busy = true;
         // ensure we have enough to keep us busy
         if (numInFlight < lowThreshold && (numOutputs-numCopied) > probe_sample_size) {
-          break;
+          busy = false;
+        }
+        //Check whether we have more CopyResult to check. If there is none, and
+        //we are not busy enough, break
+        synchronized (copyResults) {
+          if (copyResults.size() == 0 && !busy) {
+            break;
+          }
         }
       }