Pārlūkot izejas kodu

HADOOP-343. Fix mapred copying so that a failed tasktracker does not slow other copies. Contributed by Sameer.

git-svn-id: https://svn.apache.org/repos/asf/lucene/hadoop/trunk@452945 13f79535-47bb-0310-9956-ffa450edef68
Doug Cutting 19 gadi atpakaļ
vecāks
revīzija
e65689b77c
2 mainītis faili ar 17 papildinājumiem un 0 dzēšanām
  1. 3 0
      CHANGES.txt
  2. 14 0
      src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java

+ 3 - 0
CHANGES.txt

@@ -132,6 +132,9 @@ Trunk (unreleased changes)
     permits, e.g., TextInputFormat to again operate on non-UTF-8 data.
     (Hairong and Mahadev via cutting)
 
+32. HADOOP-343.  Fix mapred copying so that a failed tasktracker
+    doesn't cause other copies to slow.  (Sameer Paranjpye via cutting)
+
 
 Release 0.6.2 - 2006-09-18
 

+ 14 - 0
src/java/org/apache/hadoop/mapred/ReduceTaskRunner.java

@@ -455,6 +455,20 @@ class ReduceTaskRunner extends TaskRunner {
             LOG.warn(reduceTask.getTaskId() + " adding host " +
                      cr.getHost() + " to penalty box, next contact in " +
                      ((nextContact-currentTime)/1000) + " seconds");
+
+            // other outputs from the failed host may be present in the
+            // knownOutputs cache, purge them. This is important in case
+            // the failure is due to a lost tasktracker (causes many
+            // unnecessary backoffs). If not, we only take a small hit
+            // polling the jobtracker a few more times
+            ListIterator locIt = knownOutputs.listIterator();
+            while (locIt.hasNext()) {
+              MapOutputLocation loc = (MapOutputLocation)locIt.next();
+              if (cr.getHost().equals(loc.getHost())) {
+                locIt.remove();
+                neededOutputs.add(new Integer(loc.getMapId()));
+              }
+            }
           }
           uniqueHosts.remove(cr.getHost());
           numInFlight--;