浏览代码

YARN-7249. Fix CapacityScheduler NPE issue when a container preempted while the node is being removed. Contributed by Wangda Tan.

(cherry picked from commit f794adde3bb3488d876ebc8de3796956de503e0d)
Eric Payne 7 年之前
父节点
当前提交
76e053a910

+ 6 - 0
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java

@@ -1585,6 +1585,12 @@ public class CapacityScheduler extends
     
     // Get the node on which the container was allocated
     FiCaSchedulerNode node = getNode(container.getNodeId());
+    if (node == null) {
+      LOG.info("Container=" + container + " of application=" + appId
+          + " completed with event=" + event + " on a node=" + container
+          .getNodeId() + ". However the node might be already removed by RM.");
+      return;
+    }
     
     // Inform the queue
     LeafQueue queue = (LeafQueue)application.getQueue();