ソースを参照

HADOOP-3328. When client is writing data to DFS, only the last
datanode in the pipeline needs to verify the checksum. Saves around
30% CPU on intermediate datanodes. (rangadi)


git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@675265 13f79535-47bb-0310-9956-ffa450edef68

Raghu Angadi 17 年 前
コミット
ac1f5630d1
2 ファイル変更14 行追加1 行削除
  1. 4 0
      CHANGES.txt
  2. 10 1
      src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java

+ 4 - 0
CHANGES.txt

@@ -72,6 +72,10 @@ Trunk (unreleased changes)
     singleton MessageDigester by an instance per Thread using 
     ThreadLocal. (Iv?n de Prado via omalley)
 
+    HADOOP-3328. When client is writing data to DFS, only the last 
+    datanode in the pipeline needs to verify the checksum. Saves around
+    30% CPU on intermediate datanodes. (rangadi)
+
   BUG FIXES
 
     HADOOP-3563.  Refactor the distributed upgrade code so that it is 

+ 10 - 1
src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java

@@ -2652,7 +2652,16 @@ public class DataNode extends Configured
 
         buf.position(buf.limit()); // move to the end of the data.
 
-        verifyChunks(pktBuf, dataOff, len, pktBuf, checksumOff);
+        /* skip verifying checksum iff this is not the last one in the 
+         * pipeline and clientName is non-null. i.e. Checksum is verified
+         * on all the datanodes when the data is being written by a 
+         * datanode rather than a client. Whe client is writing the data, 
+         * protocol includes acks and only the last datanode needs to verify 
+         * checksum.
+         */
+        if (mirrorOut == null || clientName.length() == 0) {
+          verifyChunks(pktBuf, dataOff, len, pktBuf, checksumOff);
+        }
 
         try {
           if (!finalized) {