소스 검색

HDFS-9048. DistCp documentation is out-of-dated (Daisuke Kobayashi via iwasakims)

(cherry picked from commit 33a412e8a4ab729d588a9576fb7eb90239c6e383)
Masatake Iwasaki 9 년 전
부모
커밋
8095c612a3
2개의 변경된 파일10개의 추가작업 그리고 6개의 파일을 삭제
  1. 3 0
      hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
  2. 7 6
      hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm

+ 3 - 0
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt

@@ -1963,6 +1963,9 @@ Release 2.7.3 - UNRELEASED
     HDFS-8791. block ID-based DN storage layout can be very slow for datanode
     on ext4 (Chris Trezzo via kihwal)
 
+    HDFS-9048. DistCp documentation is out-of-dated
+    (Daisuke Kobayashi via iwasakims)
+
   OPTIMIZATIONS
 
   BUG FIXES

+ 7 - 6
hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm

@@ -412,12 +412,13 @@ $H3 Map sizing
 
 $H3 Copying Between Versions of HDFS
 
-  For copying between two different versions of Hadoop, one will usually use
-  HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
-  destination cluster (more specifically, on NodeManagers that can write to the
-  destination cluster). Each source is specified as
-  `hftp://<dfs.http.address>/<path>` (the default `dfs.http.address` is
-  `<namenode>:50070`).
+  For copying between two different major versions of Hadoop (e.g. between 1.X
+  and 2.X), one will usually use WebHdfsFileSystem. Unlike the previous
+  HftpFileSystem, as webhdfs is available for both read and write operations,
+  DistCp can be run on both source and destination cluster.
+  Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
+  When copying between same major versions of Hadoop cluster (e.g. between 2.X
+  and 2.X), use hdfs protocol for better performance.
 
 $H3 MapReduce and other side-effects