14 years ago · fb8f15f3c9
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -37,6 +37,8 @@ Release 0.20.206.0 - unreleased
 
				 
			
 
				     MAPREDUCE-3343. TaskTracker Out of Memory because of distributed cache.
			
 
				     (Zhao Yunjiong).
			
 
				+    
			
 
				+    HADOOP-7297. Remove docs for CN and BN, as they aren't present. (harsh)
			
 
				 
			
 
				   IMPROVEMENTS
			
 
				 
			
--- a/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
+++ b/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
@@ -112,25 +112,9 @@
 
				     		problems.
			
 
				     	</li>
			
 
				     	<li>
			
 
				-    		Secondary NameNode (deprecated): performs periodic checkpoints of the 
			
 
				+    		Secondary NameNode: performs periodic checkpoints of the
			
 
				     		namespace and helps keep the size of file containing log of HDFS 
			
 
				     		modifications within certain limits at the NameNode.
			
 
				-    		Replaced by Checkpoint node.
			
 
				-    	</li>
			
 
				-    	<li>
			
 
				-    		Checkpoint node: performs periodic checkpoints of the namespace and
			
 
				-    		helps minimize the size of the log stored at the NameNode 
			
 
				-    		containing changes to the HDFS.
			
 
				-    		Replaces the role previously filled by the Secondary NameNode. 
			
 
				-    		NameNode allows multiple Checkpoint nodes simultaneously, 
			
 
				-    		as long as there are no Backup nodes registered with the system.
			
 
				-    	</li>
			
 
				-    	<li>
			
 
				-    		Backup node: An extension to the Checkpoint node.
			
 
				-    		In addition to checkpointing it also receives a stream of edits 
			
 
				-    		from the NameNode and maintains its own in-memory copy of the namespace,
			
 
				-    		which is always in sync with the active NameNode namespace state.
			
 
				-    		Only one Backup node may be registered with the NameNode at once.
			
 
				     	</li>
			
 
				       </ul>
			
 
				     </li>
			
@@ -232,12 +216,6 @@
 
				    
			
 
				    </section> 
			
 
				 	<section> <title>Secondary NameNode</title>
			
 
				-   <note>
			
 
				-   The Secondary NameNode has been deprecated. 
			
 
				-   Instead, consider using the 
			
 
				-   <a href="hdfs_user_guide.html#Checkpoint+Node">Checkpoint Node</a> or 
			
 
				-   <a href="hdfs_user_guide.html#Backup+Node">Backup Node</a>.
			
 
				-   </note>
			
 
				    <p>	
			
 
				      The NameNode stores modifications to the file system as a log
			
 
				      appended to a native file system file, <code>edits</code>. 
			
@@ -284,114 +262,6 @@
 
				      For command usage, see  
			
 
				      <a href="commands_manual.html#secondarynamenode">secondarynamenode</a>.
			
 
				    </p>
			
 
				-   
			
 
				-   </section><section> <title> Checkpoint Node </title>
			
 
				-   <p>NameNode persists its namespace using two files: <code>fsimage</code>,
			
 
				-      which is the latest checkpoint of the namespace and <code>edits</code>,
			
 
				-      a journal (log) of changes to the namespace since the checkpoint.
			
 
				-      When a NameNode starts up, it merges the <code>fsimage</code> and
			
 
				-      <code>edits</code> journal to provide an up-to-date view of the
			
 
				-      file system metadata.
			
 
				-      The NameNode then overwrites <code>fsimage</code> with the new HDFS state 
			
 
				-      and begins a new <code>edits</code> journal. 
			
 
				-   </p>
			
 
				-   <p>
			
 
				-     The Checkpoint node periodically creates checkpoints of the namespace. 
			
 
				-     It downloads <code>fsimage</code> and <code>edits</code> from the active 
			
 
				-     NameNode, merges them locally, and uploads the new image back to the 
			
 
				-     active NameNode.
			
 
				-     The Checkpoint node usually runs on a different machine than the NameNode
			
 
				-     since its memory requirements are on the same order as the NameNode.
			
 
				-     The Checkpoint node is started by 
			
 
				-     <code>bin/hdfs namenode -checkpoint</code> on the node 
			
 
				-     specified in the configuration file.
			
 
				-   </p>
			
 
				-   <p>The location of the Checkpoint (or Backup) node and its accompanying 
			
 
				-      web interface are configured via the <code>dfs.backup.address</code> 
			
 
				-      and <code>dfs.backup.http.address</code> configuration variables.
			
 
				-	 </p>
			
 
				-   <p>
			
 
				-     The start of the checkpoint process on the Checkpoint node is 
			
 
				-     controlled by two configuration parameters.
			
 
				-   </p>
			
 
				-   <ul>
			
 
				-      <li>
			
 
				-        <code>fs.checkpoint.period</code>, set to 1 hour by default, specifies
			
 
				-        the maximum delay between two consecutive checkpoints 
			
 
				-      </li>
			
 
				-      <li>
			
 
				-        <code>fs.checkpoint.size</code>, set to 64MB by default, defines the
			
 
				-        size of the edits log file that forces an urgent checkpoint even if 
			
 
				-        the maximum checkpoint delay is not reached.
			
 
				-      </li>
			
 
				-   </ul>
			
 
				-   <p>
			
 
				-     The Checkpoint node stores the latest checkpoint in a  
			
 
				-     directory that is structured the same as the NameNode's
			
 
				-     directory. This allows the checkpointed image to be always available for
			
 
				-     reading by the NameNode if necessary.
			
 
				-     See <a href="hdfs_user_guide.html#Import+Checkpoint">Import Checkpoint</a>.
			
 
				-   </p>
			
 
				-   <p>Multiple checkpoint nodes may be specified in the cluster configuration file.</p>
			
 
				-   <p>
			
 
				-     For command usage, see  
			
 
				-     <a href="commands_manual.html#namenode">namenode</a>.
			
 
				-   </p>
			
 
				-   </section>
			
 
				-
			
 
				-   <section> <title> Backup Node </title>
			
 
				-   <p>	
			
 
				-    The Backup node provides the same checkpointing functionality as the 
			
 
				-    Checkpoint node, as well as maintaining an in-memory, up-to-date copy of the
			
 
				-    file system namespace that is always synchronized with the active NameNode state.
			
 
				-    Along with accepting a journal stream of file system edits from 
			
 
				-    the NameNode and persisting this to disk, the Backup node also applies 
			
 
				-    those edits into its own copy of the namespace in memory, thus creating 
			
 
				-    a backup of the namespace.
			
 
				-   </p>
			
 
				-   <p>
			
 
				-    The Backup node does not need to download 
			
 
				-    <code>fsimage</code> and <code>edits</code> files from the active NameNode
			
 
				-    in order to create a checkpoint, as would be required with a 
			
 
				-    Checkpoint node or Secondary NameNode, since it already has an up-to-date 
			
 
				-    state of the namespace state in memory.
			
 
				-    The Backup node checkpoint process is more efficient as it only needs to 
			
 
				-    save the namespace into the local <code>fsimage</code> file and reset
			
 
				-    <code>edits</code>.
			
 
				-   </p> 
			
 
				-   <p>
			
 
				-    As the Backup node maintains a copy of the
			
 
				-    namespace in memory, its RAM requirements are the same as the NameNode.
			
 
				-   </p> 
			
 
				-   <p>
			
 
				-    The NameNode supports one Backup node at a time. No Checkpoint nodes may be
			
 
				-    registered if a Backup node is in use. Using multiple Backup nodes 
			
 
				-    concurrently will be supported in the future.
			
 
				-   </p> 
			
 
				-   <p>
			
 
				-    The Backup node is configured in the same manner as the Checkpoint node.
			
 
				-    It is started with <code>bin/hdfs namenode -checkpoint</code>.
			
 
				-   </p>
			
 
				-   <p>The location of the Backup (or Checkpoint) node and its accompanying 
			
 
				-      web interface are configured via the <code>dfs.backup.address</code> 
			
 
				-      and <code>dfs.backup.http.address</code> configuration variables.
			
 
				-	 </p>
			
 
				-   <p>
			
 
				-    Use of a Backup node provides the option of running the NameNode with no 
			
 
				-    persistent storage, delegating all responsibility for persisting the state
			
 
				-    of the namespace to the Backup node. 
			
 
				-    To do this, start the NameNode with the 
			
 
				-    <code>-importCheckpoint</code> option, along with specifying no persistent
			
 
				-    storage directories of type edits <code>dfs.name.edits.dir</code> 
			
 
				-    for the NameNode configuration.
			
 
				-   </p> 
			
 
				-   <p>
			
 
				-    For a complete discussion of the motivation behind the creation of the 
			
 
				-    Backup node and Checkpoint node, see 
			
 
				-    <a href="https://issues.apache.org/jira/browse/HADOOP-4539">HADOOP-4539</a>.
			
 
				-    For command usage, see  
			
 
				-     <a href="commands_manual.html#namenode">namenode</a>.
			
 
				-   </p>
			
 
				    </section>
			
 
				 
			
 
				    <section> <title> Import Checkpoint </title>