瀏覽代碼

ZOOKEEPER-229. improve documentation regarding user's responsibility to cleanup datadir (snaps/logs)

git-svn-id: https://svn.apache.org/repos/asf/hadoop/zookeeper/trunk@739480 13f79535-47bb-0310-9956-ffa450edef68
Patrick D. Hunt 16 年之前
父節點
當前提交
4685280974

+ 3 - 0
CHANGES.txt

@@ -141,6 +141,9 @@ flavio via mahadev)
 
 
    ZOOKEEPER-215. expand system test environment (breed via phunt)
    ZOOKEEPER-215. expand system test environment (breed via phunt)
 
 
+   ZOOKEEPER-229. improve documentation regarding user's responsibility to
+   cleanup datadir (snaps/logs) (mahadev via phunt)
+
 NEW FEATURES:
 NEW FEATURES:
 
 
    ZOOKEEPER-276. Bookkeeper contribution (Flavio and Luca Telloli via mahadev)
    ZOOKEEPER-276. Bookkeeper contribution (Flavio and Luca Telloli via mahadev)

+ 2 - 0
build.xml

@@ -333,6 +333,8 @@
           <include name="org/apache/zookeeper/Watcher.java"/>
           <include name="org/apache/zookeeper/Watcher.java"/>
           <include name="org/apache/zookeeper/ZooDefs.java"/>
           <include name="org/apache/zookeeper/ZooDefs.java"/>
           <include name="org/apache/zookeeper/ZooKeeper.java"/>
           <include name="org/apache/zookeeper/ZooKeeper.java"/>
+          <include name="org/apache/zookeeper/server/LogFormatter.java"/>
+          <include name="org/apache/zookeeper/server/PurgeTxnLog.java"/>
           <exclude name="org/apache/zookeeper/server/quorum/QuorumPacket"/>
           <exclude name="org/apache/zookeeper/server/quorum/QuorumPacket"/>
     	</fileset>
     	</fileset>
     	<packageset dir="${src_generated.dir}">
     	<packageset dir="${src_generated.dir}">

+ 94 - 29
docs/zookeeperAdmin.html

@@ -231,6 +231,17 @@ document.write("Last Published: " + document.lastModified);
 <a href="#sc_administering">Administering</a>
 <a href="#sc_administering">Administering</a>
 </li>
 </li>
 <li>
 <li>
+<a href="#sc_maintenance">Maintenance</a>
+<ul class="minitoc">
+<li>
+<a href="#Ongoing+Data+Directory+Cleanup">Ongoing Data Directory Cleanup</a>
+</li>
+<li>
+<a href="#Debug+Log+Cleanup+%28log4j%29">Debug Log Cleanup (log4j)</a>
+</li>
+</ul>
+</li>
+<li>
 <a href="#sc_monitoring">Monitoring</a>
 <a href="#sc_monitoring">Monitoring</a>
 </li>
 </li>
 <li>
 <li>
@@ -269,7 +280,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#The+Log+Directory">The Log Directory</a>
 <a href="#The+Log+Directory">The Log Directory</a>
 </li>
 </li>
 <li>
 <li>
-<a href="#File+Management">File Management</a>
+<a href="#sc_filemanagement">File Management</a>
 </li>
 </li>
 </ul>
 </ul>
 </li>
 </li>
@@ -472,7 +483,7 @@ server.3=zoo3:2888:3888</span>
           consists of a single line containing only the text of that machine's
           consists of a single line containing only the text of that machine's
           id. So <span class="codefrag filename">myid</span> of server 1 would contain the text
           id. So <span class="codefrag filename">myid</span> of server 1 would contain the text
           "1" and nothing else. The id must be unique within the
           "1" and nothing else. The id must be unique within the
-          ensemble.</p>
+          ensemble and should have a value between 1 and 255.</p>
         
         
 </li>
 </li>
 
 
@@ -626,6 +637,15 @@ server.3=zoo3:2888:3888</span>
 </li>
 </li>
 
 
         
         
+<li>
+          
+<p>
+<a href="#sc_maintenance">Maintenance</a>
+</p>
+        
+</li>
+
+        
 <li>
 <li>
           
           
 <p>
 <p>
@@ -698,7 +718,7 @@ server.3=zoo3:2888:3888</span>
 </li>
 </li>
       
       
 </ul>
 </ul>
-<a name="N101A6"></a><a name="sc_designing"></a>
+<a name="N101AE"></a><a name="sc_designing"></a>
 <h3 class="h4">Designing a ZooKeeper Deployment</h3>
 <h3 class="h4">Designing a ZooKeeper Deployment</h3>
 <p>The reliablity of ZooKeeper rests on two basic assumptions.</p>
 <p>The reliablity of ZooKeeper rests on two basic assumptions.</p>
 <ol>
 <ol>
@@ -725,7 +745,7 @@ server.3=zoo3:2888:3888</span>
       to hold true. Some of these are cross-machines considerations,
       to hold true. Some of these are cross-machines considerations,
       and others are things you should consider for each and every
       and others are things you should consider for each and every
       machine in your deployment.</p>
       machine in your deployment.</p>
-<a name="N101C2"></a><a name="sc_CrossMachineRequirements"></a>
+<a name="N101CA"></a><a name="sc_CrossMachineRequirements"></a>
 <h4>Cross Machine Requirements</h4>
 <h4>Cross Machine Requirements</h4>
 <p>For the ZooKeeper service to be active, there must be a
 <p>For the ZooKeeper service to be active, there must be a
         majority of non-failing machines that can communicate with
         majority of non-failing machines that can communicate with
@@ -743,7 +763,7 @@ server.3=zoo3:2888:3888</span>
         failure of that switch could cause a correlated failure and
         failure of that switch could cause a correlated failure and
         bring down the service. The same holds true of shared power
         bring down the service. The same holds true of shared power
         circuits, cooling systems, etc.</p>
         circuits, cooling systems, etc.</p>
-<a name="N101CF"></a><a name="Single+Machine+Requirements"></a>
+<a name="N101D7"></a><a name="Single+Machine+Requirements"></a>
 <h4>Single Machine Requirements</h4>
 <h4>Single Machine Requirements</h4>
 <p>If ZooKeeper has to contend with other applications for
 <p>If ZooKeeper has to contend with other applications for
         access to resourses like storage media, CPU, network, or
         access to resourses like storage media, CPU, network, or
@@ -784,19 +804,61 @@ server.3=zoo3:2888:3888</span>
 </li>
 </li>
       
       
 </ul>
 </ul>
-<a name="N101ED"></a><a name="sc_provisioning"></a>
+<a name="N101F5"></a><a name="sc_provisioning"></a>
 <h3 class="h4">Provisioning</h3>
 <h3 class="h4">Provisioning</h3>
 <p></p>
 <p></p>
-<a name="N101F6"></a><a name="sc_strengthsAndLimitations"></a>
+<a name="N101FE"></a><a name="sc_strengthsAndLimitations"></a>
 <h3 class="h4">Things to Consider: ZooKeeper Strengths and Limitations</h3>
 <h3 class="h4">Things to Consider: ZooKeeper Strengths and Limitations</h3>
 <p></p>
 <p></p>
-<a name="N101FF"></a><a name="sc_administering"></a>
+<a name="N10207"></a><a name="sc_administering"></a>
 <h3 class="h4">Administering</h3>
 <h3 class="h4">Administering</h3>
 <p></p>
 <p></p>
-<a name="N10208"></a><a name="sc_monitoring"></a>
+<a name="N10210"></a><a name="sc_maintenance"></a>
+<h3 class="h4">Maintenance</h3>
+<p>Little long term maintenance is required for a ZooKeeper
+        cluster however you must be aware of the following:</p>
+<a name="N10219"></a><a name="Ongoing+Data+Directory+Cleanup"></a>
+<h4>Ongoing Data Directory Cleanup</h4>
+<p>The ZooKeeper <a href="#var_datadir">Data
+          Directory</a> contains files which are a persistent copy
+          of the znodes stored by a particular serving ensemble. These
+          are the snapshot and transactional log files. As changes are
+          made to the znodes these changes are appended to a
+          transaction log, occasionally, when a log grows large, a
+          snapshot of the current state of all znodes will be written
+          to the filesystem. This snapshot supercedes all previous
+          logs.
+        </p>
+<p>A ZooKeeper server <strong>will not remove
+          old snapshots and log files</strong>, this is the
+          responsibility of the operator. Every serving environment is
+          different and therefore the requirements of managing these
+          files may differ from install to install (backup for example).
+        </p>
+<p>The PurgeTxnLog utility implements a simple retention
+        policy that administrators can use. The <a href="api/index.html">API docs</a> contains details on
+        calling conventions (arguments, etc...).
+        </p>
+<p>In the following example the last count snapshots and
+        their corresponding logs are retained and the others are
+        deleted.  The value of &lt;count&gt; should typically be
+        greater than 3 (although not required, this provides 3 backups
+        in the unlikely event a recent log has become corrupted). This
+        can be run as a cron job on the ZooKeeper server machines to
+        clean up the logs daily.</p>
+<pre class="code"> java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog &lt;dataDir&gt; &lt;snapDir&gt; -n &lt;count&gt;</pre>
+<a name="N1023A"></a><a name="Debug+Log+Cleanup+%28log4j%29"></a>
+<h4>Debug Log Cleanup (log4j)</h4>
+<p>See the section on <a href="#sc_logging">logging</a> in this document. It is
+        expected that you will setup a rolling file appender using the
+        in-built log4j feature. The sample configuration file in the
+        release tar's conf/log4j.properties provides an example of
+        this.
+        </p>
+<a name="N10249"></a><a name="sc_monitoring"></a>
 <h3 class="h4">Monitoring</h3>
 <h3 class="h4">Monitoring</h3>
 <p></p>
 <p></p>
-<a name="N10211"></a><a name="sc_logging"></a>
+<a name="N10252"></a><a name="sc_logging"></a>
 <h3 class="h4">Logging</h3>
 <h3 class="h4">Logging</h3>
 <p>ZooKeeper uses <strong>log4j</strong> version 1.2 as 
 <p>ZooKeeper uses <strong>log4j</strong> version 1.2 as 
       its logging infrastructure. The  ZooKeeper default <span class="codefrag filename">log4j.properties</span> 
       its logging infrastructure. The  ZooKeeper default <span class="codefrag filename">log4j.properties</span> 
@@ -806,10 +868,10 @@ server.3=zoo3:2888:3888</span>
 <p>For more information, see 
 <p>For more information, see 
       <a href="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</a> 
       <a href="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</a> 
       of the log4j manual.</p>
       of the log4j manual.</p>
-<a name="N10231"></a><a name="sc_troubleshooting"></a>
+<a name="N10272"></a><a name="sc_troubleshooting"></a>
 <h3 class="h4">Troubleshooting</h3>
 <h3 class="h4">Troubleshooting</h3>
 <p></p>
 <p></p>
-<a name="N1023A"></a><a name="sc_configuration"></a>
+<a name="N1027B"></a><a name="sc_configuration"></a>
 <h3 class="h4">Configuration Parameters</h3>
 <h3 class="h4">Configuration Parameters</h3>
 <p>ZooKeeper's behavior is governed by the ZooKeeper configuration
 <p>ZooKeeper's behavior is governed by the ZooKeeper configuration
       file. This file is designed so that the exact same file can be used by
       file. This file is designed so that the exact same file can be used by
@@ -817,7 +879,7 @@ server.3=zoo3:2888:3888</span>
       layouts are the same. If servers use different configuration files, care
       layouts are the same. If servers use different configuration files, care
       must be taken to ensure that the list of servers in all of the different
       must be taken to ensure that the list of servers in all of the different
       configuration files match.</p>
       configuration files match.</p>
-<a name="N10243"></a><a name="sc_minimumConfiguration"></a>
+<a name="N10284"></a><a name="sc_minimumConfiguration"></a>
 <h4>Minimum Configuration</h4>
 <h4>Minimum Configuration</h4>
 <p>Here are the minimum configuration keywords that must be defined
 <p>Here are the minimum configuration keywords that must be defined
         in the configuration file:</p>
         in the configuration file:</p>
@@ -864,7 +926,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N1026A"></a><a name="sc_advancedConfiguration"></a>
+<a name="N102AB"></a><a name="sc_advancedConfiguration"></a>
 <h4>Advanced Configuration</h4>
 <h4>Advanced Configuration</h4>
 <p>The configuration settings in the section are optional. You can
 <p>The configuration settings in the section are optional. You can
         use them to further fine tune the behaviour of your ZooKeeper servers.
         use them to further fine tune the behaviour of your ZooKeeper servers.
@@ -955,7 +1017,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N102CA"></a><a name="sc_clusterOptions"></a>
+<a name="N1030B"></a><a name="sc_clusterOptions"></a>
 <h4>Cluster Options</h4>
 <h4>Cluster Options</h4>
 <p>The options in this section are designed for use with an ensemble
 <p>The options in this section are designed for use with an ensemble
         of servers -- that is, when deploying clusters of servers.</p>
         of servers -- that is, when deploying clusters of servers.</p>
@@ -1045,7 +1107,7 @@ server.3=zoo3:2888:3888</span>
         
         
 </dl>
 </dl>
 <p></p>
 <p></p>
-<a name="N10327"></a><a name="Unsafe+Options"></a>
+<a name="N10368"></a><a name="Unsafe+Options"></a>
 <h4>Unsafe Options</h4>
 <h4>Unsafe Options</h4>
 <p>The following options can be useful, but be careful when you use
 <p>The following options can be useful, but be careful when you use
         them. The risk of each is explained along with the explanation of what
         them. The risk of each is explained along with the explanation of what
@@ -1090,7 +1152,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N10359"></a><a name="sc_zkCommands"></a>
+<a name="N1039A"></a><a name="sc_zkCommands"></a>
 <h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
 <h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
 <p>ZooKeeper responds to a small set of commands. Each command is
 <p>ZooKeeper responds to a small set of commands. Each command is
       composed of four letters. You issue the commands to ZooKeeper via telnet
       composed of four letters. You issue the commands to ZooKeeper via telnet
@@ -1163,7 +1225,7 @@ server.3=zoo3:2888:3888</span>
 <pre class="code">$ echo ruok | nc 127.0.0.1 5111
 <pre class="code">$ echo ruok | nc 127.0.0.1 5111
 imok
 imok
 </pre>
 </pre>
-<a name="N103A0"></a><a name="sc_dataFileManagement"></a>
+<a name="N103E1"></a><a name="sc_dataFileManagement"></a>
 <h3 class="h4">Data File Management</h3>
 <h3 class="h4">Data File Management</h3>
 <p>ZooKeeper stores its data in a data directory and its transaction
 <p>ZooKeeper stores its data in a data directory and its transaction
       log in a transaction log directory. By default these two directories are
       log in a transaction log directory. By default these two directories are
@@ -1171,7 +1233,7 @@ imok
       transaction log files in a separate directory than the data files.
       transaction log files in a separate directory than the data files.
       Throughput increases and latency decreases when transaction logs reside
       Throughput increases and latency decreases when transaction logs reside
       on a dedicated log devices.</p>
       on a dedicated log devices.</p>
-<a name="N103A9"></a><a name="The+Data+Directory"></a>
+<a name="N103EA"></a><a name="The+Data+Directory"></a>
 <h4>The Data Directory</h4>
 <h4>The Data Directory</h4>
 <p>This directory has two files in it:</p>
 <p>This directory has two files in it:</p>
 <ul>
 <ul>
@@ -1217,14 +1279,14 @@ imok
         idempotent nature of its updates. By replaying the transaction log
         idempotent nature of its updates. By replaying the transaction log
         against fuzzy snapshots ZooKeeper gets the state of the system at the
         against fuzzy snapshots ZooKeeper gets the state of the system at the
         end of the log.</p>
         end of the log.</p>
-<a name="N103E5"></a><a name="The+Log+Directory"></a>
+<a name="N10426"></a><a name="The+Log+Directory"></a>
 <h4>The Log Directory</h4>
 <h4>The Log Directory</h4>
 <p>The Log Directory contains the ZooKeeper transaction logs.
 <p>The Log Directory contains the ZooKeeper transaction logs.
         Before any update takes place, ZooKeeper ensures that the transaction
         Before any update takes place, ZooKeeper ensures that the transaction
         that represents the update is written to non-volatile storage. A new
         that represents the update is written to non-volatile storage. A new
         log file is started each time a snapshot is begun. The log file's
         log file is started each time a snapshot is begun. The log file's
         suffix is the first zxid written to that log.</p>
         suffix is the first zxid written to that log.</p>
-<a name="N103EF"></a><a name="File+Management"></a>
+<a name="N10430"></a><a name="sc_filemanagement"></a>
 <h4>File Management</h4>
 <h4>File Management</h4>
 <p>The format of snapshot and log files does not change between
 <p>The format of snapshot and log files does not change between
         standalone ZooKeeper servers and different configurations of
         standalone ZooKeeper servers and different configurations of
@@ -1235,13 +1297,16 @@ imok
         state of ZooKeeper servers and even restore that state. The
         state of ZooKeeper servers and even restore that state. The
         LogFormatter class allows an administrator to look at the transactions
         LogFormatter class allows an administrator to look at the transactions
         in a log.</p>
         in a log.</p>
-<p>The ZooKeeper server creates snapshot and log files, but never
-        deletes them. The retention policy of the data and log files is
-        implemented outside of the ZooKeeper server. The server itself only
-        needs the latest complete fuzzy snapshot and the log files from the
-        start of that snapshot. The PurgeTxnLog utility implements a simple
-        retention policy that administrators can use.</p>
-<a name="N10400"></a><a name="sc_commonProblems"></a>
+<p>The ZooKeeper server creates snapshot and log files, but
+        never deletes them. The retention policy of the data and log
+        files is implemented outside of the ZooKeeper server. The
+        server itself only needs the latest complete fuzzy snapshot
+        and the log files from the start of that snapshot. See the
+        <a href="#sc_maintenance">maintenance</a> section in
+        this document for more details on setting a retention policy
+        and maintenance of ZooKeeper storage.
+        </p>
+<a name="N10445"></a><a name="sc_commonProblems"></a>
 <h3 class="h4">Things to Avoid</h3>
 <h3 class="h4">Things to Avoid</h3>
 <p>Here are some common problems you can avoid by configuring
 <p>Here are some common problems you can avoid by configuring
       ZooKeeper correctly:</p>
       ZooKeeper correctly:</p>
@@ -1295,7 +1360,7 @@ imok
 </dd>
 </dd>
       
       
 </dl>
 </dl>
-<a name="N10424"></a><a name="sc_bestPractices"></a>
+<a name="N10469"></a><a name="sc_bestPractices"></a>
 <h3 class="h4">Best Practices</h3>
 <h3 class="h4">Best Practices</h3>
 <p>For best results, take note of the following list of good
 <p>For best results, take note of the following list of good
       Zookeeper practices. <em>[tbd...]</em>
       Zookeeper practices. <em>[tbd...]</em>

文件差異過大導致無法顯示
+ 21 - 10
docs/zookeeperAdmin.pdf


+ 13 - 4
docs/zookeeperStarted.html

@@ -198,6 +198,9 @@ document.write("Last Published: " + document.lastModified);
 <a href="#sc_InstallingSingleMode">Standalone Operation</a>
 <a href="#sc_InstallingSingleMode">Standalone Operation</a>
 </li>
 </li>
 <li>
 <li>
+<a href="#sc_FileManagement">Managing ZooKeeper Storage</a>
+</li>
+<li>
 <a href="#sc_ConnectingToZooKeeper">Connecting to ZooKeeper</a>
 <a href="#sc_ConnectingToZooKeeper">Connecting to ZooKeeper</a>
 </li>
 </li>
 <li>
 <li>
@@ -313,7 +316,13 @@ clientPort=2181
       This is fine for most development situations, but to run ZooKeeper in
       This is fine for most development situations, but to run ZooKeeper in
       replicated mode, please see <a href="#sc_RunningReplicatedZooKeeper">Running Replicated
       replicated mode, please see <a href="#sc_RunningReplicatedZooKeeper">Running Replicated
       ZooKeeper</a>.</p>
       ZooKeeper</a>.</p>
-<a name="N10083"></a><a name="sc_ConnectingToZooKeeper"></a>
+<a name="N10083"></a><a name="sc_FileManagement"></a>
+<h3 class="h4">Managing ZooKeeper Storage</h3>
+<p>For long running production systems ZooKeeper storage must
+      be managed externally (dataDir and logs). See the section on
+      <a href="zookeeperAdmin.html#sc_maintenance">maintenance</a> for
+      more details.</p>
+<a name="N10091"></a><a name="sc_ConnectingToZooKeeper"></a>
 <h3 class="h4">Connecting to ZooKeeper</h3>
 <h3 class="h4">Connecting to ZooKeeper</h3>
 <p>Once ZooKeeper is running, you have several options for connection
 <p>Once ZooKeeper is running, you have several options for connection
       to it:</p>
       to it:</p>
@@ -363,7 +372,7 @@ clientPort=2181
 </li>
 </li>
       
       
 </ul>
 </ul>
-<a name="N100C6"></a><a name="sc_ProgrammingToZooKeeper"></a>
+<a name="N100D4"></a><a name="sc_ProgrammingToZooKeeper"></a>
 <h3 class="h4">Programming to ZooKeeper</h3>
 <h3 class="h4">Programming to ZooKeeper</h3>
 <p>ZooKeeper has a Java bindings and C bindings. They are
 <p>ZooKeeper has a Java bindings and C bindings. They are
       functionally equivalent. The C bindings exist in two variants: single
       functionally equivalent. The C bindings exist in two variants: single
@@ -371,7 +380,7 @@ clientPort=2181
       is done. For more information, see the <a href="zookeeperProgrammers.html#ch_programStructureWithExample.html">Programming
       is done. For more information, see the <a href="zookeeperProgrammers.html#ch_programStructureWithExample.html">Programming
       Examples in the ZooKeeper Programmer's Guide</a> for
       Examples in the ZooKeeper Programmer's Guide</a> for
       sample code using of the different APIs.</p>
       sample code using of the different APIs.</p>
-<a name="N100D4"></a><a name="sc_RunningReplicatedZooKeeper"></a>
+<a name="N100E2"></a><a name="sc_RunningReplicatedZooKeeper"></a>
 <h3 class="h4">Running Replicated ZooKeeper</h3>
 <h3 class="h4">Running Replicated ZooKeeper</h3>
 <p>Running ZooKeeper in standalone mode is convenient for evaluation,
 <p>Running ZooKeeper in standalone mode is convenient for evaluation,
       some development, and testing. But in production, you should run
       some development, and testing. But in production, you should run
@@ -431,7 +440,7 @@ server.3=zoo3:2888:3888
       
       
 </div>
 </div>
 </div>
 </div>
-<a name="N10111"></a><a name="Other+Optimizations"></a>
+<a name="N1011F"></a><a name="Other+Optimizations"></a>
 <h3 class="h4">Other Optimizations</h3>
 <h3 class="h4">Other Optimizations</h3>
 <p>There are a couple of other configuration parameters that can
 <p>There are a couple of other configuration parameters that can
       greatly increase performance:</p>
       greatly increase performance:</p>

文件差異過大導致無法顯示
+ 17 - 6
docs/zookeeperStarted.pdf


+ 74 - 8
src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml

@@ -294,6 +294,10 @@ server.3=zoo3:2888:3888</computeroutput></para>
           <para><xref linkend="sc_administering" /></para>
           <para><xref linkend="sc_administering" /></para>
         </listitem>
         </listitem>
 
 
+        <listitem>
+          <para><xref linkend="sc_maintenance" /></para>
+        </listitem>
+
         <listitem>
         <listitem>
           <para><xref linkend="sc_monitoring" /></para>
           <para><xref linkend="sc_monitoring" /></para>
         </listitem>
         </listitem>
@@ -429,6 +433,65 @@ server.3=zoo3:2888:3888</computeroutput></para>
       <para></para>
       <para></para>
     </section>
     </section>
 
 
+    <section id="sc_maintenance">
+      <title>Maintenance</title>
+
+      <para>Little long term maintenance is required for a ZooKeeper
+        cluster however you must be aware of the following:</para>
+
+      <section>
+        <title>Ongoing Data Directory Cleanup</title>
+
+        <para>The ZooKeeper <ulink url="#var_datadir">Data
+          Directory</ulink> contains files which are a persistent copy
+          of the znodes stored by a particular serving ensemble. These
+          are the snapshot and transactional log files. As changes are
+          made to the znodes these changes are appended to a
+          transaction log, occasionally, when a log grows large, a
+          snapshot of the current state of all znodes will be written
+          to the filesystem. This snapshot supercedes all previous
+          logs.
+        </para>
+
+        <para>A ZooKeeper server <emphasis role="bold">will not remove
+          old snapshots and log files</emphasis>, this is the
+          responsibility of the operator. Every serving environment is
+          different and therefore the requirements of managing these
+          files may differ from install to install (backup for example).
+        </para>
+
+        <para>The PurgeTxnLog utility implements a simple retention
+        policy that administrators can use. The <ulink
+        url="ext:api/index">API docs</ulink> contains details on
+        calling conventions (arguments, etc...).
+        </para>
+
+        <para>In the following example the last count snapshots and
+        their corresponding logs are retained and the others are
+        deleted.  The value of &lt;count&gt; should typically be
+        greater than 3 (although not required, this provides 3 backups
+        in the unlikely event a recent log has become corrupted). This
+        can be run as a cron job on the ZooKeeper server machines to
+        clean up the logs daily.</para>
+
+        <programlisting> java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog &lt;dataDir&gt; &lt;snapDir&gt; -n &lt;count&gt;</programlisting>
+
+      </section>
+
+      <section>
+        <title>Debug Log Cleanup (log4j)</title>
+
+        <para>See the section on <ulink
+        url="#sc_logging">logging</ulink> in this document. It is
+        expected that you will setup a rolling file appender using the
+        in-built log4j feature. The sample configuration file in the
+        release tar's conf/log4j.properties provides an example of
+        this.
+        </para>
+      </section>
+
+    </section>
+
     <section id="sc_monitoring">
     <section id="sc_monitoring">
       <title>Monitoring</title>
       <title>Monitoring</title>
 
 
@@ -482,7 +545,7 @@ server.3=zoo3:2888:3888</computeroutput></para>
             </listitem>
             </listitem>
           </varlistentry>
           </varlistentry>
 
 
-          <varlistentry>
+          <varlistentry id="var_datadir">
             <term>dataDir</term>
             <term>dataDir</term>
 
 
             <listitem>
             <listitem>
@@ -914,7 +977,7 @@ imok
         suffix is the first zxid written to that log.</para>
         suffix is the first zxid written to that log.</para>
       </section>
       </section>
 
 
-      <section>
+      <section id="sc_filemanagement">
         <title>File Management</title>
         <title>File Management</title>
 
 
         <para>The format of snapshot and log files does not change between
         <para>The format of snapshot and log files does not change between
@@ -928,12 +991,15 @@ imok
         LogFormatter class allows an administrator to look at the transactions
         LogFormatter class allows an administrator to look at the transactions
         in a log.</para>
         in a log.</para>
 
 
-        <para>The ZooKeeper server creates snapshot and log files, but never
-        deletes them. The retention policy of the data and log files is
-        implemented outside of the ZooKeeper server. The server itself only
-        needs the latest complete fuzzy snapshot and the log files from the
-        start of that snapshot. The PurgeTxnLog utility implements a simple
-        retention policy that administrators can use.</para>
+        <para>The ZooKeeper server creates snapshot and log files, but
+        never deletes them. The retention policy of the data and log
+        files is implemented outside of the ZooKeeper server. The
+        server itself only needs the latest complete fuzzy snapshot
+        and the log files from the start of that snapshot. See the
+        <ulink url="#sc_maintenance">maintenance</ulink> section in
+        this document for more details on setting a retention policy
+        and maintenance of ZooKeeper storage.
+        </para>
       </section>
       </section>
     </section>
     </section>
 
 

+ 10 - 1
src/docs/src/documentation/content/xdocs/zookeeperStarted.xml

@@ -73,7 +73,7 @@
           stable</ulink> release from one of the Apache Download
           stable</ulink> release from one of the Apache Download
         Mirrors.</para>
         Mirrors.</para>
     </section>
     </section>
-
+	
     <section id="sc_InstallingSingleMode">
     <section id="sc_InstallingSingleMode">
       <title>Standalone Operation</title>
       <title>Standalone Operation</title>
 
 
@@ -151,6 +151,15 @@ clientPort=2181
       url="#sc_RunningReplicatedZooKeeper">Running Replicated
       url="#sc_RunningReplicatedZooKeeper">Running Replicated
       ZooKeeper</ulink>.</para>
       ZooKeeper</ulink>.</para>
     </section>
     </section>
+	
+    <section id="sc_FileManagement">
+      <title>Managing ZooKeeper Storage</title>
+      <para>For long running production systems ZooKeeper storage must
+      be managed externally (dataDir and logs). See the section on
+      <ulink
+      url="zookeeperAdmin.html#sc_maintenance">maintenance</ulink> for
+      more details.</para>
+    </section>
 
 
     <section id="sc_ConnectingToZooKeeper">
     <section id="sc_ConnectingToZooKeeper">
       <title>Connecting to ZooKeeper</title>
       <title>Connecting to ZooKeeper</title>

部分文件因文件數量過多而無法顯示