Просмотр исходного кода

ZOOKEEPER-693: hudson failure in ZKDatabaseCorruptionTest (mahadev via henryr)

git-svn-id: https://svn.apache.org/repos/asf/hadoop/zookeeper/trunk@920605 13f79535-47bb-0310-9956-ffa450edef68
Henry Robinson 15 лет назад
Родитель
Сommit
0051cd2608

+ 2 - 0
CHANGES.txt

@@ -326,6 +326,8 @@ IMPROVEMENTS:
   ZOOKEEPER-688. explain session expiration better in the docs & faq (phunt
   ZOOKEEPER-688. explain session expiration better in the docs & faq (phunt
   via mahadev)
   via mahadev)
 
 
+  ZOOKEEPER-663. hudson failure in ZKDatabaseCorruptionTest (mahadev via henryr)
+
 NEW FEATURES:
 NEW FEATURES:
   ZOOKEEPER-539. generate eclipse project via ant target. (phunt via mahadev)
   ZOOKEEPER-539. generate eclipse project via ant target. (phunt via mahadev)
 
 

BIN
docs/skin/images/rc-b-l-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-b-r-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-b-r-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-l-5-1header-2searchbox-3searchbox.png


BIN
docs/skin/images/rc-t-l-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-l-5-1header-2tab-unselected-3tab-unselected.png


BIN
docs/skin/images/rc-t-r-15-1body-2menu-3menu.png


BIN
docs/skin/images/rc-t-r-5-1header-2searchbox-3searchbox.png


BIN
docs/skin/images/rc-t-r-5-1header-2tab-selected-3tab-selected.png


BIN
docs/skin/images/rc-t-r-5-1header-2tab-unselected-3tab-unselected.png


+ 31 - 14
docs/zookeeperAdmin.html

@@ -927,8 +927,25 @@ server.3=zoo3:2888:3888</span>
       of the log4j manual.</p>
       of the log4j manual.</p>
 <a name="N10298"></a><a name="sc_troubleshooting"></a>
 <a name="N10298"></a><a name="sc_troubleshooting"></a>
 <h3 class="h4">Troubleshooting</h3>
 <h3 class="h4">Troubleshooting</h3>
-<p></p>
-<a name="N102A1"></a><a name="sc_configuration"></a>
+<dl>
+		
+<dt>
+<term> Server not coming up because of file corruption</term>
+</dt>
+<dd>
+<p>A server might not be able to read its database and fail to come up because of 
+		some file corruption in the transaction logs of the ZooKeeper server. You will
+		see some IOException on loading ZooKeeper database. In such a case,
+		make sure all the other servers in your ensemble are up and  working. Use "stat" 
+		command on the command port to see if they are in good health. After you have verified that
+		all the other servers of the ensemble are up, you can go ahead and clean the database
+		of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
+		Restart the server.
+		</p>
+</dd>
+		
+</dl>
+<a name="N102A9"></a><a name="sc_configuration"></a>
 <h3 class="h4">Configuration Parameters</h3>
 <h3 class="h4">Configuration Parameters</h3>
 <p>ZooKeeper's behavior is governed by the ZooKeeper configuration
 <p>ZooKeeper's behavior is governed by the ZooKeeper configuration
       file. This file is designed so that the exact same file can be used by
       file. This file is designed so that the exact same file can be used by
@@ -936,7 +953,7 @@ server.3=zoo3:2888:3888</span>
       layouts are the same. If servers use different configuration files, care
       layouts are the same. If servers use different configuration files, care
       must be taken to ensure that the list of servers in all of the different
       must be taken to ensure that the list of servers in all of the different
       configuration files match.</p>
       configuration files match.</p>
-<a name="N102AA"></a><a name="sc_minimumConfiguration"></a>
+<a name="N102B2"></a><a name="sc_minimumConfiguration"></a>
 <h4>Minimum Configuration</h4>
 <h4>Minimum Configuration</h4>
 <p>Here are the minimum configuration keywords that must be defined
 <p>Here are the minimum configuration keywords that must be defined
         in the configuration file:</p>
         in the configuration file:</p>
@@ -983,7 +1000,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N102D1"></a><a name="sc_advancedConfiguration"></a>
+<a name="N102D9"></a><a name="sc_advancedConfiguration"></a>
 <h4>Advanced Configuration</h4>
 <h4>Advanced Configuration</h4>
 <p>The configuration settings in the section are optional. You can
 <p>The configuration settings in the section are optional. You can
         use them to further fine tune the behaviour of your ZooKeeper servers.
         use them to further fine tune the behaviour of your ZooKeeper servers.
@@ -1083,7 +1100,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N1033A"></a><a name="sc_clusterOptions"></a>
+<a name="N10342"></a><a name="sc_clusterOptions"></a>
 <h4>Cluster Options</h4>
 <h4>Cluster Options</h4>
 <p>The options in this section are designed for use with an ensemble
 <p>The options in this section are designed for use with an ensemble
         of servers -- that is, when deploying clusters of servers.</p>
         of servers -- that is, when deploying clusters of servers.</p>
@@ -1221,7 +1238,7 @@ server.3=zoo3:2888:3888</span>
         
         
 </dl>
 </dl>
 <p></p>
 <p></p>
-<a name="N103BA"></a><a name="sc_authOptions"></a>
+<a name="N103C2"></a><a name="sc_authOptions"></a>
 <h4>Authentication &amp; Authorization Options</h4>
 <h4>Authentication &amp; Authorization Options</h4>
 <p>The options in this section allow control over
 <p>The options in this section allow control over
         authentication/authorization performed by the service.</p>
         authentication/authorization performed by the service.</p>
@@ -1255,7 +1272,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N103DD"></a><a name="Unsafe+Options"></a>
+<a name="N103E5"></a><a name="Unsafe+Options"></a>
 <h4>Unsafe Options</h4>
 <h4>Unsafe Options</h4>
 <p>The following options can be useful, but be careful when you use
 <p>The following options can be useful, but be careful when you use
         them. The risk of each is explained along with the explanation of what
         them. The risk of each is explained along with the explanation of what
@@ -1300,7 +1317,7 @@ server.3=zoo3:2888:3888</span>
 </dd>
 </dd>
         
         
 </dl>
 </dl>
-<a name="N1040F"></a><a name="sc_zkCommands"></a>
+<a name="N10417"></a><a name="sc_zkCommands"></a>
 <h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
 <h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
 <p>ZooKeeper responds to a small set of commands. Each command is
 <p>ZooKeeper responds to a small set of commands. Each command is
       composed of four letters. You issue the commands to ZooKeeper via telnet
       composed of four letters. You issue the commands to ZooKeeper via telnet
@@ -1421,7 +1438,7 @@ server.3=zoo3:2888:3888</span>
 <pre class="code">$ echo ruok | nc 127.0.0.1 5111
 <pre class="code">$ echo ruok | nc 127.0.0.1 5111
 imok
 imok
 </pre>
 </pre>
-<a name="N10477"></a><a name="sc_dataFileManagement"></a>
+<a name="N1047F"></a><a name="sc_dataFileManagement"></a>
 <h3 class="h4">Data File Management</h3>
 <h3 class="h4">Data File Management</h3>
 <p>ZooKeeper stores its data in a data directory and its transaction
 <p>ZooKeeper stores its data in a data directory and its transaction
       log in a transaction log directory. By default these two directories are
       log in a transaction log directory. By default these two directories are
@@ -1429,7 +1446,7 @@ imok
       transaction log files in a separate directory than the data files.
       transaction log files in a separate directory than the data files.
       Throughput increases and latency decreases when transaction logs reside
       Throughput increases and latency decreases when transaction logs reside
       on a dedicated log devices.</p>
       on a dedicated log devices.</p>
-<a name="N10480"></a><a name="The+Data+Directory"></a>
+<a name="N10488"></a><a name="The+Data+Directory"></a>
 <h4>The Data Directory</h4>
 <h4>The Data Directory</h4>
 <p>This directory has two files in it:</p>
 <p>This directory has two files in it:</p>
 <ul>
 <ul>
@@ -1475,14 +1492,14 @@ imok
         idempotent nature of its updates. By replaying the transaction log
         idempotent nature of its updates. By replaying the transaction log
         against fuzzy snapshots ZooKeeper gets the state of the system at the
         against fuzzy snapshots ZooKeeper gets the state of the system at the
         end of the log.</p>
         end of the log.</p>
-<a name="N104BC"></a><a name="The+Log+Directory"></a>
+<a name="N104C4"></a><a name="The+Log+Directory"></a>
 <h4>The Log Directory</h4>
 <h4>The Log Directory</h4>
 <p>The Log Directory contains the ZooKeeper transaction logs.
 <p>The Log Directory contains the ZooKeeper transaction logs.
         Before any update takes place, ZooKeeper ensures that the transaction
         Before any update takes place, ZooKeeper ensures that the transaction
         that represents the update is written to non-volatile storage. A new
         that represents the update is written to non-volatile storage. A new
         log file is started each time a snapshot is begun. The log file's
         log file is started each time a snapshot is begun. The log file's
         suffix is the first zxid written to that log.</p>
         suffix is the first zxid written to that log.</p>
-<a name="N104C6"></a><a name="sc_filemanagement"></a>
+<a name="N104CE"></a><a name="sc_filemanagement"></a>
 <h4>File Management</h4>
 <h4>File Management</h4>
 <p>The format of snapshot and log files does not change between
 <p>The format of snapshot and log files does not change between
         standalone ZooKeeper servers and different configurations of
         standalone ZooKeeper servers and different configurations of
@@ -1502,7 +1519,7 @@ imok
         this document for more details on setting a retention policy
         this document for more details on setting a retention policy
         and maintenance of ZooKeeper storage.
         and maintenance of ZooKeeper storage.
         </p>
         </p>
-<a name="N104DB"></a><a name="sc_commonProblems"></a>
+<a name="N104E3"></a><a name="sc_commonProblems"></a>
 <h3 class="h4">Things to Avoid</h3>
 <h3 class="h4">Things to Avoid</h3>
 <p>Here are some common problems you can avoid by configuring
 <p>Here are some common problems you can avoid by configuring
       ZooKeeper correctly:</p>
       ZooKeeper correctly:</p>
@@ -1556,7 +1573,7 @@ imok
 </dd>
 </dd>
       
       
 </dl>
 </dl>
-<a name="N104FF"></a><a name="sc_bestPractices"></a>
+<a name="N10507"></a><a name="sc_bestPractices"></a>
 <h3 class="h4">Best Practices</h3>
 <h3 class="h4">Best Practices</h3>
 <p>For best results, take note of the following list of good
 <p>For best results, take note of the following list of good
       Zookeeper practices:</p>
       Zookeeper practices:</p>

Разница между файлами не показана из-за своего большого размера
+ 3 - 3
docs/zookeeperAdmin.pdf


+ 16 - 2
src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml

@@ -548,8 +548,22 @@ server.3=zoo3:2888:3888</computeroutput></para>
 
 
     <section id="sc_troubleshooting">
     <section id="sc_troubleshooting">
       <title>Troubleshooting</title>
       <title>Troubleshooting</title>
-
-      <para></para>
+	<variablelist>
+		<varlistentry>
+		<term> Server not coming up because of file corruption</term>
+		<listitem>
+		<para>A server might not be able to read its database and fail to come up because of 
+		some file corruption in the transaction logs of the ZooKeeper server. You will
+		see some IOException on loading ZooKeeper database. In such a case,
+		make sure all the other servers in your ensemble are up and  working. Use "stat" 
+		command on the command port to see if they are in good health. After you have verified that
+		all the other servers of the ensemble are up, you can go ahead and clean the database
+		of the corrupt server. Delete all the files in datadir/version-2 and datalogdir/version-2/.
+		Restart the server.
+		</para>
+		</listitem>
+		</varlistentry>
+		</variablelist>
     </section>
     </section>
 
 
     <section id="sc_configuration">
     <section id="sc_configuration">

+ 3 - 2
src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java

@@ -489,7 +489,8 @@ public class FileTxnLog implements TxnLog {
             FileHeader header= new FileHeader();
             FileHeader header= new FileHeader();
             header.deserialize(ia, "fileheader");
             header.deserialize(ia, "fileheader");
             if (header.getMagic() != FileTxnLog.TXNLOG_MAGIC) {
             if (header.getMagic() != FileTxnLog.TXNLOG_MAGIC) {
-                throw new IOException("Invalid magic number " + header.getMagic()
+                throw new IOException("Transaction log: " + this.logFile + " has invalid magic number " 
+                        + header.getMagic()
                         + " != " + FileTxnLog.TXNLOG_MAGIC);
                         + " != " + FileTxnLog.TXNLOG_MAGIC);
             }
             }
         }
         }
@@ -506,7 +507,7 @@ public class FileTxnLog implements TxnLog {
                 LOG.debug("Created new input stream " + logFile);
                 LOG.debug("Created new input stream " + logFile);
                 ia  = BinaryInputArchive.getArchive(inputStream);
                 ia  = BinaryInputArchive.getArchive(inputStream);
                 inStreamCreated(ia,inputStream);
                 inStreamCreated(ia,inputStream);
-                LOG.debug("created new input archive " + logFile);
+                LOG.debug("Created new input archive " + logFile);
             }
             }
             return ia;
             return ia;
         }
         }

+ 1 - 1
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java

@@ -109,7 +109,7 @@ public class FileTxnSnapLog {
     }
     }
     
     
     /**
     /**
-     * this function restors the server 
+     * this function restores the server 
      * database after reading from the 
      * database after reading from the 
      * snapshots and transaction logs
      * snapshots and transaction logs
      * @param dt the datatree to be restored
      * @param dt the datatree to be restored

Некоторые файлы не были показаны из-за большого количества измененных файлов