|
@@ -201,6 +201,14 @@ document.write("Last Published: " + document.lastModified);
|
|
|
<ul class="minitoc">
|
|
|
<li>
|
|
|
<a href="#sc_designing">Designing a ZooKeeper Deployment</a>
|
|
|
+<ul class="minitoc">
|
|
|
+<li>
|
|
|
+<a href="#sc_CrossMachineRequirements">Cross Machine Requirements</a>
|
|
|
+</li>
|
|
|
+<li>
|
|
|
+<a href="#Single+Machine+Requirements">Single Machine Requirements</a>
|
|
|
+</li>
|
|
|
+</ul>
|
|
|
</li>
|
|
|
<li>
|
|
|
<a href="#sc_provisioning">Provisioning</a>
|
|
@@ -621,20 +629,103 @@ server.3=zoo3:2888:3888</span>
|
|
|
</ul>
|
|
|
<a name="N10160"></a><a name="sc_designing"></a>
|
|
|
<h3 class="h4">Designing a ZooKeeper Deployment</h3>
|
|
|
-<p></p>
|
|
|
-<a name="N10169"></a><a name="sc_provisioning"></a>
|
|
|
+<p>The reliablity of ZooKeeper rests on two basic assumptions.</p>
|
|
|
+<ol>
|
|
|
+
|
|
|
+<li>
|
|
|
+<p> Only a minority of servers in a deployment
|
|
|
+ will fail. <em>Failure</em> in this context
|
|
|
+ means a machine crash, or some error in the network that
|
|
|
+ partitions a server off from the majority.</p>
|
|
|
+
|
|
|
+</li>
|
|
|
+
|
|
|
+<li>
|
|
|
+<p> Deployed machines operate correctly. To
|
|
|
+ operate correctly means to execute code correctly, to have
|
|
|
+ clocks that work properly, and to have storage and network
|
|
|
+ components that perform consistently.</p>
|
|
|
+
|
|
|
+</li>
|
|
|
+
|
|
|
+</ol>
|
|
|
+<p>The sections below contain considerations for ZooKeeper
|
|
|
+ administrators to maximize the probability for these assumptions
|
|
|
+ to hold true. Some of these are cross-machines considerations,
|
|
|
+ and others are things you should consider for each and every
|
|
|
+ machine in your deployment.</p>
|
|
|
+<a name="N1017C"></a><a name="sc_CrossMachineRequirements"></a>
|
|
|
+<h4>Cross Machine Requirements</h4>
|
|
|
+<p>For the ZooKeeper service to be active, there must be a
|
|
|
+ majority of non-failing machines that can communicate with
|
|
|
+ each other. To create a deployment that can tolerate the
|
|
|
+ failure of F machines, you should count on deploying 2xF+1
|
|
|
+ machines. Thus, a deployment that consists of three machines
|
|
|
+ can handle one failure, and a deployment of five machines can
|
|
|
+ handle two failures. Note that a deployment of six machines
|
|
|
+ can only handle two failures since three machines is not a
|
|
|
+ majority. For this reason, ZooKeeper deployments are usually
|
|
|
+ made up of an odd number of machines.</p>
|
|
|
+<p>To achieve the highest probability of tolerating a failure
|
|
|
+ you should try to make machine failures independent. For
|
|
|
+ example, if most of the machines share the same switch,
|
|
|
+ failure of that switch could cause a correlated failure and
|
|
|
+ bring down the service. The same holds true of shared power
|
|
|
+ circuits, cooling systems, etc.</p>
|
|
|
+<a name="N10189"></a><a name="Single+Machine+Requirements"></a>
|
|
|
+<h4>Single Machine Requirements</h4>
|
|
|
+<p>If ZooKeeper has to contend with other applications for
|
|
|
+ access to resourses like storage media, CPU, network, or
|
|
|
+ memory, its performance will suffer markedly. ZooKeeper has
|
|
|
+ strong durability guarantees, which means it uses storage
|
|
|
+ media to log changes before the operation responsible for the
|
|
|
+ change is allowed to complete. You should be aware of this
|
|
|
+ dependency then, and take great care if you want to ensure
|
|
|
+ that ZooKeeper operations aren’t held up by your media. Here
|
|
|
+ are some things you can do to minimize that sort of
|
|
|
+ degradation:
|
|
|
+ </p>
|
|
|
+<ul>
|
|
|
+
|
|
|
+<li>
|
|
|
+
|
|
|
+<p>ZooKeeper's transaction log must be on a dedicated
|
|
|
+ device. (A dedicated partition is not enough.) ZooKeeper
|
|
|
+ writes the log sequentially, without seeking Sharing your
|
|
|
+ log device with other processes can cause seeks and
|
|
|
+ contention, which in turn can cause multi-second
|
|
|
+ delays.</p>
|
|
|
+
|
|
|
+</li>
|
|
|
+
|
|
|
+
|
|
|
+<li>
|
|
|
+
|
|
|
+<p>Do not put ZooKeeper in a situation that can cause a
|
|
|
+ swap. In order for ZooKeeper to function with any sort of
|
|
|
+ timeliness, it simply cannot be allowed to swap.
|
|
|
+ Therefore, make certain that the maximum heap size given
|
|
|
+ to ZooKeeper is not bigger than the amount of real memory
|
|
|
+ available to ZooKeeper. For more on this, see
|
|
|
+ <a href="#sc_commonProblems">Things to Avoid</a>
|
|
|
+ below. </p>
|
|
|
+
|
|
|
+</li>
|
|
|
+
|
|
|
+</ul>
|
|
|
+<a name="N101A7"></a><a name="sc_provisioning"></a>
|
|
|
<h3 class="h4">Provisioning</h3>
|
|
|
<p></p>
|
|
|
-<a name="N10172"></a><a name="sc_strengthsAndLimitations"></a>
|
|
|
+<a name="N101B0"></a><a name="sc_strengthsAndLimitations"></a>
|
|
|
<h3 class="h4">Things to Consider: ZooKeeper Strengths and Limitations</h3>
|
|
|
<p></p>
|
|
|
-<a name="N1017B"></a><a name="sc_administering"></a>
|
|
|
+<a name="N101B9"></a><a name="sc_administering"></a>
|
|
|
<h3 class="h4">Administering</h3>
|
|
|
<p></p>
|
|
|
-<a name="N10184"></a><a name="sc_monitoring"></a>
|
|
|
+<a name="N101C2"></a><a name="sc_monitoring"></a>
|
|
|
<h3 class="h4">Monitoring</h3>
|
|
|
<p></p>
|
|
|
-<a name="N1018D"></a><a name="sc_logging"></a>
|
|
|
+<a name="N101CB"></a><a name="sc_logging"></a>
|
|
|
<h3 class="h4">Logging</h3>
|
|
|
<p>ZooKeeper uses <strong>log4j</strong> version 1.2 as
|
|
|
its logging infrastructure. The ZooKeeper default <span class="codefrag filename">log4j.properties</span>
|
|
@@ -644,10 +735,10 @@ server.3=zoo3:2888:3888</span>
|
|
|
<p>For more information, see
|
|
|
<a href="http://logging.apache.org/log4j/1.2/manual.html#defaultInit">Log4j Default Initialization Procedure</a>
|
|
|
of the log4j manual.</p>
|
|
|
-<a name="N101AD"></a><a name="sc_troubleshooting"></a>
|
|
|
+<a name="N101EB"></a><a name="sc_troubleshooting"></a>
|
|
|
<h3 class="h4">Troubleshooting</h3>
|
|
|
<p></p>
|
|
|
-<a name="N101B6"></a><a name="sc_configuration"></a>
|
|
|
+<a name="N101F4"></a><a name="sc_configuration"></a>
|
|
|
<h3 class="h4">Configuration Parameters</h3>
|
|
|
<p>ZooKeeper's behavior is governed by the ZooKeeper configuration
|
|
|
file. This file is designed so that the exact same file can be used by
|
|
@@ -655,7 +746,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
layouts are the same. If servers use different configuration files, care
|
|
|
must be taken to ensure that the list of servers in all of the different
|
|
|
configuration files match.</p>
|
|
|
-<a name="N101BF"></a><a name="sc_minimumConfiguration"></a>
|
|
|
+<a name="N101FD"></a><a name="sc_minimumConfiguration"></a>
|
|
|
<h4>Minimum Configuration</h4>
|
|
|
<p>Here are the minimum configuration keywords that must be defined
|
|
|
in the configuration file:</p>
|
|
@@ -702,7 +793,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
</dd>
|
|
|
|
|
|
</dl>
|
|
|
-<a name="N101E6"></a><a name="sc_advancedConfiguration"></a>
|
|
|
+<a name="N10224"></a><a name="sc_advancedConfiguration"></a>
|
|
|
<h4>Advanced Configuration</h4>
|
|
|
<p>The configuration settings in the section are optional. You can
|
|
|
use them to further fine tune the behaviour of your ZooKeeper servers.
|
|
@@ -793,7 +884,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
</dd>
|
|
|
|
|
|
</dl>
|
|
|
-<a name="N10246"></a><a name="sc_clusterOptions"></a>
|
|
|
+<a name="N10284"></a><a name="sc_clusterOptions"></a>
|
|
|
<h4>Cluster Options</h4>
|
|
|
<p>The options in this section are designed for use with an ensemble
|
|
|
of servers -- that is, when deploying clusters of servers.</p>
|
|
@@ -883,7 +974,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
|
|
|
</dl>
|
|
|
<p></p>
|
|
|
-<a name="N102A3"></a><a name="Unsafe+Options"></a>
|
|
|
+<a name="N102E1"></a><a name="Unsafe+Options"></a>
|
|
|
<h4>Unsafe Options</h4>
|
|
|
<p>The following options can be useful, but be careful when you use
|
|
|
them. The risk of each is explained along with the explanation of what
|
|
@@ -928,7 +1019,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
</dd>
|
|
|
|
|
|
</dl>
|
|
|
-<a name="N102D5"></a><a name="sc_zkCommands"></a>
|
|
|
+<a name="N10313"></a><a name="sc_zkCommands"></a>
|
|
|
<h3 class="h4">ZooKeeper Commands: The Four Letter Words</h3>
|
|
|
<p>ZooKeeper responds to a small set of commands. Each command is
|
|
|
composed of four letters. You issue the commands to ZooKeeper via telnet
|
|
@@ -993,7 +1084,7 @@ server.3=zoo3:2888:3888</span>
|
|
|
<pre class="code">$ echo ruok | nc 127.0.0.1 5111
|
|
|
imok
|
|
|
</pre>
|
|
|
-<a name="N10315"></a><a name="sc_dataFileManagement"></a>
|
|
|
+<a name="N10353"></a><a name="sc_dataFileManagement"></a>
|
|
|
<h3 class="h4">Data File Management</h3>
|
|
|
<p>ZooKeeper stores its data in a data directory and its transaction
|
|
|
log in a transaction log directory. By default these two directories are
|
|
@@ -1001,7 +1092,7 @@ imok
|
|
|
transaction log files in a separate directory than the data files.
|
|
|
Throughput increases and latency decreases when transaction logs reside
|
|
|
on a dedicated log devices.</p>
|
|
|
-<a name="N1031E"></a><a name="The+Data+Directory"></a>
|
|
|
+<a name="N1035C"></a><a name="The+Data+Directory"></a>
|
|
|
<h4>The Data Directory</h4>
|
|
|
<p>This directory has two files in it:</p>
|
|
|
<ul>
|
|
@@ -1047,14 +1138,14 @@ imok
|
|
|
idempotent nature of its updates. By replaying the transaction log
|
|
|
against fuzzy snapshots ZooKeeper gets the state of the system at the
|
|
|
end of the log.</p>
|
|
|
-<a name="N1035A"></a><a name="The+Log+Directory"></a>
|
|
|
+<a name="N10398"></a><a name="The+Log+Directory"></a>
|
|
|
<h4>The Log Directory</h4>
|
|
|
<p>The Log Directory contains the ZooKeeper transaction logs.
|
|
|
Before any update takes place, ZooKeeper ensures that the transaction
|
|
|
that represents the update is written to non-volatile storage. A new
|
|
|
log file is started each time a snapshot is begun. The log file's
|
|
|
suffix is the first zxid written to that log.</p>
|
|
|
-<a name="N10364"></a><a name="File+Management"></a>
|
|
|
+<a name="N103A2"></a><a name="File+Management"></a>
|
|
|
<h4>File Management</h4>
|
|
|
<p>The format of snapshot and log files does not change between
|
|
|
standalone ZooKeeper servers and different configurations of
|
|
@@ -1071,7 +1162,7 @@ imok
|
|
|
needs the latest complete fuzzy snapshot and the log files from the
|
|
|
start of that snapshot. The PurgeTxnLog utility implements a simple
|
|
|
retention policy that administrators can use.</p>
|
|
|
-<a name="N10375"></a><a name="sc_commonProblems"></a>
|
|
|
+<a name="N103B3"></a><a name="sc_commonProblems"></a>
|
|
|
<h3 class="h4">Things to Avoid</h3>
|
|
|
<p>Here are some common problems you can avoid by configuring
|
|
|
ZooKeeper correctly:</p>
|
|
@@ -1125,7 +1216,7 @@ imok
|
|
|
</dd>
|
|
|
|
|
|
</dl>
|
|
|
-<a name="N10399"></a><a name="sc_bestPractices"></a>
|
|
|
+<a name="N103D7"></a><a name="sc_bestPractices"></a>
|
|
|
<h3 class="h4">Best Practices</h3>
|
|
|
<p>For best results, take note of the following list of good
|
|
|
Zookeeper practices. <em>[tbd...]</em>
|