瀏覽代碼

HADOOP-13839. Fix outdated tracing documentation. Contributed by Elek, Marton.

Masatake Iwasaki 8 年之前
父節點
當前提交
04ccb1bec5

+ 3 - 0
hadoop-common-project/hadoop-common/CHANGES.txt

@@ -107,6 +107,9 @@ Release 2.7.4 - UNRELEASED
     HADOOP-11859. PseudoAuthenticationHandler fails with httpcomponents v4.4.
     HADOOP-11859. PseudoAuthenticationHandler fails with httpcomponents v4.4.
     (Eugene Koifman via jitendra)
     (Eugene Koifman via jitendra)
 
 
+    HADOOP-13839. Fix outdated tracing documentation.
+    (Elek, Marton via iwasakims)
+
 Release 2.7.3 - 2016-08-25
 Release 2.7.3 - 2016-08-25
 
 
   INCOMPATIBLE CHANGES
   INCOMPATIBLE CHANGES

+ 25 - 19
hadoop-common-project/hadoop-common/src/site/markdown/Tracing.md

@@ -12,6 +12,7 @@
   limitations under the License. See accompanying LICENSE file.
   limitations under the License. See accompanying LICENSE file.
 -->
 -->
 
 
+
 Enabling Dapper-like Tracing in Hadoop
 Enabling Dapper-like Tracing in Hadoop
 ======================================
 ======================================
 
 
@@ -24,7 +25,8 @@ Enabling Dapper-like Tracing in Hadoop
         * [Dynamic update of tracing configuration](#Dynamic_update_of_tracing_configuration)
         * [Dynamic update of tracing configuration](#Dynamic_update_of_tracing_configuration)
         * [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API)
         * [Starting tracing spans by HTrace API](#Starting_tracing_spans_by_HTrace_API)
         * [Sample code for tracing](#Sample_code_for_tracing)
         * [Sample code for tracing](#Sample_code_for_tracing)
-  
+
+
 Dapper-like Tracing in Hadoop
 Dapper-like Tracing in Hadoop
 -----------------------------
 -----------------------------
 
 
@@ -37,14 +39,14 @@ Setting up tracing is quite simple, however it requires some very minor changes
 
 
 ### Samplers
 ### Samplers
 
 
-Configure the samplers in `core-site.xml` property: `hadoop.htrace.sampler`.
+Configure the samplers in `core-site.xml` property: `dfs.htrace.sampler`.
 The value can be NeverSampler, AlwaysSampler or ProbabilitySampler.
 The value can be NeverSampler, AlwaysSampler or ProbabilitySampler.
 NeverSampler: HTrace is OFF for all spans;
 NeverSampler: HTrace is OFF for all spans;
 AlwaysSampler: HTrace is ON for all spans;
 AlwaysSampler: HTrace is ON for all spans;
 ProbabilitySampler: HTrace is ON for some percentage% of top-level spans.
 ProbabilitySampler: HTrace is ON for some percentage% of top-level spans.
 
 
       <property>
       <property>
-        <name>hadoop.htrace.sampler</name>
+        <name>dfs.htrace.sampler</name>
         <value>NeverSampler</value>
         <value>NeverSampler</value>
       </property>
       </property>
 
 
@@ -61,18 +63,18 @@ by putting a comma separated list of the fully-qualified class name of classes i
 in `core-site.xml` property: `hadoop.htrace.spanreceiver.classes`.
 in `core-site.xml` property: `hadoop.htrace.spanreceiver.classes`.
 
 
       <property>
       <property>
-        <name>hadoop.htrace.spanreceiver.classes</name>
+        <name>dfs.htrace.spanreceiver.classes</name>
         <value>org.apache.htrace.impl.LocalFileSpanReceiver</value>
         <value>org.apache.htrace.impl.LocalFileSpanReceiver</value>
       </property>
       </property>
       <property>
       <property>
-        <name>hadoop.htrace.local-file-span-receiver.path</name>
+        <name>dfs.htrace.local-file-span-receiver.path</name>
         <value>/var/log/hadoop/htrace.out</value>
         <value>/var/log/hadoop/htrace.out</value>
       </property>
       </property>
 
 
 You can omit package name prefix if you use span receiver bundled with HTrace.
 You can omit package name prefix if you use span receiver bundled with HTrace.
 
 
       <property>
       <property>
-        <name>hadoop.htrace.spanreceiver.classes</name>
+        <name>dfs.htrace.spanreceiver.classes</name>
         <value>LocalFileSpanReceiver</value>
         <value>LocalFileSpanReceiver</value>
       </property>
       </property>
 
 
@@ -83,30 +85,32 @@ you can use `ZipkinSpanReceiver` which uses
 [Zipkin](https://github.com/twitter/zipkin) for collecting and displaying tracing data.
 [Zipkin](https://github.com/twitter/zipkin) for collecting and displaying tracing data.
 
 
 In order to use `ZipkinSpanReceiver`,
 In order to use `ZipkinSpanReceiver`,
-you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first.
+you need to download and setup [Zipkin](https://github.com/twitter/zipkin) first. With docker you can start an experimental zipkin server with the following command.
+
+```
+docker run -d -p 9411:9411 -p 9410:9410 openzipkin/zipkin
+```
 
 
 you also need to add the jar of `htrace-zipkin` to the classpath of Hadoop on each node.
 you also need to add the jar of `htrace-zipkin` to the classpath of Hadoop on each node.
-Here is example setup procedure.
+The easiest way to achieve this is downloading the binary jars from maven central add put it to the hadoop lib directory.
 
 
-      $ git clone https://github.com/cloudera/htrace
-      $ cd htrace/htrace-zipkin
-      $ mvn compile assembly:single
-      $ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HADOOP_HOME/share/hadoop/common/lib/
+    $ wget http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.9.0/libthrift-0.9.0.jar -O $HADOOP_HOME/share/hadoop/common/lib/libthrift-0.9.0.jar
+    $ wget http://repo1.maven.org/maven2/org/apache/htrace/htrace-zipkin/3.1.0-incubating/htrace-zipkin-3.1.0-incubating.jar -O $HADOOP_HOME/share/hadoop/common/lib/htrace-zipkin-3.1.0-incubating.jar
 
 
 The sample configuration for `ZipkinSpanReceiver` is shown below.
 The sample configuration for `ZipkinSpanReceiver` is shown below.
 By adding these to `core-site.xml` of NameNode and DataNodes, `ZipkinSpanReceiver` is initialized on the startup.
 By adding these to `core-site.xml` of NameNode and DataNodes, `ZipkinSpanReceiver` is initialized on the startup.
 You also need this configuration on the client node in addition to the servers.
 You also need this configuration on the client node in addition to the servers.
 
 
       <property>
       <property>
-        <name>hadoop.htrace.spanreceiver.classes</name>
+        <name>dfs.htrace.spanreceiver.classes</name>
         <value>ZipkinSpanReceiver</value>
         <value>ZipkinSpanReceiver</value>
       </property>
       </property>
       <property>
       <property>
-        <name>hadoop.htrace.zipkin.collector-hostname</name>
+        <name>dfs.htrace.zipkin.collector-hostname</name>
         <value>192.168.1.2</value>
         <value>192.168.1.2</value>
       </property>
       </property>
       <property>
       <property>
-        <name>hadoop.htrace.zipkin.collector-port</name>
+        <name>dfs.htrace.zipkin.collector-port</name>
         <value>9410</value>
         <value>9410</value>
       </property>
       </property>
 
 
@@ -136,8 +140,8 @@ You need to run the command against all servers if you want to update the config
 You need to specify the class name of span receiver as argument of `-class` option.
 You need to specify the class name of span receiver as argument of `-class` option.
 You can specify the configuration associated with span receiver by `-Ckey=value` options.
 You can specify the configuration associated with span receiver by `-Ckey=value` options.
 
 
-      $ hadoop trace -add -class LocalFileSpanReceiver -Chadoop.htrace.local-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000
-      Added trace span receiver 2 with configuration hadoop.htrace.local-file-span-receiver.path = /tmp/htrace.out
+      $ hadoop trace -add -class LocalFileSpanReceiver -Clocal-file-span-receiver.path=/tmp/htrace.out -host 192.168.56.2:9000
+      Added trace span receiver 2 with configuration local-file-span-receiver.path = /tmp/htrace.out
 
 
       $ hadoop trace -list -host 192.168.56.2:9000
       $ hadoop trace -list -host 192.168.56.2:9000
       ID  CLASS
       ID  CLASS
@@ -159,7 +163,7 @@ In addition, you need to initialize `SpanReceiver` once per process.
 
 
     ...
     ...
 
 
-        SpanReceiverHost.getInstance(new HdfsConfiguration());
+        SpanReceiverHost.get(new HdfsConfiguration(), "dfs");
 
 
     ...
     ...
 
 
@@ -189,7 +193,7 @@ which start tracing span before invoking HDFS shell command.
         FsShell shell = new FsShell();
         FsShell shell = new FsShell();
         conf.setQuietMode(false);
         conf.setQuietMode(false);
         shell.setConf(conf);
         shell.setConf(conf);
-        SpanReceiverHost.getInstance(conf);
+        SpanReceiverHost.get(conf, "dfs");
         int res = 0;
         int res = 0;
         TraceScope ts = null;
         TraceScope ts = null;
         try {
         try {
@@ -207,3 +211,5 @@ You can compile and execute this code as shown below.
 
 
     $ javac -cp `hadoop classpath` TracingFsShell.java
     $ javac -cp `hadoop classpath` TracingFsShell.java
     $ java -cp .:`hadoop classpath` TracingFsShell -ls /
     $ java -cp .:`hadoop classpath` TracingFsShell -ls /
+
+The configuration prefix for the client-side htrace configuration is defined in the SpanReceiverHost.get call. In the case above you should use the dfs prefix as on the server side. (*dfs*.htrace…. configuration values)