浏览代码

MAPREDUCE-6260. Convert site documentation to markdown (Masatake Iwasaki via aw)

Allen Wittenauer 10 年之前
父节点
当前提交
8b787e2fdb
共有 17 个文件被更改,包括 6586 次插入7902 次删除
  1. 3 0
      hadoop-mapreduce-project/CHANGES.txt
  2. 0 151
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm
  3. 0 320
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm
  4. 0 1605
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduceTutorial.apt.vm
  5. 0 114
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm
  6. 0 2709
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredAppMasterRest.apt.vm
  7. 0 233
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
  8. 0 98
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm
  9. 119 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistributedCacheDeploy.md.vm
  10. 255 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md
  11. 1156 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapReduceTutorial.md
  12. 69 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapReduce_Compatibility_Hadoop1_Hadoop2.md
  13. 2397 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapredAppMasterRest.md
  14. 153 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapredCommands.md
  15. 73 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/PluggableShuffleAndPluggableSort.md
  16. 0 2672
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/apt/HistoryServerRest.apt.vm
  17. 2361 0
      hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/markdown/HistoryServerRest.md

+ 3 - 0
hadoop-mapreduce-project/CHANGES.txt

@@ -96,6 +96,9 @@ Trunk (Unreleased)
 
     MAPREDUCE-6250. deprecate sbin/mr-jobhistory-daemon.sh (aw)
 
+    MAPREDUCE-6260. Convert site documentation to markdown (Masatake Iwasaki
+    via aw)
+
   BUG FIXES
 
     MAPREDUCE-6191. Improve clearing stale state of Java serialization

+ 0 - 151
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm

@@ -1,151 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Distributed Cache Deploy
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Distributed Cache Deploy
-
-* Introduction
-
-  The MapReduce application framework has rudimentary support for deploying a
-  new version of the MapReduce framework via the distributed cache. By setting
-  the appropriate configuration properties, users can run a different version
-  of MapReduce than the one initially deployed to the cluster. For example,
-  cluster administrators can place multiple versions of MapReduce in HDFS and
-  configure <<<mapred-site.xml>>> to specify which version jobs will use by
-  default. This allows the administrators to perform a rolling upgrade of the
-  MapReduce framework under certain conditions.
-
-* Preconditions and Limitations
-
-  The support for deploying the MapReduce framework via the distributed cache
-  currently does not address the job client code used to submit and query
-  jobs. It also does not address the <<<ShuffleHandler>>> code that runs as an
-  auxilliary service within each NodeManager. As a result the following
-  limitations apply to MapReduce versions that can be successfully deployed via
-  the distributed cache in a rolling upgrade fashion:
-
-  * The MapReduce version must be compatible with the job client code used to
-    submit and query jobs. If it is incompatible then the job client must be
-    upgraded separately on any node from which jobs using the new MapReduce
-    version will be submitted or queried.
-
-  * The MapReduce version must be compatible with the configuration files used
-    by the job client submitting the jobs. If it is incompatible with that
-    configuration (e.g.: a new property must be set or an existing property
-    value changed) then the configuration must be updated first.
-
-  * The MapReduce version must be compatible with the <<<ShuffleHandler>>>
-    version running on the nodes in the cluster. If it is incompatible then the
-    new <<<ShuffleHandler>>> code must be deployed to all the nodes in the
-    cluster, and the NodeManagers must be restarted to pick up the new
-    <<<ShuffleHandler>>> code.
-
-* Deploying a New MapReduce Version via the Distributed Cache
-
-  Deploying a new MapReduce version consists of three steps:
-
-  [[1]] Upload the MapReduce archive to a location that can be accessed by the
-  job submission client. Ideally the archive should be on the cluster's default
-  filesystem at a publicly-readable path. See the archive location discussion
-  below for more details.
-
-  [[2]] Configure <<<mapreduce.application.framework.path>>> to point to the
-  location where the archive is located. As when specifying distributed cache
-  files for a job, this is a URL that also supports creating an alias for the
-  archive if a URL fragment is specified. For example,
-  <<<hdfs:/mapred/framework/hadoop-mapreduce-${project.version}.tar.gz#mrframework>>>
-  will be localized as <<<mrframework>>> rather than
-  <<<hadoop-mapreduce-${project.version}.tar.gz>>>.
-
-  [[3]] Configure <<<mapreduce.application.classpath>>> to set the proper
-  classpath to use with the MapReduce archive configured above. NOTE: An error
-  occurs if <<<mapreduce.application.framework.path>>> is configured but
-  <<<mapreduce.application.classpath>>> does not reference the base name of the
-  archive path or the alias if an alias was specified.
-
-** Location of the MapReduce Archive and How It Affects Job Performance
-
-  Note that the location of the MapReduce archive can be critical to job
-  submission and job startup performance. If the archive is not located on the
-  cluster's default filesystem then it will be copied to the job staging
-  directory for each job and localized to each node where the job's tasks
-  run. This will slow down job submission and task startup performance.
-
-  If the archive is located on the default filesystem then the job client will
-  not upload the archive to the job staging directory for each job
-  submission. However if the archive path is not readable by all cluster users
-  then the archive will be localized separately for each user on each node
-  where tasks execute. This can cause unnecessary duplication in the
-  distributed cache.
-
-  When working with a large cluster it can be important to increase the
-  replication factor of the archive to increase its availability. This will
-  spread the load when the nodes in the cluster localize the archive for the
-  first time.
-
-* MapReduce Archives and Classpath Configuration
-
-  Setting a proper classpath for the MapReduce archive depends upon the
-  composition of the archive and whether it has any additional dependencies.
-  For example, the archive can contain not only the MapReduce jars but also the
-  necessary YARN, HDFS, and Hadoop Common jars and all other dependencies. In
-  that case, <<<mapreduce.application.classpath>>> would be configured to
-  something like the following example, where the archive basename is
-  hadoop-mapreduce-${project.version}.tar.gz and the archive is organized
-  internally similar to the standard Hadoop distribution archive:
-
-    <<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/lib/*>>>
-
-  Another possible approach is to have the archive consist of just the
-  MapReduce jars and have the remaining dependencies picked up from the Hadoop
-  distribution installed on the nodes.  In that case, the above example would
-  change to something like the following:
-
-    <<<$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*>>>
-
-** NOTE: 
-
-  If shuffle encryption is also enabled in the cluster, then we could meet the problem that MR job get failed with exception like below: 
-  
-+---+
-2014-10-10 02:17:16,600 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to junpingdu-centos5-3.cs1cloud.internal:13562 with 1 map outputs
-javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
-    at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
-    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
-    at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
-    at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206)
-    at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
-    at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
-    at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
-    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
-    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
-    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
-    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)
-    at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
-    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81)
-    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61)
-    at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584)
-    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193)
-    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
-    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
-    at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427)
-....
-
-+---+
-
-  This is because MR client (deployed from HDFS) cannot access ssl-client.xml in local FS under directory of $HADOOP_CONF_DIR. To fix the problem, we can add the directory with ssl-client.xml to the classpath of MR which is specified in "mapreduce.application.classpath" as mentioned above. To avoid MR application being affected by other local configurations, it is better to create a dedicated directory for putting ssl-client.xml, e.g. a sub-directory under $HADOOP_CONF_DIR, like: $HADOOP_CONF_DIR/security.

+ 0 - 320
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm

@@ -1,320 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Encrypted Shuffle
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Encrypted Shuffle
-
-* {Introduction}
-
-  The Encrypted Shuffle capability allows encryption of the MapReduce shuffle
-  using HTTPS and with optional client authentication (also known as
-  bi-directional HTTPS, or HTTPS with client certificates). It comprises:
-
-  * A Hadoop configuration setting for toggling the shuffle between HTTP and
-    HTTPS.
-
-  * A Hadoop configuration settings for specifying the keystore and truststore
-   properties (location, type, passwords) used by the shuffle service and the
-   reducers tasks fetching shuffle data.
-
-  * A way to re-load truststores across the cluster (when a node is added or
-    removed).
-
-* {Configuration}
-
-**  <<core-site.xml>> Properties
-
-  To enable encrypted shuffle, set the following properties in core-site.xml of
-  all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.require.client.cert>>> | <<<false>>>         | Whether client certificates are required |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.hostname.verifier>>>   | <<<DEFAULT>>>       | The hostname verifier to provide for HttpsURLConnections. Valid values are: <<DEFAULT>>, <<STRICT>>, <<STRICT_I6>>, <<DEFAULT_AND_LOCALHOST>> and <<ALLOW_ALL>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.keystores.factory.class>>> | <<<org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory>>> | The KeyStoresFactory implementation to use |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.server.conf>>>         | <<<ssl-server.xml>>> | Resource file from which ssl server keystore information will be extracted. This file is looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.client.conf>>>         | <<<ssl-client.xml>>> | Resource file from which ssl server keystore information will be extracted. This file is looked up in the classpath, typically it should be in Hadoop conf/ directory |
-*--------------------------------------+---------------------+-----------------+
-| <<<hadoop.ssl.enabled.protocols>>>   | <<<TLSv1>>>         | The supported SSL protocols (JDK6 can use <<TLSv1>>, JDK7+ can use <<TLSv1,TLSv1.1,TLSv1.2>>) |
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> Currently requiring client certificates should be set to false.
-  Refer the {{{ClientCertificates}Client Certificates}} section for details.
-
-  <<IMPORTANT:>> All these properties should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>hadoop.ssl.require.client.cert</name>
-      <value>false</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.hostname.verifier</name>
-      <value>DEFAULT</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.keystores.factory.class</name>
-      <value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.server.conf</name>
-      <value>ssl-server.xml</value>
-      <final>true</final>
-    </property>
-
-    <property>
-      <name>hadoop.ssl.client.conf</name>
-      <value>ssl-client.xml</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-**  <<<mapred-site.xml>>> Properties
-
-  To enable encrypted shuffle, set the following property in mapred-site.xml
-  of all nodes in the cluster:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<mapreduce.shuffle.ssl.enabled>>>  | <<<false>>>         | Whether encrypted shuffle is enabled |
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> This property should be marked as final in the cluster
-  configuration files.
-
-*** Example:
-
-------
-    ...
-    <property>
-      <name>mapreduce.shuffle.ssl.enabled</name>
-      <value>true</value>
-      <final>true</final>
-    </property>
-    ...
-------
-
-  The Linux container executor should be set to prevent job tasks from
-  reading the server keystore information and gaining access to the shuffle
-  server certificates.
-
-  Refer to Hadoop Kerberos configuration for details on how to do this.
-
-* {Keystore and Truststore Settings}
-
-  Currently <<<FileBasedKeyStoresFactory>>> is the only <<<KeyStoresFactory>>>
-  implementation. The <<<FileBasedKeyStoresFactory>>> implementation uses the
-  following properties, in the <<ssl-server.xml>> and <<ssl-client.xml>> files,
-  to configure the keystores and truststores.
-
-** <<<ssl-server.xml>>> (Shuffle server) Configuration:
-
-  The mapred user should own the <<ssl-server.xml>> file and have exclusive
-  read access to it.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | <<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.type>>>              | <<<jks>>>           | Keystore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.location>>>          | NONE                | Keystore file location. The mapred user should own this file and have exclusive read access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.keystore.password>>>          | NONE                | Keystore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.type>>>            | <<<jks>>>           | Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.location>>>        | NONE                | Truststore file location. The mapred user should own this file and have exclusive read access to it. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.password>>>        | NONE                | Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.server.truststore.reload.interval>>> | 10000               | Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Server Certificate Store -->
-  <property>
-    <name>ssl.server.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.location</name>
-    <value>${user.home}/keystores/server-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.keystore.password</name>
-    <value>serverfoo</value>
-  </property>
-
-  <!-- Server Trust Store -->
-  <property>
-    <name>ssl.server.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.server.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-** <<<ssl-client.xml>>> (Reducer/Fetcher) Configuration:
-
-  The mapred user should own the <<ssl-client.xml>> file and it should have
-  default permissions.
-
-*---------------------------------------------+---------------------+-----------------+
-| <<Property>>                                | <<Default Value>>   | <<Explanation>> |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.type>>>              | <<<jks>>>           | Keystore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.location>>>          | NONE                | Keystore file location. The mapred user should own this file and it should have default permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.keystore.password>>>          | NONE                | Keystore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.type>>>            | <<<jks>>>           | Truststore file type |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.location>>>        | NONE                | Truststore file location. The mapred user should own this file and it should have default permissions. |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.password>>>        | NONE                | Truststore file password |
-*---------------------------------------------+---------------------+-----------------+
-| <<<ssl.client.truststore.reload.interval>>> | 10000                | Truststore reload interval, in milliseconds |
-*--------------------------------------+----------------------------+-----------------+
-
-*** Example:
-
-------
-<configuration>
-
-  <!-- Client certificate Store -->
-  <property>
-    <name>ssl.client.keystore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.location</name>
-    <value>${user.home}/keystores/client-keystore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.keystore.password</name>
-    <value>clientfoo</value>
-  </property>
-
-  <!-- Client Trust Store -->
-  <property>
-    <name>ssl.client.truststore.type</name>
-    <value>jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.location</name>
-    <value>${user.home}/keystores/truststore.jks</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.password</name>
-    <value>clientserverbar</value>
-  </property>
-  <property>
-    <name>ssl.client.truststore.reload.interval</name>
-    <value>10000</value>
-  </property>
-</configuration>
-------
-
-* Activating Encrypted Shuffle
-
-  When you have made the above configuration changes, activate Encrypted
-  Shuffle by re-starting all NodeManagers.
-
-  <<IMPORTANT:>> Using encrypted shuffle will incur in a significant
-  performance impact. Users should profile this and potentially reserve
-  1 or more cores for encrypted shuffle.
-
-* {ClientCertificates} Client Certificates
-
-  Using Client Certificates does not fully ensure that the client is a
-  reducer task for the job. Currently, Client Certificates (their private key)
-  keystore files must be readable by all users submitting jobs to the cluster.
-  This means that a rogue job could read such those keystore files and use
-  the client certificates in them to establish a secure connection with a
-  Shuffle server. However, unless the rogue job has a proper JobToken, it won't
-  be able to retrieve shuffle data from the Shuffle server. A job, using its
-  own JobToken, can only retrieve shuffle data that belongs to itself.
-
-* Reloading Truststores
-
-  By default the truststores will reload their configuration every 10 seconds.
-  If a new truststore file is copied over the old one, it will be re-read,
-  and its certificates will replace the old ones. This mechanism is useful for
-  adding or removing nodes from the cluster, or for adding or removing trusted
-  clients. In these cases, the client or NodeManager certificate is added to
-  (or removed from) all the truststore files in the system, and the new
-  configuration will be picked up without you having to restart the NodeManager
-  daemons.
-
-* Debugging
-
-  <<NOTE:>> Enable debugging only for troubleshooting, and then only for jobs
-  running on small amounts of data. It is very verbose and slows down jobs by
-  several orders of magnitude. (You might need to increase mapred.task.timeout
-  to prevent jobs from failing because tasks run so slowly.)
-
-  To enable SSL debugging in the reducers, set <<<-Djavax.net.debug=all>>> in
-  the <<<mapreduce.reduce.child.java.opts>>> property; for example:
-
-------
-  <property>
-    <name>mapred.reduce.child.java.opts</name>
-    <value>-Xmx-200m -Djavax.net.debug=all</value>
-  </property>
-------
-
-  You can do this on a per-job basis, or by means of a cluster-wide setting in
-  the <<<mapred-site.xml>>> file.
-
-  To set this property in NodeManager, set it in the <<<yarn-env.sh>>> file:
-
-------
-  YARN_NODEMANAGER_OPTS="-Djavax.net.debug=all $YARN_NODEMANAGER_OPTS"
-------

+ 0 - 1605
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduceTutorial.apt.vm

@@ -1,1605 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  MapReduce Tutorial
-  ---
-  ---
-  ${maven.build.timestamp}
-
-MapReduce Tutorial
-
-%{toc|section=1|fromDepth=0|toDepth=4}
-
-* Purpose
-
-  This document comprehensively describes all user-facing facets of
-  the Hadoop MapReduce framework and serves as a tutorial.
-
-* Prerequisites
-
-  Ensure that Hadoop is installed, configured and is running. More details:
-
-  * {{{../../hadoop-project-dist/hadoop-common/SingleCluster.html}
-    Single Node Setup}} for first-time users.
-
-  * {{{../../hadoop-project-dist/hadoop-common/ClusterSetup.html}
-    Cluster Setup}} for large, distributed clusters.
-
-* Overview
-
-  Hadoop MapReduce is a software framework for easily writing applications
-  which process vast amounts of data (multi-terabyte data-sets) in-parallel
-  on large clusters (thousands of nodes) of commodity hardware in a reliable,
-  fault-tolerant manner.
-
-  A MapReduce <job> usually splits the input data-set into independent chunks
-  which are processed by the <map tasks> in a completely parallel manner. The
-  framework sorts the outputs of the maps, which are then input to the <reduce
-  tasks>. Typically both the input and the output of the job are stored in
-  a file-system. The framework takes care of scheduling tasks, monitoring them
-  and re-executes the failed tasks.
-
-  Typically the compute nodes and the storage nodes are the same, that is,
-  the MapReduce framework and the Hadoop Distributed File System
-  (see {{{../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html}
-  HDFS Architecture Guide}}) are running on the same set of nodes. This
-  configuration allows the framework to effectively schedule tasks on the nodes
-  where data is already present, resulting in very high aggregate bandwidth
-  across the cluster.
-
-  The MapReduce framework consists of a single master <<<ResourceManager>>>,
-  one slave <<<NodeManager>>> per cluster-node, and <<<MRAppMaster>>> per
-  application (see {{{../../hadoop-yarn/hadoop-yarn-site/YARN.html}
-  YARN Architecture Guide}}).
-
-  Minimally, applications specify the input/output locations and supply <map>
-  and <reduce> functions via implementations of appropriate interfaces and/or
-  abstract-classes. These, and other job parameters, comprise the <job
-  configuration>.
-
-  The Hadoop <job client> then submits the job (jar/executable etc.) and
-  configuration to the <<<ResourceManager>>> which then assumes the
-  responsibility of distributing the software/configuration to the slaves,
-  scheduling tasks and monitoring them, providing status and diagnostic
-  information to the job-client.
-
-  Although the Hadoop framework is implemented in Java\u2122, MapReduce
-  applications need not be written in Java.
-
-  * {{{../../api/org/apache/hadoop/streaming/package-summary.html}
-    Hadoop Streaming}} is a utility which allows users to create and run jobs
-    with any executables (e.g. shell utilities) as the mapper and/or the
-    reducer.
-
-  * {{{../../api/org/apache/hadoop/mapred/pipes/package-summary.html}
-    Hadoop Pipes}} is a {{{http://www.swig.org/}SWIG}}-compatible C++ API to
-    implement MapReduce applications (non JNI\u2122 based).
-
-* Inputs and Outputs
-
-  The MapReduce framework operates exclusively on <<<\<key, value\>>>> pairs,
-  that is, the framework views the input to the job as a set of <<<\<key,
-  value\>>>> pairs and produces a set of <<<\<key, value\>>>> pairs as the
-  output of the job, conceivably of different types.
-
-  The <<<key>>> and <<<value>>> classes have to be serializable by the
-  framework and hence need to implement the
-  {{{../../api/org/apache/hadoop/io/Writable.html}Writable}} interface.
-  Additionally, the key classes have to implement the
-  {{{../../api/org/apache/hadoop/io/WritableComparable.html}
-  WritableComparable}} interface to facilitate sorting by the framework.
-
-  Input and Output types of a MapReduce job:
-
-  (input) <<<\<k1, v1\> -\>>>> <<map>> <<<-\> \<k2, v2\> -\>>>> <<combine>>
-  <<<-\> \<k2, v2\> -\>>>> <<reduce>> <<<-\> \<k3, v3\>>>> (output)
-
-* Example: WordCount v1.0
-
-  Before we jump into the details, lets walk through an example MapReduce
-  application to get a flavour for how they work.
-
-  <<<WordCount>>> is a simple application that counts the number of
-  occurrences of each word in a given input set.
-
-  This works with a local-standalone, pseudo-distributed or fully-distributed
-  Hadoop installation
-  ({{{../../hadoop-project-dist/hadoop-common/SingleCluster.html}
-  Single Node Setup}}).
-
-** Source Code
-
-+---+
-import java.io.IOException;
-import java.util.StringTokenizer;
-
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.io.IntWritable;
-import org.apache.hadoop.io.Text;
-import org.apache.hadoop.mapreduce.Job;
-import org.apache.hadoop.mapreduce.Mapper;
-import org.apache.hadoop.mapreduce.Reducer;
-import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
-import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
-
-public class WordCount {
-
-  public static class TokenizerMapper
-       extends Mapper<Object, Text, Text, IntWritable>{
-
-    private final static IntWritable one = new IntWritable(1);
-    private Text word = new Text();
-
-    public void map(Object key, Text value, Context context
-                    ) throws IOException, InterruptedException {
-      StringTokenizer itr = new StringTokenizer(value.toString());
-      while (itr.hasMoreTokens()) {
-        word.set(itr.nextToken());
-        context.write(word, one);
-      }
-    }
-  }
-
-  public static class IntSumReducer
-       extends Reducer<Text,IntWritable,Text,IntWritable> {
-    private IntWritable result = new IntWritable();
-
-    public void reduce(Text key, Iterable<IntWritable> values,
-                       Context context
-                       ) throws IOException, InterruptedException {
-      int sum = 0;
-      for (IntWritable val : values) {
-        sum += val.get();
-      }
-      result.set(sum);
-      context.write(key, result);
-    }
-  }
-
-  public static void main(String[] args) throws Exception {
-    Configuration conf = new Configuration();
-    Job job = Job.getInstance(conf, "word count");
-    job.setJarByClass(WordCount.class);
-    job.setMapperClass(TokenizerMapper.class);
-    job.setCombinerClass(IntSumReducer.class);
-    job.setReducerClass(IntSumReducer.class);
-    job.setOutputKeyClass(Text.class);
-    job.setOutputValueClass(IntWritable.class);
-    FileInputFormat.addInputPath(job, new Path(args[0]));
-    FileOutputFormat.setOutputPath(job, new Path(args[1]));
-    System.exit(job.waitForCompletion(true) ? 0 : 1);
-  }
-}
-+---+
-
-** Usage
-
-  Assuming environment variables are set as follows:
-
-+---+
-export JAVA_HOME=/usr/java/default
-export PATH=$JAVA_HOME/bin:$PATH
-export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
-+---+
-
-  Compile <<<WordCount.java>>> and create a jar:
-
-  <<<$ bin/hadoop com.sun.tools.javac.Main WordCount.java>>> \
-  <<<$ jar cf wc.jar WordCount\*.class>>>
-
-  Assuming that:
-
-   * <<</user/joe/wordcount/input>>> - input directory in HDFS
-
-   * <<</user/joe/wordcount/output>>> - output directory in HDFS
-
-  Sample text-files as input:
-
-  <<<$ bin/hdfs dfs -ls /user/joe/wordcount/input/>>> \
-  <<</user/joe/wordcount/input/file01>>> \
-  <<</user/joe/wordcount/input/file02>>>
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/input/file01>>> \
-  <<<Hello World Bye World>>>
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/input/file02>>> \
-  <<<Hello Hadoop Goodbye Hadoop>>>
-
-  Run the application:
-
-  <<<$ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input
-  /user/joe/wordcount/output>>>
-
-  Output:
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/output/part-r-00000>>>
-
-  <<<Bye     1>>> \
-  <<<Goodbye 1>>> \
-  <<<Hadoop  2>>> \
-  <<<Hello   2>>> \
-  <<<World   2>>>
-
-  Applications can specify a comma separated list of paths which would be
-  present in the current working directory of the task using the option
-  <<<-files>>>. The <<<-libjars>>> option allows applications to add jars to
-  the classpaths of the maps and reduces. The option <<<-archives>>> allows
-  them to pass comma separated list of archives as arguments. These archives
-  are unarchived and a link with name of the archive is created in the current
-  working directory of tasks. More details about the command line options are
-  available at {{{../../hadoop-project-dist/hadoop-common/CommandsManual.html}
-  Commands Guide}}.
-
-  Running <<<wordcount>>> example with <<<-libjars>>>, <<<-files>>> and
-  <<<-archives>>>: \
-  <<<bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files
-  cachefile.txt -libjars mylib.jar -archives myarchive.zip input output>>>
-  Here, myarchive.zip will be placed and unzipped into a directory by the name
-  "myarchive.zip".
-
-  Users can specify a different symbolic name for files and archives passed
-  through <<<-files>>> and <<<-archives>>> option, using #.
-
-  For example, <<<bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount
-  -files dir1/dict.txt#dict1,dir2/dict.txt#dict2 -archives mytar.tgz#tgzdir
-  input output>>> Here, the files dir1/dict.txt and dir2/dict.txt can be
-  accessed by tasks using the symbolic names dict1 and dict2 respectively.
-  The archive mytar.tgz will be placed and unarchived into a directory by the
-  name "tgzdir".
-
-** Walk-through
-
-  The <<<WordCount>>> application is quite straight-forward.
-
-+---+
-    public void map(Object key, Text value, Context context
-                    ) throws IOException, InterruptedException {
-      StringTokenizer itr = new StringTokenizer(value.toString());
-      while (itr.hasMoreTokens()) {
-        word.set(itr.nextToken());
-        context.write(word, one);
-      }
-    }
-+---+
-
-  The <<<Mapper>>> implementation, via the <<<map>>> method, processes one
-  line at a time, as provided by the specified <<<TextInputFormat>>>. It then
-  splits the line into tokens separated by whitespaces, via the
-  <<<StringTokenizer>>>, and emits a key-value pair of <<<\< \<word\>, 1\>>>>.
-
-  For the given sample input the first map emits: \
-  <<<\< Hello, 1\>>>> \
-  <<<\< World, 1\>>>> \
-  <<<\< Bye, 1\>>>> \
-  <<<\< World, 1\>>>>
-
-  The second map emits: \
-  <<<\< Hello, 1\>>>> \
-  <<<\< Hadoop, 1\>>>> \
-  <<<\< Goodbye, 1\>>>> \
-  <<<\< Hadoop, 1\>>>>
-
-  We'll learn more about the number of maps spawned for a given job, and how to
-  control them in a fine-grained manner, a bit later in the tutorial.
-
-+---+
-    job.setCombinerClass(IntSumReducer.class);
-+---+
-
-  <<<WordCount>>> also specifies a <<<combiner>>>. Hence, the output of each
-  map is passed through the local combiner (which is same as the <<<Reducer>>>
-  as per the job configuration) for local aggregation, after being sorted on
-  the <key>s.
-
-  The output of the first map: \
-  <<<\< Bye, 1\>>>> \
-  <<<\< Hello, 1\>>>> \
-  <<<\< World, 2\>>>>
-
-  The output of the second map: \
-  <<<\< Goodbye, 1\>>>> \
-  <<<\< Hadoop, 2\>>>> \
-  <<<\< Hello, 1\>>>>
-
-+---+
-    public void reduce(Text key, Iterable<IntWritable> values,
-                       Context context
-                       ) throws IOException, InterruptedException {
-      int sum = 0;
-      for (IntWritable val : values) {
-        sum += val.get();
-      }
-      result.set(sum);
-      context.write(key, result);
-    }
-+---+
-
-  The <<<Reducer>>> implementation, via the <<<reduce>>> method just sums up
-  the values, which are the occurence counts for each key (i.e. words in this
-  example).
-
-  Thus the output of the job is: \
-  <<<\< Bye, 1\>>>> \
-  <<<\< Goodbye, 1\>>>> \
-  <<<\< Hadoop, 2\>>>> \
-  <<<\< Hello, 2\>>>> \
-  <<<\< World, 2\>>>>
-
-  The <<<main>>> method specifies various facets of the job, such as the
-  input/output paths (passed via the command line), key/value types,
-  input/output formats etc., in the <<<Job>>>. It then calls the
-  <<<job.waitForCompletion>>> to submit the job and monitor its progress.
-
-  We'll learn more about <<<Job>>>, <<<InputFormat>>>, <<<OutputFormat>>> and
-  other interfaces and classes a bit later in the tutorial.
-
-* MapReduce - User Interfaces
-
-  This section provides a reasonable amount of detail on every user-facing
-  aspect of the MapReduce framework. This should help users implement,
-  configure and tune their jobs in a fine-grained manner. However, please note
-  that the javadoc for each class/interface remains the most comprehensive
-  documentation available; this is only meant to be a tutorial.
-
-  Let us first take the <<<Mapper>>> and <<<Reducer>>> interfaces. Applications
-  typically implement them to provide the <<<map>>> and <<<reduce>>> methods.
-
-  We will then discuss other core interfaces including <<<Job>>>,
-  <<<Partitioner>>>, <<<InputFormat>>>, <<<OutputFormat>>>, and others.
-
-  Finally, we will wrap up by discussing some useful features of the framework
-  such as the <<<DistributedCache>>>, <<<IsolationRunner>>> etc.
-
-** Payload
-
-  Applications typically implement the <<<Mapper>>> and <<<Reducer>>>
-  interfaces to provide the <<<map>>> and <<<reduce>>> methods. These form
-  the core of the job.
-
-*** Mapper
-
-  {{{../../api/org/apache/hadoop/mapreduce/Mapper.html}Mapper}} maps input
-  key/value pairs to a set of intermediate key/value pairs.
-
-  Maps are the individual tasks that transform input records into intermediate
-  records. The transformed intermediate records do not need to be of the same
-  type as the input records. A given input pair may map to zero or many output
-  pairs.
-
-  The Hadoop MapReduce framework spawns one map task for each <<<InputSplit>>>
-  generated by the <<<InputFormat>>> for the job.
-
-  Overall, <<<Mapper>>> implementations are passed the <<<Job>>> for the job
-  via the {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setMapperClass(Class)}} method. The framework then calls
-  {{{../../api/org/apache/hadoop/mapreduce/Mapper.html}
-  map(WritableComparable, Writable, Context)}} for each key/value pair in the
-  <<<InputSplit>>> for that task. Applications can then override the
-  <<<cleanup(Context)>>> method to perform any required cleanup.
-
-  Output pairs do not need to be of the same types as input pairs. A given
-  input pair may map to zero or many output pairs. Output pairs are collected
-  with calls to context.write(WritableComparable, Writable).
-
-  Applications can use the <<<Counter>>> to report its statistics.
-
-  All intermediate values associated with a given output key are subsequently
-  grouped by the framework, and passed to the <<<Reducer>>>(s) to determine the
-  final output. Users can control the grouping by specifying a <<<Comparator>>>
-  via {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setGroupingComparatorClass(Class)}}.
-
-  The <<<Mapper>>> outputs are sorted and then partitioned per <<<Reducer>>>.
-  The total number of partitions is the same as the number of reduce tasks for
-  the job. Users can control which keys (and hence records) go to which
-  <<<Reducer>>> by implementing a custom <<<Partitioner>>>.
-
-  Users can optionally specify a <<<combiner>>>, via
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setCombinerClass(Class)}}, to perform local aggregation of the
-  intermediate outputs, which helps to cut down the amount of data transferred
-  from the <<<Mapper>>> to the <<<Reducer>>>.
-
-  The intermediate, sorted outputs are always stored in a simple (key-len, key,
-  value-len, value) format. Applications can control if, and how, the
-  intermediate outputs are to be compressed and the
-  {{{../../api/org/apache/hadoop/io/compress/CompressionCodec.html}
-  CompressionCodec}} to be used via the <<<Configuration>>>.
-
-**** How Many Maps?
-
-  The number of maps is usually driven by the total size of the inputs, that
-  is, the total number of blocks of the input files.
-
-  The right level of parallelism for maps seems to be around 10-100 maps
-  per-node, although it has been set up to 300 maps for very cpu-light map
-  tasks. Task setup takes a while, so it is best if the maps take at least a
-  minute to execute.
-
-  Thus, if you expect 10TB of input data and have a blocksize of <<<128MB>>>,
-  you'll end up with 82,000 maps, unless
-  Configuration.set(<<<MRJobConfig.NUM_MAPS>>>, int) (which only provides a
-  hint to the framework) is used to set it even higher.
-
-*** Reducer
-
-  {{{../../api/org/apache/hadoop/mapreduce/Reducer.html}Reducer}} reduces a
-  set of intermediate values which share a key to a smaller set of values.
-
-  The number of reduces for the job is set by the user via
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setNumReduceTasks(int)}}.
-
-  Overall, <<<Reducer>>> implementations are passed the <<<Job>>> for the
-  job via the {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setReducerClass(Class)}} method and can override it to initialize
-  themselves. The framework then calls
-  {{{../../api/org/apache/hadoop/mapreduce/Reducer.html}
-  reduce(WritableComparable, Iterable\<Writable\>, Context)}} method for each
-  <<<\<key, (list of values)\>>>> pair in the grouped inputs. Applications can
-  then override the <<<cleanup(Context)>>> method to perform any required
-  cleanup.
-
-  <<<Reducer>>> has 3 primary phases: shuffle, sort and reduce.
-
-**** Shuffle
-
-  Input to the <<<Reducer>>> is the sorted output of the mappers. In this phase
-  the framework fetches the relevant partition of the output of all the
-  mappers, via HTTP.
-
-**** Sort
-
-  The framework groups <<<Reducer>>> inputs by keys (since different mappers
-  may have output the same key) in this stage.
-
-  The shuffle and sort phases occur simultaneously; while map-outputs are being
-  fetched they are merged.
-
-**** Secondary Sort
-
-  If equivalence rules for grouping the intermediate keys are required to be
-  different from those for grouping keys before reduction, then one may specify
-  a <<<Comparator>>> via
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setSortComparatorClass(Class)}}. Since
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setGroupingComparatorClass(Class)}} can be used to control how
-  intermediate keys are grouped, these can be used in conjunction to simulate
-  <secondary sort on values>.
-
-**** Reduce
-
-  In this phase the reduce(WritableComparable, Iterable\<Writable\>, Context)
-  method is called for each <<<\<key, (list of values)\>>>> pair in the grouped
-  inputs.
-
-  The output of the reduce task is typically written to the
-  {{{../../api/org/apache/hadoop/fs/FileSystem.html}FileSystem}} via
-  Context.write(WritableComparable, Writable).
-
-  Applications can use the <<<Counter>>> to report its statistics.
-
-  The output of the <<<Reducer>>> is <not sorted>.
-
-**** How Many Reduces?
-
-  The right number of reduces seems to be <<<0.95>>> or <<<1.75>>> multiplied
-  by (\<<no. of nodes>\> * \<<no. of maximum containers per node>\>).
-
-  With <<<0.95>>> all of the reduces can launch immediately and start
-  transferring map outputs as the maps finish. With <<<1.75>>> the faster nodes
-  will finish their first round of reduces and launch a second wave of reduces
-  doing a much better job of load balancing.
-
-  Increasing the number of reduces increases the framework overhead, but
-  increases load balancing and lowers the cost of failures.
-
-  The scaling factors above are slightly less than whole numbers to reserve a
-  few reduce slots in the framework for speculative-tasks and failed tasks.
-
-**** Reducer NONE
-
-  It is legal to set the number of reduce-tasks to <zero> if no reduction is
-  desired.
-
-  In this case the outputs of the map-tasks go directly to the
-  <<<FileSystem>>>, into the output path set by
-  {{{../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html}
-  FileOutputFormat.setOutputPath(Job, Path)}}. The framework does not sort the
-  map-outputs before writing them out to the <<<FileSystem>>>.
-
-*** Partitioner
-
-  {{{../../api/org/apache/hadoop/mapreduce/Partitioner.html}Partitioner}}
-  partitions the key space.
-
-  Partitioner controls the partitioning of the keys of the intermediate
-  map-outputs. The key (or a subset of the key) is used to derive the
-  partition, typically by a <hash function>. The total number of partitions is
-  the same as the number of reduce tasks for the job. Hence this controls which
-  of the <<<m>>> reduce tasks the intermediate key (and hence the record) is
-  sent to for reduction.
-
-  {{{../../api/org/apache/hadoop/mapreduce/lib/partition/HashPartitioner.html}
-  HashPartitioner}} is the default <<<Partitioner>>>.
-
-*** Counter
-
-  {{{../../api/org/apache/hadoop/mapreduce/Counter.html}Counter}} is a facility
-  for MapReduce applications to report its statistics.
-
-  <<<Mapper>>> and <<<Reducer>>> implementations can use the <<<Counter>>> to
-  report statistics.
-
-  Hadoop MapReduce comes bundled with a
-  {{{../../api/org/apache/hadoop/mapreduce/package-summary.html}library}}
-  of generally useful mappers, reducers, and partitioners.
-
-** Job Configuration
-
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}Job}} represents a
-  MapReduce job configuration.
-
-  <<<Job>>> is the primary interface for a user to describe a MapReduce job to
-  the Hadoop framework for execution. The framework tries to faithfully execute
-  the job as described by <<<Job>>>, however:
-
-   * Some configuration parameters may have been marked as final by
-     administrators
-     (see {{{../../api/org/apache/hadoop/conf/Configuration.html#FinalParams}
-     Final Parameters}}) and hence cannot be altered.
-
-   * While some job parameters are straight-forward to set (e.g.
-     {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-     Job.setNumReduceTasks(int)}}), other parameters interact subtly with the
-     rest of the framework and/or job configuration and are more complex to set
-     (e.g. {{{../../api/org/apache/hadoop/conf/Configuration.html}
-     Configuration.set(<<<JobContext.NUM_MAPS>>>, int)}}).
-
-  <<<Job>>> is typically used to specify the <<<Mapper>>>, combiner (if any),
-  <<<Partitioner>>>, <<<Reducer>>>, <<<InputFormat>>>, <<<OutputFormat>>>
-  implementations.
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat}} indicates the set of input files
-  ({{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat.setInputPaths(Job, Path...)}}/
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat.addInputPath(Job, Path)}}) and
-  ({{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat.setInputPaths(Job, String...)}}/
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat.addInputPaths(Job, String))}} and where the output files
-  should be written
-  ({{{../../api/org/apache/hadoop/mapreduce/lib/input/FileOutputFormat.html}
-  FileOutputFormat.setOutputPath(Path)}}).
-
-  Optionally, <<<Job>>> is used to specify other advanced facets of the job
-  such as the <<<Comparator>>> to be used, files to be put in the
-  <<<DistributedCache>>>, whether intermediate and/or job outputs are to be
-  compressed (and how), whether job tasks can be executed in a <speculative>
-  manner ({{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  setMapSpeculativeExecution(boolean)}})/
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  setReduceSpeculativeExecution(boolean)}}),
-  maximum number of attempts per task
-  ({{{../../api/org/apache/hadoop/mapreduce/Job.html}setMaxMapAttempts(int)}}/
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  setMaxReduceAttempts(int)}}) etc.
-
-  Of course, users can use
-  {{{../../api/org/apache/hadoop/conf/Configuration.html}
-  Configuration.set(String, String)}}/
-  {{{../../api/org/apache/hadoop/conf/Configuration.html}
-  Configuration.get(String)}} to set/get arbitrary parameters needed by
-  applications. However, use the <<<DistributedCache>>> for large amounts of
-  (read-only) data.
-
-** Task Execution & Environment
-
-  The <<<MRAppMaster>>> executes the <<<Mapper>>>/<<<Reducer>>> <task> as a
-  child process in a separate jvm.
-
-  The child-task inherits the environment of the parent <<<MRAppMaster>>>. The
-  user can specify additional options to the child-jvm via the
-  <<<mapreduce.\{map|reduce\}.java.opts>>> and configuration parameter in the
-  <<<Job>>> such as non-standard paths for the run-time linker to search
-  shared libraries via <<<-Djava.library.path=\<\>>>> etc. If the
-  <<<mapreduce.\{map|reduce\}.java.opts>>> parameters contains the symbol
-  <@taskid@> it is interpolated with value of <<<taskid>>> of the MapReduce
-  task.
-
-  Here is an example with multiple arguments and substitutions, showing jvm GC
-  logging, and start of a passwordless JVM JMX agent so that it can connect
-  with jconsole and the likes to watch child memory, threads and get thread
-  dumps. It also sets the maximum heap-size of the map and reduce child jvm to
-  512MB & 1024MB respectively. It also adds an additional path to the
-  <<<java.library.path>>> of the child-jvm.
-
-+---+
-<property>
-  <name>mapreduce.map.java.opts</name>
-  <value>
-    -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc
-    -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
-  </value>
-</property>
-
-<property>
-  <name>mapreduce.reduce.java.opts</name>
-  <value>
-    -Xmx1024M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc
-    -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
-  </value>
-</property>
-+---+
-
-*** Memory Management
-
-  Users/admins can also specify the maximum virtual memory of the launched
-  child-task, and any sub-process it launches recursively, using
-  <<<mapreduce.\{map|reduce\}.memory.mb>>>. Note that the value set here is a
-  per process limit. The value for <<<mapreduce.\{map|reduce\}.memory.mb>>>
-  should be specified in mega bytes (MB). And also the value must be greater
-  than or equal to the -Xmx passed to JavaVM, else the VM might not start.
-
-  Note: <<<mapreduce.\{map|reduce\}.java.opts>>> are used only for configuring
-  the launched child tasks from MRAppMaster. Configuring the memory options for
-  daemons is documented in
-  {{{../../hadoop-project-dist/hadoop-common/ClusterSetup.html#Configuring_Environment_of_Hadoop_Daemons}
-  Configuring the Environment of the Hadoop Daemons}}.
-
-  The memory available to some parts of the framework is also configurable.
-  In map and reduce tasks, performance may be influenced by adjusting
-  parameters influencing the concurrency of operations and the frequency with
-  which data will hit disk. Monitoring the filesystem counters for a job-
-  particularly relative to byte counts from the map and into the reduce- is
-  invaluable to the tuning of these parameters.
-
-*** Map Parameters
-
-  A record emitted from a map will be serialized into a buffer and metadata
-  will be stored into accounting buffers. As described in the following
-  options, when either the serialization buffer or the metadata exceed a
-  threshold, the contents of the buffers will be sorted and written to disk in
-  the background while the map continues to output records. If either buffer
-  fills completely while the spill is in progress, the map thread will block.
-  When the map is finished, any remaining records are written to disk and all
-  on-disk segments are merged into a single file. Minimizing the number of
-  spills to disk can decrease map time, but a larger buffer also decreases the
-  memory available to the mapper.
-
-*-------------*-------*-------------------------------------------------------*
-|| Name       || Type || Description                                          |
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.io.sort.mb | int | The cumulative size of the serialization
-|             |       | and accounting buffers storing records emitted from the
-|             |       | map, in megabytes.
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.map.sort.spill.percent | float | The soft limit in the
-|             |       | serialization buffer. Once reached, a thread will begin
-|             |       | to spill the contents to disk in the background.
-*-------------+-------+-------------------------------------------------------+
-
-  Other notes
-
-   * If either spill threshold is exceeded while a spill is in progress,
-     collection will continue until the spill is finished. For example, if
-     <<<mapreduce.map.sort.spill.percent>>> is set to 0.33, and the remainder
-     of the buffer is filled while the spill runs, the next spill will include
-     all the collected records, or 0.66 of the buffer, and will not generate
-     additional spills. In other words, the thresholds are defining triggers,
-     not blocking.
-
-   * A record larger than the serialization buffer will first trigger a spill,
-     then be spilled to a separate file. It is undefined whether or not this
-     record will first pass through the combiner.
-
-*** Shuffle/Reduce Parameters
-
-  As described previously, each reduce fetches the output assigned to it by the
-  Partitioner via HTTP into memory and periodically merges these outputs to
-  disk. If intermediate compression of map outputs is turned on, each output is
-  decompressed into memory. The following options affect the frequency of these
-  merges to disk prior to the reduce and the memory allocated to map output
-  during the reduce.
-
-*-------------*-------*-------------------------------------------------------*
-|| Name       || Type || Description                                          |
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.io.soft.factor | int | Specifies the number of segments on
-|             |       | disk to be merged at the same time. It limits the
-|             |       | number of open files and compression codecs during
-|             |       | merge. If the number of files exceeds this limit, the
-|             |       | merge will proceed in several passes. Though this limit
-|             |       | also applies to the map, most jobs should be configured
-|             |       | so that hitting this limit is unlikely there.
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.reduce.merge.inmem.thresholds | int | The number of sorted map
-|             |       | outputs fetched into memory before being merged to
-|             |       | disk. Like the spill thresholds in the preceding note,
-|             |       | this is not defining a unit of partition, but a
-|             |       | trigger. In practice, this is usually set very high
-|             |       | (1000) or disabled (0), since merging in-memory
-|             |       | segments is often less expensive than merging from disk
-|             |       | (see notes following this table). This threshold
-|             |       | influences only the frequency of in-memory merges
-|             |       | during the shuffle.
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.reduce.shuffle.merge.percent | float | The memory threshold for
-|             |       | fetched map outputs before an in-memory merge is started,
-|             |       | expressed as a percentage of memory allocated to
-|             |       | storing map outputs in memory. Since map outputs that
-|             |       | can't fit in memory can be stalled, setting this high
-|             |       | may decrease parallelism between the fetch and merge.
-|             |       | Conversely, values as high as 1.0 have been effective
-|             |       | for reduces whose input can fit entirely in memory.
-|             |       | This parameter influences only the frequency of
-|             |       | in-memory merges during the shuffle.
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.reduce.shuffle.input.buffer.percent | float | The percentage of
-|             |       | memory- relative to the maximum heapsize as typically
-|             |       | specified in <<<mapreduce.reduce.java.opts>>>- that can
-|             |       | be allocated to storing map outputs during the shuffle.
-|             |       | Though some memory should be set aside for the
-|             |       | framework, in general it is advantageous to set this
-|             |       | high enough to store large and numerous map outputs.
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.reduce.input.buffer.percent | float | The percentage of memory
-|             |       | relative to the maximum heapsize in which map outputs
-|             |       | may be retained during the reduce. When the reduce
-|             |       | begins, map outputs will be merged to disk until those
-|             |       | that remain are under the resource limit this defines.
-|             |       | By default, all map outputs are merged to disk before
-|             |       | the reduce begins to maximize the memory available to
-|             |       | the reduce. For less memory-intensive reduces, this
-|             |       | should be increased to avoid trips to disk.
-*-------------+-------+-------------------------------------------------------+
-
-  Other notes
-
-   * If a map output is larger than 25 percent of the memory allocated to
-     copying map outputs, it will be written directly to disk without first
-     staging through memory.
-
-   * When running with a combiner, the reasoning about high merge thresholds
-     and large buffers may not hold. For merges started before all map outputs
-     have been fetched, the combiner is run while spilling to disk. In some
-     cases, one can obtain better reduce times by spending resources combining
-     map outputs- making disk spills small and parallelizing spilling and
-     fetching- rather than aggressively increasing buffer sizes.
-
-   * When merging in-memory map outputs to disk to begin the reduce, if an
-     intermediate merge is necessary because there are segments to spill and at
-     least <<<mapreduce.task.io.sort.factor>>> segments already on disk, the
-     in-memory map outputs will be part of the intermediate merge.
-
-*** Configured Parameters
-
-  The following properties are localized in the job configuration for each
-  task's execution:
-
-*-------------*-------*-------------------------------------------------------*
-|| Name       || Type || Description                                          |
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.job.id | String | The job id
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.job.jar | String | job.jar location in job directory
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.job.local.dir | String | The job specific shared scratch space
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.id | String | The task id
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.attempt.id | String | The task attempt id
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.is.map | boolean | Is this a map task
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.partition | int | The id of the task within the job
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.map.input.file | String | The filename that the map is reading from
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.map.input.start | long | The offset of the start of the map input
-|             |       | split
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.map.input.length | long | The number of bytes in the map input
-|             |       | split
-*-------------+-------+-------------------------------------------------------+
-| mapreduce.task.output.dir | String | The task's temporary output directory
-*-------------+-------+-------------------------------------------------------+
-
-  <<Note:>> During the execution of a streaming job, the names of the
-  "mapreduce" parameters are transformed. The dots ( . ) become underscores
-  ( _ ). For example, mapreduce.job.id becomes mapreduce_job_id and
-  mapreduce.job.jar becomes mapreduce_job_jar. To get the values in a streaming
-  job's mapper/reducer use the parameter names with the underscores.
-
-*** Task Logs
-
-  The standard output (stdout) and error (stderr) streams and the syslog of the
-  task are read by the NodeManager and logged to
-  <<<$\{HADOOP_LOG_DIR\}/userlogs>>>.
-
-*** Distributing Libraries
-
-  The {{DistributedCache}} can also be used to distribute both jars and native
-  libraries for use in the map and/or reduce tasks. The child-jvm always has
-  its <current working directory> added to the <<<java.library.path>>> and
-  <<<LD_LIBRARY_PATH>>>. And hence the cached libraries can be loaded via
-  {{{http://docs.oracle.com/javase/7/docs/api/java/lang/System.html}
-  System.loadLibrary}} or
-  {{{http://docs.oracle.com/javase/7/docs/api/java/lang/System.html}
-  System.load}}. More details on how to load shared libraries through
-  distributed cache are documented at
-  {{{../../hadoop-project-dist/hadoop-common/NativeLibraries.html#Native_Shared_Libraries}
-  Native Libraries}}.
-
-** Job Submission and Monitoring
-
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}Job}} is the primary
-  interface by which user-job interacts with the <<<ResourceManager>>>.
-
-  <<<Job>>> provides facilities to submit jobs, track their progress, access
-  component-tasks' reports and logs, get the MapReduce cluster's status
-  information and so on.
-
-  The job submission process involves:
-
-   [[1]] Checking the input and output specifications of the job.
-
-   [[2]] Computing the <<<InputSplit>>> values for the job.
-
-   [[3]] Setting up the requisite accounting information for the
-         <<<DistributedCache>>> of the job, if necessary.
-
-   [[4]] Copying the job's jar and configuration to the MapReduce system
-         directory on the <<<FileSystem>>>.
-
-   [[5]] Submitting the job to the <<<ResourceManager>>> and optionally
-         monitoring it's status.
-
-  Job history files are also logged to user specified directory
-  <<<mapreduce.jobhistory.intermediate-done-dir>>> and
-  <<<mapreduce.jobhistory.done-dir>>>, which defaults to job output directory.
-
-  User can view the history logs summary in specified directory using the
-  following command \
-  <<<$ mapred job -history output.jhist>>> \
-  This command will print job details, failed and killed tip details. \
-  More details about the job such as successful tasks and task attempts made
-  for each task can be viewed using the following command \
-  <<<$ mapred job -history all output.jhist>>>
-
-  Normally the user uses <<<Job>>> to create the application, describe various
-  facets of the job, submit the job, and monitor its progress.
-
-*** Job Control
-
-  Users may need to chain MapReduce jobs to accomplish complex tasks which
-  cannot be done via a single MapReduce job. This is fairly easy since the
-  output of the job typically goes to distributed file-system, and the output,
-  in turn, can be used as the input for the next job.
-
-  However, this also means that the onus on ensuring jobs are complete
-  (success/failure) lies squarely on the clients. In such cases, the various
-  job-control options are:
-
-   * {{{../../api/org/apache/hadoop/mapreduce/Job.html}Job.submit()}} :
-     Submit the job to the cluster and return immediately.
-
-   * {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-     Job.waitForCompletion(boolean)}} :
-     Submit the job to the cluster and wait for it to finish.
-
-** Job Input
-
-  {{{../../api/org/apache/hadoop/mapreduce/InputFormat.html}InputFormat}}
-  describes the input-specification for a MapReduce job.
-
-  The MapReduce framework relies on the <<<InputFormat>>> of the job to:
-
-   [[1]] Validate the input-specification of the job.
-
-   [[2]] Split-up the input file(s) into logical <<<InputSplit>>> instances,
-         each of which is then assigned to an individual <<<Mapper>>>.
-
-   [[3]] Provide the <<<RecordReader>>> implementation used to glean input
-         records from the logical <<<InputSplit>>> for processing by the
-         <<<Mapper>>>.
-
-  The default behavior of file-based <<<InputFormat>>> implementations,
-  typically sub-classes of
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html}
-  FileInputFormat}}, is to split the input into <logical> <<<InputSplit>>>
-  instances based on the total size, in bytes, of the input files. However, the
-  <<<FileSystem>>> blocksize of the  input files is treated as an upper bound
-  for input splits. A lower bound on the split size can be set via
-  <<<mapreduce.input.fileinputformat.split.minsize>>>.
-
-  Clearly, logical splits based on input-size is insufficient for many
-  applications since record boundaries must be respected. In such cases, the
-  application should implement a <<<RecordReader>>>, who is responsible for
-  respecting record-boundaries and presents a record-oriented view of the
-  logical <<<InputSplit>>> to the individual task.
-
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/TextInputFormat.html}
-  TextInputFormat}} is the default <<<InputFormat>>>.
-
-  If <<<TextInputFormat>>> is the <<<InputFormat>>> for a given job, the
-  framework detects input-files with the <.gz> extensions and automatically
-  decompresses them using the appropriate <<<CompressionCodec>>>. However, it
-  must be noted that compressed files with the above extensions cannot be
-  <split> and each compressed file is processed in its entirety by a single
-  mapper.
-
-*** InputSplit
-
-  {{{../../api/org/apache/hadoop/mapreduce/InputSplit.html}InputSplit}}
-  represents the data to be processed by an individual <<<Mapper>>>.
-
-  Typically <<<InputSplit>>> presents a byte-oriented view of the input, and it
-  is the responsibility of <<<RecordReader>>> to process and present a
-  record-oriented view.
-
-  {{{../../api/org/apache/hadoop/mapreduce/lib/input/FileSplit.html}FileSplit}}
-  is the default <<<InputSplit>>>. It sets <<<mapreduce.map.input.file>>> to
-  the path of the input file for the logical split.
-
-*** RecordReader
-
-  {{{../../api/org/apache/hadoop/mapreduce/RecordReader.html}RecordReader}}
-  reads <<<\<key, value\>>>> pairs from an <<<InputSplit>>>.
-
-  Typically the <<<RecordReader>>> converts the byte-oriented view of the
-  input, provided by the <<<InputSplit>>>, and presents a record-oriented to
-  the <<<Mapper>>> implementations for processing. <<<RecordReader>>> thus
-  assumes the responsibility of processing record boundaries and presents the
-  tasks with keys and values.
-
-** Job Output
-
-  {{{../../api/org/apache/hadoop/mapreduce/OutputFormat.html}OutputFormat}}
-  describes the output-specification for a MapReduce job.
-
-  The MapReduce framework relies on the <<<OutputFormat>>> of the job to:
-
-   [[1]] Validate the output-specification of the job; for example, check that
-         the output directory doesn't already exist.
-
-   [[2]] Provide the <<<RecordWriter>>> implementation used to write the output
-         files of the job. Output files are stored in a <<<FileSystem>>>.
-
-  <<<TextOutputFormat>>> is the default <<<OutputFormat>>>.
-
-*** OutputCommitter
-
-  {{{../../api/org/apache/hadoop/mapreduce/OutputCommitter.html}
-  OutputCommitter}} describes the commit of task output for a MapReduce job.
-
-  The MapReduce framework relies on the <<<OutputCommitter>>> of the job to:
-
-   [[1]] Setup the job during initialization. For example, create the temporary
-         output directory for the job during the initialization of the job. Job
-         setup is done by a separate task when the job is in PREP state and
-         after initializing tasks. Once the setup task completes, the job will
-         be moved to RUNNING state.
-
-   [[2]] Cleanup the job after the job completion. For example, remove the
-         temporary output directory after the job completion. Job cleanup is
-         done by a separate task at the end of the job. Job is declared
-         SUCCEDED/FAILED/KILLED after the cleanup task completes.
-
-   [[3]] Setup the task temporary output. Task setup is done as part of the
-         same task, during task initialization.
-
-   [[4]] Check whether a task needs a commit. This is to avoid the commit
-         procedure if a task does not need commit.
-
-   [[5]] Commit of the task output. Once task is done, the task will commit
-         it's output if required.
-
-   [[6]] Discard the task commit. If the task has been failed/killed, the
-         output will be cleaned-up. If task could not cleanup (in exception
-         block), a separate task will be launched with same attempt-id to do
-         the cleanup.
-
-  <<<FileOutputCommitter>>> is the default <<<OutputCommitter>>>. Job
-  setup/cleanup tasks occupy map or reduce containers, whichever is available
-  on the NodeManager. And JobCleanup task, TaskCleanup tasks and JobSetup task
-  have the highest priority, and in that order.
-
-*** Task Side-Effect Files
-
-  In some applications, component tasks need to create and/or write to
-  side-files, which differ from the actual job-output files.
-
-  In such cases there could be issues with two instances of the same
-  <<<Mapper>>> or <<<Reducer>>> running simultaneously (for example,
-  speculative tasks) trying to open and/or write to the same file (path) on the
-  <<<FileSystem>>>. Hence the application-writer will have to pick unique names
-  per task-attempt (using the attemptid, say
-  <<<attempt_200709221812_0001_m_000000_0>>>), not just per task.
-
-  To avoid these issues the MapReduce framework, when the <<<OutputCommitter>>>
-  is <<<FileOutputCommitter>>>, maintains a special
-  <<<$\{mapreduce.output.fileoutputformat.outputdir\}/_temporary/_$\{taskid\}>>>
-  sub-directory accessible via <<<$\{mapreduce.task.output.dir\}>>> for each
-  task-attempt on the <<<FileSystem>>> where the output of the task-attempt is
-  stored. On successful completion of the task-attempt, the files in the
-  <<<$\{mapreduce.output.fileoutputformat.outputdir\}/_temporary/_$\{taskid\}>>>
-  (only) are <promoted> to
-  <<<$\{mapreduce.output.fileoutputformat.outputdir\}>>>. Of course, the
-  framework discards the sub-directory of unsuccessful task-attempts. This
-  process is completely transparent to the application.
-
-  The application-writer can take advantage of this feature by creating any
-  side-files required in <<<$\{mapreduce.task.output.dir\}>>> during execution
-  of a task via
-  {{{../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html}
-  FileOutputFormat.getWorkOutputPath(Conext)}}, and the framework will promote
-  them similarly for succesful task-attempts, thus eliminating the need to pick
-  unique paths per task-attempt.
-
-  Note: The value of <<<$\{mapreduce.task.output.dir\}>>> during execution of a
-  particular task-attempt is actually
-  <<<$\{mapreduce.output.fileoutputformat.outputdir\}/_temporary/_\{$taskid\}>>>,
-  and this value is set by the MapReduce framework. So, just create any
-  side-files in the path returned by
-  {{{../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html}
-  FileOutputFormat.getWorkOutputPath(Conext)}} from MapReduce task to take
-  advantage of this feature.
-
-  The entire discussion holds true for maps of jobs with reducer=NONE
-  (i.e. 0 reduces) since output of the map, in that case, goes directly to
-  HDFS.
-
-*** RecordWriter
-
-  {{{../../api/org/apache/hadoop/mapreduce/RecordWriter.html}RecordWriter}}
-  writes the output <<<\<key, value\>>>> pairs to an output file.
-
-  RecordWriter implementations write the job outputs to the <<<FileSystem>>>.
-
-** Other Useful Features
-
-*** Submitting Jobs to Queues
-
-  Users submit jobs to Queues. Queues, as collection of jobs, allow the system
-  to provide specific functionality. For example, queues use ACLs to control
-  which users who can submit jobs to them. Queues are expected to be primarily
-  used by Hadoop Schedulers.
-
-  Hadoop comes configured with a single mandatory queue, called 'default'.
-  Queue names are defined in the <<<mapreduce.job.queuename>>>> property of the
-  Hadoop site configuration. Some job schedulers, such as the
-  {{{../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html}
-  Capacity Scheduler}}, support multiple queues.
-
-  A job defines the queue it needs to be submitted to through the
-  <<<mapreduce.job.queuename>>> property, or through the
-  Configuration.set(<<<MRJobConfig.QUEUE_NAME>>>, String) API. Setting the
-  queue name is optional. If a job is submitted without an associated queue
-  name, it is submitted to the 'default' queue.
-
-*** Counters
-
-  <<<Counters>>> represent global counters, defined either by the MapReduce
-  framework or applications. Each <<<Counter>>> can be of any <<<Enum>>> type.
-  Counters of a particular <<<Enum>>> are bunched into groups of type
-  <<<Counters.Group>>>.
-
-  Applications can define arbitrary <<<Counters>>> (of type <<<Enum>>>) and
-  update them via
-  {{{../../api/org/apache/hadoop/mapred/Counters.html}
-  Counters.incrCounter(Enum, long)}} or Counters.incrCounter(String, String,
-  long) in the <<<map>>> and/or <<<reduce>>> methods. These counters are then
-  globally aggregated by the framework.
-
-*** DistributedCache
-
-  <<<DistributedCache>>> distributes application-specific, large, read-only
-  files efficiently.
-
-  <<<DistributedCache>>> is a facility provided by the MapReduce framework to
-  cache files (text, archives, jars and so on) needed by applications.
-
-  Applications specify the files to be cached via urls (hdfs://) in the
-  <<<Job>>>. The <<<DistributedCache>>> assumes that the files specified via
-  hdfs:// urls are already present on the <<<FileSystem>>>.
-
-  The framework will copy the necessary files to the slave node before any
-  tasks for the job are executed on that node. Its efficiency stems from the
-  fact that the files are only copied once per job and the ability to cache
-  archives which are un-archived on the slaves.
-
-  <<<DistributedCache>>> tracks the modification timestamps of the cached
-  files. Clearly the cache files should not be modified by the application or
-  externally while the job is executing.
-
-  <<<DistributedCache>>> can be used to distribute simple, read-only data/text
-  files and more complex types such as archives and jars. Archives (zip, tar,
-  tgz and tar.gz files) are <un-archived> at the slave nodes. Files have
-  <execution permissions> set.
-
-  The files/archives can be distributed by setting the property
-  <<<mapreduce.job.cache.\{files|archives\}>>>. If more than one file/archive
-  has to be distributed, they can be added as comma separated paths. The
-  properties can also be set by APIs
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}Job.addCacheFile(URI)}}/
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}Job.addCacheArchive(URI)}}
-  and
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setCacheFiles(URI\[\])}}/
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setCacheArchives(URI\[\])}} where URI is of the form
-  <<<hdfs://host:port/absolute-path\#link-name>>>. In Streaming, the files can
-  be distributed through command line option <<<-cacheFile/-cacheArchive>>>.
-
-  The <<<DistributedCache>>> can also be used as a rudimentary software
-  distribution mechanism for use in the map and/or reduce tasks. It can be used
-  to distribute both jars and native libraries. The
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.addArchiveToClassPath(Path)}} or
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.addFileToClassPath(Path)}} api can be used to cache files/jars and also
-  add them to the <classpath> of child-jvm. The same can be done by setting the
-  configuration properties <<<mapreduce.job.classpath.\{files|archives\}>>>.
-  Similarly the cached files that are symlinked into the working directory of
-  the task can be used to distribute native libraries and load them.
-
-**** Private and Public DistributedCache Files
-
-  DistributedCache files can be private or public, that determines how they can
-  be shared on the slave nodes.
-
-   * "Private" DistributedCache files are cached in a localdirectory private to
-      the user whose jobs need these files. These files are shared by all tasks
-      and jobs of the specific user only and cannot be accessed by jobs of
-      other users on the slaves. A DistributedCache file becomes private by
-      virtue of its permissions on the file system where the files are
-      uploaded, typically HDFS. If the file has no world readable access, or if
-      the directory path leading to the file has no world executable access for
-      lookup, then the file becomes private.
-
-   * "Public" DistributedCache files are cached in a global directory and the
-     file access is setup such that they are publicly visible to all users.
-     These files can be shared by tasks and jobs of all users on the slaves. A
-     DistributedCache file becomes public by virtue of its permissions on the
-     file system where the files are uploaded, typically HDFS. If the file has
-     world readable access, AND if the directory path leading to the file has
-     world executable access for lookup, then the file becomes public. In other
-     words, if the user intends to make a file publicly available to all users,
-     the file permissions must be set to be world readable, and the directory
-     permissions on the path leading to the file must be world executable.
-
-*** Profiling
-
-  Profiling is a utility to get a representative (2 or 3) sample of built-in
-  java profiler for a sample of maps and reduces.
-
-  User can specify whether the system should collect profiler information for
-  some of the tasks in the job by setting the configuration property
-  <<<mapreduce.task.profile>>>. The value can be set using the api
-  Configuration.set(<<<MRJobConfig.TASK_PROFILE>>>, boolean). If the value is
-  set <<<true>>>, the task profiling is enabled. The profiler information is
-  stored in the user log directory. By default, profiling is not enabled for
-  the job.
-
-  Once user configures that profiling is needed, she/he can use the
-  configuration property <<<mapreduce.task.profile.\{maps|reduces\}>>>
-  to set the ranges of MapReduce tasks to profile. The value can be set using
-  the api Configuration.set(<<<MRJobConfig.NUM_\{MAP|REDUCE\}_PROFILES>>>,
-  String). By default, the specified range is <<<0-2>>>.
-
-  User can also specify the profiler configuration arguments by setting the
-  configuration property <<<mapreduce.task.profile.params>>>. The value can be
-  specified using the api
-  Configuration.set(<<<MRJobConfig.TASK_PROFILE_PARAMS>>>, String). If the
-  string contains a <<<%s>>>, it will be replaced with the name of the
-  profiling output file when the task runs. These parameters are passed to the
-  task child JVM on the command line. The default value for the profiling
-  parameters is
-  <<<-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s>>>.
-
-*** Debugging
-
-  The MapReduce framework provides a facility to run user-provided scripts for
-  debugging. When a MapReduce task fails, a user can run a debug script, to
-  process task logs for example. The script is given access to the task's
-  stdout and stderr outputs, syslog and jobconf. The output from the debug
-  script's stdout and stderr is displayed on the console diagnostics and also
-  as part of the job UI.
-
-  In the following sections we discuss how to submit a debug script with a job.
-  The script file needs to be distributed and submitted to the framework.
-
-**** How to distribute the script file:
-
-  The user needs to use {{DistributedCache}} to <distribute> and <symlink> the
-  script file.
-
-**** How to submit the script:
-
-  A quick way to submit the debug script is to set values for the properties
-  <<<mapreduce.map.debug.script>>> and <<<mapreduce.reduce.debug.script>>>, for
-  debugging map and reduce tasks respectively. These properties can also be set
-  by using APIs
-  {{{../../api/org/apache/hadoop/conf/Configuration.html}
-  Configuration.set(<<<MRJobConfig.MAP_DEBUG_SCRIPT>>>, String)}} and
-  {{{../../api/org/apache/hadoop/conf/Configuration.html}
-  Configuration.set(<<<MRJobConfig.REDUCE_DEBUG_SCRIPT>>>, String)}}. In
-  streaming mode, a debug script can be submitted with the command-line options
-  <<<-mapdebug>>> and <<<-reducedebug>>>, for debugging map and reduce tasks
-  respectively.
-
-  The arguments to the script are the task's stdout, stderr, syslog and jobconf
-  files. The debug command, run on the node where the MapReduce task failed,
-  is: \
-  <<<$script $stdout $stderr $syslog $jobconf>>>
-
-  Pipes programs have the c++ program name as a fifth argument for the command.
-  Thus for the pipes programs the command is \
-  <<<$script $stdout $stderr $syslog $jobconf $program>>>
-
-**** Default Behavior:
-
-  For pipes, a default script is run to process core dumps under gdb, prints
-  stack trace and gives info about running threads.
-
-*** Data Compression
-
-  Hadoop MapReduce provides facilities for the application-writer to specify
-  compression for both intermediate map-outputs and the job-outputs i.e. output
-  of the reduces. It also comes bundled with
-  {{{../../api/org/apache/hadoop/io/compress/CompressionCodec.html}
-  CompressionCodec}} implementation for the {{{http://www.zlib.net}zlib}}
-  compression algorithm. The {{{http://www.gzip.org}gzip}},
-  {{{http://www.bzip.org}bzip2}}, {{{http://code.google.com/p/snappy/}snappy}},
-  and {{{http://code.google.com/p/lz4/}lz4}} file format are also supported.
-
-  Hadoop also provides native implementations of the above compression codecs
-  for reasons of both performance (zlib) and non-availability of Java
-  libraries. More details on their usage and availability are available
-  {{{../../hadoop-project-dist/hadoop-common/NativeLibraries.html}here}}.
-
-**** Intermediate Outputs
-
-  Applications can control compression of intermediate map-outputs via the
-  Configuration.set(<<<MRJobConfig.MAP_OUTPUT_COMPRESS>>>, boolean) api and the
-  <<<CompressionCodec>>> to be used via the
-  Configuration.set(<<<MRJobConfig.MAP_OUTPUT_COMPRESS_CODEC>>>, Class) api.
-
-**** Job Outputs
-
-  Applications can control compression of job-outputs via the
-  {{{../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html}
-  FileOutputFormat.setCompressOutput(Job, boolean)}} api and the
-  <<<CompressionCodec>>> to be used can be specified via the
-  FileOutputFormat.setOutputCompressorClass(Job, Class) api.
-
-  If the job outputs are to be stored in the
-  {{{../../api/org/apache/hadoop/mapreduce/lib/output/SequenceFileOutputFormat.html}
-  SequenceFileOutputFormat}}, the required <<<SequenceFile.CompressionType>>>
-  (i.e. <<<RECORD>>> / <<<BLOCK>>> - defaults to <<<RECORD>>>) can be specified
-  via the SequenceFileOutputFormat.setOutputCompressionType(Job,
-  SequenceFile.CompressionType) api.
-
-*** Skipping Bad Records
-
-  Hadoop provides an option where a certain set of bad input records can be
-  skipped when processing map inputs. Applications can control this feature
-  through the {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords}} class.
-
-  This feature can be used when map tasks crash deterministically on certain
-  input. This usually happens due to bugs in the map function. Usually, the
-  user would have to fix these bugs. This is, however, not possible sometimes.
-  The bug may be in third party libraries, for example, for which the source
-  code is not available. In such cases, the task never completes successfully
-  even after multiple attempts, and the job fails. With this feature, only a
-  small portion of data surrounding the bad records is lost, which may be
-  acceptable for some applications (those performing statistical analysis on
-  very large data, for example).
-
-  By default this feature is disabled. For enabling it, refer to
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)}} and
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setReducerMaxSkipGroups(Configuration, long)}}.
-
-  With this feature enabled, the framework gets into 'skipping mode' after a
-  certain number of map failures. For more details, see
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setAttemptsToStartSkipping(Configuration, int)}}. In 'skipping
-  mode', map tasks maintain the range of records being processed. To do this,
-  the framework relies on the processed record counter. See
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS}} and
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS}}. This counter enables the
-  framework to know how many records have been processed successfully, and
-  hence, what record range caused a task to crash. On further attempts,
-  this range of records is skipped.
-
-  The number of records skipped depends on how frequently the processed record
-  counter is incremented by the application. It is recommended that this
-  counter be incremented after every record is processed. This may not be
-  possible in some applications that typically batch their processing. In such
-  cases, the framework may skip additional records surrounding the bad record.
-  Users can control the number of skipped records through
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)}} and
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setReducerMaxSkipGroups(Configuration, long)}}. The framework
-  tries to narrow the range of skipped records using a binary search-like
-  approach. The skipped range is divided into two halves and only one half gets
-  executed. On subsequent failures, the framework figures out which half
-  contains bad records. A task will be re-executed till the acceptable skipped
-  value is met or all task attempts are exhausted. To increase the number of
-  task attempts, use
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setMaxMapAttempts(int)}} and
-  {{{../../api/org/apache/hadoop/mapreduce/Job.html}
-  Job.setMaxReduceAttempts(int)}}
-
-  Skipped records are written to HDFS in the sequence file format, for later
-  analysis. The location can be changed through
-  {{{../../api/org/apache/hadoop/mapred/SkipBadRecords.html}
-  SkipBadRecords.setSkipOutputPath(JobConf, Path)}}.
-
-** Example: WordCount v2.0
-
-  Here is a more complete <<<WordCount>>> which uses many of the features
-  provided by the MapReduce framework we discussed so far.
-
-  This needs the HDFS to be up and running, especially for the
-  <<<DistributedCache>>>-related features. Hence it only works with a
-  {{{../../hadoop-project-dist/hadoop-common/SingleCluster.html}
-  pseudo-distributed}} or
-  {{{../../hadoop-project-dist/hadoop-common/ClusterSetup.html}
-  fully-distributed}} Hadoop installation.
-
-*** Source Code
-
-+---+
-import java.io.BufferedReader;
-import java.io.FileReader;
-import java.io.IOException;
-import java.net.URI;
-import java.util.ArrayList;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Set;
-import java.util.StringTokenizer;
-
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.io.IntWritable;
-import org.apache.hadoop.io.Text;
-import org.apache.hadoop.mapreduce.Job;
-import org.apache.hadoop.mapreduce.Mapper;
-import org.apache.hadoop.mapreduce.Reducer;
-import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
-import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
-import org.apache.hadoop.mapreduce.Counter;
-import org.apache.hadoop.util.GenericOptionsParser;
-import org.apache.hadoop.util.StringUtils;
-
-public class WordCount2 {
-
-  public static class TokenizerMapper
-       extends Mapper<Object, Text, Text, IntWritable>{
-
-    static enum CountersEnum { INPUT_WORDS }
-
-    private final static IntWritable one = new IntWritable(1);
-    private Text word = new Text();
-
-    private boolean caseSensitive;
-    private Set<String> patternsToSkip = new HashSet<String>();
-
-    private Configuration conf;
-    private BufferedReader fis;
-
-    @Override
-    public void setup(Context context) throws IOException,
-        InterruptedException {
-      conf = context.getConfiguration();
-      caseSensitive = conf.getBoolean("wordcount.case.sensitive", true);
-      if (conf.getBoolean("wordcount.skip.patterns", true)) {
-        URI[] patternsURIs = Job.getInstance(conf).getCacheFiles();
-        for (URI patternsURI : patternsURIs) {
-          Path patternsPath = new Path(patternsURI.getPath());
-          String patternsFileName = patternsPath.getName().toString();
-          parseSkipFile(patternsFileName);
-        }
-      }
-    }
-
-    private void parseSkipFile(String fileName) {
-      try {
-        fis = new BufferedReader(new FileReader(fileName));
-        String pattern = null;
-        while ((pattern = fis.readLine()) != null) {
-          patternsToSkip.add(pattern);
-        }
-      } catch (IOException ioe) {
-        System.err.println("Caught exception while parsing the cached file '"
-            + StringUtils.stringifyException(ioe));
-      }
-    }
-
-    @Override
-    public void map(Object key, Text value, Context context
-                    ) throws IOException, InterruptedException {
-      String line = (caseSensitive) ?
-          value.toString() : value.toString().toLowerCase();
-      for (String pattern : patternsToSkip) {
-        line = line.replaceAll(pattern, "");
-      }
-      StringTokenizer itr = new StringTokenizer(line);
-      while (itr.hasMoreTokens()) {
-        word.set(itr.nextToken());
-        context.write(word, one);
-        Counter counter = context.getCounter(CountersEnum.class.getName(),
-            CountersEnum.INPUT_WORDS.toString());
-        counter.increment(1);
-      }
-    }
-  }
-
-  public static class IntSumReducer
-       extends Reducer<Text,IntWritable,Text,IntWritable> {
-    private IntWritable result = new IntWritable();
-
-    public void reduce(Text key, Iterable<IntWritable> values,
-                       Context context
-                       ) throws IOException, InterruptedException {
-      int sum = 0;
-      for (IntWritable val : values) {
-        sum += val.get();
-      }
-      result.set(sum);
-      context.write(key, result);
-    }
-  }
-
-  public static void main(String[] args) throws Exception {
-    Configuration conf = new Configuration();
-    GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
-    String[] remainingArgs = optionParser.getRemainingArgs();
-    if (!(remainingArgs.length != 2 || remainingArgs.length != 4)) {
-      System.err.println("Usage: wordcount <in> <out> [-skip skipPatternFile]");
-      System.exit(2);
-    }
-    Job job = Job.getInstance(conf, "word count");
-    job.setJarByClass(WordCount2.class);
-    job.setMapperClass(TokenizerMapper.class);
-    job.setCombinerClass(IntSumReducer.class);
-    job.setReducerClass(IntSumReducer.class);
-    job.setOutputKeyClass(Text.class);
-    job.setOutputValueClass(IntWritable.class);
-
-    List<String> otherArgs = new ArrayList<String>();
-    for (int i=0; i < remainingArgs.length; ++i) {
-      if ("-skip".equals(remainingArgs[i])) {
-        job.addCacheFile(new Path(remainingArgs[++i]).toUri());
-        job.getConfiguration().setBoolean("wordcount.skip.patterns", true);
-      } else {
-        otherArgs.add(remainingArgs[i]);
-      }
-    }
-    FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
-    FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));
-
-    System.exit(job.waitForCompletion(true) ? 0 : 1);
-  }
-}
-+---+
-
-*** Sample Runs
-
-  Sample text-files as input:
-
-  <<<$ bin/hdfs dfs -ls /user/joe/wordcount/input/>>> \
-  <<</user/joe/wordcount/input/file01>>> \
-  <<</user/joe/wordcount/input/file02>>> \
-  \
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/input/file01>>> \
-  <<<Hello World, Bye World!>>> \
-  \
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/input/file02>>> \
-  <<<Hello Hadoop, Goodbye to hadoop.>>>
-
-  Run the application:
-
-  <<<$ bin/hadoop jar wc.jar WordCount2 /user/joe/wordcount/input
-  /user/joe/wordcount/output>>>
-
-  Output:
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/output/part-r-00000>>> \
-  <<<Bye     1>>> \
-  <<<Goodbye 1>>> \
-  <<<Hadoop, 1>>> \
-  <<<Hello   2>>> \
-  <<<World!  1>>> \
-  <<<World,  1>>> \
-  <<<hadoop. 1>>> \
-  <<<to      1>>>
-
-  Notice that the inputs differ from the first version we looked at, and how
-  they affect the outputs.
-
-  Now, lets plug-in a pattern-file which lists the word-patterns to be ignored,
-  via the <<<DistributedCache>>>.
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/patterns.txt>>> \
-  <<<\\.>>> \
-  <<<\\,>>> \
-  <<<\\!>>> \
-  <<<to>>>
-
-  Run it again, this time with more options:
-
-  <<<$ bin/hadoop jar wc.jar WordCount2
-     -Dwordcount.case.sensitive=true /user/joe/wordcount/input
-     /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt>>>
-
-  As expected, the output:
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/output/part-r-00000>>> \
-  <<<Bye     1>>> \
-  <<<Goodbye 1>>> \
-  <<<Hadoop  1>>> \
-  <<<Hello   2>>> \
-  <<<World   2>>> \
-  <<<hadoop  1>>>
-
-  Run it once more, this time switch-off case-sensitivity:
-
-  <<<$ bin/hadoop jar wc.jar WordCount2
-     -Dwordcount.case.sensitive=false /user/joe/wordcount/input
-     /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt>>>
-
-  Sure enough, the output:
-
-  <<<$ bin/hdfs dfs -cat /user/joe/wordcount/output/part-r-00000>>> \
-  <<<bye     1>>> \
-  <<<goodbye 1>>> \
-  <<<hadoop  2>>> \
-  <<<hello   2>>> \
-  <<<horld   2>>>
-
-*** Highlights
-
-  The second version of <<<WordCount>>> improves upon the previous one by using
-  some features offered by the MapReduce framework:
-
-   * Demonstrates how applications can access configuration parameters in the
-     <<<setup>>> method of the <<<Mapper>>> (and <<<Reducer>>>)
-     implementations.
-
-   * Demonstrates how the <<<DistributedCache>>> can be used to distribute
-     read-only data needed by the jobs. Here it allows the user to specify
-     word-patterns to skip while counting.
-
-   * Demonstrates the utility of the <<<GenericOptionsParser>>> to handle
-     generic Hadoop command-line options.
-
-   * Demonstrates how applications can use <<<Counters>>> and how they can set
-     application-specific status information passed to the <<<map>>> (and
-     <<<reduce>>>) method.
-
-  <Java and JNI are trademarks or registered trademarks of Oracle America,
-  Inc. in the United States and other countries.>

+ 0 - 114
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm

@@ -1,114 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Backward Compatibility
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Apache Hadoop MapReduce - Migrating from Apache Hadoop 1.x to Apache Hadoop 2.x 
-
-* {Introduction}
-
-  This document provides information for users to migrate their Apache Hadoop 
-  MapReduce applications from Apache Hadoop 1.x to Apache Hadoop 2.x.
-
-  In Apache Hadoop 2.x we have spun off resource management capabilities
-  into Apache Hadoop YARN, a general purpose, distributed application management 
-  framework while Apache Hadoop MapReduce (aka MRv2) remains as a pure 
-  distributed computation framework.
-
-  In general, the previous MapReduce runtime (aka MRv1) has been reused and
-  no major surgery has been conducted on it. Therefore, MRv2 is able to ensure
-  satisfactory compatibility with MRv1 applications. However, due to some
-  improvements and code refactorings, a few APIs have been rendered
-  backward-incompatible. 
-  
-  The remainder of this page will discuss the scope and the level of backward 
-  compatibility that we support in Apache Hadoop MapReduce 2.x (MRv2).
-
-* {Binary Compatibility}
-
-  First, we ensure binary compatibility to the applications that use old
-  <<mapred>> APIs. This means that applications which were built against MRv1
-  <<mapred>> APIs can run directly on YARN without recompilation, merely by 
-  pointing them to an Apache Hadoop 2.x cluster via configuration.
-
-* {Source Compatibility}
-
-  We cannot ensure complete binary compatibility with the applications that use
-  <<mapreduce>> APIs, as these APIs have evolved a lot since MRv1. However, we
-  ensure source compatibility for <<mapreduce>> APIs that break binary
-  compatibility. In other words, users should recompile their applications that 
-  use <<mapreduce>> APIs against MRv2 jars. One notable binary incompatibility 
-  break is Counter and CounterGroup. 
-
-* {Not Supported}
-
-  MRAdmin has been removed in MRv2 because because <<<mradmin>>> commands
-  no longer exist. They have been replaced by the commands in <<<rmadmin>>>. We
-  neither support binary compatibility nor source compatibility for the
-  applications that use this class directly.
-
-* {Tradeoffs between MRv1 Users and Early MRv2 Adopters}
-
-  Unfortunately, maintaining binary compatibility for MRv1 applications may lead
-  to binary incompatibility issues for early MRv2 adopters, in particular Hadoop
-  0.23 users. For <<mapred>> APIs, we have chosen to be compatible with MRv1
-  applications, which have a larger user base. For <<mapreduce>> APIs, if they
-  don't significantly break Hadoop 0.23 applications, we still change them to be
-  compatible with MRv1 applications. Below is the list of MapReduce APIs which
-  are incompatible with Hadoop 0.23.
-
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<Problematic Function>>                                                          | <<Incompatibility Issue>>                                                                                        |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.util.ProgramDriver#drive>>>                                  | Return type changes from <<<void>>> to <<<int>>>                                                                 |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapred.jobcontrol.Job#getMapredJobID>>>                      | Return type changes from <<<String>>> to <<<JobID>>>                                                             |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapred.TaskReport#getTaskId>>>                               | Return type changes from <<<String>>> to <<<TaskID>>>                                                            |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapred.ClusterStatus#UNINITIALIZED_MEMORY_VALUE>>>           | Data type changes from <<<long>>> to <<<int>>>                                                                   |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapreduce.filecache.DistributedCache#getArchiveTimestamps>>> | Return type changes from <<<long[]>>> to <<<String[]>>>                                                          |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapreduce.filecache.DistributedCache#getFileTimestamps>>>    | Return type changes from <<<long[]>>> to <<<String[]>>>                                                          |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapreduce.Job#failTask>>>                                    | Return type changes from <<<void>>> to <<<boolean>>>                                                             |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapreduce.Job#killTask>>>                                    | Return type changes from <<<void>>> to <<<boolean>>>                                                             |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-| <<<org.apache.hadoop.mapreduce.Job#getTaskCompletionEvents>>>                     | Return type changes from <<<o.a.h.mapred.TaskCompletionEvent[]>>> to <<<o.a.h.mapreduce.TaskCompletionEvent[]>>> |
-*-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+
-
-* {Malicious}
-
-  For the users who are going to try <<<hadoop-examples-1.x.x.jar>>> on YARN,
-  please note that <<<hadoop -jar hadoop-examples-1.x.x.jar>>> will still use
-  <<<hadoop-mapreduce-examples-2.x.x.jar>>>, which is installed together with
-  other MRv2 jars. By default Hadoop framework jars appear before the users'
-  jars in the classpath, such that the classes from the 2.x.x jar will still be
-  picked. Users should remove <<<hadoop-mapreduce-examples-2.x.x.jar>>>
-  from the classpath of all the nodes in a cluster. Otherwise, users need to
-  set <<<HADOOP_USER_CLASSPATH_FIRST=true>>> and
-  <<<HADOOP_CLASSPATH=...:hadoop-examples-1.x.x.jar>>> to run their target
-  examples jar, and add the following configuration in <<<mapred-site.xml>>> to
-  make the processes in YARN containers pick this jar as well.
-
-+---+
-    <property>
-        <name>mapreduce.job.user.classpath.first</name>
-        <value>true</value>
-    </property>
-+---+

+ 0 - 2709
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredAppMasterRest.apt.vm

@@ -1,2709 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  MapReduce Application Master REST API's.
-  ---
-  ---
-  ${maven.build.timestamp}
-
-MapReduce Application Master REST API's.
-
-%{toc|section=1|fromDepth=0|toDepth=2}
-
-* Overview
-
-  The MapReduce Application Master REST API's allow the user to get status on the running MapReduce application master. Currently this is the equivalent to a running MapReduce job. The information includes the jobs the app master is running and all the job particulars like tasks, counters, configuration, attempts, etc. The application master should be accessed via the proxy.  This proxy is configurable to run either on the resource manager or on a separate host. The proxy URL usually looks like: http://<proxy http address:port>/proxy/{appid}.
-  
-* Mapreduce Application Master Info API
-
-  The MapReduce application master information resource provides overall information about that mapreduce application master. This includes application id, time it was started, user, name, etc. 
-
-** URI
-
-  Both of the following URI's give you the MapReduce application master information, from an application id identified by the appid value. 
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/info
-------
-
-** HTTP Operations Supported
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <info> object
-
-  When you make a request for the mapreduce application master information, the information will be returned as an info object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| appId            | long         | The application id |
-*---------------+--------------+-------------------------------+
-| startedOn     | long         | The time the application started (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| name | string | The name of the application |
-*---------------+--------------+-------------------------------+
-| user | string | The user name of the user who started the application |
-*---------------+--------------+-------------------------------+
-| elapsedTime | long | The time since the application was started (in ms)|
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0003/ws/v1/mapreduce/info
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{   
-  "info" : {
-      "appId" : "application_1326232085508_0003",
-      "startedOn" : 1326238244047,
-      "user" : "user1",
-      "name" : "Sleep job",
-      "elapsedTime" : 32374
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
------
-  Accept: application/xml
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0003/ws/v1/mapreduce/info
------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 223
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<info>
-  <appId>application_1326232085508_0003</appId>
-  <name>Sleep job</name>
-  <user>user1</user>
-  <startedOn>1326238244047</startedOn>
-  <elapsedTime>32407</elapsedTime>
-</info>
-+---+
-
-* Jobs API
-
-  The jobs resource provides a list of the jobs running on this application master.  See also {{Job API}} for syntax of the job object.
-
-** URI
-
-------
-  *  http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <jobs> object
-
-  When you make a request for the list of jobs, the information will be returned as a collection of job objects. See also {{Job API}} for syntax of the job object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type  || Description                   |
-*---------------+--------------+-------------------------------+
-| job | array of job objects(JSON)/Zero or more job objects(XML) | The collection of job objects |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-  "jobs" : {
-      "job" : [
-         {
-            "runningReduceAttempts" : 1,
-            "reduceProgress" : 100,
-            "failedReduceAttempts" : 0,
-            "newMapAttempts" : 0,
-            "mapsRunning" : 0,
-            "state" : "RUNNING",
-            "successfulReduceAttempts" : 0,
-            "reducesRunning" : 1,
-            "acls" : [
-               {
-                  "value" : " ",
-                  "name" : "mapreduce.job.acl-modify-job"
-               },
-               {
-                  "value" : " ",
-                  "name" : "mapreduce.job.acl-view-job"
-               }
-            ],
-            "reducesPending" : 0,
-            "user" : "user1",
-            "reducesTotal" : 1,
-            "mapsCompleted" : 1,
-            "startTime" : 1326238769379,
-            "id" : "job_1326232085508_4_4",
-            "successfulMapAttempts" : 1,
-            "runningMapAttempts" : 0,
-            "newReduceAttempts" : 0,
-            "name" : "Sleep job",
-            "mapsPending" : 0,
-            "elapsedTime" : 59377,
-            "reducesCompleted" : 0,
-            "mapProgress" : 100,
-            "diagnostics" : "",
-            "failedMapAttempts" : 0,
-            "killedReduceAttempts" : 0,
-            "mapsTotal" : 1,
-            "uberized" : false,
-            "killedMapAttempts" : 0,
-            "finishTime" : 0
-         }
-     ]
-   }
- }
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 1214 
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobs>
-  <job>
-    <startTime>1326238769379</startTime>
-    <finishTime>0</finishTime>
-    <elapsedTime>59416</elapsedTime>
-    <id>job_1326232085508_4_4</id>
-    <name>Sleep job</name>
-    <user>user1</user>
-    <state>RUNNING</state>
-    <mapsTotal>1</mapsTotal>
-    <mapsCompleted>1</mapsCompleted>
-    <reducesTotal>1</reducesTotal>
-    <reducesCompleted>0</reducesCompleted>
-    <mapProgress>100.0</mapProgress>
-    <reduceProgress>100.0</reduceProgress>
-    <mapsPending>0</mapsPending>
-    <mapsRunning>0</mapsRunning>
-    <reducesPending>0</reducesPending>
-    <reducesRunning>1</reducesRunning>
-    <uberized>false</uberized>
-    <diagnostics/>
-    <newReduceAttempts>0</newReduceAttempts>
-    <runningReduceAttempts>1</runningReduceAttempts>
-    <failedReduceAttempts>0</failedReduceAttempts>
-    <killedReduceAttempts>0</killedReduceAttempts>
-    <successfulReduceAttempts>0</successfulReduceAttempts>
-    <newMapAttempts>0</newMapAttempts>
-    <runningMapAttempts>0</runningMapAttempts>
-    <failedMapAttempts>0</failedMapAttempts>
-    <killedMapAttempts>0</killedMapAttempts>
-    <successfulMapAttempts>1</successfulMapAttempts>
-    <acls>
-      <name>mapreduce.job.acl-modify-job</name>
-      <value> </value>
-    </acls>
-    <acls>
-      <name>mapreduce.job.acl-view-job</name>
-      <value> </value>
-    </acls>
-  </job>
-</jobs>
-+---+
-
-* {Job API}
-
-  A job resource contains information about a particular job that was started by this application master. Certain fields are only accessible if user has permissions - depends on acl settings.
-
-** URI
-
-  Use the following URI to obtain a job object, for a job identified by the jobid value.
-
-------
-  * http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/{jobid}
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <job> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The job id|
-*---------------+--------------+-------------------------------+
-| name | string | The job name |
-*---------------+--------------+-------------------------------+
-| user | string | The user name |
-*---------------+--------------+-------------------------------+
-| state | string | the job state - valid values are:  NEW, INITED, RUNNING, SUCCEEDED, FAILED, KILL_WAIT, KILLED, ERROR|
-*---------------+--------------+-------------------------------+
-| startTime | long | The time the job started (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| finishTime | long | The time the job finished (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| elapsedTime | long | The elapsed time since job started (in ms)|
-*---------------+--------------+-------------------------------+
-| mapsTotal | int | The total number of maps |
-*---------------+--------------+-------------------------------+
-| mapsCompleted | int | The number of completed maps |
-*---------------+--------------+-------------------------------+
-| reducesTotal | int | The total number of reduces |
-*---------------+--------------+-------------------------------+
-| reducesCompleted | int | The number of completed reduces|
-*---------------+--------------+-------------------------------+
-| diagnostics | string | A diagnostic message |
-*---------------+--------------+-------------------------------+
-| uberized | boolean | Indicates if the job was an uber job - ran completely in the application master|
-*---------------+--------------+-------------------------------+
-| mapsPending | int | The number of maps still to be run|
-*---------------+--------------+-------------------------------+
-| mapsRunning | int | The number of running maps |
-*---------------+--------------+-------------------------------+
-| reducesPending | int | The number of reduces still to be run |
-*---------------+--------------+-------------------------------+
-| reducesRunning | int | The number of running reduces|
-*---------------+--------------+-------------------------------+
-| newReduceAttempts | int | The number of new reduce attempts |
-*---------------+--------------+-------------------------------+
-| runningReduceAttempts | int | The number of running reduce attempts |
-*---------------+--------------+-------------------------------+
-| failedReduceAttempts | int | The number of failed reduce attempts |
-*---------------+--------------+-------------------------------+
-| killedReduceAttempts | int | The number of killed reduce attempts |
-*---------------+--------------+-------------------------------+
-| successfulReduceAttempts | int | The number of successful reduce attempts |
-*---------------+--------------+-------------------------------+
-| newMapAttempts | int | The number of new map attempts |
-*---------------+--------------+-------------------------------+
-| runningMapAttempts | int | The number of running map attempts |
-*---------------+--------------+-------------------------------+
-| failedMapAttempts | int | The number of failed map attempts |
-*---------------+--------------+-------------------------------+
-| killedMapAttempts | int | The number of killed map attempts |
-*---------------+--------------+-------------------------------+
-| successfulMapAttempts | int | The number of successful map attempts |
-*---------------+--------------+-------------------------------+
-| acls | array of acls(json)/zero or more acls objects(xml)| A collection of acls objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <acls> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| value | string | The acl value|
-*---------------+--------------+-------------------------------+
-| name | string | The acl name |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Server: Jetty(6.1.26)
-  Content-Length: 720
-+---+
-
-  Response Body:
-
-+---+
-{
-   "job" : {
-      "runningReduceAttempts" : 1,
-      "reduceProgress" : 100,
-      "failedReduceAttempts" : 0,
-      "newMapAttempts" : 0,
-      "mapsRunning" : 0,
-      "state" : "RUNNING",
-      "successfulReduceAttempts" : 0,
-      "reducesRunning" : 1,
-      "acls" : [
-         {  
-            "value" : " ",
-            "name" : "mapreduce.job.acl-modify-job"
-         },
-         {  
-            "value" : " ",
-            "name" : "mapreduce.job.acl-view-job"
-         }
-      ],
-      "reducesPending" : 0,
-      "user" : "user1",
-      "reducesTotal" : 1,
-      "mapsCompleted" : 1,
-      "startTime" : 1326238769379,
-      "id" : "job_1326232085508_4_4",
-      "successfulMapAttempts" : 1,
-      "runningMapAttempts" : 0,
-      "newReduceAttempts" : 0,
-      "name" : "Sleep job",
-      "mapsPending" : 0,
-      "elapsedTime" : 59437,
-      "reducesCompleted" : 0,
-      "mapProgress" : 100,
-      "diagnostics" : "",
-      "failedMapAttempts" : 0,
-      "killedReduceAttempts" : 0,
-      "mapsTotal" : 1,
-      "uberized" : false,
-      "killedMapAttempts" : 0,
-      "finishTime" : 0
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 1201
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<job>
-  <startTime>1326238769379</startTime>
-  <finishTime>0</finishTime>
-  <elapsedTime>59474</elapsedTime>
-  <id>job_1326232085508_4_4</id>
-  <name>Sleep job</name>
-  <user>user1</user>
-  <state>RUNNING</state>
-  <mapsTotal>1</mapsTotal>
-  <mapsCompleted>1</mapsCompleted>
-  <reducesTotal>1</reducesTotal>
-  <reducesCompleted>0</reducesCompleted>
-  <mapProgress>100.0</mapProgress>
-  <reduceProgress>100.0</reduceProgress>
-  <mapsPending>0</mapsPending>
-  <mapsRunning>0</mapsRunning>
-  <reducesPending>0</reducesPending>
-  <reducesRunning>1</reducesRunning>
-  <uberized>false</uberized>
-  <diagnostics/>  
-  <newReduceAttempts>0</newReduceAttempts>
-  <runningReduceAttempts>1</runningReduceAttempts>
-  <failedReduceAttempts>0</failedReduceAttempts>
-  <killedReduceAttempts>0</killedReduceAttempts>
-  <successfulReduceAttempts>0</successfulReduceAttempts>
-  <newMapAttempts>0</newMapAttempts>
-  <runningMapAttempts>0</runningMapAttempts>
-  <failedMapAttempts>0</failedMapAttempts>
-  <killedMapAttempts>0</killedMapAttempts>
-  <successfulMapAttempts>1</successfulMapAttempts>
-  <acls>
-    <name>mapreduce.job.acl-modify-job</name>
-    <value> </value>
-  </acls>
-  <acls>
-    <name>mapreduce.job.acl-view-job</name>    <value> </value>
-  </acls>
-</job>
-+---+
-
-* Job Attempts API
-
-  With the job attempts API, you can obtain a collection of resources that represent the job attempts.  When you run a GET operation on this resource, you obtain a collection of Job Attempt Objects.
-
-** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/jobs/{jobid}/jobattempts
-------
-
-** HTTP Operations Supported
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <jobAttempts> object
-
-  When you make a request for the list of job attempts, the information will be returned as an array of job attempt objects.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| jobAttempt | array of job attempt objects(JSON)/zero or more job attempt objects(XML) | The collection of job attempt objects |
-*---------------+--------------+--------------------------------+
-
-** Elements of the <jobAttempt> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string | The job attempt id |
-*---------------+--------------+--------------------------------+
-| nodeId | string | The node id of the node the attempt ran on|
-*---------------+--------------+--------------------------------+
-| nodeHttpAddress | string | The node http address of the node the attempt ran on|
-*---------------+--------------+--------------------------------+
-| logsLink | string | The http link to the job attempt logs |
-*---------------+--------------+--------------------------------+
-| containerId | string | The id of the container for the job attempt |
-*---------------+--------------+--------------------------------+
-| startTime | long | The start time of the attempt (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/jobattempts
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobAttempts" : {
-      "jobAttempt" : [
-         {    
-            "nodeId" : "host.domain.com:8041",
-            "nodeHttpAddress" : "host.domain.com:8042",
-            "startTime" : 1326238773493,
-            "id" : 1, 
-            "logsLink" : "http://host.domain.com:8042/node/containerlogs/container_1326232085508_0004_01_000001",
-            "containerId" : "container_1326232085508_0004_01_000001"
-         }  
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/jobattempts
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 498
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobAttempts>
-  <jobAttempt>
-    <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-    <nodeId>host.domain.com:8041</nodeId>
-    <id>1</id>
-    <startTime>1326238773493</startTime>
-    <containerId>container_1326232085508_0004_01_000001</containerId>
-    <logsLink>http://host.domain.com:8042/node/containerlogs/container_1326232085508_0004_01_000001</logsLink>
-  </jobAttempt>
-</jobAttempts>
-+---+
-
-* Job Counters API
-
-  With the job counters API, you can object a collection of resources that represent all the counters for that job. 
-
-** URI
-
-------
-  * http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/{jobid}/counters
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <jobCounters> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The job id |
-*---------------+--------------+-------------------------------+
-| counterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <counterGroup> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| reduceCounterValue | long | The counter value of reduce tasks |
-*---------------+--------------+-------------------------------+
-| mapCounterValue | long | The counter value of map tasks |
-*---------------+--------------+-------------------------------+
-| totalCounterValue | long | The counter value of all tasks |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobCounters" : {
-      "id" : "job_1326232085508_4_4",
-      "counterGroup" : [
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BAD_ID"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "CONNECTION"
-               }, 
-              {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "IO_ERROR"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_LENGTH"
-               },                {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_MAP"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-         }, 
-         {  
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2483,
-                  "name" : "FILE_BYTES_READ"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 108763,
-                  "name" : "FILE_BYTES_WRITTEN"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_READ_OPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 48,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "HDFS_READ_OPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         }, 
-         {  
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "MAP_INPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1200,
-                  "name" : "MAP_OUTPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 4800,
-                  "name" : "MAP_OUTPUT_BYTES"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2235,
-                  "name" : "MAP_OUTPUT_MATERIALIZED_BYTES"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 48,
-                  "name" : "SPLIT_RAW_BYTES"
-               }, 
-              {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 460,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 460,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1200,
-                  "name" : "SPILLED_RECORDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },                {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 58,
-                  "name" : "GC_TIME_MILLIS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1580,
-                  "name" : "CPU_MILLISECONDS"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 462643200,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               }, 
-               {   
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2149728256,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               }, 
-              {  
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 357957632,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {  
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter",
-            "counter" : [
-               {  
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BYTES_READ"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {  
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 7027
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobCounters>
-  <id>job_1326232085508_4_4</id>
-  <counterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <totalCounterValue>2483</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <totalCounterValue>108763</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <totalCounterValue>48</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName> 
-    <counter>
-      <name>MAP_INPUT_RECORDS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_RECORDS</name>
-      <totalCounterValue>1200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_BYTES</name>
-      <totalCounterValue>4800</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_MATERIALIZED_BYTES</name>
-      <totalCounterValue>2235</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SPLIT_RAW_BYTES</name>
-      <totalCounterValue>48</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <totalCounterValue>460</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <totalCounterValue>2235</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <totalCounterValue>460</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <totalCounterValue>1200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <totalCounterValue>58</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <totalCounterValue>1580</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <totalCounterValue>462643200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <totalCounterValue>2149728256</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <totalCounterValue>357957632</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_READ</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>  </counterGroup>  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>      <name>BYTES_WRITTEN</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-</jobCounters>
-+---+
-
-* Job Conf API
-
-  A job configuration resource contains information about the job configuration for this job.
-
-** URI
-
-  Use the following URI to obtain th job configuration information, from a job identified by the {jobid} value. 
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/conf
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <conf> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| path | string | The path to the job configuration file|
-*---------------+--------------+-------------------------------+
-| property | array of the configuration properties(JSON)/zero or more property objects(XML) | Collection of property objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <property> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the configuration property |
-*---------------+--------------+-------------------------------+
-| value | string | The value of the configuration property |
-*---------------+--------------+-------------------------------+
-| source | string | The location this configuration object came from. If there is more then one of these it shows the history with the latest source at the end of the list. |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/conf
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-  This is a small snippet of the output as the output if very large. The real output contains every property in your job configuration file.
-
-+---+
-{
-   "conf" : {
-      "path" : "hdfs://host.domain.com:9000/user/user1/.staging/job_1326232085508_0004/job.xml",
-      "property" : [
-         {  
-            "value" : "/home/hadoop/hdfs/data",
-            "name" : "dfs.datanode.data.dir",
-            "source" : ["hdfs-site.xml", "job.xml"]
-         },
-         {
-            "value" : "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer",
-            "name" : "hadoop.http.filter.initializers"
-            "source" : ["programmatically", "job.xml"]
-         },
-         {
-            "value" : "/home/hadoop/tmp",
-            "name" : "mapreduce.cluster.temp.dir"
-            "source" : ["mapred-site.xml"]
-         },
-         ...
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/conf
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 552
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<conf>
-  <path>hdfs://host.domain.com:9000/user/user1/.staging/job_1326232085508_0004/job.xml</path>
-  <property>
-    <name>dfs.datanode.data.dir</name>
-    <value>/home/hadoop/hdfs/data</value>
-    <source>hdfs-site.xml</source>
-    <source>job.xml</source>
-  </property>
-  <property>
-    <name>hadoop.http.filter.initializers</name>
-    <value>org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer</value>
-    <source>programmatically</source>
-    <source>job.xml</source>
-  </property>
-  <property>
-    <name>mapreduce.cluster.temp.dir</name>
-    <value>/home/hadoop/tmp</value>
-    <source>mapred-site.xml</source>
-  </property>
-  ...
-</conf>
-+---+
-
-* Tasks API
-
-  With the tasks API, you can obtain a collection of resources that represent all the tasks for a job.  When you run a GET operation on this resource, you obtain a collection of Task Objects. 
-
-** URI
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  * type - type of task, valid values are m or r.  m for map task or r for reduce task.
-------
-
-** Elements of the <tasks> object
-
-  When you make a request for the list of tasks , the information will be returned as an array of task objects. 
-  See also {{Task API}} for syntax of the task object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| task | array of task objects(JSON)/zero or more task objects(XML) | The collection of task objects |
-*---------------+--------------+--------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "tasks" : {
-      "task" : [
-         {
-            "progress" : 100,
-            "elapsedTime" : 2768,
-            "state" : "SUCCEEDED",
-            "startTime" : 1326238773493,
-            "id" : "task_1326232085508_4_4_m_0",
-            "type" : "MAP",
-            "successfulAttempt" : "attempt_1326232085508_4_4_m_0_0",
-            "finishTime" : 1326238776261
-         },
-         {
-            "progress" : 100,
-            "elapsedTime" : 0,
-            "state" : "RUNNING",
-            "startTime" : 1326238777460,
-            "id" : "task_1326232085508_4_4_r_0",
-            "type" : "REDUCE",
-            "successfulAttempt" : "",
-            "finishTime" : 0
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 603
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<tasks>
-  <task>
-    <startTime>1326238773493</startTime>
-    <finishTime>1326238776261</finishTime>
-    <elapsedTime>2768</elapsedTime>
-    <progress>100.0</progress>
-    <id>task_1326232085508_4_4_m_0</id>
-    <state>SUCCEEDED</state>
-    <type>MAP</type>
-    <successfulAttempt>attempt_1326232085508_4_4_m_0_0</successfulAttempt>
-  </task>
-  <task>
-    <startTime>1326238777460</startTime>
-    <finishTime>0</finishTime>
-    <elapsedTime>0</elapsedTime>
-    <progress>100.0</progress>
-    <id>task_1326232085508_4_4_r_0</id>
-    <state>RUNNING</state>
-    <type>REDUCE</type>
-    <successfulAttempt/>
-  </task>
-</tasks>
-+---+
-
-* {Task API}
-
-  A Task resource contains information about a particular task within a job. 
-
-** URI
-
-  Use the following URI to obtain an Task Object, from a task identified by the {taskid} value. 
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <task> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string  | The task id | 
-*---------------+--------------+--------------------------------+
-| state | string | The state of the task - valid values are: NEW, SCHEDULED, RUNNING, SUCCEEDED, FAILED, KILL_WAIT, KILLED |
-*---------------+--------------+--------------------------------+
-| type | string | The task type - MAP or REDUCE|
-*---------------+--------------+--------------------------------+
-| successfulAttempt | string | The the id of the last successful attempt |
-*---------------+--------------+--------------------------------+
-| progress | float | The progress of the task as a percent|
-*---------------+--------------+--------------------------------+
-| startTime | long | The time in which the task started (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| finishTime | long | The time in which the task finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| elapsedTime | long | The elapsed time since the application started (in ms)|
-*---------------+--------------+--------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "task" : {
-      "progress" : 100,
-      "elapsedTime" : 0,
-      "state" : "RUNNING",
-      "startTime" : 1326238777460,
-      "id" : "task_1326232085508_4_4_r_0",
-      "type" : "REDUCE",
-      "successfulAttempt" : "",
-      "finishTime" : 0
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 299
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<task>
-  <startTime>1326238777460</startTime>
-  <finishTime>0</finishTime>
-  <elapsedTime>0</elapsedTime>
-  <progress>100.0</progress>
-  <id>task_1326232085508_4_4_r_0</id>
-  <state>RUNNING</state>
-  <type>REDUCE</type>
-  <successfulAttempt/>
-</task>
-+---+
-
-* Task Counters API
-
-  With the task counters API, you can object a collection of resources that represent all the counters for that task. 
-
-** URI
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/counters
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <jobTaskCounters> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The task id |
-*---------------+--------------+-------------------------------+
-| taskcounterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <counterGroup> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| value | long | The value of the counter |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobTaskCounters" : {
-      "id" : "task_1326232085508_4_4_r_0",
-      "taskCounterGroup" : [
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {
-                  "value" : 2363,
-                  "name" : "FILE_BYTES_READ"
-               },
-               {
-                  "value" : 54372,
-                  "name" : "FILE_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               },
-               {
-                  "value" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "SPILLED_RECORDS"
-               },
-               {
-                  "value" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               },
-               {
-                  "value" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },
-               {
-                  "value" : 26,
-                  "name" : "GC_TIME_MILLIS"
-               },
-               {
-                  "value" : 860,
-                  "name" : "CPU_MILLISECONDS"
-               },
-               {
-                  "value" : 107839488,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 1123147776,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 57475072,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BAD_ID"
-               },
-               {
-                  "value" : 0,
-                  "name" : "CONNECTION"
-               },
-               {
-                  "value" : 0,
-                  "name" : "IO_ERROR"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_LENGTH"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_MAP"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 2660
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobTaskCounters>
-  <id>task_1326232085508_4_4_r_0</id>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <value>2363</value>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <value>54372</value>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <value>2235</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <value>26</value>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <value>860</value>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <value>107839488</value>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <value>1123147776</value>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <value>57475072</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-</jobTaskCounters>
-+---+
-
-* Task Attempts API
-
-  With the task attempts  API, you can obtain a collection of resources that represent a task attempt within a job.  When you run a GET operation on this resource, you obtain a collection of Task Attempt Objects. 
-
-** URI
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempts
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <taskAttempts> object
-
-  When you make a request for the list of task attempts, the information will be returned as an array of task attempt objects. 
-  See also {{Task Attempt API}} for syntax of the task object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| taskAttempt | array of task attempt objects(JSON)/zero or more task attempt objects(XML) | The collection of task attempt objects |
-*---------------+--------------+--------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "taskAttempts" : {
-      "taskAttempt" : [
-         {
-            "elapsedMergeTime" : 47,
-            "shuffleFinishTime" : 1326238780052,
-            "assignedContainerId" : "container_1326232085508_0004_01_000003",
-            "progress" : 100,
-            "elapsedTime" : 0,
-            "state" : "RUNNING",
-            "elapsedShuffleTime" : 2592,
-            "mergeFinishTime" : 1326238780099,
-            "rack" : "/98.139.92.0",
-            "elapsedReduceTime" : 0,
-            "nodeHttpAddress" : "host.domain.com:8042",
-            "type" : "REDUCE",
-            "startTime" : 1326238777460,
-            "id" : "attempt_1326232085508_4_4_r_0_0",
-            "finishTime" : 0
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 807
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<taskAttempts>
-  <taskAttempt xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="reduceTaskAttemptInfo">
-    <startTime>1326238777460</startTime>
-    <finishTime>0</finishTime>
-    <elapsedTime>0</elapsedTime>
-    <progress>100.0</progress>
-    <id>attempt_1326232085508_4_4_r_0_0</id>
-    <rack>/98.139.92.0</rack>
-    <state>RUNNING</state>
-    <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-    <type>REDUCE</type>
-    <assignedContainerId>container_1326232085508_0004_01_000003</assignedContainerId>
-    <shuffleFinishTime>1326238780052</shuffleFinishTime>
-    <mergeFinishTime>1326238780099</mergeFinishTime>
-    <elapsedShuffleTime>2592</elapsedShuffleTime>
-    <elapsedMergeTime>47</elapsedMergeTime>
-    <elapsedReduceTime>0</elapsedReduceTime>
-  </taskAttempt>
-</taskAttempts>
-+---+
-
-* {Task Attempt API}
-
-  A Task Attempt resource contains information about a particular task attempt within a job. 
-
-** URI
-
-  Use the following URI to obtain an Task Attempt Object, from a task identified by the {attemptid} value. 
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <taskAttempt> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string  | The task id | 
-*---------------+--------------+--------------------------------+
-| rack | string  | The rack | 
-*---------------+--------------+--------------------------------+
-| state | string | The state of the task attempt - valid values are: NEW, UNASSIGNED, ASSIGNED, RUNNING, COMMIT_PENDING, SUCCESS_CONTAINER_CLEANUP, SUCCEEDED, FAIL_CONTAINER_CLEANUP, FAIL_TASK_CLEANUP, FAILED, KILL_CONTAINER_CLEANUP, KILL_TASK_CLEANUP, KILLED|
-*---------------+--------------+--------------------------------+
-| type | string | The type of task |
-*---------------+--------------+--------------------------------+
-| assignedContainerId | string | The container id this attempt is assigned to|
-*---------------+--------------+--------------------------------+
-| nodeHttpAddress | string | The http address of the node this task attempt ran on |
-*---------------+--------------+--------------------------------+
-| diagnostics| string | The diagnostics message |
-*---------------+--------------+--------------------------------+
-| progress | float | The progress of the task attempt as a percent|
-*---------------+--------------+--------------------------------+
-| startTime | long | The time in which the task attempt started (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| finishTime | long | The time in which the task attempt finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| elapsedTime | long | The elapsed time since the task attempt started (in ms)|
-*---------------+--------------+--------------------------------+
-
-  For reduce task attempts you also have the following fields:
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| shuffleFinishTime | long | The time at which shuffle finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| mergeFinishTime | long | The time at which merge finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| elapsedShuffleTime | long | The time it took for the shuffle phase to complete (time in ms between reduce task start and shuffle finish)|
-*---------------+--------------+--------------------------------+
-| elapsedMergeTime | long | The time it took for the merge phase to complete (time in ms between the shuffle finish and merge finish)|
-*---------------+--------------+--------------------------------+
-| elapsedReduceTime | long | The time it took for the reduce phase to complete (time in ms between merge finish to end of reduce task)|
-*---------------+--------------+--------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0 
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "taskAttempt" : {
-      "elapsedMergeTime" : 47,
-      "shuffleFinishTime" : 1326238780052,
-      "assignedContainerId" : "container_1326232085508_0004_01_000003",
-      "progress" : 100,
-      "elapsedTime" : 0,
-      "state" : "RUNNING",
-      "elapsedShuffleTime" : 2592,
-      "mergeFinishTime" : 1326238780099,
-      "rack" : "/98.139.92.0",
-      "elapsedReduceTime" : 0,
-      "nodeHttpAddress" : "host.domain.com:8042",
-      "startTime" : 1326238777460,
-      "id" : "attempt_1326232085508_4_4_r_0_0",
-      "type" : "REDUCE",
-      "finishTime" : 0
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0 
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 691
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<taskAttempt>
-  <startTime>1326238777460</startTime>
-  <finishTime>0</finishTime>
-  <elapsedTime>0</elapsedTime>
-  <progress>100.0</progress>
-  <id>attempt_1326232085508_4_4_r_0_0</id>
-  <rack>/98.139.92.0</rack>
-  <state>RUNNING</state>
-  <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-  <type>REDUCE</type>
-  <assignedContainerId>container_1326232085508_0004_01_000003</assignedContainerId>
-  <shuffleFinishTime>1326238780052</shuffleFinishTime>
-  <mergeFinishTime>1326238780099</mergeFinishTime>
-  <elapsedShuffleTime>2592</elapsedShuffleTime>
-  <elapsedMergeTime>47</elapsedMergeTime>
-  <elapsedReduceTime>0</elapsedReduceTime>
-</taskAttempt>
-+---+
-
-* Task Attempt Counters API
-
-  With the task attempt counters API, you can object a collection of resources that represent al the counters for that task attempt. 
-
-** URI
-
-------
-  * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}/counters
-------
-
-** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <jobTaskAttemptCounters> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The task attempt id |
-*---------------+--------------+-------------------------------+
-| taskAttemptcounterGroup | array of task attempt counterGroup objects(JSON)/zero or more task attempt counterGroup objects(XML) | A collection of task attempt counter group objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <taskAttemptCounterGroup> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| value | long | The value of the counter |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobTaskAttemptCounters" : {
-      "taskAttemptCounterGroup" : [
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {
-                  "value" : 2363,
-                  "name" : "FILE_BYTES_READ"
-               },
-               {
-                  "value" : 54372,
-                  "name" : "FILE_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               },
-              {
-                  "value" : 0,
-                  "name" : "HDFS_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               },
-               {
-                  "value" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "SPILLED_RECORDS"
-               },
-               {
-                  "value" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               },
-               {
-                  "value" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },
-               {
-                  "value" : 26,
-                  "name" : "GC_TIME_MILLIS"
-               },
-               {
-                  "value" : 860,
-                  "name" : "CPU_MILLISECONDS"
-               },
-               {
-                  "value" : 107839488,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 1123147776,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 57475072,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BAD_ID"
-               },
-               {
-                  "value" : 0,
-                  "name" : "CONNECTION"
-               },
-               {
-                  "value" : 0,
-                  "name" : "IO_ERROR"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_LENGTH"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_MAP"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ],
-      "id" : "attempt_1326232085508_4_4_r_0_0"
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 2735
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobTaskAttemptCounters>
-  <id>attempt_1326232085508_4_4_r_0_0</id>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <value>2363</value>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <value>54372</value>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <value>2235</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <value>26</value>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <value>860</value>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <value>107839488</value>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <value>1123147776</value>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <value>57475072</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-</jobTaskAttemptCounters>
-+---+
-

+ 0 - 233
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm

@@ -1,233 +0,0 @@
-~~ Licensed to the Apache Software Foundation (ASF) under one or more
-~~ contributor license agreements.  See the NOTICE file distributed with
-~~ this work for additional information regarding copyright ownership.
-~~ The ASF licenses this file to You under the Apache License, Version 2.0
-~~ (the "License"); you may not use this file except in compliance with
-~~ the License.  You may obtain a copy of the License at
-~~
-~~     http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License.
-
-  ---
-  MapReduce Commands Guide
-  ---
-  ---
-  ${maven.build.timestamp}
-
-MapReduce Commands Guide
-
-%{toc|section=1|fromDepth=2|toDepth=4}
-
-* Overview
-
-  MapReduce commands are invoked by the <<<bin/mapred>>> script. Running the
-  script without any arguments prints the description for all commands.
-
-   Usage: <<<mapred [--config confdir] [--loglevel loglevel] COMMAND>>>
-
-   MapReduce has an option parsing framework that employs parsing generic
-   options as well as running classes.
-
-*-------------------------+---------------------------------------------------+
-|| COMMAND_OPTIONS        || Description                                      |
-*-------------------------+---------------------------------------------------+
-| --config confdir | Overwrites the default Configuration directory. Default
-|                  | is $\{HADOOP_PREFIX\}/conf.
-*-------------------------+---------------------------------------------------+
-| --loglevel loglevel | Overwrites the log level. Valid log levels are FATAL,
-|                     | ERROR, WARN, INFO, DEBUG, and TRACE. Default is INFO.
-*-------------------------+---------------------------------------------------+
-| COMMAND COMMAND_OPTIONS | Various commands with their options are described
-|                         | in the following sections. The commands have been
-|                         | grouped into {{User Commands}} and
-|                         | {{Administration Commands}}.
-*-------------------------+---------------------------------------------------+
-
-* User Commands
-
-   Commands useful for users of a hadoop cluster.
-
-** <<<pipes>>>
-
-   Runs a pipes job.
-
-   Usage: <<<mapred pipes [-conf <path>] [-jobconf <key=value>, <key=value>,
-   ...] [-input <path>] [-output <path>] [-jar <jar file>] [-inputformat
-   <class>] [-map <class>] [-partitioner <class>] [-reduce <class>] [-writer
-   <class>] [-program <executable>] [-reduces <num>]>>>
-
-*----------------------------------------+------------------------------------+
-|| COMMAND_OPTION                        || Description
-*----------------------------------------+------------------------------------+
-| -conf <path>                           | Configuration for job
-*----------------------------------------+------------------------------------+
-| -jobconf <key=value>, <key=value>, ... | Add/override configuration for job
-*----------------------------------------+------------------------------------+
-| -input <path>                          | Input directory
-*----------------------------------------+------------------------------------+
-| -output <path>                         | Output directory
-*----------------------------------------+------------------------------------+
-| -jar <jar file>                        | Jar filename
-*----------------------------------------+------------------------------------+
-| -inputformat <class>                   | InputFormat class
-*----------------------------------------+------------------------------------+
-| -map <class>                           | Java Map class
-*----------------------------------------+------------------------------------+
-| -partitioner <class>                   | Java Partitioner
-*----------------------------------------+------------------------------------+
-| -reduce <class>                        | Java Reduce class
-*----------------------------------------+------------------------------------+
-| -writer <class>                        | Java RecordWriter
-*----------------------------------------+------------------------------------+
-| -program <executable>                  | Executable URI
-*----------------------------------------+------------------------------------+
-| -reduces <num>                         | Number of reduces
-*----------------------------------------+------------------------------------+
-
-** <<<job>>>
-
-   Command to interact with Map Reduce Jobs.
-
-   Usage: <<<mapred job
-          | [{{{../../hadoop-project-dist/hadoop-common/CommandsManual.html#Generic_Options}GENERIC_OPTIONS}}]
-          | [-submit <job-file>]
-          | [-status <job-id>]
-          | [-counter <job-id> <group-name> <counter-name>]
-          | [-kill <job-id>]
-          | [-events <job-id> <from-event-#> <#-of-events>]
-          | [-history [all] <jobOutputDir>] | [-list [all]]
-          | [-kill-task <task-id>] | [-fail-task <task-id>]
-          | [-set-priority <job-id> <priority>]>>>
-
-*------------------------------+---------------------------------------------+
-|| COMMAND_OPTION              || Description
-*------------------------------+---------------------------------------------+
-| -submit <job-file>           | Submits the job.
-*------------------------------+---------------------------------------------+
-| -status <job-id>             | Prints the map and reduce completion
-                               | percentage and all job counters.
-*------------------------------+---------------------------------------------+
-| -counter <job-id> <group-name> <counter-name> | Prints the counter value.
-*------------------------------+---------------------------------------------+
-| -kill <job-id>               | Kills the job.
-*------------------------------+---------------------------------------------+
-| -events <job-id> <from-event-#> <#-of-events> | Prints the events' details
-                               | received by jobtracker for the given range.
-*------------------------------+---------------------------------------------+
-| -history [all]<jobOutputDir> | Prints job details, failed and killed tip
-                               | details.  More details about the job such as
-                               | successful tasks and task attempts made for
-                               | each task can be viewed by specifying the
-                               | [all] option.
-*------------------------------+---------------------------------------------+
-| -list [all]                  | Displays jobs which are yet to complete.
-                               | <<<-list all>>> displays all jobs.
-*------------------------------+---------------------------------------------+
-| -kill-task <task-id>         | Kills the task. Killed tasks are NOT counted
-                               | against failed attempts.
-*------------------------------+---------------------------------------------+
-| -fail-task <task-id>         | Fails the task. Failed tasks are counted
-                               | against failed attempts.
-*------------------------------+---------------------------------------------+
-| -set-priority <job-id> <priority> | Changes the priority of the job. Allowed
-                               | priority values are VERY_HIGH, HIGH, NORMAL,
-                               | LOW, VERY_LOW
-*------------------------------+---------------------------------------------+
-
-** <<<queue>>>
-
-   command to interact and view Job Queue information
-
-   Usage: <<<mapred queue [-list] | [-info <job-queue-name> [-showJobs]]
-          | [-showacls]>>>
-
-*-----------------+-----------------------------------------------------------+
-|| COMMAND_OPTION || Description
-*-----------------+-----------------------------------------------------------+
-| -list           | Gets list of Job Queues configured in the system.
-                  | Along with scheduling information associated with the job
-                  | queues.
-*-----------------+-----------------------------------------------------------+
-| -info <job-queue-name> [-showJobs] | Displays the job queue information and
-                  | associated scheduling information of particular job queue.
-                  | If <<<-showJobs>>> options is present a list of jobs
-                  | submitted to the particular job queue is displayed.
-*-----------------+-----------------------------------------------------------+
-| -showacls       | Displays the queue name and associated queue operations
-                  | allowed for the current user. The list consists of only
-                  | those queues to which the user has access.
-*-----------------+-----------------------------------------------------------+
-
-** <<<classpath>>>
-
-   Prints the class path needed to get the Hadoop jar and the required
-   libraries.
-
-   Usage: <<<mapred classpath>>>
-
-** <<<distcp>>>
-
-   Copy file or directories recursively. More information can be found at
-   {{{./DistCp.html}Hadoop DistCp Guide}}.
-
-** <<<archive>>>
-
-   Creates a hadoop archive. More information can be found at
-   {{{./HadoopArchives.html}Hadoop Archives Guide}}.
-
-** <<<version>>>
-
-   Prints the version.
-
-   Usage: <<<mapred version>>>
-
-* Administration Commands
-
-   Commands useful for administrators of a hadoop cluster.
-
-** <<<historyserver>>>
-
-   Start JobHistoryServer.
-
-   Usage: <<<mapred historyserver>>>
-
-** <<<hsadmin>>>
-
-   Runs a MapReduce hsadmin client for execute JobHistoryServer administrative
-   commands.
-
-   Usage: <<<mapred hsadmin
-          [-refreshUserToGroupsMappings] |
-          [-refreshSuperUserGroupsConfiguration] |
-          [-refreshAdminAcls] |
-          [-refreshLoadedJobCache] |
-          [-refreshLogRetentionSettings] |
-          [-refreshJobRetentionSettings] |
-          [-getGroups [username]] | [-help [cmd]]>>>
-
-*-----------------+-----------------------------------------------------------+
-|| COMMAND_OPTION || Description
-*-----------------+-----------------------------------------------------------+
-| -refreshUserToGroupsMappings | Refresh user-to-groups mappings
-*-----------------+-----------------------------------------------------------+
-| -refreshSuperUserGroupsConfiguration| Refresh superuser proxy groups mappings
-*-----------------+-----------------------------------------------------------+
-| -refreshAdminAcls | Refresh acls for administration of Job history server
-*-----------------+-----------------------------------------------------------+
-| -refreshLoadedJobCache | Refresh loaded job cache of Job history server
-*-----------------+-----------------------------------------------------------+
-| -refreshJobRetentionSettings|Refresh job history period, job cleaner settings
-*-----------------+-----------------------------------------------------------+
-| -refreshLogRetentionSettings | Refresh log retention period and log retention
-|                              | check interval
-*-----------------+-----------------------------------------------------------+
-| -getGroups [username] | Get the groups which given user belongs to
-*-----------------+-----------------------------------------------------------+
-| -help [cmd] | Displays help for the given command or all commands if none is
-|             | specified.
-*-----------------+-----------------------------------------------------------+

+ 0 - 98
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm

@@ -1,98 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Pluggable Shuffle and Pluggable Sort
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Pluggable Shuffle and Pluggable Sort
-
-* Introduction
-
-  The pluggable shuffle and pluggable sort capabilities allow replacing the 
-  built in shuffle and sort logic with alternate implementations. Example use 
-  cases for this are: using a different application protocol other than HTTP 
-  such as RDMA for shuffling data from the Map nodes to the Reducer nodes; or
-  replacing the sort logic with custom algorithms that enable Hash aggregation 
-  and Limit-N query.
-
-  <<IMPORTANT:>> The pluggable shuffle and pluggable sort capabilities are 
-  experimental and unstable. This means the provided APIs may change and break 
-  compatibility in future versions of Hadoop.
-
-* Implementing a Custom Shuffle and a Custom Sort 
-
-  A custom shuffle implementation requires a
-  <<<org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.AuxiliaryService>>> 
-  implementation class running in the NodeManagers and a 
-  <<<org.apache.hadoop.mapred.ShuffleConsumerPlugin>>> implementation class
-  running in the Reducer tasks.
-
-  The default implementations provided by Hadoop can be used as references:
-
-    * <<<org.apache.hadoop.mapred.ShuffleHandler>>>
-    
-    * <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>>
-
-  A custom sort implementation requires a <<<org.apache.hadoop.mapred.MapOutputCollector>>>
-  implementation class running in the Mapper tasks and (optionally, depending
-  on the sort implementation) a <<<org.apache.hadoop.mapred.ShuffleConsumerPlugin>>> 
-  implementation class running in the Reducer tasks.
-
-  The default implementations provided by Hadoop can be used as references:
-
-  * <<<org.apache.hadoop.mapred.MapTask$MapOutputBuffer>>>
-  
-  * <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>>
-
-* Configuration
-
-  Except for the auxiliary service running in the NodeManagers serving the 
-  shuffle (by default the <<<ShuffleHandler>>>), all the pluggable components 
-  run in the job tasks. This means, they can be configured on per job basis. 
-  The auxiliary service servicing the Shuffle must be configured in the 
-  NodeManagers configuration.
-
-** Job Configuration Properties (on per job basis):
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<mapreduce.job.reduce.shuffle.consumer.plugin.class>>> | <<<org.apache.hadoop.mapreduce.task.reduce.Shuffle>>>         | The <<<ShuffleConsumerPlugin>>> implementation to use |
-*--------------------------------------+---------------------+-----------------+
-| <<<mapreduce.job.map.output.collector.class>>>   | <<<org.apache.hadoop.mapred.MapTask$MapOutputBuffer>>> | The <<<MapOutputCollector>>> implementation(s) to use |
-*--------------------------------------+---------------------+-----------------+
-
-  These properties can also be set in the <<<mapred-site.xml>>> to change the default values for all jobs.
-
-  The collector class configuration may specify a comma-separated list of collector implementations.
-  In this case, the map task will attempt to instantiate each in turn until one of the
-  implementations successfully initializes. This can be useful if a given collector
-  implementation is only compatible with certain types of keys or values, for example.
-
-** NodeManager Configuration properties, <<<yarn-site.xml>>> in all nodes:
-
-*--------------------------------------+---------------------+-----------------+
-| <<Property>>                         | <<Default Value>>   | <<Explanation>> |
-*--------------------------------------+---------------------+-----------------+
-| <<<yarn.nodemanager.aux-services>>> | <<<...,mapreduce_shuffle>>>  | The auxiliary service name |
-*--------------------------------------+---------------------+-----------------+
-| <<<yarn.nodemanager.aux-services.mapreduce_shuffle.class>>>   | <<<org.apache.hadoop.mapred.ShuffleHandler>>> | The auxiliary service class to use |
-*--------------------------------------+---------------------+-----------------+
-
-  <<IMPORTANT:>> If setting an auxiliary service in addition the default 
-  <<<mapreduce_shuffle>>> service, then a new service key should be added to the
-  <<<yarn.nodemanager.aux-services>>> property, for example <<<mapred.shufflex>>>.
-  Then the property defining the corresponding class must be
-  <<<yarn.nodemanager.aux-services.mapreduce_shufflex.class>>>.

+ 119 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/DistributedCacheDeploy.md.vm

@@ -0,0 +1,119 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+#set ( $H3 = '###' )
+#set ( $H4 = '####' )
+#set ( $H5 = '#####' )
+
+Hadoop: Distributed Cache Deploy
+================================
+
+Introduction
+------------
+
+The MapReduce application framework has rudimentary support for deploying a new version of the MapReduce framework via the distributed cache. By setting the appropriate configuration properties, users can run a different version of MapReduce than the one initially deployed to the cluster. For example, cluster administrators can place multiple versions of MapReduce in HDFS and configure `mapred-site.xml` to specify which version jobs will use by default. This allows the administrators to perform a rolling upgrade of the MapReduce framework under certain conditions.
+
+Preconditions and Limitations
+-----------------------------
+
+The support for deploying the MapReduce framework via the distributed cache currently does not address the job client code used to submit and query jobs. It also does not address the `ShuffleHandler` code that runs as an auxilliary service within each NodeManager. As a result the following limitations apply to MapReduce versions that can be successfully deployed via the distributed cache in a rolling upgrade fashion:
+
+* The MapReduce version must be compatible with the job client code used to
+  submit and query jobs. If it is incompatible then the job client must be
+  upgraded separately on any node from which jobs using the new MapReduce
+  version will be submitted or queried.
+
+* The MapReduce version must be compatible with the configuration files used
+  by the job client submitting the jobs. If it is incompatible with that
+  configuration (e.g.: a new property must be set or an existing property
+  value changed) then the configuration must be updated first.
+
+* The MapReduce version must be compatible with the `ShuffleHandler`
+  version running on the nodes in the cluster. If it is incompatible then the
+  new `ShuffleHandler` code must be deployed to all the nodes in the
+  cluster, and the NodeManagers must be restarted to pick up the new
+  `ShuffleHandler` code.
+
+Deploying a New MapReduce Version via the Distributed Cache
+-----------------------------------------------------------
+
+Deploying a new MapReduce version consists of three steps:
+
+1.  Upload the MapReduce archive to a location that can be accessed by the
+    job submission client. Ideally the archive should be on the cluster's default
+    filesystem at a publicly-readable path. See the archive location discussion
+    below for more details.
+
+2.  Configure `mapreduce.application.framework.path` to point to the
+    location where the archive is located. As when specifying distributed cache
+    files for a job, this is a URL that also supports creating an alias for the
+    archive if a URL fragment is specified. For example,
+    `hdfs:/mapred/framework/hadoop-mapreduce-${project.version}.tar.gz#mrframework`
+    will be localized as `mrframework` rather than
+    `hadoop-mapreduce-${project.version}.tar.gz`.
+
+3.  Configure `mapreduce.application.classpath` to set the proper
+    classpath to use with the MapReduce archive configured above. NOTE: An error
+    occurs if `mapreduce.application.framework.path` is configured but
+    `mapreduce.application.classpath` does not reference the base name of the
+    archive path or the alias if an alias was specified.
+
+$H3 Location of the MapReduce Archive and How It Affects Job Performance
+
+Note that the location of the MapReduce archive can be critical to job submission and job startup performance. If the archive is not located on the cluster's default filesystem then it will be copied to the job staging directory for each job and localized to each node where the job's tasks run. This will slow down job submission and task startup performance.
+
+If the archive is located on the default filesystem then the job client will not upload the archive to the job staging directory for each job submission. However if the archive path is not readable by all cluster users then the archive will be localized separately for each user on each node where tasks execute. This can cause unnecessary duplication in the distributed cache.
+
+When working with a large cluster it can be important to increase the replication factor of the archive to increase its availability. This will spread the load when the nodes in the cluster localize the archive for the first time.
+
+MapReduce Archives and Classpath Configuration
+----------------------------------------------
+
+Setting a proper classpath for the MapReduce archive depends upon the composition of the archive and whether it has any additional dependencies. For example, the archive can contain not only the MapReduce jars but also the necessary YARN, HDFS, and Hadoop Common jars and all other dependencies. In that case, `mapreduce.application.classpath` would be configured to something like the following example, where the archive basename is hadoop-mapreduce-${project.version}.tar.gz and the archive is organized internally similar to the standard Hadoop distribution archive:
+
+`$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/common/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/yarn/lib/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/hdfs/lib/*`
+
+Another possible approach is to have the archive consist of just the MapReduce jars and have the remaining dependencies picked up from the Hadoop distribution installed on the nodes. In that case, the above example would change to something like the following:
+
+`$HADOOP_CONF_DIR,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/*,$PWD/hadoop-mapreduce-${project.version}.tar.gz/hadoop-mapreduce-${project.version}/share/hadoop/mapreduce/lib/*,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*`
+
+$H3 NOTE:
+
+If shuffle encryption is also enabled in the cluster, then we could meet the problem that MR job get failed with exception like below:
+
+    2014-10-10 02:17:16,600 WARN [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to junpingdu-centos5-3.cs1cloud.internal:13562 with 1 map outputs
+    javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
+        at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
+        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
+        at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:241)
+        at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:235)
+        at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1206)
+        at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:136)
+        at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:593)
+        at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:529)
+        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
+        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
+        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
+        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)
+        at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:434)
+        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:81)
+        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.setNewClient(AbstractDelegateHttpsURLConnection.java:61)
+        at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:584)
+        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1193)
+        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
+        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:318)
+        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.verifyConnection(Fetcher.java:427)
+    ....
+
+This is because MR client (deployed from HDFS) cannot access ssl-client.xml in local FS under directory of $HADOOP\_CONF\_DIR. To fix the problem, we can add the directory with ssl-client.xml to the classpath of MR which is specified in "mapreduce.application.classpath" as mentioned above. To avoid MR application being affected by other local configurations, it is better to create a dedicated directory for putting ssl-client.xml, e.g. a sub-directory under $HADOOP\_CONF\_DIR, like: $HADOOP\_CONF\_DIR/security.

+ 255 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/EncryptedShuffle.md

@@ -0,0 +1,255 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop: Encrypted Shuffle
+=========================
+
+Introduction
+------------
+
+The Encrypted Shuffle capability allows encryption of the MapReduce shuffle using HTTPS and with optional client authentication (also known as bi-directional HTTPS, or HTTPS with client certificates). It comprises:
+
+*   A Hadoop configuration setting for toggling the shuffle between HTTP and
+    HTTPS.
+
+*   A Hadoop configuration settings for specifying the keystore and truststore
+    properties (location, type, passwords) used by the shuffle service and the
+    reducers tasks fetching shuffle data.
+
+*   A way to re-load truststores across the cluster (when a node is added or
+    removed).
+
+Configuration
+-------------
+
+### **core-site.xml** Properties
+
+To enable encrypted shuffle, set the following properties in core-site.xml of all nodes in the cluster:
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `hadoop.ssl.require.client.cert` | `false` | Whether client certificates are required |
+| `hadoop.ssl.hostname.verifier` | `DEFAULT` | The hostname verifier to provide for HttpsURLConnections. Valid values are: **DEFAULT**, **STRICT**, **STRICT\_I6**, **DEFAULT\_AND\_LOCALHOST** and **ALLOW\_ALL** |
+| `hadoop.ssl.keystores.factory.class` | `org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory` | The KeyStoresFactory implementation to use |
+| `hadoop.ssl.server.conf` | `ssl-server.xml` | Resource file from which ssl server keystore information will be extracted. This file is looked up in the classpath, typically it should be in Hadoop conf/ directory |
+| `hadoop.ssl.client.conf` | `ssl-client.xml` | Resource file from which ssl server keystore information will be extracted. This file is looked up in the classpath, typically it should be in Hadoop conf/ directory |
+| `hadoop.ssl.enabled.protocols` | `TLSv1` | The supported SSL protocols (JDK6 can use **TLSv1**, JDK7+ can use **TLSv1,TLSv1.1,TLSv1.2**) |
+
+**IMPORTANT:** Currently requiring client certificates should be set to false. Refer the [Client Certificates](#Client_Certificates) section for details.
+
+**IMPORTANT:** All these properties should be marked as final in the cluster configuration files.
+
+#### Example:
+
+```xml
+  <property>
+    <name>hadoop.ssl.require.client.cert</name>
+    <value>false</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>hadoop.ssl.hostname.verifier</name>
+    <value>DEFAULT</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>hadoop.ssl.keystores.factory.class</name>
+    <value>org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>hadoop.ssl.server.conf</name>
+    <value>ssl-server.xml</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>hadoop.ssl.client.conf</name>
+    <value>ssl-client.xml</value>
+    <final>true</final>
+  </property>
+```
+
+### `mapred-site.xml` Properties
+
+To enable encrypted shuffle, set the following property in mapred-site.xml of all nodes in the cluster:
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `mapreduce.shuffle.ssl.enabled` | `false` | Whether encrypted shuffle is enabled |
+
+**IMPORTANT:** This property should be marked as final in the cluster configuration files.
+
+#### Example:
+
+```xml
+  <property>
+    <name>mapreduce.shuffle.ssl.enabled</name>
+    <value>true</value>
+    <final>true</final>
+  </property>
+```
+
+The Linux container executor should be set to prevent job tasks from reading the server keystore information and gaining access to the shuffle server certificates.
+
+Refer to Hadoop Kerberos configuration for details on how to do this.
+
+Keystore and Truststore Settings
+--------------------------------
+
+Currently `FileBasedKeyStoresFactory` is the only `KeyStoresFactory` implementation. The `FileBasedKeyStoresFactory` implementation uses the following properties, in the **ssl-server.xml** and **ssl-client.xml** files, to configure the keystores and truststores.
+
+### `ssl-server.xml` (Shuffle server) Configuration:
+
+The mapred user should own the **ssl-server.xml** file and have exclusive read access to it.
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `ssl.server.keystore.type` | `jks` | Keystore file type |
+| `ssl.server.keystore.location` | NONE | Keystore file location. The mapred user should own this file and have exclusive read access to it. |
+| `ssl.server.keystore.password` | NONE | Keystore file password |
+| `ssl.server.truststore.type` | `jks` | Truststore file type |
+| `ssl.server.truststore.location` | NONE | Truststore file location. The mapred user should own this file and have exclusive read access to it. |
+| `ssl.server.truststore.password` | NONE | Truststore file password |
+| `ssl.server.truststore.reload.interval` | 10000 | Truststore reload interval, in milliseconds |
+
+#### Example:
+
+```xml
+<configuration>
+
+  <!-- Server Certificate Store -->
+  <property>
+    <name>ssl.server.keystore.type</name>
+    <value>jks</value>
+  </property>
+  <property>
+    <name>ssl.server.keystore.location</name>
+    <value>${user.home}/keystores/server-keystore.jks</value>
+  </property>
+  <property>
+    <name>ssl.server.keystore.password</name>
+    <value>serverfoo</value>
+  </property>
+
+  <!-- Server Trust Store -->
+  <property>
+    <name>ssl.server.truststore.type</name>
+    <value>jks</value>
+  </property>
+  <property>
+    <name>ssl.server.truststore.location</name>
+    <value>${user.home}/keystores/truststore.jks</value>
+  </property>
+  <property>
+    <name>ssl.server.truststore.password</name>
+    <value>clientserverbar</value>
+  </property>
+  <property>
+    <name>ssl.server.truststore.reload.interval</name>
+    <value>10000</value>
+  </property>
+</configuration>
+```
+
+### `ssl-client.xml` (Reducer/Fetcher) Configuration:
+
+The mapred user should own the **ssl-client.xml** file and it should have default permissions.
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `ssl.client.keystore.type` | `jks` | Keystore file type |
+| `ssl.client.keystore.location` | NONE | Keystore file location. The mapred user should own this file and it should have default permissions. |
+| `ssl.client.keystore.password` | NONE | Keystore file password |
+| `ssl.client.truststore.type` | `jks` | Truststore file type |
+| `ssl.client.truststore.location` | NONE | Truststore file location. The mapred user should own this file and it should have default permissions. |
+| `ssl.client.truststore.password` | NONE | Truststore file password |
+| `ssl.client.truststore.reload.interval` | 10000 | Truststore reload interval, in milliseconds |
+
+#### Example:
+
+```xml
+<configuration>
+
+  <!-- Client certificate Store -->
+  <property>
+    <name>ssl.client.keystore.type</name>
+    <value>jks</value>
+  </property>
+  <property>
+    <name>ssl.client.keystore.location</name>
+    <value>${user.home}/keystores/client-keystore.jks</value>
+  </property>
+  <property>
+    <name>ssl.client.keystore.password</name>
+    <value>clientfoo</value>
+  </property>
+
+  <!-- Client Trust Store -->
+  <property>
+    <name>ssl.client.truststore.type</name>
+    <value>jks</value>
+  </property>
+  <property>
+    <name>ssl.client.truststore.location</name>
+    <value>${user.home}/keystores/truststore.jks</value>
+  </property>
+  <property>
+    <name>ssl.client.truststore.password</name>
+    <value>clientserverbar</value>
+  </property>
+  <property>
+    <name>ssl.client.truststore.reload.interval</name>
+    <value>10000</value>
+  </property>
+</configuration>
+```
+
+Activating Encrypted Shuffle
+----------------------------
+
+When you have made the above configuration changes, activate Encrypted Shuffle by re-starting all NodeManagers.
+
+**IMPORTANT:** Using encrypted shuffle will incur in a significant performance impact. Users should profile this and potentially reserve 1 or more cores for encrypted shuffle.
+
+Client Certificates
+-------------------
+
+Using Client Certificates does not fully ensure that the client is a reducer task for the job. Currently, Client Certificates (their private key) keystore files must be readable by all users submitting jobs to the cluster. This means that a rogue job could read such those keystore files and use the client certificates in them to establish a secure connection with a Shuffle server. However, unless the rogue job has a proper JobToken, it won't be able to retrieve shuffle data from the Shuffle server. A job, using its own JobToken, can only retrieve shuffle data that belongs to itself.
+
+Reloading Truststores
+---------------------
+
+By default the truststores will reload their configuration every 10 seconds. If a new truststore file is copied over the old one, it will be re-read, and its certificates will replace the old ones. This mechanism is useful for adding or removing nodes from the cluster, or for adding or removing trusted clients. In these cases, the client or NodeManager certificate is added to (or removed from) all the truststore files in the system, and the new configuration will be picked up without you having to restart the NodeManager daemons.
+
+Debugging
+---------
+
+**NOTE:** Enable debugging only for troubleshooting, and then only for jobs running on small amounts of data. It is very verbose and slows down jobs by several orders of magnitude. (You might need to increase mapred.task.timeout to prevent jobs from failing because tasks run so slowly.)
+
+To enable SSL debugging in the reducers, set `-Djavax.net.debug=all` in the `mapreduce.reduce.child.java.opts` property; for example:
+
+      <property>
+        <name>mapred.reduce.child.java.opts</name>
+        <value>-Xmx-200m -Djavax.net.debug=all</value>
+      </property>
+
+You can do this on a per-job basis, or by means of a cluster-wide setting in the `mapred-site.xml` file.
+
+To set this property in NodeManager, set it in the `yarn-env.sh` file:
+
+      YARN_NODEMANAGER_OPTS="-Djavax.net.debug=all"

+ 1156 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapReduceTutorial.md

@@ -0,0 +1,1156 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+MapReduce Tutorial
+==================
+
+* [MapReduce Tutorial](#MapReduce_Tutorial)
+    * [Purpose](#Purpose)
+    * [Prerequisites](#Prerequisites)
+    * [Overview](#Overview)
+    * [Inputs and Outputs](#Inputs_and_Outputs)
+    * [Example: WordCount v1.0](#Example:_WordCount_v1.0)
+        * [Source Code](#Source_Code)
+        * [Usage](#Usage)
+        * [Walk-through](#Walk-through)
+    * [MapReduce - User Interfaces](#MapReduce_-_User_Interfaces)
+        * [Payload](#Payload)
+            * [Mapper](#Mapper)
+            * [Reducer](#Reducer)
+            * [Partitioner](#Partitioner)
+            * [Counter](#Counter)
+        * [Job Configuration](#Job_Configuration)
+        * [Task Execution & Environment](#Task_Execution__Environment)
+            * [Memory Management](#Memory_Management)
+            * [Map Parameters](#Map_Parameters)
+            * [Shuffle/Reduce Parameters](#ShuffleReduce_Parameters)
+            * [Configured Parameters](#Configured_Parameters)
+            * [Task Logs](#Task_Logs)
+            * [Distributing Libraries](#Distributing_Libraries)
+        * [Job Submission and Monitoring](#Job_Submission_and_Monitoring)
+            * [Job Control](#Job_Control)
+        * [Job Input](#Job_Input)
+            * [InputSplit](#InputSplit)
+            * [RecordReader](#RecordReader)
+        * [Job Output](#Job_Output)
+            * [OutputCommitter](#OutputCommitter)
+            * [Task Side-Effect Files](#Task_Side-Effect_Files)
+            * [RecordWriter](#RecordWriter)
+        * [Other Useful Features](#Other_Useful_Features)
+            * [Submitting Jobs to Queues](#Submitting_Jobs_to_Queues)
+            * [Counters](#Counters)
+            * [DistributedCache](#DistributedCache)
+            * [Profiling](#Profiling)
+            * [Debugging](#Debugging)
+            * [Data Compression](#Data_Compression)
+            * [Skipping Bad Records](#Skipping_Bad_Records)
+        * [Example: WordCount v2.0](#Example:_WordCount_v2.0)
+            * [Source Code](#Source_Code)
+            * [Sample Runs](#Sample_Runs)
+            * [Highlights](#Highlights)
+
+Purpose
+-------
+
+This document comprehensively describes all user-facing facets of the Hadoop MapReduce framework and serves as a tutorial.
+
+Prerequisites
+-------------
+
+Ensure that Hadoop is installed, configured and is running. More details:
+
+*   [Single Node Setup](../../hadoop-project-dist/hadoop-common/SingleCluster.html)
+    for first-time users.
+
+*   [Cluster Setup](../../hadoop-project-dist/hadoop-common/ClusterSetup.html)
+    for large, distributed clusters.
+
+Overview
+--------
+
+Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.
+
+A MapReduce *job* usually splits the input data-set into independent chunks which are processed by the *map tasks* in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the *reduce tasks*. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.
+
+Typically the compute nodes and the storage nodes are the same, that is, the MapReduce framework and the Hadoop Distributed File System (see [HDFS Architecture Guide](../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)) are running on the same set of nodes. This configuration allows the framework to effectively schedule tasks on the nodes where data is already present, resulting in very high aggregate bandwidth across the cluster.
+
+The MapReduce framework consists of a single master `ResourceManager`, one slave `NodeManager` per cluster-node, and `MRAppMaster` per application (see [YARN Architecture Guide](../../hadoop-yarn/hadoop-yarn-site/YARN.html)).
+
+Minimally, applications specify the input/output locations and supply *map* and *reduce* functions via implementations of appropriate interfaces and/or abstract-classes. These, and other job parameters, comprise the *job configuration*.
+
+The Hadoop *job client* then submits the job (jar/executable etc.) and configuration to the `ResourceManager` which then assumes the responsibility of distributing the software/configuration to the slaves, scheduling tasks and monitoring them, providing status and diagnostic information to the job-client.
+
+Although the Hadoop framework is implemented in Java™, MapReduce applications need not be written in Java.
+
+* [Hadoop Streaming](../../api/org/apache/hadoop/streaming/package-summary.html)
+  is a utility which allows users to create and run jobs
+  with any executables (e.g. shell utilities) as the mapper and/or the
+  reducer.
+
+* [Hadoop Pipes](../../api/org/apache/hadoop/mapred/pipes/package-summary.html)
+  is a [SWIG](http://www.swig.org/)-compatible C++ API to
+  implement MapReduce applications (non JNI™ based).
+
+Inputs and Outputs
+------------------
+
+The MapReduce framework operates exclusively on `<key, value>` pairs, that is, the framework views the input to the job as a set of `<key, value>` pairs and produces a set of `<key, value>` pairs as the output of the job, conceivably of different types.
+
+The `key` and `value` classes have to be serializable by the framework and hence need to implement the [Writable](../../api/org/apache/hadoop/io/Writable.html) interface. Additionally, the key classes have to implement the [WritableComparable](../../api/org/apache/hadoop/io/WritableComparable.html) interface to facilitate sorting by the framework.
+
+Input and Output types of a MapReduce job:
+
+(input) `<k1, v1> ->` **map** `-> <k2, v2> ->` **combine** `-> <k2, v2> ->` **reduce** `-> <k3, v3>` (output)
+
+Example: WordCount v1.0
+-----------------------
+
+Before we jump into the details, lets walk through an example MapReduce application to get a flavour for how they work.
+
+`WordCount` is a simple application that counts the number of occurrences of each word in a given input set.
+
+This works with a local-standalone, pseudo-distributed or fully-distributed Hadoop installation ([Single Node Setup](../../hadoop-project-dist/hadoop-common/SingleCluster.html)).
+
+### Source Code
+
+```java
+import java.io.IOException;
+import java.util.StringTokenizer;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.Reducer;
+import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
+
+public class WordCount {
+
+  public static class TokenizerMapper
+       extends Mapper<Object, Text, Text, IntWritable>{
+
+    private final static IntWritable one = new IntWritable(1);
+    private Text word = new Text();
+
+    public void map(Object key, Text value, Context context
+                    ) throws IOException, InterruptedException {
+      StringTokenizer itr = new StringTokenizer(value.toString());
+      while (itr.hasMoreTokens()) {
+        word.set(itr.nextToken());
+        context.write(word, one);
+      }
+    }
+  }
+
+  public static class IntSumReducer
+       extends Reducer<Text,IntWritable,Text,IntWritable> {
+    private IntWritable result = new IntWritable();
+
+    public void reduce(Text key, Iterable<IntWritable> values,
+                       Context context
+                       ) throws IOException, InterruptedException {
+      int sum = 0;
+      for (IntWritable val : values) {
+        sum += val.get();
+      }
+      result.set(sum);
+      context.write(key, result);
+    }
+  }
+
+  public static void main(String[] args) throws Exception {
+    Configuration conf = new Configuration();
+    Job job = Job.getInstance(conf, "word count");
+    job.setJarByClass(WordCount.class);
+    job.setMapperClass(TokenizerMapper.class);
+    job.setCombinerClass(IntSumReducer.class);
+    job.setReducerClass(IntSumReducer.class);
+    job.setOutputKeyClass(Text.class);
+    job.setOutputValueClass(IntWritable.class);
+    FileInputFormat.addInputPath(job, new Path(args[0]));
+    FileOutputFormat.setOutputPath(job, new Path(args[1]));
+    System.exit(job.waitForCompletion(true) ? 0 : 1);
+  }
+}
+```
+
+### Usage
+
+Assuming environment variables are set as follows:
+
+```bash
+export JAVA_HOME=/usr/java/default
+export PATH=${JAVA_HOME}/bin:${PATH}
+export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
+```
+
+Compile `WordCount.java` and create a jar:
+
+    $ bin/hadoop com.sun.tools.javac.Main WordCount.java
+    $ jar cf wc.jar WordCount*.class
+
+Assuming that:
+
+* `/user/joe/wordcount/input` - input directory in HDFS
+* `/user/joe/wordcount/output` - output directory in HDFS
+
+Sample text-files as input:
+
+    $ bin/hadoop fs -ls /user/joe/wordcount/input/ /user/joe/wordcount/input/file01 /user/joe/wordcount/input/file02
+    
+    $ bin/hadoop fs -cat /user/joe/wordcount/input/file01
+    Hello World Bye World
+    
+    $ bin/hadoop fs -cat /user/joe/wordcount/input/file02
+    Hello Hadoop Goodbye Hadoop
+
+Run the application:
+
+    $ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output
+
+Output:
+
+    $ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000`
+    Bye 1
+    Goodbye 1
+    Hadoop 2
+    Hello 2
+    World 2`
+
+Applications can specify a comma separated list of paths which would be present in the current working directory of the task using the option `-files`. The `-libjars` option allows applications to add jars to the classpaths of the maps and reduces. The option `-archives` allows them to pass comma separated list of archives as arguments. These archives are unarchived and a link with name of the archive is created in the current working directory of tasks. More details about the command line options are available at [Commands Guide](../../hadoop-project-dist/hadoop-common/CommandsManual.html).
+
+Running `wordcount` example with `-libjars`, `-files` and `-archives`:
+
+    bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files cachefile.txt -libjars mylib.jar -archives myarchive.zip input output
+
+Here, myarchive.zip will be placed and unzipped into a directory by the name "myarchive.zip".
+
+Users can specify a different symbolic name for files and archives passed through `-files` and `-archives` option, using \#.
+
+For example,
+
+    bin/hadoop jar hadoop-mapreduce-examples-<ver>.jar wordcount -files dir1/dict.txt#dict1,dir2/dict.txt#dict2 -archives mytar.tgz#tgzdir input output
+
+Here, the files dir1/dict.txt and dir2/dict.txt can be accessed by tasks using the symbolic names dict1 and dict2 respectively. The archive mytar.tgz will be placed and unarchived into a directory by the name "tgzdir".
+
+### Walk-through
+
+The `WordCount` application is quite straight-forward.
+
+```java
+public void map(Object key, Text value, Context context
+                ) throws IOException, InterruptedException {
+  StringTokenizer itr = new StringTokenizer(value.toString());
+  while (itr.hasMoreTokens()) {
+    word.set(itr.nextToken());
+    context.write(word, one);
+  }
+}
+```
+
+The `Mapper` implementation, via the `map` method, processes one line at a time, as provided by the specified `TextInputFormat`. It then splits the line into tokens separated by whitespaces, via the `StringTokenizer`, and emits a key-value pair of `< <word>, 1>`.
+
+For the given sample input the first map emits:
+
+    < Hello, 1>
+    < World, 1>
+    < Bye, 1>
+    < World, 1>
+
+The second map emits:
+
+    < Hello, 1>
+    < Hadoop, 1>
+    < Goodbye, 1>
+    < Hadoop, 1>
+
+We'll learn more about the number of maps spawned for a given job, and how to control them in a fine-grained manner, a bit later in the tutorial.
+
+        job.setCombinerClass(IntSumReducer.class);
+
+`WordCount` also specifies a `combiner`. Hence, the output of each map is passed through the local combiner (which is same as the `Reducer` as per the job configuration) for local aggregation, after being sorted on the *key*s.
+
+The output of the first map:
+
+    < Bye, 1>
+    < Hello, 1>
+    < World, 2>`
+
+The output of the second map:
+
+    < Goodbye, 1>
+    < Hadoop, 2>
+    < Hello, 1>`
+
+```java
+public void reduce(Text key, Iterable<IntWritable> values,
+                   Context context
+                   ) throws IOException, InterruptedException {
+  int sum = 0;
+  for (IntWritable val : values) {
+    sum += val.get();
+  }
+  result.set(sum);
+  context.write(key, result);
+}
+```
+
+The `Reducer` implementation, via the `reduce` method just sums up the values, which are the occurence counts for each key (i.e. words in this example).
+
+Thus the output of the job is:
+
+    < Bye, 1>
+    < Goodbye, 1>
+    < Hadoop, 2>
+    < Hello, 2>
+    < World, 2>`
+
+The `main` method specifies various facets of the job, such as the input/output paths (passed via the command line), key/value types, input/output formats etc., in the `Job`. It then calls the `job.waitForCompletion` to submit the job and monitor its progress.
+
+We'll learn more about `Job`, `InputFormat`, `OutputFormat` and other interfaces and classes a bit later in the tutorial.
+
+MapReduce - User Interfaces
+---------------------------
+
+This section provides a reasonable amount of detail on every user-facing aspect of the MapReduce framework. This should help users implement, configure and tune their jobs in a fine-grained manner. However, please note that the javadoc for each class/interface remains the most comprehensive documentation available; this is only meant to be a tutorial.
+
+Let us first take the `Mapper` and `Reducer` interfaces. Applications typically implement them to provide the `map` and `reduce` methods.
+
+We will then discuss other core interfaces including `Job`, `Partitioner`, `InputFormat`, `OutputFormat`, and others.
+
+Finally, we will wrap up by discussing some useful features of the framework such as the `DistributedCache`, `IsolationRunner` etc.
+
+### Payload
+
+Applications typically implement the `Mapper` and `Reducer` interfaces to provide the `map` and `reduce` methods. These form the core of the job.
+
+#### Mapper
+
+[Mapper](../../api/org/apache/hadoop/mapreduce/Mapper.html) maps input key/value pairs to a set of intermediate key/value pairs.
+
+Maps are the individual tasks that transform input records into intermediate records. The transformed intermediate records do not need to be of the same type as the input records. A given input pair may map to zero or many output pairs.
+
+The Hadoop MapReduce framework spawns one map task for each `InputSplit` generated by the `InputFormat` for the job.
+
+Overall, `Mapper` implementations are passed the `Job` for the job via the [Job.setMapperClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html) method. The framework then calls [map(WritableComparable, Writable, Context)](../../api/org/apache/hadoop/mapreduce/Mapper.html) for each key/value pair in the `InputSplit` for that task. Applications can then override the `cleanup(Context)` method to perform any required cleanup.
+
+Output pairs do not need to be of the same types as input pairs. A given input pair may map to zero or many output pairs. Output pairs are collected with calls to context.write(WritableComparable, Writable).
+
+Applications can use the `Counter` to report its statistics.
+
+All intermediate values associated with a given output key are subsequently grouped by the framework, and passed to the `Reducer`(s) to determine the final output. Users can control the grouping by specifying a `Comparator` via [Job.setGroupingComparatorClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html).
+
+The `Mapper` outputs are sorted and then partitioned per `Reducer`. The total number of partitions is the same as the number of reduce tasks for the job. Users can control which keys (and hence records) go to which `Reducer` by implementing a custom `Partitioner`.
+
+Users can optionally specify a `combiner`, via [Job.setCombinerClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html), to perform local aggregation of the intermediate outputs, which helps to cut down the amount of data transferred from the `Mapper` to the `Reducer`.
+
+The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format. Applications can control if, and how, the intermediate outputs are to be compressed and the [CompressionCodec](../../api/org/apache/hadoop/io/compress/CompressionCodec.html) to be used via the `Configuration`.
+
+##### How Many Maps?
+
+The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files.
+
+The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks. Task setup takes a while, so it is best if the maps take at least a minute to execute.
+
+Thus, if you expect 10TB of input data and have a blocksize of `128MB`, you'll end up with 82,000 maps, unless Configuration.set(`MRJobConfig.NUM_MAPS`, int) (which only provides a hint to the framework) is used to set it even higher.
+
+#### Reducer
+
+[Reducer](../../api/org/apache/hadoop/mapreduce/Reducer.html) reduces a set of intermediate values which share a key to a smaller set of values.
+
+The number of reduces for the job is set by the user via [Job.setNumReduceTasks(int)](../../api/org/apache/hadoop/mapreduce/Job.html).
+
+Overall, `Reducer` implementations are passed the `Job` for the job via the [Job.setReducerClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html) method and can override it to initialize themselves. The framework then calls [reduce(WritableComparable, Iterable\<Writable\>, Context)](../../api/org/apache/hadoop/mapreduce/Reducer.html) method for each `<key, (list of values)>` pair in the grouped inputs. Applications can then override the `cleanup(Context)` method to perform any required cleanup.
+
+`Reducer` has 3 primary phases: shuffle, sort and reduce.
+
+##### Shuffle
+
+Input to the `Reducer` is the sorted output of the mappers. In this phase the framework fetches the relevant partition of the output of all the mappers, via HTTP.
+
+##### Sort
+
+The framework groups `Reducer` inputs by keys (since different mappers may have output the same key) in this stage.
+
+The shuffle and sort phases occur simultaneously; while map-outputs are being fetched they are merged.
+
+##### Secondary Sort
+
+If equivalence rules for grouping the intermediate keys are required to be different from those for grouping keys before reduction, then one may specify a `Comparator` via [Job.setSortComparatorClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html). Since [Job.setGroupingComparatorClass(Class)](../../api/org/apache/hadoop/mapreduce/Job.html) can be used to control how intermediate keys are grouped, these can be used in conjunction to simulate *secondary sort on values*.
+
+##### Reduce
+
+In this phase the reduce(WritableComparable, Iterable\<Writable\>, Context) method is called for each `<key, (list of values)>` pair in the grouped inputs.
+
+The output of the reduce task is typically written to the [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html) via Context.write(WritableComparable, Writable).
+
+Applications can use the `Counter` to report its statistics.
+
+The output of the `Reducer` is *not sorted*.
+
+##### How Many Reduces?
+
+The right number of reduces seems to be `0.95` or `1.75` multiplied by (\<*no. of nodes*\> \* \<*no. of maximum containers per node*\>).
+
+With `0.95` all of the reduces can launch immediately and start transferring map outputs as the maps finish. With `1.75` the faster nodes will finish their first round of reduces and launch a second wave of reduces doing a much better job of load balancing.
+
+Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.
+
+The scaling factors above are slightly less than whole numbers to reserve a few reduce slots in the framework for speculative-tasks and failed tasks.
+
+##### Reducer NONE
+
+It is legal to set the number of reduce-tasks to *zero* if no reduction is desired.
+
+In this case the outputs of the map-tasks go directly to the `FileSystem`, into the output path set by [FileOutputFormat.setOutputPath(Job, Path)](../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html). The framework does not sort the map-outputs before writing them out to the `FileSystem`.
+
+#### Partitioner
+
+[Partitioner](../../api/org/apache/hadoop/mapreduce/Partitioner.html) partitions the key space.
+
+Partitioner controls the partitioning of the keys of the intermediate map-outputs. The key (or a subset of the key) is used to derive the partition, typically by a *hash function*. The total number of partitions is the same as the number of reduce tasks for the job. Hence this controls which of the `m` reduce tasks the intermediate key (and hence the record) is sent to for reduction.
+
+[HashPartitioner](../../api/org/apache/hadoop/mapreduce/lib/partition/HashPartitioner.html) is the default `Partitioner`.
+
+#### Counter
+
+[Counter](../../api/org/apache/hadoop/mapreduce/Counter.html) is a facility for MapReduce applications to report its statistics.
+
+`Mapper` and `Reducer` implementations can use the `Counter` to report statistics.
+
+Hadoop MapReduce comes bundled with a [library](../../api/org/apache/hadoop/mapreduce/package-summary.html) of generally useful mappers, reducers, and partitioners.
+
+### Job Configuration
+
+[Job](../../api/org/apache/hadoop/mapreduce/Job.html) represents a MapReduce job configuration.
+
+`Job` is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution. The framework tries to faithfully execute the job as described by `Job`, however:
+
+* Some configuration parameters may have been marked as final by
+  administrators
+  (see [Final Parameters](../../api/org/apache/hadoop/conf/Configuration.html#FinalParams))
+  and hence cannot be altered.
+
+* While some job parameters are straight-forward to set (e.g.
+  [Job.setNumReduceTasks(int)](../../api/org/apache/hadoop/mapreduce/Job.html))
+  , other parameters interact subtly with the
+  rest of the framework and/or job configuration and are more complex to set
+  (e.g. [Configuration.set(`JobContext.NUM_MAPS`, int)](../../api/org/apache/hadoop/conf/Configuration.html)).
+
+`Job` is typically used to specify the `Mapper`, combiner (if any), `Partitioner`, `Reducer`, `InputFormat`, `OutputFormat` implementations.
+[FileInputFormat](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html)
+indicates the set of input files
+([FileInputFormat.setInputPaths(Job, Path...)](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html)/
+[FileInputFormat.addInputPath(Job, Path)](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html)) and
+([FileInputFormat.setInputPaths(Job, String...)](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html)/
+[FileInputFormat.addInputPaths(Job, String))](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html) and where the output files should be written
+([FileOutputFormat.setOutputPath(Path)](../../api/org/apache/hadoop/mapreduce/lib/input/FileOutputFormat.html)).
+
+Optionally, `Job` is used to specify other advanced facets of the job such as the `Comparator` to be used, files to be put in the `DistributedCache`, whether intermediate and/or job outputs are to be compressed (and how), whether job tasks can be executed in a *speculative* manner
+([setMapSpeculativeExecution(boolean)](../../api/org/apache/hadoop/mapreduce/Job.html))/
+[setReduceSpeculativeExecution(boolean)](../../api/org/apache/hadoop/mapreduce/Job.html)),
+maximum number of attempts per task
+([setMaxMapAttempts(int)](../../api/org/apache/hadoop/mapreduce/Job.html)/
+[setMaxReduceAttempts(int)](../../api/org/apache/hadoop/mapreduce/Job.html)) etc.
+
+Of course, users can use
+[Configuration.set(String, String)](../../api/org/apache/hadoop/conf/Configuration.html)/
+[Configuration.get(String)](../../api/org/apache/hadoop/conf/Configuration.html)
+to set/get arbitrary parameters needed by
+applications. However, use the `DistributedCache` for large amounts of
+(read-only) data.
+
+### Task Execution & Environment
+
+The `MRAppMaster` executes the `Mapper`/`Reducer` *task* as a child process in a separate jvm.
+
+The child-task inherits the environment of the parent `MRAppMaster`. The user can specify additional options to the child-jvm via the `mapreduce.{map|reduce}.java.opts` and configuration parameter in the `Job` such as non-standard paths for the run-time linker to search shared libraries via `-Djava.library.path=<>` etc. If the `mapreduce.{map|reduce}.java.opts` parameters contains the symbol *@taskid@* it is interpolated with value of `taskid` of the MapReduce task.
+
+Here is an example with multiple arguments and substitutions, showing jvm GC logging, and start of a passwordless JVM JMX agent so that it can connect with jconsole and the likes to watch child memory, threads and get thread dumps. It also sets the maximum heap-size of the map and reduce child jvm to 512MB & 1024MB respectively. It also adds an additional path to the `java.library.path` of the child-jvm.
+
+```xml
+<property>
+  <name>mapreduce.map.java.opts</name>
+  <value>
+  -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc
+  -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
+  </value>
+</property>
+
+<property>
+  <name>mapreduce.reduce.java.opts</name>
+  <value>
+  -Xmx1024M -Djava.library.path=/home/mycompany/lib -verbose:gc -Xloggc:/tmp/@taskid@.gc
+  -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
+  </value>
+</property>
+```
+
+#### Memory Management
+
+Users/admins can also specify the maximum virtual memory of the launched child-task, and any sub-process it launches recursively, using `mapreduce.{map|reduce}.memory.mb`. Note that the value set here is a per process limit. The value for `mapreduce.{map|reduce}.memory.mb` should be specified in mega bytes (MB). And also the value must be greater than or equal to the -Xmx passed to JavaVM, else the VM might not start.
+
+Note: `mapreduce.{map|reduce}.java.opts` are used only for configuring the launched child tasks from MRAppMaster. Configuring the memory options for daemons is documented in [Configuring the Environment of the Hadoop Daemons](../../hadoop-project-dist/hadoop-common/ClusterSetup.html#Configuring_Environment_of_Hadoop_Daemons).
+
+The memory available to some parts of the framework is also configurable. In map and reduce tasks, performance may be influenced by adjusting parameters influencing the concurrency of operations and the frequency with which data will hit disk. Monitoring the filesystem counters for a job- particularly relative to byte counts from the map and into the reduce- is invaluable to the tuning of these parameters.
+
+#### Map Parameters
+
+A record emitted from a map will be serialized into a buffer and metadata will be stored into accounting buffers. As described in the following options, when either the serialization buffer or the metadata exceed a threshold, the contents of the buffers will be sorted and written to disk in the background while the map continues to output records. If either buffer fills completely while the spill is in progress, the map thread will block. When the map is finished, any remaining records are written to disk and all on-disk segments are merged into a single file. Minimizing the number of spills to disk can decrease map time, but a larger buffer also decreases the memory available to the mapper.
+
+|               Name |  Type |                                                           Description |
+|:---- |:---- |:---- |
+|     mapreduce.task.io.sort.mb |  int |       The cumulative size of the serialization and accounting buffers storing records emitted from the map, in megabytes. |
+| mapreduce.map.sort.spill.percent | float | The soft limit in the serialization buffer. Once reached, a thread will begin to spill the contents to disk in the background. |
+
+Other notes
+
+* If either spill threshold is exceeded while a spill is in progress,
+  collection will continue until the spill is finished. For example, if
+  `mapreduce.map.sort.spill.percent` is set to 0.33, and the remainder
+  of the buffer is filled while the spill runs, the next spill will include
+  all the collected records, or 0.66 of the buffer, and will not generate
+  additional spills. In other words, the thresholds are defining triggers,
+  not blocking.
+
+* A record larger than the serialization buffer will first trigger a spill,
+  then be spilled to a separate file. It is undefined whether or not this
+  record will first pass through the combiner.
+
+#### Shuffle/Reduce Parameters
+
+As described previously, each reduce fetches the output assigned to it by the Partitioner via HTTP into memory and periodically merges these outputs to disk. If intermediate compression of map outputs is turned on, each output is decompressed into memory. The following options affect the frequency of these merges to disk prior to the reduce and the memory allocated to map output during the reduce.
+
+| Name | Type | Description |
+|:---- |:---- |:---- |
+| mapreduce.task.io.soft.factor | int | Specifies the number of segments on disk to be merged at the same time. It limits the number of open files and compression codecs during merge. If the number of files exceeds this limit, the merge will proceed in several passes. Though this limit also applies to the map, most jobs should be configured so that hitting this limit is unlikely there. |
+| mapreduce.reduce.merge.inmem.thresholds | int | The number of sorted map outputs fetched into memory before being merged to disk. Like the spill thresholds in the preceding note, this is not defining a unit of partition, but a trigger. In practice, this is usually set very high (1000) or disabled (0), since merging in-memory segments is often less expensive than merging from disk (see notes following this table). This threshold influences only the frequency of in-memory merges during the shuffle. |
+| mapreduce.reduce.shuffle.merge.percent | float | The memory threshold for fetched map outputs before an in-memory merge is started, expressed as a percentage of memory allocated to storing map outputs in memory. Since map outputs that can't fit in memory can be stalled, setting this high may decrease parallelism between the fetch and merge. Conversely, values as high as 1.0 have been effective for reduces whose input can fit entirely in memory. This parameter influences only the frequency of in-memory merges during the shuffle. |
+| mapreduce.reduce.shuffle.input.buffer.percent | float | The percentage of memory- relative to the maximum heapsize as typically specified in `mapreduce.reduce.java.opts`- that can be allocated to storing map outputs during the shuffle. Though some memory should be set aside for the framework, in general it is advantageous to set this high enough to store large and numerous map outputs. |
+| mapreduce.reduce.input.buffer.percent | float | The percentage of memory relative to the maximum heapsize in which map outputs may be retained during the reduce. When the reduce begins, map outputs will be merged to disk until those that remain are under the resource limit this defines. By default, all map outputs are merged to disk before the reduce begins to maximize the memory available to the reduce. For less memory-intensive reduces, this should be increased to avoid trips to disk. |
+
+Other notes
+
+* If a map output is larger than 25 percent of the memory allocated to
+  copying map outputs, it will be written directly to disk without first
+  staging through memory.
+
+* When running with a combiner, the reasoning about high merge thresholds
+  and large buffers may not hold. For merges started before all map outputs
+  have been fetched, the combiner is run while spilling to disk. In some
+  cases, one can obtain better reduce times by spending resources combining
+  map outputs- making disk spills small and parallelizing spilling and
+  fetching- rather than aggressively increasing buffer sizes.
+
+* When merging in-memory map outputs to disk to begin the reduce, if an
+  intermediate merge is necessary because there are segments to spill and at
+  least `mapreduce.task.io.sort.factor` segments already on disk, the
+  in-memory map outputs will be part of the intermediate merge.
+
+#### Configured Parameters
+
+The following properties are localized in the job configuration for each task's execution:
+
+| Name | Type | Description |
+|:---- |:---- |:---- |
+| mapreduce.job.id | String | The job id |
+| mapreduce.job.jar | String | job.jar location in job directory |
+| mapreduce.job.local.dir | String | The job specific shared scratch space |
+| mapreduce.task.id | String | The task id |
+| mapreduce.task.attempt.id | String | The task attempt id |
+| mapreduce.task.is.map | boolean | Is this a map task |
+| mapreduce.task.partition | int | The id of the task within the job |
+| mapreduce.map.input.file | String | The filename that the map is reading from |
+| mapreduce.map.input.start | long | The offset of the start of the map input split |
+| mapreduce.map.input.length | long | The number of bytes in the map input split |
+| mapreduce.task.output.dir | String | The task's temporary output directory |
+
+**Note:** During the execution of a streaming job, the names of the "mapreduce" parameters are transformed. The dots ( . ) become underscores ( \_ ). For example, mapreduce.job.id becomes mapreduce\_job\_id and mapreduce.job.jar becomes mapreduce\_job\_jar. To get the values in a streaming job's mapper/reducer use the parameter names with the underscores.
+
+#### Task Logs
+
+The standard output (stdout) and error (stderr) streams and the syslog of the task are read by the NodeManager and logged to `${HADOOP_LOG_DIR}/userlogs`.
+
+#### Distributing Libraries
+
+The [DistributedCache](#DistributedCache) can also be used to distribute both jars and native
+libraries for use in the map and/or reduce tasks. The child-jvm always has
+its *current working directory* added to the `java.library.path` and
+`LD_LIBRARY_PATH`. And hence the cached libraries can be loaded via
+[System.loadLibrary](http://docs.oracle.com/javase/7/docs/api/java/lang/System.html) or
+[System.load](http://docs.oracle.com/javase/7/docs/api/java/lang/System.html).
+More details on how to load shared libraries through distributed cache are documented at
+[Native Libraries](../../hadoop-project-dist/hadoop-common/NativeLibraries.html#Native_Shared_Libraries).
+
+### Job Submission and Monitoring
+
+[Job](../../api/org/apache/hadoop/mapreduce/Job.html) is the primary interface by which user-job interacts with the `ResourceManager`.
+
+`Job` provides facilities to submit jobs, track their progress, access component-tasks' reports and logs, get the MapReduce cluster's status information and so on.
+
+The job submission process involves:
+
+1.  Checking the input and output specifications of the job.
+
+2.  Computing the `InputSplit` values for the job.
+
+3.  Setting up the requisite accounting information for the
+    `DistributedCache` of the job, if necessary.
+
+4.  Copying the job's jar and configuration to the MapReduce system
+    directory on the `FileSystem`.
+
+5.  Submitting the job to the `ResourceManager` and optionally
+    monitoring it's status.
+
+Job history files are also logged to user specified directory `mapreduce.jobhistory.intermediate-done-dir` and `mapreduce.jobhistory.done-dir`, which defaults to job output directory.
+
+User can view the history logs summary in specified directory using the following command `$ mapred job -history output.jhist` This command will print job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed using the following command `$ mapred job -history all output.jhist`
+
+Normally the user uses `Job` to create the application, describe various facets of the job, submit the job, and monitor its progress.
+
+#### Job Control
+
+Users may need to chain MapReduce jobs to accomplish complex tasks which cannot be done via a single MapReduce job. This is fairly easy since the output of the job typically goes to distributed file-system, and the output, in turn, can be used as the input for the next job.
+
+However, this also means that the onus on ensuring jobs are complete (success/failure) lies squarely on the clients. In such cases, the various job-control options are:
+
+* [Job.submit()](../../api/org/apache/hadoop/mapreduce/Job.html) :
+  Submit the job to the cluster and return immediately.
+
+* [Job.waitForCompletion(boolean)](../../api/org/apache/hadoop/mapreduce/Job.html) :
+  Submit the job to the cluster and wait for it to finish.
+
+### Job Input
+
+[InputFormat](../../api/org/apache/hadoop/mapreduce/InputFormat.html) describes the input-specification for a MapReduce job.
+
+The MapReduce framework relies on the `InputFormat` of the job to:
+
+1.  Validate the input-specification of the job.
+
+2.  Split-up the input file(s) into logical `InputSplit` instances,
+    each of which is then assigned to an individual `Mapper`.
+
+3.  Provide the `RecordReader` implementation used to glean input
+    records from the logical `InputSplit` for processing by the
+    `Mapper`.
+
+The default behavior of file-based `InputFormat` implementations,
+typically sub-classes of
+[FileInputFormat](../../api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html),
+is to split the input into *logical* `InputSplit`
+instances based on the total size, in bytes, of the input files. However, the
+`FileSystem` blocksize of the input files is treated as an upper bound
+for input splits. A lower bound on the split size can be set via
+`mapreduce.input.fileinputformat.split.minsize`.
+
+Clearly, logical splits based on input-size is insufficient for many applications since record boundaries must be respected. In such cases, the application should implement a `RecordReader`, who is responsible for respecting record-boundaries and presents a record-oriented view of the logical `InputSplit` to the individual task.
+
+[TextInputFormat](../../api/org/apache/hadoop/mapreduce/lib/input/TextInputFormat.html)
+is the default `InputFormat`.
+
+If `TextInputFormat` is the `InputFormat` for a given job, the framework detects input-files with the *.gz* extensions and automatically decompresses them using the appropriate `CompressionCodec`. However, it must be noted that compressed files with the above extensions cannot be *split* and each compressed file is processed in its entirety by a single mapper.
+
+#### InputSplit
+
+[InputSplit](../../api/org/apache/hadoop/mapreduce/InputSplit.html)
+represents the data to be processed by an individual `Mapper`.
+
+Typically `InputSplit` presents a byte-oriented view of the input, and it is the responsibility of `RecordReader` to process and present a record-oriented view.
+
+[FileSplit](../../api/org/apache/hadoop/mapreduce/lib/input/FileSplit.html)
+is the default `InputSplit`. It sets `mapreduce.map.input.file` to
+the path of the input file for the logical split.
+
+#### RecordReader
+
+[RecordReader](../../api/org/apache/hadoop/mapreduce/RecordReader.html) reads `<key, value>` pairs from an `InputSplit`.
+
+Typically the `RecordReader` converts the byte-oriented view of the input, provided by the `InputSplit`, and presents a record-oriented to the `Mapper` implementations for processing. `RecordReader` thus assumes the responsibility of processing record boundaries and presents the tasks with keys and values.
+
+### Job Output
+
+[OutputFormat](../../api/org/apache/hadoop/mapreduce/OutputFormat.html)
+describes the output-specification for a MapReduce job.
+
+The MapReduce framework relies on the `OutputFormat` of the job to:
+
+1.  Validate the output-specification of the job; for example, check that
+    the output directory doesn't already exist.
+
+2.  Provide the `RecordWriter` implementation used to write the output
+    files of the job. Output files are stored in a `FileSystem`.
+
+`TextOutputFormat` is the default `OutputFormat`.
+
+#### OutputCommitter
+
+[OutputCommitter](../../api/org/apache/hadoop/mapreduce/OutputCommitter.html)
+describes the commit of task output for a MapReduce job.
+
+The MapReduce framework relies on the `OutputCommitter` of the job to:
+
+1.  Setup the job during initialization. For example, create the temporary
+    output directory for the job during the initialization of the job. Job
+    setup is done by a separate task when the job is in PREP state and
+    after initializing tasks. Once the setup task completes, the job will
+    be moved to RUNNING state.
+
+2.  Cleanup the job after the job completion. For example, remove the
+    temporary output directory after the job completion. Job cleanup is
+    done by a separate task at the end of the job. Job is declared
+    SUCCEDED/FAILED/KILLED after the cleanup task completes.
+
+3.  Setup the task temporary output. Task setup is done as part of the
+    same task, during task initialization.
+
+4.  Check whether a task needs a commit. This is to avoid the commit
+    procedure if a task does not need commit.
+
+5.  Commit of the task output. Once task is done, the task will commit
+    it's output if required.
+
+6.  Discard the task commit. If the task has been failed/killed, the
+    output will be cleaned-up. If task could not cleanup (in exception
+    block), a separate task will be launched with same attempt-id to do
+    the cleanup.
+
+`FileOutputCommitter` is the default `OutputCommitter`. Job setup/cleanup tasks occupy map or reduce containers, whichever is available on the NodeManager. And JobCleanup task, TaskCleanup tasks and JobSetup task have the highest priority, and in that order.
+
+#### Task Side-Effect Files
+
+In some applications, component tasks need to create and/or write to side-files, which differ from the actual job-output files.
+
+In such cases there could be issues with two instances of the same `Mapper` or `Reducer` running simultaneously (for example, speculative tasks) trying to open and/or write to the same file (path) on the `FileSystem`. Hence the application-writer will have to pick unique names per task-attempt (using the attemptid, say `attempt_200709221812_0001_m_000000_0`), not just per task.
+
+To avoid these issues the MapReduce framework, when the `OutputCommitter` is `FileOutputCommitter`, maintains a special `${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}` sub-directory accessible via `${mapreduce.task.output.dir}` for each task-attempt on the `FileSystem` where the output of the task-attempt is stored. On successful completion of the task-attempt, the files in the `${mapreduce.output.fileoutputformat.outputdir}/_temporary/_${taskid}` (only) are *promoted* to `${mapreduce.output.fileoutputformat.outputdir}`. Of course, the framework discards the sub-directory of unsuccessful task-attempts. This process is completely transparent to the application.
+
+The application-writer can take advantage of this feature by creating any side-files required in `${mapreduce.task.output.dir}` during execution of a task via
+[FileOutputFormat.getWorkOutputPath(Conext)](../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html),
+and the framework will promote them similarly for succesful task-attempts, thus eliminating the need to pick unique paths per task-attempt.
+
+Note: The value of `${mapreduce.task.output.dir}` during execution of a particular task-attempt is actually `${mapreduce.output.fileoutputformat.outputdir}/_temporary/_{$taskid}`, and this value is set by the MapReduce framework. So, just create any side-files in the path returned by
+[FileOutputFormat.getWorkOutputPath(Conext)](../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html)
+from MapReduce task to take advantage of this feature.
+
+The entire discussion holds true for maps of jobs with reducer=NONE (i.e. 0 reduces) since output of the map, in that case, goes directly to HDFS.
+
+#### RecordWriter
+
+[RecordWriter](../../api/org/apache/hadoop/mapreduce/RecordWriter.html)
+writes the output `<key, value>` pairs to an output file.
+
+RecordWriter implementations write the job outputs to the `FileSystem`.
+
+### Other Useful Features
+
+#### Submitting Jobs to Queues
+
+Users submit jobs to Queues. Queues, as collection of jobs, allow the system to provide specific functionality. For example, queues use ACLs to control which users who can submit jobs to them. Queues are expected to be primarily used by Hadoop Schedulers.
+
+Hadoop comes configured with a single mandatory queue, called 'default'. Queue names are defined in the `mapreduce.job.queuename`\> property of the Hadoop site configuration. Some job schedulers, such as the
+[Capacity Scheduler](../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html),
+support multiple queues.
+
+A job defines the queue it needs to be submitted to through the `mapreduce.job.queuename` property, or through the Configuration.set(`MRJobConfig.QUEUE_NAME`, String) API. Setting the queue name is optional. If a job is submitted without an associated queue name, it is submitted to the 'default' queue.
+
+#### Counters
+
+`Counters` represent global counters, defined either by the MapReduce framework or applications. Each `Counter` can be of any `Enum` type. Counters of a particular `Enum` are bunched into groups of type `Counters.Group`.
+
+Applications can define arbitrary `Counters` (of type `Enum`) and update them via
+[Counters.incrCounter(Enum, long)](../../api/org/apache/hadoop/mapred/Counters.html)
+or Counters.incrCounter(String, String, long) in the `map` and/or `reduce` methods. These counters are then globally aggregated by the framework.
+
+#### DistributedCache
+
+`DistributedCache` distributes application-specific, large, read-only files efficiently.
+
+`DistributedCache` is a facility provided by the MapReduce framework to cache files (text, archives, jars and so on) needed by applications.
+
+Applications specify the files to be cached via urls (hdfs://) in the `Job`. The `DistributedCache` assumes that the files specified via hdfs:// urls are already present on the `FileSystem`.
+
+The framework will copy the necessary files to the slave node before any tasks for the job are executed on that node. Its efficiency stems from the fact that the files are only copied once per job and the ability to cache archives which are un-archived on the slaves.
+
+`DistributedCache` tracks the modification timestamps of the cached files. Clearly the cache files should not be modified by the application or externally while the job is executing.
+
+`DistributedCache` can be used to distribute simple, read-only data/text files and more complex types such as archives and jars. Archives (zip, tar, tgz and tar.gz files) are *un-archived* at the slave nodes. Files have *execution permissions* set.
+
+The files/archives can be distributed by setting the property `mapreduce.job.cache.{files |archives}`. If more than one file/archive has to be distributed, they can be added as comma separated paths. The properties can also be set by APIs
+[Job.addCacheFile(URI)](../../api/org/apache/hadoop/mapreduce/Job.html)/
+[Job.addCacheArchive(URI)](../../api/org/apache/hadoop/mapreduce/Job.html)
+and
+[Job.setCacheFiles(URI[])](../../api/org/apache/hadoop/mapreduce/Job.html)/
+[Job.setCacheArchives(URI[])](../../api/org/apache/hadoop/mapreduce/Job.html)
+where URI is of the form `hdfs://host:port/absolute-path\#link-name`. In Streaming, the files can be distributed through command line option `-cacheFile/-cacheArchive`.
+
+The `DistributedCache` can also be used as a rudimentary software distribution mechanism for use in the map and/or reduce tasks. It can be used to distribute both jars and native libraries. The
+[Job.addArchiveToClassPath(Path)](../../api/org/apache/hadoop/mapreduce/Job.html) or
+[Job.addFileToClassPath(Path)](../../api/org/apache/hadoop/mapreduce/Job.html)
+api can be used to cache files/jars and also add them to the *classpath* of child-jvm. The same can be done by setting the configuration properties `mapreduce.job.classpath.{files |archives}`. Similarly the cached files that are symlinked into the working directory of the task can be used to distribute native libraries and load them.
+
+##### Private and Public DistributedCache Files
+
+DistributedCache files can be private or public, that determines how they can be shared on the slave nodes.
+
+* "Private" DistributedCache files are cached in a localdirectory private to
+  the user whose jobs need these files. These files are shared by all tasks
+  and jobs of the specific user only and cannot be accessed by jobs of
+  other users on the slaves. A DistributedCache file becomes private by
+  virtue of its permissions on the file system where the files are
+  uploaded, typically HDFS. If the file has no world readable access, or if
+  the directory path leading to the file has no world executable access for
+  lookup, then the file becomes private.
+
+* "Public" DistributedCache files are cached in a global directory and the
+  file access is setup such that they are publicly visible to all users.
+  These files can be shared by tasks and jobs of all users on the slaves. A
+  DistributedCache file becomes public by virtue of its permissions on the
+  file system where the files are uploaded, typically HDFS. If the file has
+  world readable access, AND if the directory path leading to the file has
+  world executable access for lookup, then the file becomes public. In other
+  words, if the user intends to make a file publicly available to all users,
+  the file permissions must be set to be world readable, and the directory
+  permissions on the path leading to the file must be world executable.
+
+#### Profiling
+
+Profiling is a utility to get a representative (2 or 3) sample of built-in java profiler for a sample of maps and reduces.
+
+User can specify whether the system should collect profiler information for some of the tasks in the job by setting the configuration property `mapreduce.task.profile`. The value can be set using the api Configuration.set(`MRJobConfig.TASK_PROFILE`, boolean). If the value is set `true`, the task profiling is enabled. The profiler information is stored in the user log directory. By default, profiling is not enabled for the job.
+
+Once user configures that profiling is needed, she/he can use the configuration property `mapreduce.task.profile.{maps|reduces}` to set the ranges of MapReduce tasks to profile. The value can be set using the api Configuration.set(`MRJobConfig.NUM_{MAP|REDUCE}_PROFILES`, String). By default, the specified range is `0-2`.
+
+User can also specify the profiler configuration arguments by setting the configuration property `mapreduce.task.profile.params`. The value can be specified using the api Configuration.set(`MRJobConfig.TASK_PROFILE_PARAMS`, String). If the string contains a `%s`, it will be replaced with the name of the profiling output file when the task runs. These parameters are passed to the task child JVM on the command line. The default value for the profiling parameters is `-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s`.
+
+#### Debugging
+
+The MapReduce framework provides a facility to run user-provided scripts for debugging. When a MapReduce task fails, a user can run a debug script, to process task logs for example. The script is given access to the task's stdout and stderr outputs, syslog and jobconf. The output from the debug script's stdout and stderr is displayed on the console diagnostics and also as part of the job UI.
+
+In the following sections we discuss how to submit a debug script with a job. The script file needs to be distributed and submitted to the framework.
+
+##### How to distribute the script file:
+
+The user needs to use [DistributedCache](#DistributedCache) to *distribute* and *symlink* thescript file.
+
+##### How to submit the script:
+
+A quick way to submit the debug script is to set values for the properties `mapreduce.map.debug.script` and `mapreduce.reduce.debug.script`, for debugging map and reduce tasks respectively. These properties can also be set by using APIs
+[Configuration.set(`MRJobConfig.MAP_DEBUG_SCRIPT`, String)](../../api/org/apache/hadoop/conf/Configuration.html) and
+[Configuration.set(`MRJobConfig.REDUCE_DEBUG_SCRIPT`, String)](../../api/org/apache/hadoop/conf/Configuration.html).
+In streaming mode, a debug script can be submitted with the command-line options `-mapdebug` and `-reducedebug`, for debugging map and reduce tasks respectively.
+
+The arguments to the script are the task's stdout, stderr, syslog and jobconf files. The debug command, run on the node where the MapReduce task failed, is:<br/>
+`$script $stdout $stderr $syslog $jobconf`
+
+Pipes programs have the c++ program name as a fifth argument for the command. Thus for the pipes programs the command is<br/>
+`$script $stdout $stderr $syslog $jobconf $program`
+
+##### Default Behavior:
+
+For pipes, a default script is run to process core dumps under gdb, prints stack trace and gives info about running threads.
+
+#### Data Compression
+
+Hadoop MapReduce provides facilities for the application-writer to specify compression for both intermediate map-outputs and the job-outputs i.e. output of the reduces. It also comes bundled with
+[CompressionCodec](../../api/org/apache/hadoop/io/compress/CompressionCodec.html)
+implementation for the
+[zlib](http://www.zlib.net) compression algorithm. The
+[gzip](http://www.gzip.org),
+[bzip2](http://www.bzip.org),
+[snappy](http://code.google.com/p/snappy/), and
+[lz4](http://code.google.com/p/lz4/) file format are also supported.
+
+Hadoop also provides native implementations of the above compression codecs for reasons of both performance (zlib) and non-availability of Java libraries. More details on their usage and availability are available
+[here](../../hadoop-project-dist/hadoop-common/NativeLibraries.html).
+
+##### Intermediate Outputs
+
+Applications can control compression of intermediate map-outputs via the Configuration.set(`MRJobConfig.MAP_OUTPUT_COMPRESS`, boolean) api and the `CompressionCodec` to be used via the Configuration.set(`MRJobConfig.MAP_OUTPUT_COMPRESS_CODEC`, Class) api.
+
+##### Job Outputs
+
+Applications can control compression of job-outputs via the
+[FileOutputFormat.setCompressOutput(Job, boolean)](../../api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html)
+api and the `CompressionCodec` to be used can be specified via the FileOutputFormat.setOutputCompressorClass(Job, Class) api.
+
+If the job outputs are to be stored in the
+[SequenceFileOutputFormat](../../api/org/apache/hadoop/mapreduce/lib/output/SequenceFileOutputFormat.html),
+the required `SequenceFile.CompressionType` (i.e. `RECORD` / `BLOCK` - defaults to `RECORD`) can be specified via the SequenceFileOutputFormat.setOutputCompressionType(Job, SequenceFile.CompressionType) api.
+
+#### Skipping Bad Records
+
+Hadoop provides an option where a certain set of bad input records can be skipped when processing map inputs. Applications can control this feature through the
+[SkipBadRecords](../../api/org/apache/hadoop/mapred/SkipBadRecords.html) class.
+
+This feature can be used when map tasks crash deterministically on certain input. This usually happens due to bugs in the map function. Usually, the user would have to fix these bugs. This is, however, not possible sometimes. The bug may be in third party libraries, for example, for which the source code is not available. In such cases, the task never completes successfully even after multiple attempts, and the job fails. With this feature, only a small portion of data surrounding the bad records is lost, which may be acceptable for some applications (those performing statistical analysis on very large data, for example).
+
+By default this feature is disabled. For enabling it, refer to
+[SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html) and
+[SkipBadRecords.setReducerMaxSkipGroups(Configuration, long)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html).
+
+With this feature enabled, the framework gets into 'skipping mode' after a certain number of map failures. For more details, see
+[SkipBadRecords.setAttemptsToStartSkipping(Configuration, int)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html).
+In 'skipping mode', map tasks maintain the range of records being processed. To do this, the framework relies on the processed record counter. See
+[SkipBadRecords.COUNTER\_MAP\_PROCESSED\_RECORDS](../../api/org/apache/hadoop/mapred/SkipBadRecords.html) and
+[SkipBadRecords.COUNTER\_REDUCE\_PROCESSED\_GROUPS](../../api/org/apache/hadoop/mapred/SkipBadRecords.html).
+This counter enables the framework to know how many records have been processed successfully, and hence, what record range caused a task to crash. On further attempts, this range of records is skipped.
+
+The number of records skipped depends on how frequently the processed record counter is incremented by the application. It is recommended that this counter be incremented after every record is processed. This may not be possible in some applications that typically batch their processing. In such cases, the framework may skip additional records surrounding the bad record. Users can control the number of skipped records through
+[SkipBadRecords.setMapperMaxSkipRecords(Configuration, long)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html) and
+[SkipBadRecords.setReducerMaxSkipGroups(Configuration, long)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html).
+The framework tries to narrow the range of skipped records using a binary search-like approach. The skipped range is divided into two halves and only one half gets executed. On subsequent failures, the framework figures out which half contains bad records. A task will be re-executed till the acceptable skipped value is met or all task attempts are exhausted. To increase the number of task attempts, use
+[Job.setMaxMapAttempts(int)](../../api/org/apache/hadoop/mapreduce/Job.html) and
+[Job.setMaxReduceAttempts(int)](../../api/org/apache/hadoop/mapreduce/Job.html)
+
+Skipped records are written to HDFS in the sequence file format, for later analysis. The location can be changed through
+[SkipBadRecords.setSkipOutputPath(JobConf, Path)](../../api/org/apache/hadoop/mapred/SkipBadRecords.html).
+
+### Example: WordCount v2.0
+
+Here is a more complete `WordCount` which uses many of the features provided by the MapReduce framework we discussed so far.
+
+This needs the HDFS to be up and running, especially for the `DistributedCache`-related features. Hence it only works with a
+[pseudo-distributed](../../hadoop-project-dist/hadoop-common/SingleCluster.html) or
+[fully-distributed](../../hadoop-project-dist/hadoop-common/ClusterSetup.html) Hadoop installation.
+
+#### Source Code
+
+```java
+import java.io.BufferedReader;
+import java.io.FileReader;
+import java.io.IOException;
+import java.net.URI;
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+import java.util.StringTokenizer;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.Reducer;
+import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
+import org.apache.hadoop.mapreduce.Counter;
+import org.apache.hadoop.util.GenericOptionsParser;
+import org.apache.hadoop.util.StringUtils;
+
+public class WordCount2 {
+
+  public static class TokenizerMapper
+       extends Mapper<Object, Text, Text, IntWritable>{
+
+    static enum CountersEnum { INPUT_WORDS }
+
+    private final static IntWritable one = new IntWritable(1);
+    private Text word = new Text();
+
+    private boolean caseSensitive;
+    private Set<String> patternsToSkip = new HashSet<String>();
+
+    private Configuration conf;
+    private BufferedReader fis;
+
+    @Override
+    public void setup(Context context) throws IOException,
+        InterruptedException {
+      conf = context.getConfiguration();
+      caseSensitive = conf.getBoolean("wordcount.case.sensitive", true);
+      if (conf.getBoolean("wordcount.skip.patterns", true)) {
+        URI[] patternsURIs = Job.getInstance(conf).getCacheFiles();
+        for (URI patternsURI : patternsURIs) {
+          Path patternsPath = new Path(patternsURI.getPath());
+          String patternsFileName = patternsPath.getName().toString();
+          parseSkipFile(patternsFileName);
+        }
+      }
+    }
+
+    private void parseSkipFile(String fileName) {
+      try {
+        fis = new BufferedReader(new FileReader(fileName));
+        String pattern = null;
+        while ((pattern = fis.readLine()) != null) {
+          patternsToSkip.add(pattern);
+        }
+      } catch (IOException ioe) {
+        System.err.println("Caught exception while parsing the cached file '"
+            + StringUtils.stringifyException(ioe));
+      }
+    }
+
+    @Override
+    public void map(Object key, Text value, Context context
+                    ) throws IOException, InterruptedException {
+      String line = (caseSensitive) ?
+          value.toString() : value.toString().toLowerCase();
+      for (String pattern : patternsToSkip) {
+        line = line.replaceAll(pattern, "");
+      }
+      StringTokenizer itr = new StringTokenizer(line);
+      while (itr.hasMoreTokens()) {
+        word.set(itr.nextToken());
+        context.write(word, one);
+        Counter counter = context.getCounter(CountersEnum.class.getName(),
+            CountersEnum.INPUT_WORDS.toString());
+        counter.increment(1);
+      }
+    }
+  }
+
+  public static class IntSumReducer
+       extends Reducer<Text,IntWritable,Text,IntWritable> {
+    private IntWritable result = new IntWritable();
+
+    public void reduce(Text key, Iterable<IntWritable> values,
+                       Context context
+                       ) throws IOException, InterruptedException {
+      int sum = 0;
+      for (IntWritable val : values) {
+        sum += val.get();
+      }
+      result.set(sum);
+      context.write(key, result);
+    }
+  }
+
+  public static void main(String[] args) throws Exception {
+    Configuration conf = new Configuration();
+    GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
+    String[] remainingArgs = optionParser.getRemainingArgs();
+    if (!(remainingArgs.length != 2 | | remainingArgs.length != 4)) {
+      System.err.println("Usage: wordcount <in> <out> [-skip skipPatternFile]");
+      System.exit(2);
+    }
+    Job job = Job.getInstance(conf, "word count");
+    job.setJarByClass(WordCount2.class);
+    job.setMapperClass(TokenizerMapper.class);
+    job.setCombinerClass(IntSumReducer.class);
+    job.setReducerClass(IntSumReducer.class);
+    job.setOutputKeyClass(Text.class);
+    job.setOutputValueClass(IntWritable.class);
+
+    List<String> otherArgs = new ArrayList<String>();
+    for (int i=0; i < remainingArgs.length; ++i) {
+      if ("-skip".equals(remainingArgs[i])) {
+        job.addCacheFile(new Path(remainingArgs[++i]).toUri());
+        job.getConfiguration().setBoolean("wordcount.skip.patterns", true);
+      } else {
+        otherArgs.add(remainingArgs[i]);
+      }
+    }
+    FileInputFormat.addInputPath(job, new Path(otherArgs.get(0)));
+    FileOutputFormat.setOutputPath(job, new Path(otherArgs.get(1)));
+
+    System.exit(job.waitForCompletion(true) ? 0 : 1);
+  }
+}
+```
+
+#### Sample Runs
+
+Sample text-files as input:
+
+    $ bin/hadoop fs -ls /user/joe/wordcount/input/
+    /user/joe/wordcount/input/file01
+    /user/joe/wordcount/input/file02
+    
+    $ bin/hadoop fs -cat /user/joe/wordcount/input/file01
+    Hello World, Bye World!
+    
+    $ bin/hadoop fs -cat /user/joe/wordcount/input/file02
+    Hello Hadoop, Goodbye to hadoop.
+
+Run the application:
+
+    $ bin/hadoop jar wc.jar WordCount2 /user/joe/wordcount/input /user/joe/wordcount/output
+
+Output:
+
+    $ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000
+    Bye 1
+    Goodbye 1
+    Hadoop, 1
+    Hello 2
+    World! 1
+    World, 1
+    hadoop. 1
+    to 1
+
+Notice that the inputs differ from the first version we looked at, and how they affect the outputs.
+
+Now, lets plug-in a pattern-file which lists the word-patterns to be ignored, via the `DistributedCache`.
+
+    $ bin/hadoop fs -cat /user/joe/wordcount/patterns.txt
+    \.
+    \,
+    \!
+    to
+
+Run it again, this time with more options:
+
+    $ bin/hadoop jar wc.jar WordCount2 -Dwordcount.case.sensitive=true /user/joe/wordcount/input /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt
+
+As expected, the output:
+
+    $ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000
+    Bye 1
+    Goodbye 1
+    Hadoop 1
+    Hello 2
+    World 2
+    hadoop 1
+
+Run it once more, this time switch-off case-sensitivity:
+
+    $ bin/hadoop jar wc.jar WordCount2 -Dwordcount.case.sensitive=false /user/joe/wordcount/input /user/joe/wordcount/output -skip /user/joe/wordcount/patterns.txt
+
+Sure enough, the output:
+
+    $ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000
+    bye 1
+    goodbye 1
+    hadoop 2
+    hello 2
+    horld 2
+
+#### Highlights
+
+The second version of `WordCount` improves upon the previous one by using some features offered by the MapReduce framework:
+
+* Demonstrates how applications can access configuration parameters in the
+  `setup` method of the `Mapper` (and `Reducer`)
+  implementations.
+
+* Demonstrates how the `DistributedCache` can be used to distribute
+  read-only data needed by the jobs. Here it allows the user to specify
+  word-patterns to skip while counting.
+
+* Demonstrates the utility of the `GenericOptionsParser` to handle
+  generic Hadoop command-line options.
+
+* Demonstrates how applications can use `Counters` and how they can set
+  application-specific status information passed to the `map` (and
+  `reduce`) method.
+
+*Java and JNI are trademarks or registered trademarks of Oracle America, Inc. in the United States and other countries.*

+ 69 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapReduce_Compatibility_Hadoop1_Hadoop2.md

@@ -0,0 +1,69 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Apache Hadoop MapReduce - Migrating from Apache Hadoop 1.x to Apache Hadoop 2.x
+===============================================================================
+
+Introduction
+------------
+
+This document provides information for users to migrate their Apache Hadoop MapReduce applications from Apache Hadoop 1.x to Apache Hadoop 2.x.
+
+In Apache Hadoop 2.x we have spun off resource management capabilities into Apache Hadoop YARN, a general purpose, distributed application management framework while Apache Hadoop MapReduce (aka MRv2) remains as a pure distributed computation framework.
+
+In general, the previous MapReduce runtime (aka MRv1) has been reused and no major surgery has been conducted on it. Therefore, MRv2 is able to ensure satisfactory compatibility with MRv1 applications. However, due to some improvements and code refactorings, a few APIs have been rendered backward-incompatible.
+
+The remainder of this page will discuss the scope and the level of backward compatibility that we support in Apache Hadoop MapReduce 2.x (MRv2).
+
+Binary Compatibility
+--------------------
+
+First, we ensure binary compatibility to the applications that use old **mapred** APIs. This means that applications which were built against MRv1 **mapred** APIs can run directly on YARN without recompilation, merely by pointing them to an Apache Hadoop 2.x cluster via configuration.
+
+Source Compatibility
+--------------------
+
+We cannot ensure complete binary compatibility with the applications that use **mapreduce** APIs, as these APIs have evolved a lot since MRv1. However, we ensure source compatibility for **mapreduce** APIs that break binary compatibility. In other words, users should recompile their applications that use **mapreduce** APIs against MRv2 jars. One notable binary incompatibility break is Counter and CounterGroup.
+
+Not Supported
+-------------
+
+MRAdmin has been removed in MRv2 because because `mradmin` commands no longer exist. They have been replaced by the commands in `rmadmin`. We neither support binary compatibility nor source compatibility for the applications that use this class directly.
+
+Tradeoffs between MRv1 Users and Early MRv2 Adopters
+----------------------------------------------------
+
+Unfortunately, maintaining binary compatibility for MRv1 applications may lead to binary incompatibility issues for early MRv2 adopters, in particular Hadoop 0.23 users. For **mapred** APIs, we have chosen to be compatible with MRv1 applications, which have a larger user base. For **mapreduce** APIs, if they don't significantly break Hadoop 0.23 applications, we still change them to be compatible with MRv1 applications. Below is the list of MapReduce APIs which are incompatible with Hadoop 0.23.
+
+| **Problematic Function** | **Incompatibility Issue** |
+|:---- |:---- |
+| `org.apache.hadoop.util.ProgramDriver#drive` | Return type changes from `void` to `int` |
+| `org.apache.hadoop.mapred.jobcontrol.Job#getMapredJobID` | Return type changes from `String` to `JobID` |
+| `org.apache.hadoop.mapred.TaskReport#getTaskId` | Return type changes from `String` to `TaskID` |
+| `org.apache.hadoop.mapred.ClusterStatus#UNINITIALIZED_MEMORY_VALUE` | Data type changes from `long` to `int` |
+| `org.apache.hadoop.mapreduce.filecache.DistributedCache#getArchiveTimestamps` | Return type changes from `long[]` to `String[]` |
+| `org.apache.hadoop.mapreduce.filecache.DistributedCache#getFileTimestamps` | Return type changes from `long[]` to `String[]` |
+| `org.apache.hadoop.mapreduce.Job#failTask` | Return type changes from `void` to `boolean` |
+| `org.apache.hadoop.mapreduce.Job#killTask` | Return type changes from `void` to `boolean` |
+| `org.apache.hadoop.mapreduce.Job#getTaskCompletionEvents` | Return type changes from `o.a.h.mapred.TaskCompletionEvent[]` to `o.a.h.mapreduce.TaskCompletionEvent[]` |
+
+Malicious
+---------
+
+For the users who are going to try `hadoop-examples-1.x.x.jar` on YARN, please note that `hadoop -jar hadoop-examples-1.x.x.jar` will still use `hadoop-mapreduce-examples-2.x.x.jar`, which is installed together with other MRv2 jars. By default Hadoop framework jars appear before the users' jars in the classpath, such that the classes from the 2.x.x jar will still be picked. Users should remove `hadoop-mapreduce-examples-2.x.x.jar` from the classpath of all the nodes in a cluster. Otherwise, users need to set `HADOOP_USER_CLASSPATH_FIRST=true` and `HADOOP_CLASSPATH=...:hadoop-examples-1.x.x.jar` to run their target examples jar, and add the following configuration in `mapred-site.xml` to make the processes in YARN containers pick this jar as well.
+
+        <property>
+            <name>mapreduce.job.user.classpath.first</name>
+            <value>true</value>
+        </property>

+ 2397 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapredAppMasterRest.md

@@ -0,0 +1,2397 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+MapReduce Application Master REST API's.
+========================================
+
+* [MapReduce Application Master REST API's.](#MapReduce_Application_Master_REST_APIs.)
+    * [Overview](#Overview)
+    * [Mapreduce Application Master Info API](#Mapreduce_Application_Master_Info_API)
+    * [Jobs API](#Jobs_API)
+    * [Job API](#Job_API)
+    * [Job Attempts API](#Job_Attempts_API)
+    * [Job Counters API](#Job_Counters_API)
+    * [Job Conf API](#Job_Conf_API)
+    * [Tasks API](#Tasks_API)
+    * [Task API](#Task_API)
+    * [Task Counters API](#Task_Counters_API)
+    * [Task Attempts API](#Task_Attempts_API)
+    * [Task Attempt API](#Task_Attempt_API)
+    * [Task Attempt Counters API](#Task_Attempt_Counters_API)
+
+Overview
+--------
+
+The MapReduce Application Master REST API's allow the user to get status on the running MapReduce application master. Currently this is the equivalent to a running MapReduce job. The information includes the jobs the app master is running and all the job particulars like tasks, counters, configuration, attempts, etc. The application master should be accessed via the proxy. This proxy is configurable to run either on the resource manager or on a separate host. The proxy URL usually looks like: `http://<proxy http address:port>/proxy/appid`.
+
+Mapreduce Application Master Info API
+-------------------------------------
+
+The MapReduce application master information resource provides overall information about that mapreduce application master. This includes application id, time it was started, user, name, etc.
+
+### URI
+
+Both of the following URI's give you the MapReduce application master information, from an application id identified by the appid value.
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/info
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *info* object
+
+When you make a request for the mapreduce application master information, the information will be returned as an info object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| appId | long | The application id |
+| startedOn | long | The time the application started (in ms since epoch) |
+| name | string | The name of the application |
+| user | string | The user name of the user who started the application |
+| elapsedTime | long | The time since the application was started (in ms) |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0003/ws/v1/mapreduce/info
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {   
+      "info" : {
+          "appId" : "application_1326232085508_0003",
+          "startedOn" : 1326238244047,
+          "user" : "user1",
+          "name" : "Sleep job",
+          "elapsedTime" : 32374
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      Accept: application/xml
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0003/ws/v1/mapreduce/info
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 223
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <info>
+      <appId>application_1326232085508_0003</appId>
+      <name>Sleep job</name>
+      <user>user1</user>
+      <startedOn>1326238244047</startedOn>
+      <elapsedTime>32407</elapsedTime>
+    </info>
+
+Jobs API
+--------
+
+The jobs resource provides a list of the jobs running on this application master. See also [Job API](#Job_API) for syntax of the job object.
+
+### URI
+
+      *  http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *jobs* object
+
+When you make a request for the list of jobs, the information will be returned as a collection of job objects. See also [Job API](#Job_API) for syntax of the job object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| job | array of job objects(JSON)/Zero or more job objects(XML) | The collection of job objects |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+      "jobs" : {
+          "job" : [
+             {
+                "runningReduceAttempts" : 1,
+                "reduceProgress" : 100,
+                "failedReduceAttempts" : 0,
+                "newMapAttempts" : 0,
+                "mapsRunning" : 0,
+                "state" : "RUNNING",
+                "successfulReduceAttempts" : 0,
+                "reducesRunning" : 1,
+                "acls" : [
+                   {
+                      "value" : " ",
+                      "name" : "mapreduce.job.acl-modify-job"
+                   },
+                   {
+                      "value" : " ",
+                      "name" : "mapreduce.job.acl-view-job"
+                   }
+                ],
+                "reducesPending" : 0,
+                "user" : "user1",
+                "reducesTotal" : 1,
+                "mapsCompleted" : 1,
+                "startTime" : 1326238769379,
+                "id" : "job_1326232085508_4_4",
+                "successfulMapAttempts" : 1,
+                "runningMapAttempts" : 0,
+                "newReduceAttempts" : 0,
+                "name" : "Sleep job",
+                "mapsPending" : 0,
+                "elapsedTime" : 59377,
+                "reducesCompleted" : 0,
+                "mapProgress" : 100,
+                "diagnostics" : "",
+                "failedMapAttempts" : 0,
+                "killedReduceAttempts" : 0,
+                "mapsTotal" : 1,
+                "uberized" : false,
+                "killedMapAttempts" : 0,
+                "finishTime" : 0
+             }
+         ]
+       }
+     }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 1214 
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobs>
+      <job>
+        <startTime>1326238769379</startTime>
+        <finishTime>0</finishTime>
+        <elapsedTime>59416</elapsedTime>
+        <id>job_1326232085508_4_4</id>
+        <name>Sleep job</name>
+        <user>user1</user>
+        <state>RUNNING</state>
+        <mapsTotal>1</mapsTotal>
+        <mapsCompleted>1</mapsCompleted>
+        <reducesTotal>1</reducesTotal>
+        <reducesCompleted>0</reducesCompleted>
+        <mapProgress>100.0</mapProgress>
+        <reduceProgress>100.0</reduceProgress>
+        <mapsPending>0</mapsPending>
+        <mapsRunning>0</mapsRunning>
+        <reducesPending>0</reducesPending>
+        <reducesRunning>1</reducesRunning>
+        <uberized>false</uberized>
+        <diagnostics/>
+        <newReduceAttempts>0</newReduceAttempts>
+        <runningReduceAttempts>1</runningReduceAttempts>
+        <failedReduceAttempts>0</failedReduceAttempts>
+        <killedReduceAttempts>0</killedReduceAttempts>
+        <successfulReduceAttempts>0</successfulReduceAttempts>
+        <newMapAttempts>0</newMapAttempts>
+        <runningMapAttempts>0</runningMapAttempts>
+        <failedMapAttempts>0</failedMapAttempts>
+        <killedMapAttempts>0</killedMapAttempts>
+        <successfulMapAttempts>1</successfulMapAttempts>
+        <acls>
+          <name>mapreduce.job.acl-modify-job</name>
+          <value> </value>
+        </acls>
+        <acls>
+          <name>mapreduce.job.acl-view-job</name>
+          <value> </value>
+        </acls>
+      </job>
+    </jobs>
+
+Job API
+-------
+
+A job resource contains information about a particular job that was started by this application master. Certain fields are only accessible if user has permissions - depends on acl settings.
+
+### URI
+
+Use the following URI to obtain a job object, for a job identified by the jobid value.
+
+      * http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/{jobid}
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *job* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job id |
+| name | string | The job name |
+| user | string | The user name |
+| state | string | the job state - valid values are: NEW, INITED, RUNNING, SUCCEEDED, FAILED, KILL\_WAIT, KILLED, ERROR |
+| startTime | long | The time the job started (in ms since epoch) |
+| finishTime | long | The time the job finished (in ms since epoch) |
+| elapsedTime | long | The elapsed time since job started (in ms) |
+| mapsTotal | int | The total number of maps |
+| mapsCompleted | int | The number of completed maps |
+| reducesTotal | int | The total number of reduces |
+| reducesCompleted | int | The number of completed reduces |
+| diagnostics | string | A diagnostic message |
+| uberized | boolean | Indicates if the job was an uber job - ran completely in the application master |
+| mapsPending | int | The number of maps still to be run |
+| mapsRunning | int | The number of running maps |
+| reducesPending | int | The number of reduces still to be run |
+| reducesRunning | int | The number of running reduces |
+| newReduceAttempts | int | The number of new reduce attempts |
+| runningReduceAttempts | int | The number of running reduce attempts |
+| failedReduceAttempts | int | The number of failed reduce attempts |
+| killedReduceAttempts | int | The number of killed reduce attempts |
+| successfulReduceAttempts | int | The number of successful reduce attempts |
+| newMapAttempts | int | The number of new map attempts |
+| runningMapAttempts | int | The number of running map attempts |
+| failedMapAttempts | int | The number of failed map attempts |
+| killedMapAttempts | int | The number of killed map attempts |
+| successfulMapAttempts | int | The number of successful map attempts |
+| acls | array of acls(json)/zero or more acls objects(xml) | A collection of acls objects |
+
+### Elements of the *acls* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| value | string | The acl value |
+| name | string | The acl name |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Server: Jetty(6.1.26)
+      Content-Length: 720
+
+Response Body:
+
+    {
+       "job" : {
+          "runningReduceAttempts" : 1,
+          "reduceProgress" : 100,
+          "failedReduceAttempts" : 0,
+          "newMapAttempts" : 0,
+          "mapsRunning" : 0,
+          "state" : "RUNNING",
+          "successfulReduceAttempts" : 0,
+          "reducesRunning" : 1,
+          "acls" : [
+             {  
+                "value" : " ",
+                "name" : "mapreduce.job.acl-modify-job"
+             },
+             {  
+                "value" : " ",
+                "name" : "mapreduce.job.acl-view-job"
+             }
+          ],
+          "reducesPending" : 0,
+          "user" : "user1",
+          "reducesTotal" : 1,
+          "mapsCompleted" : 1,
+          "startTime" : 1326238769379,
+          "id" : "job_1326232085508_4_4",
+          "successfulMapAttempts" : 1,
+          "runningMapAttempts" : 0,
+          "newReduceAttempts" : 0,
+          "name" : "Sleep job",
+          "mapsPending" : 0,
+          "elapsedTime" : 59437,
+          "reducesCompleted" : 0,
+          "mapProgress" : 100,
+          "diagnostics" : "",
+          "failedMapAttempts" : 0,
+          "killedReduceAttempts" : 0,
+          "mapsTotal" : 1,
+          "uberized" : false,
+          "killedMapAttempts" : 0,
+          "finishTime" : 0
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 1201
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <job>
+      <startTime>1326238769379</startTime>
+      <finishTime>0</finishTime>
+      <elapsedTime>59474</elapsedTime>
+      <id>job_1326232085508_4_4</id>
+      <name>Sleep job</name>
+      <user>user1</user>
+      <state>RUNNING</state>
+      <mapsTotal>1</mapsTotal>
+      <mapsCompleted>1</mapsCompleted>
+      <reducesTotal>1</reducesTotal>
+      <reducesCompleted>0</reducesCompleted>
+      <mapProgress>100.0</mapProgress>
+      <reduceProgress>100.0</reduceProgress>
+      <mapsPending>0</mapsPending>
+      <mapsRunning>0</mapsRunning>
+      <reducesPending>0</reducesPending>
+      <reducesRunning>1</reducesRunning>
+      <uberized>false</uberized>
+      <diagnostics/>  
+      <newReduceAttempts>0</newReduceAttempts>
+      <runningReduceAttempts>1</runningReduceAttempts>
+      <failedReduceAttempts>0</failedReduceAttempts>
+      <killedReduceAttempts>0</killedReduceAttempts>
+      <successfulReduceAttempts>0</successfulReduceAttempts>
+      <newMapAttempts>0</newMapAttempts>
+      <runningMapAttempts>0</runningMapAttempts>
+      <failedMapAttempts>0</failedMapAttempts>
+      <killedMapAttempts>0</killedMapAttempts>
+      <successfulMapAttempts>1</successfulMapAttempts>
+      <acls>
+        <name>mapreduce.job.acl-modify-job</name>
+        <value> </value>
+      </acls>
+      <acls>
+        <name>mapreduce.job.acl-view-job</name>    <value> </value>
+      </acls>
+    </job>
+
+Job Attempts API
+----------------
+
+With the job attempts API, you can obtain a collection of resources that represent the job attempts. When you run a GET operation on this resource, you obtain a collection of Job Attempt Objects.
+
+### URI
+
+      * http://<history server http address:port>/ws/v1/history/jobs/{jobid}/jobattempts
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *jobAttempts* object
+
+When you make a request for the list of job attempts, the information will be returned as an array of job attempt objects.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| jobAttempt | array of job attempt objects(JSON)/zero or more job attempt objects(XML) | The collection of job attempt objects |
+
+### Elements of the *jobAttempt* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job attempt id |
+| nodeId | string | The node id of the node the attempt ran on |
+| nodeHttpAddress | string | The node http address of the node the attempt ran on |
+| logsLink | string | The http link to the job attempt logs |
+| containerId | string | The id of the container for the job attempt |
+| startTime | long | The start time of the attempt (in ms since epoch) |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/jobattempts
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobAttempts" : {
+          "jobAttempt" : [
+             {    
+                "nodeId" : "host.domain.com:8041",
+                "nodeHttpAddress" : "host.domain.com:8042",
+                "startTime" : 1326238773493,
+                "id" : 1, 
+                "logsLink" : "http://host.domain.com:8042/node/containerlogs/container_1326232085508_0004_01_000001",
+                "containerId" : "container_1326232085508_0004_01_000001"
+             }  
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/jobattempts
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 498
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobAttempts>
+      <jobAttempt>
+        <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+        <nodeId>host.domain.com:8041</nodeId>
+        <id>1</id>
+        <startTime>1326238773493</startTime>
+        <containerId>container_1326232085508_0004_01_000001</containerId>
+        <logsLink>http://host.domain.com:8042/node/containerlogs/container_1326232085508_0004_01_000001</logsLink>
+      </jobAttempt>
+    </jobAttempts>
+
+Job Counters API
+----------------
+
+With the job counters API, you can object a collection of resources that represent all the counters for that job.
+
+### URI
+
+      * http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/{jobid}/counters
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *jobCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job id |
+| counterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
+
+### Elements of the *counterGroup* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| reduceCounterValue | long | The counter value of reduce tasks |
+| mapCounterValue | long | The counter value of map tasks |
+| totalCounterValue | long | The counter value of all tasks |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobCounters" : {
+          "id" : "job_1326232085508_4_4",
+          "counterGroup" : [
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BAD_ID"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "CONNECTION"
+                   }, 
+                  {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "IO_ERROR"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },                {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_MAP"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+             }, 
+             {  
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2483,
+                      "name" : "FILE_BYTES_READ"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 108763,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_READ_OPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 48,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "HDFS_READ_OPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             }, 
+             {  
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "MAP_INPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1200,
+                      "name" : "MAP_OUTPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 4800,
+                      "name" : "MAP_OUTPUT_BYTES"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2235,
+                      "name" : "MAP_OUTPUT_MATERIALIZED_BYTES"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 48,
+                      "name" : "SPLIT_RAW_BYTES"
+                   }, 
+                  {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 460,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 460,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1200,
+                      "name" : "SPILLED_RECORDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },                {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 58,
+                      "name" : "GC_TIME_MILLIS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1580,
+                      "name" : "CPU_MILLISECONDS"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 462643200,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   }, 
+                   {   
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2149728256,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   }, 
+                  {  
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 357957632,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {  
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter",
+                "counter" : [
+                   {  
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BYTES_READ"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {  
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 7027
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobCounters>
+      <id>job_1326232085508_4_4</id>
+      <counterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <totalCounterValue>2483</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <totalCounterValue>108763</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <totalCounterValue>48</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName> 
+        <counter>
+          <name>MAP_INPUT_RECORDS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_RECORDS</name>
+          <totalCounterValue>1200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_BYTES</name>
+          <totalCounterValue>4800</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_MATERIALIZED_BYTES</name>
+          <totalCounterValue>2235</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SPLIT_RAW_BYTES</name>
+          <totalCounterValue>48</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <totalCounterValue>460</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <totalCounterValue>2235</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <totalCounterValue>460</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <totalCounterValue>1200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <totalCounterValue>58</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <totalCounterValue>1580</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <totalCounterValue>462643200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <totalCounterValue>2149728256</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <totalCounterValue>357957632</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_READ</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>  </counterGroup>  <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>      <name>BYTES_WRITTEN</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+    </jobCounters>
+
+Job Conf API
+------------
+
+A job configuration resource contains information about the job configuration for this job.
+
+### URI
+
+Use the following URI to obtain th job configuration information, from a job identified by the jobid value.
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/conf
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *conf* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| path | string | The path to the job configuration file |
+| property | array of the configuration properties(JSON)/zero or more property objects(XML) | Collection of property objects |
+
+### Elements of the *property* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the configuration property |
+| value | string | The value of the configuration property |
+| source | string | The location this configuration object came from. If there is more then one of these it shows the history with the latest source at the end of the list. |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/conf
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+This is a small snippet of the output as the output if very large. The real output contains every property in your job configuration file.
+
+    {
+       "conf" : {
+          "path" : "hdfs://host.domain.com:9000/user/user1/.staging/job_1326232085508_0004/job.xml",
+          "property" : [
+             {  
+                "value" : "/home/hadoop/hdfs/data",
+                "name" : "dfs.datanode.data.dir",
+                "source" : ["hdfs-site.xml", "job.xml"]
+             },
+             {
+                "value" : "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer",
+                "name" : "hadoop.http.filter.initializers"
+                "source" : ["programmatically", "job.xml"]
+             },
+             {
+                "value" : "/home/hadoop/tmp",
+                "name" : "mapreduce.cluster.temp.dir"
+                "source" : ["mapred-site.xml"]
+             },
+             ...
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/conf
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 552
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <conf>
+      <path>hdfs://host.domain.com:9000/user/user1/.staging/job_1326232085508_0004/job.xml</path>
+      <property>
+        <name>dfs.datanode.data.dir</name>
+        <value>/home/hadoop/hdfs/data</value>
+        <source>hdfs-site.xml</source>
+        <source>job.xml</source>
+      </property>
+      <property>
+        <name>hadoop.http.filter.initializers</name>
+        <value>org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer</value>
+        <source>programmatically</source>
+        <source>job.xml</source>
+      </property>
+      <property>
+        <name>mapreduce.cluster.temp.dir</name>
+        <value>/home/hadoop/tmp</value>
+        <source>mapred-site.xml</source>
+      </property>
+      ...
+    </conf>
+
+Tasks API
+---------
+
+With the tasks API, you can obtain a collection of resources that represent all the tasks for a job. When you run a GET operation on this resource, you obtain a collection of Task Objects.
+
+### URI
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      * type - type of task, valid values are m or r.  m for map task or r for reduce task.
+
+### Elements of the *tasks* object
+
+When you make a request for the list of tasks , the information will be returned as an array of task objects. See also [Task API](#Task_API) for syntax of the task object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| task | array of task objects(JSON)/zero or more task objects(XML) | The collection of task objects |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "tasks" : {
+          "task" : [
+             {
+                "progress" : 100,
+                "elapsedTime" : 2768,
+                "state" : "SUCCEEDED",
+                "startTime" : 1326238773493,
+                "id" : "task_1326232085508_4_4_m_0",
+                "type" : "MAP",
+                "successfulAttempt" : "attempt_1326232085508_4_4_m_0_0",
+                "finishTime" : 1326238776261
+             },
+             {
+                "progress" : 100,
+                "elapsedTime" : 0,
+                "state" : "RUNNING",
+                "startTime" : 1326238777460,
+                "id" : "task_1326232085508_4_4_r_0",
+                "type" : "REDUCE",
+                "successfulAttempt" : "",
+                "finishTime" : 0
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 603
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <tasks>
+      <task>
+        <startTime>1326238773493</startTime>
+        <finishTime>1326238776261</finishTime>
+        <elapsedTime>2768</elapsedTime>
+        <progress>100.0</progress>
+        <id>task_1326232085508_4_4_m_0</id>
+        <state>SUCCEEDED</state>
+        <type>MAP</type>
+        <successfulAttempt>attempt_1326232085508_4_4_m_0_0</successfulAttempt>
+      </task>
+      <task>
+        <startTime>1326238777460</startTime>
+        <finishTime>0</finishTime>
+        <elapsedTime>0</elapsedTime>
+        <progress>100.0</progress>
+        <id>task_1326232085508_4_4_r_0</id>
+        <state>RUNNING</state>
+        <type>REDUCE</type>
+        <successfulAttempt/>
+      </task>
+    </tasks>
+
+Task API
+--------
+
+A Task resource contains information about a particular task within a job.
+
+### URI
+
+Use the following URI to obtain an Task Object, from a task identified by the taskid value.
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *task* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| state | string | The state of the task - valid values are: NEW, SCHEDULED, RUNNING, SUCCEEDED, FAILED, KILL\_WAIT, KILLED |
+| type | string | The task type - MAP or REDUCE |
+| successfulAttempt | string | The the id of the last successful attempt |
+| progress | float | The progress of the task as a percent |
+| startTime | long | The time in which the task started (in ms since epoch) |
+| finishTime | long | The time in which the task finished (in ms since epoch) |
+| elapsedTime | long | The elapsed time since the application started (in ms) |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "task" : {
+          "progress" : 100,
+          "elapsedTime" : 0,
+          "state" : "RUNNING",
+          "startTime" : 1326238777460,
+          "id" : "task_1326232085508_4_4_r_0",
+          "type" : "REDUCE",
+          "successfulAttempt" : "",
+          "finishTime" : 0
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 299
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <task>
+      <startTime>1326238777460</startTime>
+      <finishTime>0</finishTime>
+      <elapsedTime>0</elapsedTime>
+      <progress>100.0</progress>
+      <id>task_1326232085508_4_4_r_0</id>
+      <state>RUNNING</state>
+      <type>REDUCE</type>
+      <successfulAttempt/>
+    </task>
+
+Task Counters API
+-----------------
+
+With the task counters API, you can object a collection of resources that represent all the counters for that task.
+
+### URI
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/counters
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *jobTaskCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| taskcounterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
+
+### Elements of the *counterGroup* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| value | long | The value of the counter |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobTaskCounters" : {
+          "id" : "task_1326232085508_4_4_r_0",
+          "taskCounterGroup" : [
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {
+                      "value" : 2363,
+                      "name" : "FILE_BYTES_READ"
+                   },
+                   {
+                      "value" : 54372,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   },
+                   {
+                      "value" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "SPILLED_RECORDS"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },
+                   {
+                      "value" : 26,
+                      "name" : "GC_TIME_MILLIS"
+                   },
+                   {
+                      "value" : 860,
+                      "name" : "CPU_MILLISECONDS"
+                   },
+                   {
+                      "value" : 107839488,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 1123147776,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 57475072,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BAD_ID"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "CONNECTION"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "IO_ERROR"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_MAP"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 2660
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobTaskCounters>
+      <id>task_1326232085508_4_4_r_0</id>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <value>2363</value>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <value>54372</value>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <value>2235</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <value>26</value>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <value>860</value>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <value>107839488</value>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <value>1123147776</value>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <value>57475072</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+    </jobTaskCounters>
+
+Task Attempts API
+-----------------
+
+With the task attempts API, you can obtain a collection of resources that represent a task attempt within a job. When you run a GET operation on this resource, you obtain a collection of Task Attempt Objects.
+
+### URI
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempts
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *taskAttempts* object
+
+When you make a request for the list of task attempts, the information will be returned as an array of task attempt objects. See also [Task Attempt API](#Task_Attempt_API) for syntax of the task object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| taskAttempt | array of task attempt objects(JSON)/zero or more task attempt objects(XML) | The collection of task attempt objects |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "taskAttempts" : {
+          "taskAttempt" : [
+             {
+                "elapsedMergeTime" : 47,
+                "shuffleFinishTime" : 1326238780052,
+                "assignedContainerId" : "container_1326232085508_0004_01_000003",
+                "progress" : 100,
+                "elapsedTime" : 0,
+                "state" : "RUNNING",
+                "elapsedShuffleTime" : 2592,
+                "mergeFinishTime" : 1326238780099,
+                "rack" : "/98.139.92.0",
+                "elapsedReduceTime" : 0,
+                "nodeHttpAddress" : "host.domain.com:8042",
+                "type" : "REDUCE",
+                "startTime" : 1326238777460,
+                "id" : "attempt_1326232085508_4_4_r_0_0",
+                "finishTime" : 0
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 807
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <taskAttempts>
+      <taskAttempt xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="reduceTaskAttemptInfo">
+        <startTime>1326238777460</startTime>
+        <finishTime>0</finishTime>
+        <elapsedTime>0</elapsedTime>
+        <progress>100.0</progress>
+        <id>attempt_1326232085508_4_4_r_0_0</id>
+        <rack>/98.139.92.0</rack>
+        <state>RUNNING</state>
+        <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+        <type>REDUCE</type>
+        <assignedContainerId>container_1326232085508_0004_01_000003</assignedContainerId>
+        <shuffleFinishTime>1326238780052</shuffleFinishTime>
+        <mergeFinishTime>1326238780099</mergeFinishTime>
+        <elapsedShuffleTime>2592</elapsedShuffleTime>
+        <elapsedMergeTime>47</elapsedMergeTime>
+        <elapsedReduceTime>0</elapsedReduceTime>
+      </taskAttempt>
+    </taskAttempts>
+
+Task Attempt API
+----------------
+
+A Task Attempt resource contains information about a particular task attempt within a job.
+
+### URI
+
+Use the following URI to obtain an Task Attempt Object, from a task identified by the attemptid value.
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *taskAttempt* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| rack | string | The rack |
+| state | string | The state of the task attempt - valid values are: NEW, UNASSIGNED, ASSIGNED, RUNNING, COMMIT\_PENDING, SUCCESS\_CONTAINER\_CLEANUP, SUCCEEDED, FAIL\_CONTAINER\_CLEANUP, FAIL\_TASK\_CLEANUP, FAILED, KILL\_CONTAINER\_CLEANUP, KILL\_TASK\_CLEANUP, KILLED |
+| type | string | The type of task |
+| assignedContainerId | string | The container id this attempt is assigned to |
+| nodeHttpAddress | string | The http address of the node this task attempt ran on |
+| diagnostics | string | The diagnostics message |
+| progress | float | The progress of the task attempt as a percent |
+| startTime | long | The time in which the task attempt started (in ms since epoch) |
+| finishTime | long | The time in which the task attempt finished (in ms since epoch) |
+| elapsedTime | long | The elapsed time since the task attempt started (in ms) |
+
+For reduce task attempts you also have the following fields:
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| shuffleFinishTime | long | The time at which shuffle finished (in ms since epoch) |
+| mergeFinishTime | long | The time at which merge finished (in ms since epoch) |
+| elapsedShuffleTime | long | The time it took for the shuffle phase to complete (time in ms between reduce task start and shuffle finish) |
+| elapsedMergeTime | long | The time it took for the merge phase to complete (time in ms between the shuffle finish and merge finish) |
+| elapsedReduceTime | long | The time it took for the reduce phase to complete (time in ms between merge finish to end of reduce task) |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0 
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "taskAttempt" : {
+          "elapsedMergeTime" : 47,
+          "shuffleFinishTime" : 1326238780052,
+          "assignedContainerId" : "container_1326232085508_0004_01_000003",
+          "progress" : 100,
+          "elapsedTime" : 0,
+          "state" : "RUNNING",
+          "elapsedShuffleTime" : 2592,
+          "mergeFinishTime" : 1326238780099,
+          "rack" : "/98.139.92.0",
+          "elapsedReduceTime" : 0,
+          "nodeHttpAddress" : "host.domain.com:8042",
+          "startTime" : 1326238777460,
+          "id" : "attempt_1326232085508_4_4_r_0_0",
+          "type" : "REDUCE",
+          "finishTime" : 0
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0 
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 691
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <taskAttempt>
+      <startTime>1326238777460</startTime>
+      <finishTime>0</finishTime>
+      <elapsedTime>0</elapsedTime>
+      <progress>100.0</progress>
+      <id>attempt_1326232085508_4_4_r_0_0</id>
+      <rack>/98.139.92.0</rack>
+      <state>RUNNING</state>
+      <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+      <type>REDUCE</type>
+      <assignedContainerId>container_1326232085508_0004_01_000003</assignedContainerId>
+      <shuffleFinishTime>1326238780052</shuffleFinishTime>
+      <mergeFinishTime>1326238780099</mergeFinishTime>
+      <elapsedShuffleTime>2592</elapsedShuffleTime>
+      <elapsedMergeTime>47</elapsedMergeTime>
+      <elapsedReduceTime>0</elapsedReduceTime>
+    </taskAttempt>
+
+Task Attempt Counters API
+-------------------------
+
+With the task attempt counters API, you can object a collection of resources that represent al the counters for that task attempt.
+
+### URI
+
+      * http://<proxy http address:port>/proxy/{appid}/ws/v1/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}/counters
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *jobTaskAttemptCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task attempt id |
+| taskAttemptcounterGroup | array of task attempt counterGroup objects(JSON)/zero or more task attempt counterGroup objects(XML) | A collection of task attempt counter group objects |
+
+### Elements of the *taskAttemptCounterGroup* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| value | long | The value of the counter |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobTaskAttemptCounters" : {
+          "taskAttemptCounterGroup" : [
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {
+                      "value" : 2363,
+                      "name" : "FILE_BYTES_READ"
+                   },
+                   {
+                      "value" : 54372,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   },
+                  {
+                      "value" : 0,
+                      "name" : "HDFS_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   },
+                   {
+                      "value" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "SPILLED_RECORDS"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },
+                   {
+                      "value" : 26,
+                      "name" : "GC_TIME_MILLIS"
+                   },
+                   {
+                      "value" : 860,
+                      "name" : "CPU_MILLISECONDS"
+                   },
+                   {
+                      "value" : 107839488,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 1123147776,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 57475072,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BAD_ID"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "CONNECTION"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "IO_ERROR"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_MAP"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ],
+          "id" : "attempt_1326232085508_4_4_r_0_0"
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<proxy http address:port>/proxy/application_1326232085508_0004/ws/v1/mapreduce/jobs/job_1326232085508_4_4/tasks/task_1326232085508_4_4_r_0/attempts/attempt_1326232085508_4_4_r_0_0/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 2735
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobTaskAttemptCounters>
+      <id>attempt_1326232085508_4_4_r_0_0</id>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <value>2363</value>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <value>54372</value>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <value>2235</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <value>26</value>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <value>860</value>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <value>107839488</value>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <value>1123147776</value>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <value>57475072</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+    </jobTaskAttemptCounters>

+ 153 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/MapredCommands.md

@@ -0,0 +1,153 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+MapReduce Commands Guide
+========================
+
+* [Overview](#Overview)
+* [User Commands](#User_Commands)
+    * [archive](#archive)
+    * [classpath](#classpath)
+    * [distcp](#distcp)
+    * [job](#job)
+    * [pipes](#pipes)
+    * [queue](#queue)
+    * [version](#version)
+* [Administration Commands](#Administration_Commands)
+    * [historyserver](#historyserver)
+    * [hsadmin](#hsadmin)
+
+Overview
+--------
+
+All mapreduce commands are invoked by the `bin/mapred` script. Running the mapred script without any arguments prints the description for all commands.
+
+Usage: `mapred [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]`
+
+Hadoop has an option parsing framework that employs parsing generic options as well as running classes.
+
+| COMMAND\_OPTIONS | Description |
+|:---- |:---- |
+| SHELL\_OPTIONS | The common set of shell options. These are documented on the [Hadoop Commands Reference](../../hadoop-project-dist/hadoop-common/CommandsManual.html#Shell_Options) page. |
+| GENERIC\_OPTIONS | The common set of options supported by multiple commands. See the [Hadoop Commands Reference](../../hadoop-project-dist/hadoop-common/CommandsManual.html#Generic_Options) for more information. |
+| COMMAND COMMAND\_OPTIONS | Various commands with their options are described in the following sections. The commands have been grouped into [User Commands](#User_Commands) and [Administration Commands](#Administration_Commands). |
+
+User Commands
+-------------
+
+Commands useful for users of a hadoop cluster.
+
+### `archive`
+
+Creates a hadoop archive. More information can be found at
+[Hadoop Archives Guide](./HadoopArchives.html).
+
+### `classpath`
+
+Prints the class path needed to get the Hadoop jar and the required libraries.
+
+Usage: `mapred classpath`
+
+### `distcp`
+
+Copy file or directories recursively. More information can be found at
+[Hadoop DistCp Guide](./DistCp.html).
+
+### `job`
+
+Command to interact with Map Reduce Jobs.
+
+Usage: `mapred job | [GENERIC_OPTIONS] | [-submit <job-file>] | [-status <job-id>] | [-counter <job-id> <group-name> <counter-name>] | [-kill <job-id>] | [-events <job-id> <from-event-#> <#-of-events>] | [-history [all] <jobOutputDir>] | [-list [all]] | [-kill-task <task-id>] | [-fail-task <task-id>] | [-set-priority <job-id> <priority>]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| -submit *job-file* | Submits the job. |
+| -status *job-id* | Prints the map and reduce completion percentage and all job counters. |
+| -counter *job-id* *group-name* *counter-name* | Prints the counter value. |
+| -kill *job-id* | Kills the job. |
+| -events *job-id* *from-event-\#* *\#-of-events* | Prints the events' details received by jobtracker for the given range. |
+| -history [all]*jobOutputDir* | Prints job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed by specifying the [all] option. |
+| -list [all] | Displays jobs which are yet to complete. `-list all` displays all jobs. |
+| -kill-task *task-id* | Kills the task. Killed tasks are NOT counted against failed attempts. |
+| -fail-task *task-id* | Fails the task. Failed tasks are counted against failed attempts. |
+| -set-priority *job-id* *priority* | Changes the priority of the job. Allowed priority values are VERY\_HIGH, HIGH, NORMAL, LOW, VERY\_LOW |
+
+### `pipes`
+
+Runs a pipes job.
+
+Usage: `mapred pipes [-conf <path>] [-jobconf <key=value>, <key=value>, ...] [-input <path>] [-output <path>] [-jar <jar file>] [-inputformat <class>] [-map <class>] [-partitioner <class>] [-reduce <class>] [-writer <class>] [-program <executable>] [-reduces <num>]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| -conf *path* | Configuration for job |
+| -jobconf *key=value*, *key=value*, ... | Add/override configuration for job |
+| -input *path* | Input directory |
+| -output *path* | Output directory |
+| -jar *jar file* | Jar filename |
+| -inputformat *class* | InputFormat class |
+| -map *class* | Java Map class |
+| -partitioner *class* | Java Partitioner |
+| -reduce *class* | Java Reduce class |
+| -writer *class* | Java RecordWriter |
+| -program *executable* | Executable URI |
+| -reduces *num* | Number of reduces |
+
+### `queue`
+
+command to interact and view Job Queue information
+
+Usage: `mapred queue [-list] | [-info <job-queue-name> [-showJobs]] | [-showacls]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| -list | Gets list of Job Queues configured in the system. Along with scheduling information associated with the job queues. |
+| -info *job-queue-name* [-showJobs] | Displays the job queue information and associated scheduling information of particular job queue. If `-showJobs` options is present a list of jobs submitted to the particular job queue is displayed. |
+| -showacls | Displays the queue name and associated queue operations allowed for the current user. The list consists of only those queues to which the user has access. |
+
+### `version`
+
+Prints the version.
+
+Usage: `mapred version`
+
+Administration Commands
+-----------------------
+
+Commands useful for administrators of a hadoop cluster.
+
+### `historyserver`
+
+Start JobHistoryServer.
+
+Usage: `mapred historyserver`
+
+### `hsadmin`
+
+Runs a MapReduce hsadmin client for execute JobHistoryServer administrative commands.
+
+Usage: `mapred hsadmin [-refreshUserToGroupsMappings] | [-refreshSuperUserGroupsConfiguration] | [-refreshAdminAcls] | [-refreshLoadedJobCache] | [-refreshLogRetentionSettings] | [-refreshJobRetentionSettings] | [-getGroups [username]] | [-help [cmd]]`
+
+| COMMAND\_OPTION | Description |
+|:---- |:---- |
+| -refreshUserToGroupsMappings | Refresh user-to-groups mappings |
+| -refreshSuperUserGroupsConfiguration | Refresh superuser proxy groups mappings |
+| -refreshAdminAcls | Refresh acls for administration of Job history server |
+| -refreshLoadedJobCache | Refresh loaded job cache of Job history server |
+| -refreshJobRetentionSettings | Refresh job history period, job cleaner settings |
+| -refreshLogRetentionSettings | Refresh log retention period and log retention check interval |
+| -getGroups [username] | Get the groups which given user belongs to |
+| -help [cmd] | Displays help for the given command or all commands if none is specified. |
+
+

+ 73 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/PluggableShuffleAndPluggableSort.md

@@ -0,0 +1,73 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop: Pluggable Shuffle and Pluggable Sort
+============================================
+
+Introduction
+------------
+
+The pluggable shuffle and pluggable sort capabilities allow replacing the built in shuffle and sort logic with alternate implementations. Example use cases for this are: using a different application protocol other than HTTP such as RDMA for shuffling data from the Map nodes to the Reducer nodes; or replacing the sort logic with custom algorithms that enable Hash aggregation and Limit-N query.
+
+**IMPORTANT:** The pluggable shuffle and pluggable sort capabilities are experimental and unstable. This means the provided APIs may change and break compatibility in future versions of Hadoop.
+
+Implementing a Custom Shuffle and a Custom Sort
+-----------------------------------------------
+
+A custom shuffle implementation requires a
+`org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.AuxiliaryService`
+implementation class running in the NodeManagers and a
+`org.apache.hadoop.mapred.ShuffleConsumerPlugin`
+implementation class running in the Reducer tasks.
+
+The default implementations provided by Hadoop can be used as references:
+
+* `org.apache.hadoop.mapred.ShuffleHandler`
+* `org.apache.hadoop.mapreduce.task.reduce.Shuffle`
+
+A custom sort implementation requires a `org.apache.hadoop.mapred.MapOutputCollector` implementation class running in the Mapper tasks and (optionally, depending on the sort implementation) a `org.apache.hadoop.mapred.ShuffleConsumerPlugin` implementation class running in the Reducer tasks.
+
+The default implementations provided by Hadoop can be used as references:
+
+* `org.apache.hadoop.mapred.MapTask$MapOutputBuffer`
+* `org.apache.hadoop.mapreduce.task.reduce.Shuffle`
+
+Configuration
+-------------
+
+Except for the auxiliary service running in the NodeManagers serving the shuffle (by default the `ShuffleHandler`), all the pluggable components run in the job tasks. This means, they can be configured on per job basis. The auxiliary service servicing the Shuffle must be configured in the NodeManagers configuration.
+
+### Job Configuration Properties (on per job basis):
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `mapreduce.job.reduce.shuffle.consumer.plugin.class` | `org.apache.hadoop.mapreduce.task.reduce.Shuffle` | The `ShuffleConsumerPlugin` implementation to use |
+| `mapreduce.job.map.output.collector.class` | `org.apache.hadoop.mapred.MapTask$MapOutputBuffer` | The `MapOutputCollector` implementation(s) to use |
+
+These properties can also be set in the `mapred-site.xml` to change the default values for all jobs.
+
+The collector class configuration may specify a comma-separated list of collector implementations. In this case, the map task will attempt to instantiate each in turn until one of the implementations successfully initializes. This can be useful if a given collector implementation is only compatible with certain types of keys or values, for example.
+
+### NodeManager Configuration properties, `yarn-site.xml` in all nodes:
+
+| **Property** | **Default Value** | **Explanation** |
+|:---- |:---- |:---- |
+| `yarn.nodemanager.aux-services` | `...,mapreduce_shuffle` | The auxiliary service name |
+| `yarn.nodemanager.aux-services.mapreduce_shuffle.class` | `org.apache.hadoop.mapred.ShuffleHandler` | The auxiliary service class to use |
+
+**IMPORTANT:** If setting an auxiliary service in addition the default
+`mapreduce_shuffle` service, then a new service key should be added to the
+`yarn.nodemanager.aux-services` property, for example `mapred.shufflex`.
+Then the property defining the corresponding class must be
+`yarn.nodemanager.aux-services.mapreduce_shufflex.class`.

+ 0 - 2672
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/apt/HistoryServerRest.apt.vm

@@ -1,2672 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  MapReduce History Server REST API's.
-  ---
-  ---
-  ${maven.build.timestamp}
-
-MapReduce History Server REST API's.
-
-%{toc|section=1|fromDepth=0|toDepth=3}
-
-* Overview
-
-  The history server REST API's allow the user to get status on finished applications.
-
-* History Server Information API
-
-  The history server information resource provides overall information about the history server. 
-
-** URI
-
-  Both of the following URI's give you the history server information, from an application id identified by the appid value. 
-
-------
-  * http://<history server http address:port>/ws/v1/history
-  * http://<history server http address:port>/ws/v1/history/info
-------
-
-** HTTP Operations Supported
-
-------
-  * GET
-------
-
-** Query Parameters Supported
-
-------
-  None
-------
-
-** Elements of the <historyInfo> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| startedOn | long  | The time the history server was started (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| hadoopVersion | string  | Version of hadoop common |
-*---------------+--------------+-------------------------------+
-| hadoopBuildVersion | string  | Hadoop common build string with build version, user, and checksum |
-*---------------+--------------+-------------------------------+
-| hadoopVersionBuiltOn | string  | Timestamp when hadoop common was built |
-*---------------+--------------+-------------------------------+
-
-** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/info
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{   
-   "historyInfo" : {
-      "startedOn":1353512830963,
-      "hadoopVersionBuiltOn" : "Wed Jan 11 21:18:36 UTC 2012",
-      "hadoopBuildVersion" : "0.23.1-SNAPSHOT from 1230253 by user1 source checksum bb6e554c6d50b0397d826081017437a7",
-      "hadoopVersion" : "0.23.1-SNAPSHOT"
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
------
-  GET http://<history server http address:port>/ws/v1/history/info
-  Accept: application/xml
------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 330
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<historyInfo>
-  <startedOn>1353512830963</startedOn>
-  <hadoopVersion>0.23.1-SNAPSHOT</hadoopVersion>
-  <hadoopBuildVersion>0.23.1-SNAPSHOT from 1230253 by user1 source checksum bb6e554c6d50b0397d826081017437a7</hadoopBuildVersion>
-  <hadoopVersionBuiltOn>Wed Jan 11 21:18:36 UTC 2012</hadoopVersionBuiltOn>
-</historyInfo>
-+---+
-
-* MapReduce API's
-
-  The following list of resources apply to MapReduce.
-
-** Jobs API
-
-  The jobs resource provides a list of the MapReduce jobs that have finished.  It does not currently return a full list of parameters
-
-*** URI
-
-------
-  *  http://<history server http address:port>/ws/v1/history/mapreduce/jobs
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-  Multiple paramters can be specified.  The started and finished times have a begin and end parameter to allow you to specify ranges.  For example, one could request all jobs that started between 1:00am and 2:00pm on 12/19/2011 with startedTimeBegin=1324256400&startedTimeEnd=1324303200. If the Begin parameter is not specfied, it defaults to 0, and if the End parameter is not specified, it defaults to infinity.
-
-------
-  * user - user name
-  * state - the job state
-  * queue - queue name
-  * limit - total number of app objects to be returned
-  * startedTimeBegin - jobs with start time beginning with this time, specified in ms since epoch
-  * startedTimeEnd - jobs with start time ending with this time, specified in ms since epoch
-  * finishedTimeBegin - jobs with finish time beginning with this time, specified in ms since epoch
-  * finishedTimeEnd - jobs with finish time ending with this time, specified in ms since epoch
-------
-
-*** Elements of the <jobs> object
-
-  When you make a request for the list of jobs, the information will be returned as an array of job objects.
-  See also {{Job API}} for syntax of the job object.  Except this is a subset of a full job.  Only startTime,
-  finishTime, id, name, queue, user, state, mapsTotal, mapsCompleted, reducesTotal, and reducesCompleted are
-  returned.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type  || Description                   |
-*---------------+--------------+-------------------------------+
-| job | array of job objects(json)/zero or more job objects(XML) | The collection of job objects |
-*---------------+--------------+-------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobs" : {
-      "job" : [
-         {
-            "submitTime" : 1326381344449,
-            "state" : "SUCCEEDED",
-            "user" : "user1",
-            "reducesTotal" : 1,
-            "mapsCompleted" : 1,
-            "startTime" : 1326381344489,
-            "id" : "job_1326381300833_1_1",
-            "name" : "word count",
-            "reducesCompleted" : 1,
-            "mapsTotal" : 1,
-            "queue" : "default",
-            "finishTime" : 1326381356010
-         },
-         {
-            "submitTime" : 1326381446500
-            "state" : "SUCCEEDED",
-            "user" : "user1",
-            "reducesTotal" : 1,
-            "mapsCompleted" : 1,
-            "startTime" : 1326381446529,
-            "id" : "job_1326381300833_2_2",
-            "name" : "Sleep job",
-            "reducesCompleted" : 1,
-            "mapsTotal" : 1,
-            "queue" : "default",
-            "finishTime" : 1326381582106
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 1922
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobs>
-  <job>
-    <submitTime>1326381344449</submitTime>
-    <startTime>1326381344489</startTime>
-    <finishTime>1326381356010</finishTime>
-    <id>job_1326381300833_1_1</id>
-    <name>word count</name>
-    <queue>default</queue>
-    <user>user1</user>
-    <state>SUCCEEDED</state>
-    <mapsTotal>1</mapsTotal>
-    <mapsCompleted>1</mapsCompleted>
-    <reducesTotal>1</reducesTotal>
-    <reducesCompleted>1</reducesCompleted>
-  </job>
-  <job>
-    <submitTime>1326381446500</submitTime>
-    <startTime>1326381446529</startTime>
-    <finishTime>1326381582106</finishTime>
-    <id>job_1326381300833_2_2</id>
-    <name>Sleep job</name>
-    <queue>default</queue>
-    <user>user1</user>
-    <state>SUCCEEDED</state>
-    <mapsTotal>1</mapsTotal>
-    <mapsCompleted>1</mapsCompleted>
-    <reducesTotal>1</reducesTotal>
-    <reducesCompleted>1</reducesCompleted>
-  </job>
-</jobs>
-+---+
-
-** {Job API}
-
-  A Job resource contains information about a particular job identified by {jobid}. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <job> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The job id|
-*---------------+--------------+-------------------------------+
-| name | string | The job name |
-*---------------+--------------+-------------------------------+
-| queue | string | The queue the job was submitted to|
-*---------------+--------------+-------------------------------+
-| user | string | The user name |
-*---------------+--------------+-------------------------------+
-| state | string | the job state - valid values are:  NEW, INITED, RUNNING, SUCCEEDED, FAILED, KILL_WAIT, KILLED, ERROR|
-*---------------+--------------+-------------------------------+
-| diagnostics | string | A diagnostic message |
-*---------------+--------------+-------------------------------+
-| submitTime | long | The time the job submitted (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| startTime | long | The time the job started (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| finishTime | long | The time the job finished (in ms since epoch)|
-*---------------+--------------+-------------------------------+
-| mapsTotal | int | The total number of maps |
-*---------------+--------------+-------------------------------+
-| mapsCompleted | int | The number of completed maps |
-*---------------+--------------+-------------------------------+
-| reducesTotal | int | The total number of reduces |
-*---------------+--------------+-------------------------------+
-| reducesCompleted | int | The number of completed reduces|
-*---------------+--------------+-------------------------------+
-| uberized | boolean | Indicates if the job was an uber job - ran completely in the application master|
-*---------------+--------------+-------------------------------+
-| avgMapTime | long | The average time of a map task (in ms)|
-*---------------+--------------+-------------------------------+
-| avgReduceTime | long | The average time of the reduce (in ms)|
-*---------------+--------------+-------------------------------+
-| avgShuffleTime | long | The average time of the shuffle (in ms)|
-*---------------+--------------+-------------------------------+
-| avgMergeTime | long | The average time of the merge (in ms)|
-*---------------+--------------+-------------------------------+
-| failedReduceAttempts | int | The number of failed reduce attempts |
-*---------------+--------------+-------------------------------+
-| killedReduceAttempts | int | The number of killed reduce attempts |
-*---------------+--------------+-------------------------------+
-| successfulReduceAttempts | int | The number of successful reduce attempts |
-*---------------+--------------+-------------------------------+
-| failedMapAttempts | int | The number of failed map attempts |
-*---------------+--------------+-------------------------------+
-| killedMapAttempts | int | The number of killed map attempts |
-*---------------+--------------+-------------------------------+
-| successfulMapAttempts | int | The number of successful map attempts |
-*---------------+--------------+-------------------------------+
-| acls | array of acls(json)/zero or more acls objects(xml)| A collection of acls objects |
-*---------------+--------------+-------------------------------+
-
-** Elements of the <acls> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| value | string | The acl value|
-*---------------+--------------+-------------------------------+
-| name | string | The acl name |
-*---------------+--------------+-------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Server: Jetty(6.1.26)
-  Content-Length: 720
-+---+
-
-  Response Body:
-
-+---+
-{
-   "job" : {
-      "submitTime":  1326381446500,
-      "avgReduceTime" : 124961,
-      "failedReduceAttempts" : 0,
-      "state" : "SUCCEEDED",
-      "successfulReduceAttempts" : 1,
-      "acls" : [
-         {
-            "value" : " ",
-            "name" : "mapreduce.job.acl-modify-job"
-         },
-         {
-            "value" : " ",
-            "name" : "mapreduce.job.acl-view-job"
-         }
-      ],
-      "user" : "user1",
-      "reducesTotal" : 1,
-      "mapsCompleted" : 1,
-      "startTime" : 1326381446529,
-      "id" : "job_1326381300833_2_2",
-      "avgMapTime" : 2638,
-      "successfulMapAttempts" : 1,
-      "name" : "Sleep job",
-      "avgShuffleTime" : 2540,
-      "reducesCompleted" : 1,
-      "diagnostics" : "",
-      "failedMapAttempts" : 0,
-      "avgMergeTime" : 2589,
-      "killedReduceAttempts" : 0,
-      "mapsTotal" : 1,
-      "queue" : "default",
-      "uberized" : false,
-      "killedMapAttempts" : 0,
-      "finishTime" : 1326381582106
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 983
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<job>
-  <submitTime>1326381446500</submitTime>
-  <startTime>1326381446529</startTime>
-  <finishTime>1326381582106</finishTime>
-  <id>job_1326381300833_2_2</id>
-  <name>Sleep job</name>
-  <queue>default</queue>
-  <user>user1</user>
-  <state>SUCCEEDED</state>
-  <mapsTotal>1</mapsTotal>
-  <mapsCompleted>1</mapsCompleted>
-  <reducesTotal>1</reducesTotal>
-  <reducesCompleted>1</reducesCompleted>
-  <uberized>false</uberized>
-  <diagnostics/>
-  <avgMapTime>2638</avgMapTime>
-  <avgReduceTime>124961</avgReduceTime>
-  <avgShuffleTime>2540</avgShuffleTime>
-  <avgMergeTime>2589</avgMergeTime>
-  <failedReduceAttempts>0</failedReduceAttempts>
-  <killedReduceAttempts>0</killedReduceAttempts>
-  <successfulReduceAttempts>1</successfulReduceAttempts>
-  <failedMapAttempts>0</failedMapAttempts>
-  <killedMapAttempts>0</killedMapAttempts>
-  <successfulMapAttempts>1</successfulMapAttempts>
-  <acls>
-    <name>mapreduce.job.acl-modify-job</name>
-    <value> </value>
-  </acls>
-  <acls>
-    <name>mapreduce.job.acl-view-job</name>
-    <value> </value>
-  </acls>
-</job>
-+---+
-
-** Job Attempts API
-
-  With the job attempts API, you can obtain a collection of resources that represent a job attempt.  When you run a GET operation on this resource, you obtain a collection of Job Attempt Objects. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/jobattempts
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <jobAttempts> object
-
-  When you make a request for the list of job attempts, the information will be returned as an array of job attempt objects. 
-
-  jobAttempts:
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| jobAttempt | array of job attempt objects(JSON)/zero or more job attempt objects(XML) | The collection of job attempt objects |
-*---------------+--------------+--------------------------------+
-
-*** Elements of the <jobAttempt> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string | The job attempt id |
-*---------------+--------------+--------------------------------+
-| nodeId | string | The node id of the node the attempt ran on|
-*---------------+--------------+--------------------------------+
-| nodeHttpAddress | string | The node http address of the node the attempt ran on|
-*---------------+--------------+--------------------------------+
-| logsLink | string | The http link to the job attempt logs |
-*---------------+--------------+--------------------------------+
-| containerId | string | The id of the container for the job attempt |
-*---------------+--------------+--------------------------------+
-| startTime | long | The start time of the attempt (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/jobattempts
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobAttempts" : {
-      "jobAttempt" : [
-         {
-            "nodeId" : "host.domain.com:8041",
-            "nodeHttpAddress" : "host.domain.com:8042",
-            "startTime" : 1326381444693,
-            "id" : 1,
-            "logsLink" : "http://host.domain.com:19888/jobhistory/logs/host.domain.com:8041/container_1326381300833_0002_01_000001/job_1326381300833_2_2/user1",
-            "containerId" : "container_1326381300833_0002_01_000001"
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/jobattmpts
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 575
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobAttempts>
-  <jobAttempt>
-    <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-    <nodeId>host.domain.com:8041</nodeId>
-    <id>1</id>
-    <startTime>1326381444693</startTime>
-    <containerId>container_1326381300833_0002_01_000001</containerId>
-    <logsLink>http://host.domain.com:19888/jobhistory/logs/host.domain.com:8041/container_1326381300833_0002_01_000001/job_1326381300833_2_2/user1</logsLink>
-  </jobAttempt>
-</jobAttempts>
-+---+
-
-** Job Counters API
-
-  With the job counters API, you can object a collection of resources that represent al the counters for that job. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/counters
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <jobCounters> object
-  
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The job id |
-*---------------+--------------+-------------------------------+
-| counterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <counterGroup> objecs
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| reduceCounterValue | long | The counter value of reduce tasks |
-*---------------+--------------+-------------------------------+
-| mapCounterValue | long | The counter value of map tasks |
-*---------------+--------------+-------------------------------+
-| totalCounterValue | long | The counter value of all tasks |
-*---------------+--------------+-------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobCounters" : {
-      "id" : "job_1326381300833_2_2",
-      "counterGroup" : [
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BAD_ID"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "CONNECTION"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "IO_ERROR"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_LENGTH"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_MAP"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-          },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2483,
-                  "name" : "FILE_BYTES_READ"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 108525,
-                  "name" : "FILE_BYTES_WRITTEN"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_READ_OPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 48,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "HDFS_READ_OPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "MAP_INPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1200,
-                  "name" : "MAP_OUTPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 4800,
-                  "name" : "MAP_OUTPUT_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2235,
-                  "name" : "MAP_OUTPUT_MATERIALIZED_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 48,
-                  "name" : "SPLIT_RAW_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1200,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1200,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2400,
-                  "name" : "SPILLED_RECORDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 113,
-                  "name" : "GC_TIME_MILLIS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 1830,
-                  "name" : "CPU_MILLISECONDS"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 478068736,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 2159284224,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               },
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 378863616,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter",
-            "counter" : [
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BYTES_READ"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {
-                  "reduceCounterValue" : 0,
-                  "mapCounterValue" : 0,
-                  "totalCounterValue" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 7030
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobCounters>
-  <id>job_1326381300833_2_2</id>
-  <counterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <totalCounterValue>2483</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <totalCounterValue>108525</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <totalCounterValue>48</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
-    <counter>
-      <name>MAP_INPUT_RECORDS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_RECORDS</name>
-      <totalCounterValue>1200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_BYTES</name>
-      <totalCounterValue>4800</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MAP_OUTPUT_MATERIALIZED_BYTES</name>
-      <totalCounterValue>2235</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SPLIT_RAW_BYTES</name>
-      <totalCounterValue>48</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <totalCounterValue>1200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <totalCounterValue>2235</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <totalCounterValue>1200</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <totalCounterValue>2400</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <totalCounterValue>1</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <totalCounterValue>113</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <totalCounterValue>1830</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <totalCounterValue>478068736</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <totalCounterValue>2159284224</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <totalCounterValue>378863616</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_READ</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-  <counterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_WRITTEN</name>
-      <totalCounterValue>0</totalCounterValue>
-      <mapCounterValue>0</mapCounterValue>
-      <reduceCounterValue>0</reduceCounterValue>
-    </counter>
-  </counterGroup>
-</jobCounters>
-+---+
-
-
-** Job Conf API
-
-  A job configuration resource contains information about the job configuration for this job.
-
-*** URI
-
-  Use the following URI to obtain th job configuration information, from a job identified by the {jobid} value. 
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/conf
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <conf> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| path | string | The path to the job configuration file|
-*---------------+--------------+-------------------------------+
-| property | array of the configuration properties(JSON)/zero or more configuration properties(XML) | Collection of configuration property objects|
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <property> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the configuration property |
-*---------------+--------------+-------------------------------+
-| value | string | The value of the configuration property |
-*---------------+--------------+-------------------------------+
-| source | string | The location this configuration object came from. If there is more then one of these it shows the history with the latest source at the end of the list. |
-*---------------+--------------+-------------------------------+
-
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/conf
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-  This is a small snippet of the output as the output if very large. The real output contains every property in your job configuration file.
-
-+---+
-{
-   "conf" : {
-      "path" : "hdfs://host.domain.com:9000/user/user1/.staging/job_1326381300833_0002/job.xml",
-      "property" : [
-         {  
-            "value" : "/home/hadoop/hdfs/data",
-            "name" : "dfs.datanode.data.dir"
-            "source" : ["hdfs-site.xml", "job.xml"]
-         },
-         {
-            "value" : "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer",
-            "name" : "hadoop.http.filter.initializers"
-            "source" : ["programmatically", "job.xml"]
-         },
-         {
-            "value" : "/home/hadoop/tmp",
-            "name" : "mapreduce.cluster.temp.dir"
-            "source" : ["mapred-site.xml"]
-         },
-         ...
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/conf
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 552
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<conf>
-  <path>hdfs://host.domain.com:9000/user/user1/.staging/job_1326381300833_0002/job.xml</path>
-  <property>
-    <name>dfs.datanode.data.dir</name>
-    <value>/home/hadoop/hdfs/data</value>
-    <source>hdfs-site.xml</source>
-    <source>job.xml</source>
-  </property>
-  <property>
-    <name>hadoop.http.filter.initializers</name>
-    <value>org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer</value>
-    <source>programmatically</source>
-    <source>job.xml</source>
-  </property>
-  <property>
-    <name>mapreduce.cluster.temp.dir</name>
-    <value>/home/hadoop/tmp</value>
-    <source>mapred-site.xml</source>
-  </property>
-  ...
-</conf>
-+---+
-
-** Tasks API
-
-  With the tasks API, you can obtain a collection of resources that represent a task within a job.  When you run a GET operation on this resource, you obtain a collection of Task Objects. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  * type - type of task, valid values are m or r.  m for map task or r for reduce task.
-------
-
-*** Elements of the <tasks> object
-
-  When you make a request for the list of tasks , the information will be returned as an array of task objects. 
-  See also {{Task API}} for syntax of the task object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| task | array of task objects(JSON)/zero or more task objects(XML) | The collection of task objects. |
-*---------------+--------------+--------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "tasks" : {
-      "task" : [
-         {
-            "progress" : 100,
-            "elapsedTime" : 6777,
-            "state" : "SUCCEEDED",
-            "startTime" : 1326381446541,
-            "id" : "task_1326381300833_2_2_m_0",
-            "type" : "MAP",
-            "successfulAttempt" : "attempt_1326381300833_2_2_m_0_0",
-            "finishTime" : 1326381453318
-         },
-         {
-            "progress" : 100,
-            "elapsedTime" : 135559,
-            "state" : "SUCCEEDED",
-            "startTime" : 1326381446544,
-            "id" : "task_1326381300833_2_2_r_0",
-            "type" : "REDUCE",
-            "successfulAttempt" : "attempt_1326381300833_2_2_r_0_0",
-            "finishTime" : 1326381582103
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 653
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<tasks>
-  <task>
-    <startTime>1326381446541</startTime>
-    <finishTime>1326381453318</finishTime>
-    <elapsedTime>6777</elapsedTime>
-    <progress>100.0</progress>
-    <id>task_1326381300833_2_2_m_0</id>
-    <state>SUCCEEDED</state>
-    <type>MAP</type>
-    <successfulAttempt>attempt_1326381300833_2_2_m_0_0</successfulAttempt>
-  </task>
-  <task>
-    <startTime>1326381446544</startTime>
-    <finishTime>1326381582103</finishTime>
-    <elapsedTime>135559</elapsedTime>
-    <progress>100.0</progress>
-    <id>task_1326381300833_2_2_r_0</id>
-    <state>SUCCEEDED</state>
-    <type>REDUCE</type>
-    <successfulAttempt>attempt_1326381300833_2_2_r_0_0</successfulAttempt>
-  </task>
-</tasks>
-+---+
-
-** {Task API}
-
-  A Task resource contains information about a particular task within a job. 
-
-*** URI
-
-  Use the following URI to obtain an Task Object, from a task identified by the {taskid} value. 
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <task> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string  | The task id | 
-*---------------+--------------+--------------------------------+
-| state | string | The state of the task - valid values are:  NEW, SCHEDULED, RUNNING, SUCCEEDED, FAILED, KILL_WAIT, KILLED
-*---------------+--------------+--------------------------------+
-| type | string | The task type - MAP or REDUCE|
-*---------------+--------------+--------------------------------+
-| successfulAttempt | string | The id of the last successful attempt |
-*---------------+--------------+--------------------------------+
-| progress | float | The progress of the task as a percent|
-*---------------+--------------+--------------------------------+
-| startTime | long | The time in which the task started (in ms since epoch) or -1 if it was never started |
-*---------------+--------------+--------------------------------+
-| finishTime | long | The time in which the task finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| elapsedTime | long | The elapsed time since the application started (in ms)|
-*---------------+--------------+--------------------------------+
-
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "task" : {
-      "progress" : 100,
-      "elapsedTime" : 6777,
-      "state" : "SUCCEEDED",
-      "startTime" : 1326381446541,
-      "id" : "task_1326381300833_2_2_m_0",
-      "type" : "MAP",
-      "successfulAttempt" : "attempt_1326381300833_2_2_m_0_0",
-      "finishTime" : 1326381453318
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 299
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<task>
-  <startTime>1326381446541</startTime>
-  <finishTime>1326381453318</finishTime>
-  <elapsedTime>6777</elapsedTime>
-  <progress>100.0</progress>
-  <id>task_1326381300833_2_2_m_0</id>
-  <state>SUCCEEDED</state>
-  <type>MAP</type>
-  <successfulAttempt>attempt_1326381300833_2_2_m_0_0</successfulAttempt>
-</task>
-+---+
-
-** Task Counters API
-
-  With the task counters API, you can object a collection of resources that represent all the counters for that task. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/counters
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <jobTaskCounters> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The task id |
-*---------------+--------------+-------------------------------+
-| taskcounterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <counterGroup> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| value | long | The value of the counter |
-*---------------+--------------+-------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobTaskCounters" : {
-      "id" : "task_1326381300833_2_2_m_0",
-      "taskCounterGroup" : [
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {
-                  "value" : 2363,
-                  "name" : "FILE_BYTES_READ"
-               },
-               {
-                  "value" : 54372,
-                  "name" : "FILE_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               },
-               {
-                  "value" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "SPILLED_RECORDS"
-               },
-               {
-                  "value" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               },
-               {
-                  "value" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },
-               {
-                  "value" : 26,
-                  "name" : "GC_TIME_MILLIS"
-               },
-               {
-                  "value" : 860,
-                  "name" : "CPU_MILLISECONDS"
-               },
-               {
-                  "value" : 107839488,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 1123147776,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 57475072,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BAD_ID"
-               },
-               {
-                  "value" : 0,
-                  "name" : "CONNECTION"
-               },
-               {
-                  "value" : 0,
-                  "name" : "IO_ERROR"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_LENGTH"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_MAP"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 2660
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobTaskCounters>
-  <id>task_1326381300833_2_2_m_0</id>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <value>2363</value>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <value>54372</value>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <value>2235</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <value>26</value>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <value>860</value>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <value>107839488</value>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <value>1123147776</value>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <value>57475072</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-  <taskCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-  </taskCounterGroup>
-</jobTaskCounters>
-+---+
-
-** Task Attempts API
-
-  With the task attempts API, you can obtain a collection of resources that represent a task attempt within a job.  When you run a GET operation on this resource, you obtain a collection of Task Attempt Objects. 
-
-*** URI
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempts
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <taskAttempts> object
-
-  When you make a request for the list of task attempts, the information will be returned as an array of task attempt objects. 
-  See also {{Task Attempt API}} for syntax of the task object.
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| taskAttempt | array of task attempt objects(JSON)/zero or more task attempt objects(XML) | The collection of task attempt objects |
-*---------------+--------------+--------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "taskAttempts" : {
-      "taskAttempt" : [
-         {
-            "assignedContainerId" : "container_1326381300833_0002_01_000002",
-            "progress" : 100,
-            "elapsedTime" : 2638,
-            "state" : "SUCCEEDED",
-            "diagnostics" : "",
-            "rack" : "/98.139.92.0",
-            "nodeHttpAddress" : "host.domain.com:8042",
-            "startTime" : 1326381450680,
-            "id" : "attempt_1326381300833_2_2_m_0_0",
-            "type" : "MAP",
-            "finishTime" : 1326381453318
-         }
-      ]
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 537
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<taskAttempts>
-  <taskAttempt>
-    <startTime>1326381450680</startTime>
-    <finishTime>1326381453318</finishTime>
-    <elapsedTime>2638</elapsedTime>
-    <progress>100.0</progress>
-    <id>attempt_1326381300833_2_2_m_0_0</id>
-    <rack>/98.139.92.0</rack>
-    <state>SUCCEEDED</state>
-    <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-    <diagnostics/>
-    <type>MAP</type>
-    <assignedContainerId>container_1326381300833_0002_01_000002</assignedContainerId>
-  </taskAttempt>
-</taskAttempts>
-+---+
-
-** {Task Attempt API}
-
-  A Task Attempt resource contains information about a particular task attempt within a job. 
-
-*** URI
-
-  Use the following URI to obtain an Task Attempt Object, from a task identified by the {attemptid} value. 
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <taskAttempt> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| id | string  | The task id | 
-*---------------+--------------+--------------------------------+
-| rack | string  | The rack | 
-*---------------+--------------+--------------------------------+
-| state | string | The state of the task attempt - valid values are: NEW, UNASSIGNED, ASSIGNED, RUNNING, COMMIT_PENDING, SUCCESS_CONTAINER_CLEANUP, SUCCEEDED, FAIL_CONTAINER_CLEANUP, FAIL_TASK_CLEANUP, FAILED, KILL_CONTAINER_CLEANUP, KILL_TASK_CLEANUP, KILLED |
-*---------------+--------------+--------------------------------+
-| type | string | The type of task |
-*---------------+--------------+--------------------------------+
-| assignedContainerId | string | The container id this attempt is assigned to|
-*---------------+--------------+--------------------------------+
-| nodeHttpAddress | string | The http address of the node this task attempt ran on |
-*---------------+--------------+--------------------------------+
-| diagnostics| string | A diagnostics message |
-*---------------+--------------+--------------------------------+
-| progress | float | The progress of the task attempt as a percent|
-*---------------+--------------+--------------------------------+
-| startTime | long | The time in which the task attempt started (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| finishTime | long | The time in which the task attempt finished (in ms since epoch)|
-*---------------+--------------+--------------------------------+
-| elapsedTime | long | The elapsed time since the task attempt started (in ms)|
-*---------------+--------------+--------------------------------+
-
-  For reduce task attempts you also have the following fields:
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                  |
-*---------------+--------------+-------------------------------+
-| shuffleFinishTime | long | The time at which shuffle finished (in ms since epoch)| 
-*---------------+--------------+--------------------------------+
-| mergeFinishTime | long | The time at which merge finished (in ms since epoch)| 
-*---------------+--------------+--------------------------------+
-| elapsedShuffleTime | long | The time it took for the shuffle phase to complete (time in ms between reduce task start and shuffle finish)| 
-*---------------+--------------+--------------------------------+
-| elapsedMergeTime | long | The time it took for the merge phase to complete (time in ms between the shuffle finish and merge finish)| 
-*---------------+--------------+--------------------------------+
-| elapsedReduceTime | long | The time it took for the reduce phase to complete (time in ms between merge finish to end of reduce task)| 
-*---------------+--------------+--------------------------------+
-
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0 
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "taskAttempt" : {
-      "assignedContainerId" : "container_1326381300833_0002_01_000002",
-      "progress" : 100,
-      "elapsedTime" : 2638,
-      "state" : "SUCCEEDED",
-      "diagnostics" : "",
-      "rack" : "/98.139.92.0",
-      "nodeHttpAddress" : "host.domain.com:8042",
-      "startTime" : 1326381450680,
-      "id" : "attempt_1326381300833_2_2_m_0_0",
-      "type" : "MAP",
-      "finishTime" : 1326381453318
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0 
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 691
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<taskAttempt>
-  <startTime>1326381450680</startTime>
-  <finishTime>1326381453318</finishTime>
-  <elapsedTime>2638</elapsedTime>
-  <progress>100.0</progress>
-  <id>attempt_1326381300833_2_2_m_0_0</id>
-  <rack>/98.139.92.0</rack>
-  <state>SUCCEEDED</state>
-  <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
-  <diagnostics/>
-  <type>MAP</type>
-  <assignedContainerId>container_1326381300833_0002_01_000002</assignedContainerId>
-</taskAttempt>
-+---+
-
-** Task Attempt Counters API
-
-  With the task attempt counters API, you can object a collection of resources that represent al the counters for that task attempt. 
-
-*** URI 
-
-------
-  * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}/counters
-------
-
-*** HTTP Operations Supported 
-
-------
-  * GET
-------
-
-*** Query Parameters Supported
-
-------
-  None
-------
-
-*** Elements of the <jobTaskAttemptCounters> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| id | string | The task attempt id |
-*---------------+--------------+-------------------------------+
-| taskAttemptcounterGroup | array of task attempt counterGroup objects(JSON)/zero or more task attempt counterGroup objects(XML) | A collection of task attempt counter group objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <taskAttemptCounterGroup> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| counterGroupName | string | The name of the counter group |
-*---------------+--------------+-------------------------------+
-| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
-*---------------+--------------+-------------------------------+
-
-*** Elements of the <counter> object
-
-*---------------+--------------+-------------------------------+
-|| Item         || Data Type   || Description                   |
-*---------------+--------------+-------------------------------+
-| name | string | The name of the counter |
-*---------------+--------------+-------------------------------+
-| value | long | The value of the counter |
-*---------------+--------------+-------------------------------+
-
-*** Response Examples
-
-  <<JSON response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0/counters
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/json
-  Transfer-Encoding: chunked
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-{
-   "jobTaskAttemptCounters" : {
-      "taskAttemptCounterGroup" : [
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
-            "counter" : [
-               {
-                  "value" : 2363,
-                  "name" : "FILE_BYTES_READ"
-               },
-               {
-                  "value" : 54372,
-                  "name" : "FILE_BYTES_WRITTEN"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FILE_WRITE_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_READ"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_BYTES_WRITTEN"
-               },
-              {
-                  "value" : 0,
-                  "name" : "HDFS_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_LARGE_READ_OPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "HDFS_WRITE_OPS"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "COMBINE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_GROUPS"
-               },
-               {
-                  "value" : 2235,
-                  "name" : "REDUCE_SHUFFLE_BYTES"
-               },
-               {
-                  "value" : 460,
-                  "name" : "REDUCE_INPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "REDUCE_OUTPUT_RECORDS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "SPILLED_RECORDS"
-               },
-               {
-                  "value" : 1,
-                  "name" : "SHUFFLED_MAPS"
-               },
-               {
-                  "value" : 0,
-                  "name" : "FAILED_SHUFFLE"
-               },
-               {
-                  "value" : 1,
-                  "name" : "MERGED_MAP_OUTPUTS"
-               },
-               {
-                  "value" : 26,
-                  "name" : "GC_TIME_MILLIS"
-               },
-               {
-                  "value" : 860,
-                  "name" : "CPU_MILLISECONDS"
-               },
-               {
-                  "value" : 107839488,
-                  "name" : "PHYSICAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 1123147776,
-                  "name" : "VIRTUAL_MEMORY_BYTES"
-               },
-               {
-                  "value" : 57475072,
-                  "name" : "COMMITTED_HEAP_BYTES"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "Shuffle Errors",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BAD_ID"
-               },
-               {
-                  "value" : 0,
-                  "name" : "CONNECTION"
-               },
-               {
-                  "value" : 0,
-                  "name" : "IO_ERROR"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_LENGTH"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_MAP"
-               },
-               {
-                  "value" : 0,
-                  "name" : "WRONG_REDUCE"
-               }
-            ]
-         },
-         {
-            "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
-            "counter" : [
-               {
-                  "value" : 0,
-                  "name" : "BYTES_WRITTEN"
-               }
-            ]
-         }
-      ],
-      "id" : "attempt_1326381300833_2_2_m_0_0"
-   }
-}
-+---+
-
-  <<XML response>>
-
-  HTTP Request:
-
-------
-  GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0/counters
-  Accept: application/xml
-------
-
-  Response Header:
-
-+---+
-  HTTP/1.1 200 OK
-  Content-Type: application/xml
-  Content-Length: 2735
-  Server: Jetty(6.1.26)
-+---+
-
-  Response Body:
-
-+---+
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<jobTaskAttemptCounters>
-  <id>attempt_1326381300833_2_2_m_0_0</id>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
-    <counter>
-      <name>FILE_BYTES_READ</name>
-      <value>2363</value>
-    </counter>
-    <counter>
-      <name>FILE_BYTES_WRITTEN</name>
-      <value>54372</value>
-    </counter>
-    <counter>
-      <name>FILE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>FILE_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_READ</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_LARGE_READ_OPS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>HDFS_WRITE_OPS</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
-    <counter>
-      <name>COMBINE_INPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>COMBINE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_GROUPS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_SHUFFLE_BYTES</name>
-      <value>2235</value>
-    </counter>
-    <counter>
-      <name>REDUCE_INPUT_RECORDS</name>
-      <value>460</value>
-    </counter>
-    <counter>
-      <name>REDUCE_OUTPUT_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SPILLED_RECORDS</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>SHUFFLED_MAPS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>FAILED_SHUFFLE</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>MERGED_MAP_OUTPUTS</name>
-      <value>1</value>
-    </counter>
-    <counter>
-      <name>GC_TIME_MILLIS</name>
-      <value>26</value>
-    </counter>
-    <counter>
-      <name>CPU_MILLISECONDS</name>
-      <value>860</value>
-    </counter>
-    <counter>
-      <name>PHYSICAL_MEMORY_BYTES</name>
-      <value>107839488</value>
-    </counter>
-    <counter>
-      <name>VIRTUAL_MEMORY_BYTES</name>
-      <value>1123147776</value>
-    </counter>
-    <counter>
-      <name>COMMITTED_HEAP_BYTES</name>
-      <value>57475072</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>Shuffle Errors</counterGroupName>
-    <counter>
-      <name>BAD_ID</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>CONNECTION</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>IO_ERROR</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_LENGTH</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_MAP</name>
-      <value>0</value>
-    </counter>
-    <counter>
-      <name>WRONG_REDUCE</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-  <taskAttemptCounterGroup>
-    <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
-    <counter>
-      <name>BYTES_WRITTEN</name>
-      <value>0</value>
-    </counter>
-  </taskAttemptCounterGroup>
-</jobTaskAttemptCounters>
-+---+

+ 2361 - 0
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/site/markdown/HistoryServerRest.md

@@ -0,0 +1,2361 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+MapReduce History Server REST API's.
+====================================
+
+* [MapReduce History Server REST API's.](#MapReduce_History_Server_REST_APIs.)
+    * [Overview](#Overview)
+    * [History Server Information API](#History_Server_Information_API)
+        * [URI](#URI)
+        * [HTTP Operations Supported](#HTTP_Operations_Supported)
+        * [Query Parameters Supported](#Query_Parameters_Supported)
+        * [Elements of the historyInfo object](#Elements_of_the_historyInfo_object)
+        * [Response Examples](#Response_Examples)
+    * [MapReduce API's](#MapReduce_APIs)
+        * [Jobs API](#Jobs_API)
+        * [Job API](#Job_API)
+        * [Elements of the acls object](#Elements_of_the_acls_object)
+        * [Job Attempts API](#Job_Attempts_API)
+        * [Job Counters API](#Job_Counters_API)
+        * [Job Conf API](#Job_Conf_API)
+        * [Tasks API](#Tasks_API)
+        * [Task API](#Task_API)
+        * [Task Counters API](#Task_Counters_API)
+        * [Task Attempts API](#Task_Attempts_API)
+        * [Task Attempt API](#Task_Attempt_API)
+        * [Task Attempt Counters API](#Task_Attempt_Counters_API)
+
+Overview
+--------
+
+The history server REST API's allow the user to get status on finished applications.
+
+History Server Information API
+------------------------------
+
+The history server information resource provides overall information about the history server.
+
+### URI
+
+Both of the following URI's give you the history server information, from an application id identified by the appid value.
+
+      * http://<history server http address:port>/ws/v1/history
+      * http://<history server http address:port>/ws/v1/history/info
+
+### HTTP Operations Supported
+
+      * GET
+
+### Query Parameters Supported
+
+      None
+
+### Elements of the *historyInfo* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| startedOn | long | The time the history server was started (in ms since epoch) |
+| hadoopVersion | string | Version of hadoop common |
+| hadoopBuildVersion | string | Hadoop common build string with build version, user, and checksum |
+| hadoopVersionBuiltOn | string | Timestamp when hadoop common was built |
+
+### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/info
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {   
+       "historyInfo" : {
+          "startedOn":1353512830963,
+          "hadoopVersionBuiltOn" : "Wed Jan 11 21:18:36 UTC 2012",
+          "hadoopBuildVersion" : "0.23.1-SNAPSHOT from 1230253 by user1 source checksum bb6e554c6d50b0397d826081017437a7",
+          "hadoopVersion" : "0.23.1-SNAPSHOT"
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/info
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 330
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <historyInfo>
+      <startedOn>1353512830963</startedOn>
+      <hadoopVersion>0.23.1-SNAPSHOT</hadoopVersion>
+      <hadoopBuildVersion>0.23.1-SNAPSHOT from 1230253 by user1 source checksum bb6e554c6d50b0397d826081017437a7</hadoopBuildVersion>
+      <hadoopVersionBuiltOn>Wed Jan 11 21:18:36 UTC 2012</hadoopVersionBuiltOn>
+    </historyInfo>
+
+MapReduce API's
+---------------
+
+The following list of resources apply to MapReduce.
+
+### Jobs API
+
+The jobs resource provides a list of the MapReduce jobs that have finished. It does not currently return a full list of parameters
+
+#### URI
+
+      *  http://<history server http address:port>/ws/v1/history/mapreduce/jobs
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+Multiple paramters can be specified. The started and finished times have a begin and end parameter to allow you to specify ranges. For example, one could request all jobs that started between 1:00am and 2:00pm on 12/19/2011 with startedTimeBegin=1324256400&startedTimeEnd=1324303200. If the Begin parameter is not specfied, it defaults to 0, and if the End parameter is not specified, it defaults to infinity.
+
+      * user - user name
+      * state - the job state
+      * queue - queue name
+      * limit - total number of app objects to be returned
+      * startedTimeBegin - jobs with start time beginning with this time, specified in ms since epoch
+      * startedTimeEnd - jobs with start time ending with this time, specified in ms since epoch
+      * finishedTimeBegin - jobs with finish time beginning with this time, specified in ms since epoch
+      * finishedTimeEnd - jobs with finish time ending with this time, specified in ms since epoch
+
+#### Elements of the *jobs* object
+
+When you make a request for the list of jobs, the information will be returned as an array of job objects. See also
+[Job API](#Job_API)
+for syntax of the job object. Except this is a subset of a full job. Only startTime, finishTime, id, name, queue, user, state, mapsTotal, mapsCompleted, reducesTotal, and reducesCompleted are returned.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| job | array of job objects(json)/zero or more job objects(XML) | The collection of job objects |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobs" : {
+          "job" : [
+             {
+                "submitTime" : 1326381344449,
+                "state" : "SUCCEEDED",
+                "user" : "user1",
+                "reducesTotal" : 1,
+                "mapsCompleted" : 1,
+                "startTime" : 1326381344489,
+                "id" : "job_1326381300833_1_1",
+                "name" : "word count",
+                "reducesCompleted" : 1,
+                "mapsTotal" : 1,
+                "queue" : "default",
+                "finishTime" : 1326381356010
+             },
+             {
+                "submitTime" : 1326381446500
+                "state" : "SUCCEEDED",
+                "user" : "user1",
+                "reducesTotal" : 1,
+                "mapsCompleted" : 1,
+                "startTime" : 1326381446529,
+                "id" : "job_1326381300833_2_2",
+                "name" : "Sleep job",
+                "reducesCompleted" : 1,
+                "mapsTotal" : 1,
+                "queue" : "default",
+                "finishTime" : 1326381582106
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 1922
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobs>
+      <job>
+        <submitTime>1326381344449</submitTime>
+        <startTime>1326381344489</startTime>
+        <finishTime>1326381356010</finishTime>
+        <id>job_1326381300833_1_1</id>
+        <name>word count</name>
+        <queue>default</queue>
+        <user>user1</user>
+        <state>SUCCEEDED</state>
+        <mapsTotal>1</mapsTotal>
+        <mapsCompleted>1</mapsCompleted>
+        <reducesTotal>1</reducesTotal>
+        <reducesCompleted>1</reducesCompleted>
+      </job>
+      <job>
+        <submitTime>1326381446500</submitTime>
+        <startTime>1326381446529</startTime>
+        <finishTime>1326381582106</finishTime>
+        <id>job_1326381300833_2_2</id>
+        <name>Sleep job</name>
+        <queue>default</queue>
+        <user>user1</user>
+        <state>SUCCEEDED</state>
+        <mapsTotal>1</mapsTotal>
+        <mapsCompleted>1</mapsCompleted>
+        <reducesTotal>1</reducesTotal>
+        <reducesCompleted>1</reducesCompleted>
+      </job>
+    </jobs>
+
+### Job API
+
+A Job resource contains information about a particular job identified by jobid.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *job* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job id |
+| name | string | The job name |
+| queue | string | The queue the job was submitted to |
+| user | string | The user name |
+| state | string | the job state - valid values are: NEW, INITED, RUNNING, SUCCEEDED, FAILED, KILL\_WAIT, KILLED, ERROR |
+| diagnostics | string | A diagnostic message |
+| submitTime | long | The time the job submitted (in ms since epoch) |
+| startTime | long | The time the job started (in ms since epoch) |
+| finishTime | long | The time the job finished (in ms since epoch) |
+| mapsTotal | int | The total number of maps |
+| mapsCompleted | int | The number of completed maps |
+| reducesTotal | int | The total number of reduces |
+| reducesCompleted | int | The number of completed reduces |
+| uberized | boolean | Indicates if the job was an uber job - ran completely in the application master |
+| avgMapTime | long | The average time of a map task (in ms) |
+| avgReduceTime | long | The average time of the reduce (in ms) |
+| avgShuffleTime | long | The average time of the shuffle (in ms) |
+| avgMergeTime | long | The average time of the merge (in ms) |
+| failedReduceAttempts | int | The number of failed reduce attempts |
+| killedReduceAttempts | int | The number of killed reduce attempts |
+| successfulReduceAttempts | int | The number of successful reduce attempts |
+| failedMapAttempts | int | The number of failed map attempts |
+| killedMapAttempts | int | The number of killed map attempts |
+| successfulMapAttempts | int | The number of successful map attempts |
+| acls | array of acls(json)/zero or more acls objects(xml) | A collection of acls objects |
+
+### Elements of the *acls* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| value | string | The acl value |
+| name | string | The acl name |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Server: Jetty(6.1.26)
+      Content-Length: 720
+
+Response Body:
+
+    {
+       "job" : {
+          "submitTime":  1326381446500,
+          "avgReduceTime" : 124961,
+          "failedReduceAttempts" : 0,
+          "state" : "SUCCEEDED",
+          "successfulReduceAttempts" : 1,
+          "acls" : [
+             {
+                "value" : " ",
+                "name" : "mapreduce.job.acl-modify-job"
+             },
+             {
+                "value" : " ",
+                "name" : "mapreduce.job.acl-view-job"
+             }
+          ],
+          "user" : "user1",
+          "reducesTotal" : 1,
+          "mapsCompleted" : 1,
+          "startTime" : 1326381446529,
+          "id" : "job_1326381300833_2_2",
+          "avgMapTime" : 2638,
+          "successfulMapAttempts" : 1,
+          "name" : "Sleep job",
+          "avgShuffleTime" : 2540,
+          "reducesCompleted" : 1,
+          "diagnostics" : "",
+          "failedMapAttempts" : 0,
+          "avgMergeTime" : 2589,
+          "killedReduceAttempts" : 0,
+          "mapsTotal" : 1,
+          "queue" : "default",
+          "uberized" : false,
+          "killedMapAttempts" : 0,
+          "finishTime" : 1326381582106
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 983
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <job>
+      <submitTime>1326381446500</submitTime>
+      <startTime>1326381446529</startTime>
+      <finishTime>1326381582106</finishTime>
+      <id>job_1326381300833_2_2</id>
+      <name>Sleep job</name>
+      <queue>default</queue>
+      <user>user1</user>
+      <state>SUCCEEDED</state>
+      <mapsTotal>1</mapsTotal>
+      <mapsCompleted>1</mapsCompleted>
+      <reducesTotal>1</reducesTotal>
+      <reducesCompleted>1</reducesCompleted>
+      <uberized>false</uberized>
+      <diagnostics/>
+      <avgMapTime>2638</avgMapTime>
+      <avgReduceTime>124961</avgReduceTime>
+      <avgShuffleTime>2540</avgShuffleTime>
+      <avgMergeTime>2589</avgMergeTime>
+      <failedReduceAttempts>0</failedReduceAttempts>
+      <killedReduceAttempts>0</killedReduceAttempts>
+      <successfulReduceAttempts>1</successfulReduceAttempts>
+      <failedMapAttempts>0</failedMapAttempts>
+      <killedMapAttempts>0</killedMapAttempts>
+      <successfulMapAttempts>1</successfulMapAttempts>
+      <acls>
+        <name>mapreduce.job.acl-modify-job</name>
+        <value> </value>
+      </acls>
+      <acls>
+        <name>mapreduce.job.acl-view-job</name>
+        <value> </value>
+      </acls>
+    </job>
+
+### Job Attempts API
+
+With the job attempts API, you can obtain a collection of resources that represent a job attempt. When you run a GET operation on this resource, you obtain a collection of Job Attempt Objects.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/jobattempts
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *jobAttempts* object
+
+When you make a request for the list of job attempts, the information will be returned as an array of job attempt objects.
+
+jobAttempts:
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| jobAttempt | array of job attempt objects(JSON)/zero or more job attempt objects(XML) | The collection of job attempt objects |
+
+#### Elements of the *jobAttempt* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job attempt id |
+| nodeId | string | The node id of the node the attempt ran on |
+| nodeHttpAddress | string | The node http address of the node the attempt ran on |
+| logsLink | string | The http link to the job attempt logs |
+| containerId | string | The id of the container for the job attempt |
+| startTime | long | The start time of the attempt (in ms since epoch) |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/jobattempts
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobAttempts" : {
+          "jobAttempt" : [
+             {
+                "nodeId" : "host.domain.com:8041",
+                "nodeHttpAddress" : "host.domain.com:8042",
+                "startTime" : 1326381444693,
+                "id" : 1,
+                "logsLink" : "http://host.domain.com:19888/jobhistory/logs/host.domain.com:8041/container_1326381300833_0002_01_000001/job_1326381300833_2_2/user1",
+                "containerId" : "container_1326381300833_0002_01_000001"
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/jobattmpts
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 575
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobAttempts>
+      <jobAttempt>
+        <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+        <nodeId>host.domain.com:8041</nodeId>
+        <id>1</id>
+        <startTime>1326381444693</startTime>
+        <containerId>container_1326381300833_0002_01_000001</containerId>
+        <logsLink>http://host.domain.com:19888/jobhistory/logs/host.domain.com:8041/container_1326381300833_0002_01_000001/job_1326381300833_2_2/user1</logsLink>
+      </jobAttempt>
+    </jobAttempts>
+
+### Job Counters API
+
+With the job counters API, you can object a collection of resources that represent al the counters for that job.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/counters
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *jobCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The job id |
+| counterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
+
+#### Elements of the *counterGroup* objecs
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+#### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| reduceCounterValue | long | The counter value of reduce tasks |
+| mapCounterValue | long | The counter value of map tasks |
+| totalCounterValue | long | The counter value of all tasks |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobCounters" : {
+          "id" : "job_1326381300833_2_2",
+          "counterGroup" : [
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BAD_ID"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "CONNECTION"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "IO_ERROR"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_MAP"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+              },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2483,
+                      "name" : "FILE_BYTES_READ"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 108525,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_READ_OPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 48,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "HDFS_READ_OPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "MAP_INPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1200,
+                      "name" : "MAP_OUTPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 4800,
+                      "name" : "MAP_OUTPUT_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2235,
+                      "name" : "MAP_OUTPUT_MATERIALIZED_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 48,
+                      "name" : "SPLIT_RAW_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1200,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1200,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2400,
+                      "name" : "SPILLED_RECORDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 113,
+                      "name" : "GC_TIME_MILLIS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 1830,
+                      "name" : "CPU_MILLISECONDS"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 478068736,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 2159284224,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   },
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 378863616,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter",
+                "counter" : [
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BYTES_READ"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {
+                      "reduceCounterValue" : 0,
+                      "mapCounterValue" : 0,
+                      "totalCounterValue" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 7030
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobCounters>
+      <id>job_1326381300833_2_2</id>
+      <counterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <totalCounterValue>2483</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <totalCounterValue>108525</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <totalCounterValue>48</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
+        <counter>
+          <name>MAP_INPUT_RECORDS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_RECORDS</name>
+          <totalCounterValue>1200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_BYTES</name>
+          <totalCounterValue>4800</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MAP_OUTPUT_MATERIALIZED_BYTES</name>
+          <totalCounterValue>2235</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SPLIT_RAW_BYTES</name>
+          <totalCounterValue>48</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <totalCounterValue>1200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <totalCounterValue>2235</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <totalCounterValue>1200</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <totalCounterValue>2400</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <totalCounterValue>1</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <totalCounterValue>113</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <totalCounterValue>1830</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <totalCounterValue>478068736</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <totalCounterValue>2159284224</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <totalCounterValue>378863616</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_READ</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+      <counterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_WRITTEN</name>
+          <totalCounterValue>0</totalCounterValue>
+          <mapCounterValue>0</mapCounterValue>
+          <reduceCounterValue>0</reduceCounterValue>
+        </counter>
+      </counterGroup>
+    </jobCounters>
+
+### Job Conf API
+
+A job configuration resource contains information about the job configuration for this job.
+
+#### URI
+
+Use the following URI to obtain th job configuration information, from a job identified by the jobid value.
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/conf
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *conf* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| path | string | The path to the job configuration file |
+| property | array of the configuration properties(JSON)/zero or more configuration properties(XML) | Collection of configuration property objects |
+
+#### Elements of the *property* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the configuration property |
+| value | string | The value of the configuration property |
+| source | string | The location this configuration object came from. If there is more then one of these it shows the history with the latest source at the end of the list. |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/conf
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+This is a small snippet of the output as the output if very large. The real output contains every property in your job configuration file.
+
+    {
+       "conf" : {
+          "path" : "hdfs://host.domain.com:9000/user/user1/.staging/job_1326381300833_0002/job.xml",
+          "property" : [
+             {  
+                "value" : "/home/hadoop/hdfs/data",
+                "name" : "dfs.datanode.data.dir"
+                "source" : ["hdfs-site.xml", "job.xml"]
+             },
+             {
+                "value" : "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer",
+                "name" : "hadoop.http.filter.initializers"
+                "source" : ["programmatically", "job.xml"]
+             },
+             {
+                "value" : "/home/hadoop/tmp",
+                "name" : "mapreduce.cluster.temp.dir"
+                "source" : ["mapred-site.xml"]
+             },
+             ...
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/conf
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 552
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <conf>
+      <path>hdfs://host.domain.com:9000/user/user1/.staging/job_1326381300833_0002/job.xml</path>
+      <property>
+        <name>dfs.datanode.data.dir</name>
+        <value>/home/hadoop/hdfs/data</value>
+        <source>hdfs-site.xml</source>
+        <source>job.xml</source>
+      </property>
+      <property>
+        <name>hadoop.http.filter.initializers</name>
+        <value>org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer</value>
+        <source>programmatically</source>
+        <source>job.xml</source>
+      </property>
+      <property>
+        <name>mapreduce.cluster.temp.dir</name>
+        <value>/home/hadoop/tmp</value>
+        <source>mapred-site.xml</source>
+      </property>
+      ...
+    </conf>
+
+### Tasks API
+
+With the tasks API, you can obtain a collection of resources that represent a task within a job. When you run a GET operation on this resource, you obtain a collection of Task Objects.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      * type - type of task, valid values are m or r.  m for map task or r for reduce task.
+
+#### Elements of the *tasks* object
+
+When you make a request for the list of tasks , the information will be returned as an array of task objects. See also
+[Task API](#Task_API)
+for syntax of the task object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| task | array of task objects(JSON)/zero or more task objects(XML) | The collection of task objects. |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "tasks" : {
+          "task" : [
+             {
+                "progress" : 100,
+                "elapsedTime" : 6777,
+                "state" : "SUCCEEDED",
+                "startTime" : 1326381446541,
+                "id" : "task_1326381300833_2_2_m_0",
+                "type" : "MAP",
+                "successfulAttempt" : "attempt_1326381300833_2_2_m_0_0",
+                "finishTime" : 1326381453318
+             },
+             {
+                "progress" : 100,
+                "elapsedTime" : 135559,
+                "state" : "SUCCEEDED",
+                "startTime" : 1326381446544,
+                "id" : "task_1326381300833_2_2_r_0",
+                "type" : "REDUCE",
+                "successfulAttempt" : "attempt_1326381300833_2_2_r_0_0",
+                "finishTime" : 1326381582103
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 653
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <tasks>
+      <task>
+        <startTime>1326381446541</startTime>
+        <finishTime>1326381453318</finishTime>
+        <elapsedTime>6777</elapsedTime>
+        <progress>100.0</progress>
+        <id>task_1326381300833_2_2_m_0</id>
+        <state>SUCCEEDED</state>
+        <type>MAP</type>
+        <successfulAttempt>attempt_1326381300833_2_2_m_0_0</successfulAttempt>
+      </task>
+      <task>
+        <startTime>1326381446544</startTime>
+        <finishTime>1326381582103</finishTime>
+        <elapsedTime>135559</elapsedTime>
+        <progress>100.0</progress>
+        <id>task_1326381300833_2_2_r_0</id>
+        <state>SUCCEEDED</state>
+        <type>REDUCE</type>
+        <successfulAttempt>attempt_1326381300833_2_2_r_0_0</successfulAttempt>
+      </task>
+    </tasks>
+
+### Task API
+
+A Task resource contains information about a particular task within a job.
+
+#### URI
+
+Use the following URI to obtain an Task Object, from a task identified by the taskid value.
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *task* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| state | string | The state of the task - valid values are: NEW, SCHEDULED, RUNNING, SUCCEEDED, FAILED, KILL\_WAIT, KILLED |
+| type | string | The task type - MAP or REDUCE |
+| successfulAttempt | string | The id of the last successful attempt |
+| progress | float | The progress of the task as a percent |
+| startTime | long | The time in which the task started (in ms since epoch) or -1 if it was never started |
+| finishTime | long | The time in which the task finished (in ms since epoch) |
+| elapsedTime | long | The elapsed time since the application started (in ms) |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "task" : {
+          "progress" : 100,
+          "elapsedTime" : 6777,
+          "state" : "SUCCEEDED",
+          "startTime" : 1326381446541,
+          "id" : "task_1326381300833_2_2_m_0",
+          "type" : "MAP",
+          "successfulAttempt" : "attempt_1326381300833_2_2_m_0_0",
+          "finishTime" : 1326381453318
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 299
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <task>
+      <startTime>1326381446541</startTime>
+      <finishTime>1326381453318</finishTime>
+      <elapsedTime>6777</elapsedTime>
+      <progress>100.0</progress>
+      <id>task_1326381300833_2_2_m_0</id>
+      <state>SUCCEEDED</state>
+      <type>MAP</type>
+      <successfulAttempt>attempt_1326381300833_2_2_m_0_0</successfulAttempt>
+    </task>
+
+### Task Counters API
+
+With the task counters API, you can object a collection of resources that represent all the counters for that task.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/counters
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *jobTaskCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| taskcounterGroup | array of counterGroup objects(JSON)/zero or more counterGroup objects(XML) | A collection of counter group objects |
+
+#### Elements of the *counterGroup* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+#### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| value | long | The value of the counter |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobTaskCounters" : {
+          "id" : "task_1326381300833_2_2_m_0",
+          "taskCounterGroup" : [
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {
+                      "value" : 2363,
+                      "name" : "FILE_BYTES_READ"
+                   },
+                   {
+                      "value" : 54372,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   },
+                   {
+                      "value" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "SPILLED_RECORDS"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },
+                   {
+                      "value" : 26,
+                      "name" : "GC_TIME_MILLIS"
+                   },
+                   {
+                      "value" : 860,
+                      "name" : "CPU_MILLISECONDS"
+                   },
+                   {
+                      "value" : 107839488,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 1123147776,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 57475072,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BAD_ID"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "CONNECTION"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "IO_ERROR"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_MAP"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 2660
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobTaskCounters>
+      <id>task_1326381300833_2_2_m_0</id>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <value>2363</value>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <value>54372</value>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <value>2235</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <value>26</value>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <value>860</value>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <value>107839488</value>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <value>1123147776</value>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <value>57475072</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+      <taskCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+      </taskCounterGroup>
+    </jobTaskCounters>
+
+### Task Attempts API
+
+With the task attempts API, you can obtain a collection of resources that represent a task attempt within a job. When you run a GET operation on this resource, you obtain a collection of Task Attempt Objects.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempts
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *taskAttempts* object
+
+When you make a request for the list of task attempts, the information will be returned as an array of task attempt objects. See also
+[Task Attempt API](#Task_Attempt_API)
+for syntax of the task object.
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| taskAttempt | array of task attempt objects(JSON)/zero or more task attempt objects(XML) | The collection of task attempt objects |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "taskAttempts" : {
+          "taskAttempt" : [
+             {
+                "assignedContainerId" : "container_1326381300833_0002_01_000002",
+                "progress" : 100,
+                "elapsedTime" : 2638,
+                "state" : "SUCCEEDED",
+                "diagnostics" : "",
+                "rack" : "/98.139.92.0",
+                "nodeHttpAddress" : "host.domain.com:8042",
+                "startTime" : 1326381450680,
+                "id" : "attempt_1326381300833_2_2_m_0_0",
+                "type" : "MAP",
+                "finishTime" : 1326381453318
+             }
+          ]
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 537
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <taskAttempts>
+      <taskAttempt>
+        <startTime>1326381450680</startTime>
+        <finishTime>1326381453318</finishTime>
+        <elapsedTime>2638</elapsedTime>
+        <progress>100.0</progress>
+        <id>attempt_1326381300833_2_2_m_0_0</id>
+        <rack>/98.139.92.0</rack>
+        <state>SUCCEEDED</state>
+        <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+        <diagnostics/>
+        <type>MAP</type>
+        <assignedContainerId>container_1326381300833_0002_01_000002</assignedContainerId>
+      </taskAttempt>
+    </taskAttempts>
+
+### Task Attempt API
+
+A Task Attempt resource contains information about a particular task attempt within a job.
+
+#### URI
+
+Use the following URI to obtain an Task Attempt Object, from a task identified by the attemptid value.
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *taskAttempt* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task id |
+| rack | string | The rack |
+| state | string | The state of the task attempt - valid values are: NEW, UNASSIGNED, ASSIGNED, RUNNING, COMMIT\_PENDING, SUCCESS\_CONTAINER\_CLEANUP, SUCCEEDED, FAIL\_CONTAINER\_CLEANUP, FAIL\_TASK\_CLEANUP, FAILED, KILL\_CONTAINER\_CLEANUP, KILL\_TASK\_CLEANUP, KILLED |
+| type | string | The type of task |
+| assignedContainerId | string | The container id this attempt is assigned to |
+| nodeHttpAddress | string | The http address of the node this task attempt ran on |
+| diagnostics | string | A diagnostics message |
+| progress | float | The progress of the task attempt as a percent |
+| startTime | long | The time in which the task attempt started (in ms since epoch) |
+| finishTime | long | The time in which the task attempt finished (in ms since epoch) |
+| elapsedTime | long | The elapsed time since the task attempt started (in ms) |
+
+For reduce task attempts you also have the following fields:
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| shuffleFinishTime | long | The time at which shuffle finished (in ms since epoch) |
+| mergeFinishTime | long | The time at which merge finished (in ms since epoch) |
+| elapsedShuffleTime | long | The time it took for the shuffle phase to complete (time in ms between reduce task start and shuffle finish) |
+| elapsedMergeTime | long | The time it took for the merge phase to complete (time in ms between the shuffle finish and merge finish) |
+| elapsedReduceTime | long | The time it took for the reduce phase to complete (time in ms between merge finish to end of reduce task) |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0 
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "taskAttempt" : {
+          "assignedContainerId" : "container_1326381300833_0002_01_000002",
+          "progress" : 100,
+          "elapsedTime" : 2638,
+          "state" : "SUCCEEDED",
+          "diagnostics" : "",
+          "rack" : "/98.139.92.0",
+          "nodeHttpAddress" : "host.domain.com:8042",
+          "startTime" : 1326381450680,
+          "id" : "attempt_1326381300833_2_2_m_0_0",
+          "type" : "MAP",
+          "finishTime" : 1326381453318
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0 
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 691
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <taskAttempt>
+      <startTime>1326381450680</startTime>
+      <finishTime>1326381453318</finishTime>
+      <elapsedTime>2638</elapsedTime>
+      <progress>100.0</progress>
+      <id>attempt_1326381300833_2_2_m_0_0</id>
+      <rack>/98.139.92.0</rack>
+      <state>SUCCEEDED</state>
+      <nodeHttpAddress>host.domain.com:8042</nodeHttpAddress>
+      <diagnostics/>
+      <type>MAP</type>
+      <assignedContainerId>container_1326381300833_0002_01_000002</assignedContainerId>
+    </taskAttempt>
+
+### Task Attempt Counters API
+
+With the task attempt counters API, you can object a collection of resources that represent al the counters for that task attempt.
+
+#### URI
+
+      * http://<history server http address:port>/ws/v1/history/mapreduce/jobs/{jobid}/tasks/{taskid}/attempt/{attemptid}/counters
+
+#### HTTP Operations Supported
+
+      * GET
+
+#### Query Parameters Supported
+
+      None
+
+#### Elements of the *jobTaskAttemptCounters* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| id | string | The task attempt id |
+| taskAttemptcounterGroup | array of task attempt counterGroup objects(JSON)/zero or more task attempt counterGroup objects(XML) | A collection of task attempt counter group objects |
+
+#### Elements of the *taskAttemptCounterGroup* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| counterGroupName | string | The name of the counter group |
+| counter | array of counter objects(JSON)/zero or more counter objects(XML) | A collection of counter objects |
+
+#### Elements of the *counter* object
+
+| Item | Data Type | Description |
+|:---- |:---- |:---- |
+| name | string | The name of the counter |
+| value | long | The value of the counter |
+
+#### Response Examples
+
+**JSON response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0/counters
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/json
+      Transfer-Encoding: chunked
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    {
+       "jobTaskAttemptCounters" : {
+          "taskAttemptCounterGroup" : [
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.FileSystemCounter",
+                "counter" : [
+                   {
+                      "value" : 2363,
+                      "name" : "FILE_BYTES_READ"
+                   },
+                   {
+                      "value" : 54372,
+                      "name" : "FILE_BYTES_WRITTEN"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FILE_WRITE_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_READ"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_BYTES_WRITTEN"
+                   },
+                  {
+                      "value" : 0,
+                      "name" : "HDFS_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_LARGE_READ_OPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "HDFS_WRITE_OPS"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.TaskCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "COMBINE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_GROUPS"
+                   },
+                   {
+                      "value" : 2235,
+                      "name" : "REDUCE_SHUFFLE_BYTES"
+                   },
+                   {
+                      "value" : 460,
+                      "name" : "REDUCE_INPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "REDUCE_OUTPUT_RECORDS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "SPILLED_RECORDS"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "SHUFFLED_MAPS"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "FAILED_SHUFFLE"
+                   },
+                   {
+                      "value" : 1,
+                      "name" : "MERGED_MAP_OUTPUTS"
+                   },
+                   {
+                      "value" : 26,
+                      "name" : "GC_TIME_MILLIS"
+                   },
+                   {
+                      "value" : 860,
+                      "name" : "CPU_MILLISECONDS"
+                   },
+                   {
+                      "value" : 107839488,
+                      "name" : "PHYSICAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 1123147776,
+                      "name" : "VIRTUAL_MEMORY_BYTES"
+                   },
+                   {
+                      "value" : 57475072,
+                      "name" : "COMMITTED_HEAP_BYTES"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "Shuffle Errors",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BAD_ID"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "CONNECTION"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "IO_ERROR"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_LENGTH"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_MAP"
+                   },
+                   {
+                      "value" : 0,
+                      "name" : "WRONG_REDUCE"
+                   }
+                ]
+             },
+             {
+                "counterGroupName" : "org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter",
+                "counter" : [
+                   {
+                      "value" : 0,
+                      "name" : "BYTES_WRITTEN"
+                   }
+                ]
+             }
+          ],
+          "id" : "attempt_1326381300833_2_2_m_0_0"
+       }
+    }
+
+**XML response**
+
+HTTP Request:
+
+      GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs/job_1326381300833_2_2/tasks/task_1326381300833_2_2_m_0/attempts/attempt_1326381300833_2_2_m_0_0/counters
+      Accept: application/xml
+
+Response Header:
+
+      HTTP/1.1 200 OK
+      Content-Type: application/xml
+      Content-Length: 2735
+      Server: Jetty(6.1.26)
+
+Response Body:
+
+    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+    <jobTaskAttemptCounters>
+      <id>attempt_1326381300833_2_2_m_0_0</id>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.FileSystemCounter</counterGroupName>
+        <counter>
+          <name>FILE_BYTES_READ</name>
+          <value>2363</value>
+        </counter>
+        <counter>
+          <name>FILE_BYTES_WRITTEN</name>
+          <value>54372</value>
+        </counter>
+        <counter>
+          <name>FILE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>FILE_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_READ</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_LARGE_READ_OPS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>HDFS_WRITE_OPS</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.TaskCounter</counterGroupName>
+        <counter>
+          <name>COMBINE_INPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>COMBINE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_GROUPS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_SHUFFLE_BYTES</name>
+          <value>2235</value>
+        </counter>
+        <counter>
+          <name>REDUCE_INPUT_RECORDS</name>
+          <value>460</value>
+        </counter>
+        <counter>
+          <name>REDUCE_OUTPUT_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SPILLED_RECORDS</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>SHUFFLED_MAPS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>FAILED_SHUFFLE</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>MERGED_MAP_OUTPUTS</name>
+          <value>1</value>
+        </counter>
+        <counter>
+          <name>GC_TIME_MILLIS</name>
+          <value>26</value>
+        </counter>
+        <counter>
+          <name>CPU_MILLISECONDS</name>
+          <value>860</value>
+        </counter>
+        <counter>
+          <name>PHYSICAL_MEMORY_BYTES</name>
+          <value>107839488</value>
+        </counter>
+        <counter>
+          <name>VIRTUAL_MEMORY_BYTES</name>
+          <value>1123147776</value>
+        </counter>
+        <counter>
+          <name>COMMITTED_HEAP_BYTES</name>
+          <value>57475072</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>Shuffle Errors</counterGroupName>
+        <counter>
+          <name>BAD_ID</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>CONNECTION</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>IO_ERROR</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_LENGTH</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_MAP</name>
+          <value>0</value>
+        </counter>
+        <counter>
+          <name>WRONG_REDUCE</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+      <taskAttemptCounterGroup>
+        <counterGroupName>org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter</counterGroupName>
+        <counter>
+          <name>BYTES_WRITTEN</name>
+          <value>0</value>
+        </counter>
+      </taskAttemptCounterGroup>
+    </jobTaskAttemptCounters>