|
@@ -11,12 +11,56 @@
|
|
|
~~ limitations under the License. See accompanying LICENSE file.
|
|
|
|
|
|
---
|
|
|
- Hadoop ${project.version}
|
|
|
+ Apache Hadoop ${project.version}
|
|
|
---
|
|
|
---
|
|
|
${maven.build.timestamp}
|
|
|
|
|
|
-
|
|
|
+Apache Hadoop 0.23
|
|
|
+
|
|
|
+ Apache Hadoop 0.23 consists of significant improvements over the previous
|
|
|
+ stable release (hadoop-0.20.205).
|
|
|
+
|
|
|
+ Here is a short overview of the improvments to both HDFS and MapReduce.
|
|
|
+
|
|
|
+ * {HDFS Federation}
|
|
|
+
|
|
|
+ In order to scale the name service horizontally, federation uses multiple
|
|
|
+ independent Namenodes/Namespaces. The Namenodes are federated, that is, the
|
|
|
+ Namenodes are independent and don't require coordination with each other.
|
|
|
+ The datanodes are used as common storage for blocks by all the Namenodes.
|
|
|
+ Each datanode registers with all the Namenodes in the cluster. Datanodes
|
|
|
+ send periodic heartbeats and block reports and handles commands from the
|
|
|
+ Namenodes.
|
|
|
+
|
|
|
+ More details are available in the
|
|
|
+ {{{./hadoop-yarn/hadoop-yarn-site/Federation.html}HDFS Federation}}
|
|
|
+ document.
|
|
|
+
|
|
|
+ * {MapReduce NextGen aka YARN aka MRv2}
|
|
|
+
|
|
|
+ The new architecture introduced in hadoop-0.23, divides the two major
|
|
|
+ functions of the JobTracker: resource management and job life-cycle management
|
|
|
+ into separate components.
|
|
|
+
|
|
|
+ The new ResourceManager manages the global assignment of compute resources to
|
|
|
+ applications and the per-application ApplicationMaster manages the
|
|
|
+ application’s scheduling and coordination.
|
|
|
+
|
|
|
+ An application is either a single job in the sense of classic MapReduce jobs
|
|
|
+ or a DAG of such jobs.
|
|
|
+
|
|
|
+ The ResourceManager and per-machine NodeManager daemon, which manages the
|
|
|
+ user processes on that machine, form the computation fabric.
|
|
|
+
|
|
|
+ The per-application ApplicationMaster is, in effect, a framework specific
|
|
|
+ library and is tasked with negotiating resources from the ResourceManager and
|
|
|
+ working with the NodeManager(s) to execute and monitor the tasks.
|
|
|
+
|
|
|
+ More details are available in the
|
|
|
+ {{{./hadoop-yarn/hadoop-yarn-site/YARN.html}YARN}}
|
|
|
+ document.
|
|
|
+
|
|
|
Getting Started
|
|
|
|
|
|
The Hadoop documentation includes the information you need to get started using
|