12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License. See accompanying LICENSE file.
- ---
- Apache Hadoop ${project.version}
- ---
- ---
- ${maven.build.timestamp}
-
- Apache Hadoop ${project.version}
- Apache Hadoop ${project.version} consists of significant
- improvements over the previous stable release (hadoop-1.x).
- Here is a short overview of the improvments to both HDFS and MapReduce.
- * {HDFS Federation}
- In order to scale the name service horizontally, federation uses multiple
- independent Namenodes/Namespaces. The Namenodes are federated, that is, the
- Namenodes are independent and don't require coordination with each other.
- The datanodes are used as common storage for blocks by all the Namenodes.
- Each datanode registers with all the Namenodes in the cluster. Datanodes
- send periodic heartbeats and block reports and handles commands from the
- Namenodes.
- More details are available in the
- {{{./hadoop-project-dist/hadoop-hdfs/Federation.html}HDFS Federation}}
- document.
- * {MapReduce NextGen aka YARN aka MRv2}
- The new architecture introduced in hadoop-0.23, divides the two major
- functions of the JobTracker: resource management and job life-cycle management
- into separate components.
- The new ResourceManager manages the global assignment of compute resources to
- applications and the per-application ApplicationMaster manages the
- application‚ scheduling and coordination.
- An application is either a single job in the sense of classic MapReduce jobs
- or a DAG of such jobs.
- The ResourceManager and per-machine NodeManager daemon, which manages the
- user processes on that machine, form the computation fabric.
- The per-application ApplicationMaster is, in effect, a framework specific
- library and is tasked with negotiating resources from the ResourceManager and
- working with the NodeManager(s) to execute and monitor the tasks.
- More details are available in the
- {{{./hadoop-yarn/hadoop-yarn-site/YARN.html}YARN}}
- document.
- Getting Started
- The Hadoop documentation includes the information you need to get started using
- Hadoop. Begin with the
- {{{./hadoop-project-dist/hadoop-common/SingleCluster.html}Single Node Setup}} which
- shows you how to set up a single-node Hadoop installation. Then move on to the
- {{{./hadoop-project-dist/hadoop-common/ClusterSetup.html}Cluster Setup}} to learn how
- to set up a multi-node Hadoop installation.
-
|