1234567891011121314151617181920212223242526272829303132333435363738394041424344454647 |
- <?xml version="1.0" encoding="UTF-8"?>
- <!--
- Copyright 2002-2004 The Apache Software Foundation
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
- -->
- <document xmlns="http://maven.apache.org/XDOC/2.0"
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
- <head>
- <title>DistCp</title>
- </head>
- <body>
- <section name="Overview">
- <p>
- DistCp (distributed copy) is a tool used for large inter/intra-cluster
- copying. It uses Map/Reduce to effect its distribution, error
- handling and recovery, and reporting. It expands a list of files and
- directories into input to map tasks, each of which will copy a partition
- of the files specified in the source list.
- </p>
- <p>
- The erstwhile implementation of DistCp has its share of quirks and
- drawbacks, both in its usage, as well as its extensibility and
- performance. The purpose of the DistCp refactor was to fix these shortcomings,
- enabling it to be used and extended programmatically. New paradigms have
- been introduced to improve runtime and setup performance, while simultaneously
- retaining the legacy behaviour as default.
- </p>
- <p>
- This document aims to describe the design of the new DistCp, its spanking
- new features, their optimal use, and any deviance from the legacy
- implementation.
- </p>
- </section>
- </body>
- </document>
|