index.xml 2.0 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!--
  3. Copyright 2002-2004 The Apache Software Foundation
  4. Licensed under the Apache License, Version 2.0 (the "License");
  5. you may not use this file except in compliance with the License.
  6. You may obtain a copy of the License at
  7. http://www.apache.org/licenses/LICENSE-2.0
  8. Unless required by applicable law or agreed to in writing, software
  9. distributed under the License is distributed on an "AS IS" BASIS,
  10. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  11. See the License for the specific language governing permissions and
  12. limitations under the License.
  13. -->
  14. <document xmlns="http://maven.apache.org/XDOC/2.0"
  15. xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  16. xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
  17. <head>
  18. <title>DistCp</title>
  19. </head>
  20. <body>
  21. <section name="Overview">
  22. <p>
  23. DistCp (distributed copy) is a tool used for large inter/intra-cluster
  24. copying. It uses Map/Reduce to effect its distribution, error
  25. handling and recovery, and reporting. It expands a list of files and
  26. directories into input to map tasks, each of which will copy a partition
  27. of the files specified in the source list.
  28. </p>
  29. <p>
  30. The erstwhile implementation of DistCp has its share of quirks and
  31. drawbacks, both in its usage, as well as its extensibility and
  32. performance. The purpose of the DistCp refactor was to fix these shortcomings,
  33. enabling it to be used and extended programmatically. New paradigms have
  34. been introduced to improve runtime and setup performance, while simultaneously
  35. retaining the legacy behaviour as default.
  36. </p>
  37. <p>
  38. This document aims to describe the design of the new DistCp, its spanking
  39. new features, their optimal use, and any deviance from the legacy
  40. implementation.
  41. </p>
  42. </section>
  43. </body>
  44. </document>