123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654 |
- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
- <html><head>
- <meta http-equiv="content-type" content="text/html; charset=UTF-8">
- <title>Hadoop 0.18.3 Release Notes</title></head><body>
- <font face="sans-serif">
- <h1>Hadoop 0.18.3 Release Notes</h1>
- Hadoop 0.18.3 fixes serveral problems that may lead to data loss
- from the file system. Important changes were made to lease recovery and the management of
- block replicas. The bug fixes are listed below.
- <ul>
- <h2>Changes Since Hadoop 0.18.2</h2>
- <ul>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4823'>HADOOP-4823</a>] - Should not use java.util.NavigableMap in 0.18</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4824'>HADOOP-4824</a>] - Should not use File.setWritable(..) in 0.18</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-1980'>HADOOP-1980</a>] - 'dfsadmin -safemode enter' should prevent the namenode from leaving safemode automatically after startup</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3121'>HADOOP-3121</a>] - dfs -lsr fail with "Could not get listing "</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3883'>HADOOP-3883</a>] - TestFileCreation fails once in a while</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4061'>HADOOP-4061</a>] - Large number of decommission freezes the Namenode</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4257'>HADOOP-4257</a>] - TestLeaseRecovery2.testBlockSynchronization failing.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4499'>HADOOP-4499</a>] - DFSClient should invoke checksumOk only once.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4542'>HADOOP-4542</a>] - Fault in TestDistributedUpgrade</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4556'>HADOOP-4556</a>] - Block went missing</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4597'>HADOOP-4597</a>] - Under-replicated blocks are not calculated if the name-node is forced out of safe-mode.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4610'>HADOOP-4610</a>] - Always calculate mis-replicated blocks when safe-mode is turned off.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4613'>HADOOP-4613</a>] - browseBlock.jsp does not generate "genstamp" property.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4614'>HADOOP-4614</a>] - "Too many open files" error while processing a large gzip file</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4616'>HADOOP-4616</a>] - assertion makes fuse-dfs exit when reading incomplete data</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4620'>HADOOP-4620</a>] - Streaming mapper never completes if the mapper does not write to stdout</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4635'>HADOOP-4635</a>] - Memory leak ?</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4643'>HADOOP-4643</a>] - NameNode should exclude excessive replicas when counting live replicas for a block</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4647'>HADOOP-4647</a>] - NamenodeFsck creates a new DFSClient but never closes it</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4654'>HADOOP-4654</a>] - remove temporary output directory of failed tasks</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4659'>HADOOP-4659</a>] - Root cause of connection failure is being lost to code that uses it for delaying startup</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4679'>HADOOP-4679</a>] - Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads is XX</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4702'>HADOOP-4702</a>] - Failed block replication leaves an incomplete block in receiver's tmp data directory</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4703'>HADOOP-4703</a>] - DataNode.createInterDataNodeProtocolProxy should not wait for proxy forever while recovering lease</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4713'>HADOOP-4713</a>] - librecordio does not scale to large records</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4714'>HADOOP-4714</a>] - map tasks timing out during merge phase</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4717'>HADOOP-4717</a>] - Removal of default port# in NameNode.getUri() cause a map/reduce job failed to prompt temporay output</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4726'>HADOOP-4726</a>] - documentation typos: "the the"</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4734'>HADOOP-4734</a>] - Some lease recovery codes in 0.19 or trunk should also be committed in 0.18.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4742'>HADOOP-4742</a>] - Mistake delete replica in hadoop 0.18.1</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4746'>HADOOP-4746</a>] - Job output directory should be normalized</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4778'>HADOOP-4778</a>] - Check for zero size block meta file when updating a block.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4795'>HADOOP-4795</a>] - Lease monitor may get into an infinite loop</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4797'>HADOOP-4797</a>] - RPC Server can leave a lot of direct buffers </li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4806'>HADOOP-4806</a>] - HDFS rename does not work correctly if src contains Java regular expression special characters</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4810'>HADOOP-4810</a>] - Data lost at cluster startup time</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4822'>HADOOP-4822</a>] - 0.18 cannot be compiled in Java 5.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4840'>HADOOP-4840</a>] - TestNodeCount sometimes fails with NullPointerException</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4904'>HADOOP-4904</a>] - Deadlock while leaving safe mode.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4910'>HADOOP-4910</a>] - NameNode should exclude corrupt replicas when choosing excessive replicas to delete</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4919'>HADOOP-4919</a>] - [HOD] Provide execute access to JT history directory path for group</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4924'>HADOOP-4924</a>] - Race condition in re-init of TaskTracker</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4935'>HADOOP-4935</a>] - Manual leaving of safe mode may lead to data lost</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4951'>HADOOP-4951</a>] - Lease monitor does not own the LeaseManager lock in changing leases.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4961'>HADOOP-4961</a>] - ConcurrentModificationException in lease recovery of empty files.</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4971'>HADOOP-4971</a>] - Block report times from datanodes could converge to same time. </li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4983'>HADOOP-4983</a>] - Job counters sometimes go down as tasks run without task failures</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4997'>HADOOP-4997</a>] - workaround for tmp file handling on DataNodes in 0.18 (HADOOP-4663)</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-5077'>HADOOP-5077</a>] - JavaDoc errors in 0.18.3</li>
- <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3780'>HADOOP-3780</a>] - JobTracker should synchronously resolve the tasktracker's network location when the tracker registers</li>
- </ul>
- </ul>
- <h1>Hadoop 0.18.2 Release Notes</h1>
- The bug fixes are listed below.
- <ul>
- <h2>Changes Since Hadoop 0.18.1</h2>
- <ul>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-2421">HADOOP-2421</a>] - Release JDiff report of changes between different versions of Hadoop.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3217">HADOOP-3217</a>] - [HOD] Be less aggressive when querying job status from resource manager.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3614">HADOOP-3614</a>] - TestLeaseRecovery fails when run with assertions enabled.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3786">HADOOP-3786</a>] - Changes in HOD documentation.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3914">HADOOP-3914</a>] - checksumOk implementation in DFSClient can break applications.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4116">HADOOP-4116</a>] - Balancer should provide better resource management.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4228">HADOOP-4228</a>] - dfs datanode metrics, bytes_read, bytes_written overflows due to incorrect type used.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4271">HADOOP-4271</a>] - Bug in FSInputChecker makes it possible to read from an invalid buffer.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4277">HADOOP-4277</a>] - Checksum verification is disabled for LocalFS.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4292">HADOOP-4292</a>] - append() does not work for LocalFileSystem.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4314">HADOOP-4314</a>] - TestReplication fails quite often.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4326">HADOOP-4326</a>] - ChecksumFileSystem does not override all create(...) methods.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4340">HADOOP-4340</a>] - "hadoop jar" always returns exit code 0 (success) to the shell when jar throws a fatal exception.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4351">HADOOP-4351</a>] - ArrayIndexOutOfBoundsException during fsck.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4369">HADOOP-4369</a>] - Metric Averages are not averages.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4395">HADOOP-4395</a>] - Reloading FSImage and FSEditLog may erase user and group information.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4399">HADOOP-4399</a>] - fuse-dfs per FD context is not thread safe and can cause segfaults and corruptions.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4403">HADOOP-4403</a>] - TestLeaseRecovery.testBlockSynchronization failed on trunk.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4407">HADOOP-4407</a>] - HADOOP-4395 should use a Java 1.5 API for 0.18.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4467">HADOOP-4467</a>] - SerializationFactory should use current context ClassLoader.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4469">HADOOP-4469</a>] - ant jar file not being included in tar distribution.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4483">HADOOP-4483</a>] - getBlockArray in DatanodeDescriptor does not honor passed in maxblocks value.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4517">HADOOP-4517</a>] - unstable dfs when running jobs on 0.18.1.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4526">HADOOP-4526</a>] - fsck failing with NullPointerException (return value 0).</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4533">HADOOP-4533</a>] - HDFS client of hadoop 0.18.1 and HDFS server 0.18.2 (0.18 branch) not compatible.</li>
- </ul>
- </ul>
- <h1>Hadoop 0.18.1 Release Notes</h1>
- The bug fixes are listed below.
- <ul>
- <h2>Changes Since Hadoop 0.18.0</h2>
- <ul>
- <li><a name="changes">[</a><a href="https://issues.apache.org/jira/browse/HADOOP-4040">HADOOP-4040</a>] - Remove the hardcoded ipc.client.connection.maxidletime setting from the TaskTracker.Child.main().</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3934">HADOOP-3934</a>] - Update log4j from 1.2.13 to 1.2.15.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3995">HADOOP-3995</a>] - renameTo(src, dst) does not restore src name in case of quota failur.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4046">HADOOP-4046</a>] - WritableComparator's constructor should be protected instead of private.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3821">HADOOP-3821</a>]
- - SequenceFile's Reader.decompressorPool or Writer.decompressorPool
- gets into an inconsistent state when calling close() more than onc.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3940">HADOOP-3940</a>] - Reduce often attempts in memory merge with no work.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4161">HADOOP-4161</a>] - [HOD] Uncaught exceptions can potentially hang hod-client.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4060">HADOOP-4060</a>] - [HOD] Make HOD to roll log files on the client.</li>
- <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4145">HADOOP-4145</a>] - [HOD] Support an accounting plugin script for HOD.</li>
- </ul>
- </ul>
- <h1>Hadoop 0.18.0 Release Notes</h1>
- These release notes include new developer and user facing incompatibilities, features, and major improvements.
- The table below is sorted by Component.
- <ul>
- <h2>Changes Since Hadoop 0.17.2</h2>
- <ul>
- <table 100%="" border="1" cellpadding="4">
- <tbody><tr>
- <td><b>Issue</b></td>
- <td><b>Component</b></td>
- <td><b>Notes</b></td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3355">HADOOP-3355</a></td>
- <td>conf</td>
- <td>Added support for hexadecimal values in
- Configuration</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-1702">HADOOP-1702</a></td>
- <td>dfs</td>
- <td>Reduced buffer copies as data is written to HDFS.
- The order of sending data bytes and control information has changed, but this
- will not be observed by client applications.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2065">HADOOP-2065</a></td>
- <td>dfs</td>
- <td>Added "corrupt" flag to LocatedBlock to
- indicate that all replicas of the block thought to be corrupt.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a></td>
- <td>dfs</td>
- <td>Improved management of replicas of the name space
- image. If all replicas on the Name Node are lost, the latest check point can
- be loaded from the secondary Name Node. Use parameter
- "-importCheckpoint" and specify the location with "fs.checkpoint.dir."
- The directory structure on the secondary Name Node has changed to match the
- primary Name Node.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2656">HADOOP-2656</a></td>
- <td>dfs</td>
- <td>Associated a generation stamp with each block. On
- data nodes, the generation stamp is stored as part of the file name of the
- block's meta-data file.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2703">HADOOP-2703</a></td>
- <td>dfs</td>
- <td>Changed fsck to ignore files opened for writing.
- Introduced new option "-openforwrite" to explicitly show open
- files.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2797">HADOOP-2797</a></td>
- <td>dfs</td>
- <td>Withdrew the upgrade-to-CRC facility. HDFS will no
- longer support upgrades from versions without CRCs for block data. Users
- upgrading from version 0.13 or earlier must first upgrade to an intermediate
- (0.14, 0.15, 0.16, 0.17) version before doing upgrade to version 0.18 or
- later.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2865">HADOOP-2865</a></td>
- <td>dfs</td>
- <td>Changed the output of the "fs -ls" command
- to more closely match familiar Linux format. Additional changes were made by
- HADOOP-3459. Applications that parse the command output should be reviewed.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3035">HADOOP-3035</a></td>
- <td>dfs</td>
- <td>Changed protocol for transferring blocks between
- data nodes to report corrupt blocks to data node for re-replication from a
- good replica.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3113">HADOOP-3113</a></td>
- <td>dfs</td>
- <td>Added sync() method to FSDataOutputStream to really,
- really persist data in HDFS. InterDatanodeProtocol to implement this feature.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3164">HADOOP-3164</a></td>
- <td>dfs</td>
- <td>Changed data node to use FileChannel.tranferTo() to
- transfer block data. <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3177">HADOOP-3177</a></td>
- <td>dfs</td>
- <td>Added a new public interface Syncable which declares
- the sync() operation. FSDataOutputStream implements Syncable. If the
- wrappedStream in FSDataOutputStream is Syncalbe, calling
- FSDataOutputStream.sync() is equivalent to call wrappedStream.sync(). Otherwise,
- FSDataOutputStream.sync() is a no-op. Both DistributedFileSystem and
- LocalFileSystem support the sync() operation.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3187">HADOOP-3187</a></td>
- <td>dfs</td>
- <td>Introduced directory quota as hard limits on the
- number of names in the tree rooted at that directory. An administrator may
- set quotas on individual directories explicitly. Newly created directories
- have no associated quota. File/directory creations fault if the quota would
- be exceeded. The attempt to set a quota faults if the directory would be in
- violation of the new quota.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3193">HADOOP-3193</a></td>
- <td>dfs</td>
- <td>Added reporter to FSNamesystem stateChangeLog, and a
- new metric to track the number of corrupted replicas.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3232">HADOOP-3232</a></td>
- <td>dfs</td>
- <td>Changed 'du' command to run in a seperate thread so
- that it does not block user.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3310">HADOOP-3310</a></td>
- <td>dfs</td>
- <td>Implemented Lease Recovery to sync the last bock of
- a file. Added ClientDatanodeProtocol for client trigging block recovery.
- Changed DatanodeProtocol to support block synchronization. Changed
- InterDatanodeProtocol to support block update.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3317">HADOOP-3317</a></td>
- <td>dfs</td>
- <td>Changed the default port for "hdfs:" URIs
- to be 8020, so that one may simply use URIs of the form
- "hdfs://example.com/dir/file".</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3329">HADOOP-3329</a></td>
- <td>dfs</td>
- <td>Changed format of file system image to not store
- locations of last block.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3336">HADOOP-3336</a></td>
- <td>dfs</td>
- <td>Added a log4j appender that emits events from
- FSNamesystem for audit logging</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3339">HADOOP-3339</a></td>
- <td>dfs</td>
- <td>Improved failure handling of last Data Node in write
- pipeline. <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3390">HADOOP-3390</a></td>
- <td>dfs</td>
- <td>Removed deprecated
- ClientProtocol.abandonFileInProgress().</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3452">HADOOP-3452</a></td>
- <td>dfs</td>
- <td>Changed exit status of fsck to report whether the
- files system is healthy or corrupt.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3459">HADOOP-3459</a></td>
- <td>dfs</td>
- <td>Changed the output of the "fs -ls" command
- to more closely match familiar Linux format. Applications that parse the
- command output should be reviewed.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3486">HADOOP-3486</a></td>
- <td>dfs</td>
- <td>Changed the default value of
- dfs.blockreport.initialDelay to be 0 seconds.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3677">HADOOP-3677</a></td>
- <td>dfs</td>
- <td>Simplify generation stamp upgrade by making is a
- local upgrade on datandodes. Deleted distributed upgrade.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2188">HADOOP-2188</a></td>
- <td>dfs <br>
- ipc</td>
- <td>Replaced timeouts with pings to check that client
- connection is alive. Removed the property ipc.client.timeout from the default
- Hadoop configuration. Removed the metric RpcOpsDiscardedOPsNum. <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3283">HADOOP-3283</a></td>
- <td>dfs <br>
- ipc</td>
- <td>Added an IPC server in DataNode and a new IPC
- protocol InterDatanodeProtocol. Added conf properties
- dfs.datanode.ipc.address and dfs.datanode.handler.count with defaults
- "0.0.0.0:50020" and 3, respectively. <br>
- Changed the serialization in DatanodeRegistration
- and DatanodeInfo, and therefore, updated the versionID in ClientProtocol,
- DatanodeProtocol, NamenodeProtocol.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3058">HADOOP-3058</a></td>
- <td>dfs <br>
- metrics</td>
- <td>Added FSNamesystem status metrics.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3683">HADOOP-3683</a></td>
- <td>dfs <br>
- metrics</td>
- <td>Change FileListed to getNumGetListingOps and add
- CreateFileOps, DeleteFileOps and AddBlockOps metrics.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3265">HADOOP-3265</a></td>
- <td>fs</td>
- <td>Removed deprecated API getFileCacheHints</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3307">HADOOP-3307</a></td>
- <td>fs</td>
- <td>Introduced archive feature to Hadoop. A Map/Reduce
- job can be run to create an archive with indexes. A FileSystem abstraction is
- provided over the archive.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-930">HADOOP-930</a></td>
- <td>fs</td>
- <td>Added support for reading and writing native S3
- files. Native S3 files are referenced using s3n URIs. See
- http://wiki.apache.org/hadoop/AmazonS3 for more details.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3095">HADOOP-3095</a></td>
- <td>fs <br>
- fs/s3</td>
- <td>Added overloaded method
- getFileBlockLocations(FileStatus, long, long). This is an incompatible change
- for FileSystem implementations which override getFileBlockLocations(Path,
- long, long). They should have the signature of this method changed to
- getFileBlockLocations(FileStatus, long, long) to work correctly.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-4">HADOOP-4</a></td>
- <td>fuse-dfs</td>
- <td>Introduced FUSE module for HDFS. Module allows mount
- of HDFS as a Unix filesystem, and optionally the export of that mount point
- to other machines. Writes are disabled. rmdir, mv, mkdir, rm are supported,
- but not cp, touch, and the like. Usage information is attached to the Jira
- record. <br>
- <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3184">HADOOP-3184</a></td>
- <td>hod</td>
- <td>Modified HOD to handle master (NameNode or
- JobTracker) failures on bad nodes by trying to bring them up on another node
- in the ring. Introduced new property ringmaster.max-master-failures to
- specify the maximum number of times a master is allowed to fail.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3266">HADOOP-3266</a></td>
- <td>hod</td>
- <td>Moved HOD change items from CHANGES.txt to a new
- file src/contrib/hod/CHANGES.txt.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3376">HADOOP-3376</a></td>
- <td>hod</td>
- <td>Modified HOD client to look for specific messages
- related to resource limit overruns and take appropriate actions - such as
- either failing to allocate the cluster, or issuing a warning to the user. A
- tool is provided, specific to Maui and Torque, that will set these specific
- messages.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3464">HADOOP-3464</a></td>
- <td>hod</td>
- <td>Implemented a mechanism to transfer HOD errors that
- occur on compute nodes to the submit node running the HOD client, so users
- have good feedback on why an allocation failed.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3483">HADOOP-3483</a></td>
- <td>hod</td>
- <td>Modified HOD to create a cluster directory if one
- does not exist and to auto-deallocate a cluster while reallocating it, if it
- is already dead.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3564">HADOOP-3564</a></td>
- <td>hod</td>
- <td>Modifed HOD to generate the dfs.datanode.ipc.address
- parameter in the hadoop-site.xml of datanodes that it launches.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3610">HADOOP-3610</a></td>
- <td>hod</td>
- <td>Modified HOD to automatically create a cluster
- directory if the one specified with the script command does not exist.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3703">HADOOP-3703</a></td>
- <td>hod</td>
- <td>Modified logcondense.py to use the new format of
- hadoop dfs -lsr output. This version of logcondense would not work with
- previous versions of Hadoop and hence is incompatible.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3061">HADOOP-3061</a></td>
- <td>io</td>
- <td>Introduced ByteWritable and DoubleWritable
- (implementing WritableComparable) implementations for Byte and Double.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3299">HADOOP-3299</a></td>
- <td>io <br>
- mapred</td>
- <td>Changed the TextInputFormat and KeyValueTextInput
- classes to initialize the compressionCodecs member variable before
- dereferencing it.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2909">HADOOP-2909</a></td>
- <td>ipc</td>
- <td>Removed property ipc.client.maxidletime from the
- default configuration. The allowed idle time is twice
- ipc.client.connection.maxidletime. <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3569">HADOOP-3569</a></td>
- <td>KFS</td>
- <td>Fixed KFS to have read() read and return 1 byte
- instead of 4.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-1915">HADOOP-1915</a></td>
- <td>mapred</td>
- <td>Provided a new method to update counters.
- "incrCounter(String group, String counter, long amount)"</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2019">HADOOP-2019</a></td>
- <td>mapred</td>
- <td>Added support for .tar, .tgz and .tar.gz files in
- DistributedCache. File sizes are limited to 2GB.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2095">HADOOP-2095</a></td>
- <td>mapred</td>
- <td>Reduced in-memory copies of keys and values as they
- flow through the Map-Reduce framework. Changed the storage of intermediate
- map outputs to use new IFile instead of SequenceFile for better compression.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2132">HADOOP-2132</a></td>
- <td>mapred</td>
- <td>Changed "job -kill" to only allow a job
- that is in the RUNNING or PREP state to be killed.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2181">HADOOP-2181</a></td>
- <td>mapred</td>
- <td>Added logging for input splits in job tracker log
- and job history log. Added web UI for viewing input splits in the job UI and
- history UI.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-236">HADOOP-236</a></td>
- <td>mapred</td>
- <td>Changed connection protocol job tracker and task
- tracker so that task tracker will not connect to a job tracker with a
- different build version.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2427">HADOOP-2427</a></td>
- <td>mapred</td>
- <td>The current working directory of a task, i.e.
- ${mapred.local.dir}/taskTracker/jobcache/<jobid>/<task_dir>/work
- is cleanedup, as soon as the task is finished.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-2867">HADOOP-2867</a></td>
- <td>mapred</td>
- <td>Added task's cwd to its LD_LIBRARY_PATH.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3135">HADOOP-3135</a></td>
- <td>mapred</td>
- <td>Changed job submission protocol to not allow
- submission if the client's value of mapred.system.dir does not match the job
- tracker's. Deprecated JobConf.getSystemDir(); use JobClient.getSystemDir().</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3221">HADOOP-3221</a></td>
- <td>mapred</td>
- <td>Added org.apache.hadoop.mapred.lib.NLineInputFormat,
- which splits N lines of input as one split. N can be specified by
- configuration property "mapred.line.input.format.linespermap",
- which defaults to 1.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3226">HADOOP-3226</a></td>
- <td>mapred</td>
- <td>Changed policy for running combiner. The combiner
- may be run multiple times as the map's output is sorted and merged.
- Additionally, it may be run on the reduce side as data is merged. The old
- semantics are available in Hadoop 0.18 if the user calls: <br>
- job.setCombineOnlyOnce(true); <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3326">HADOOP-3326</a></td>
- <td>mapred</td>
- <td>Changed fetchOutputs() so that LocalFSMerger and
- InMemFSMergeThread threads are spawned only once. The thread gets notified
- when something is ready for merge. The merge happens when thresholds are met.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3366">HADOOP-3366</a></td>
- <td>mapred</td>
- <td>Improved shuffle so that all fetched map-outputs are
- kept in-memory before being merged by stalling the shuffle so that the
- in-memory merge executes and frees up memory for the shuffle.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3405">HADOOP-3405</a></td>
- <td>mapred</td>
- <td>Refactored previously public classes MapTaskStatus,
- ReduceTaskStatus, JobSubmissionProtocol, CompletedJobStatusStore to be
- package local.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3417">HADOOP-3417</a></td>
- <td>mapred</td>
- <td>Removed the public class
- org.apache.hadoop.mapred.JobShell. <br>
- Command line options -libjars, -files and -archives are moved to
- GenericCommands. Thus applications have to implement
- org.apache.hadoop.util.Tool to use the options.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3427">HADOOP-3427</a></td>
- <td>mapred</td>
- <td>Changed shuffle scheduler policy to wait for
- notifications from shuffle threads before scheduling more.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3460">HADOOP-3460</a></td>
- <td>mapred</td>
- <td>Created SequenceFileAsBinaryOutputFormat to write
- raw bytes as keys and values to a SequenceFile.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3512">HADOOP-3512</a></td>
- <td>mapred</td>
- <td>Separated Distcp, Logalyzer and Archiver into a
- tools jar.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3565">HADOOP-3565</a></td>
- <td>mapred</td>
- <td>Change the Java serialization framework, which is
- not enabled by default, to correctly make the objects independent of the
- previous objects.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3598">HADOOP-3598</a></td>
- <td>mapred</td>
- <td>Changed Map-Reduce framework to no longer create
- temporary task output directories for staging outputs if staging outputs
- isn't necessary. ${mapred.out.dir}/_temporary/_${taskid}</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-544">HADOOP-544</a></td>
- <td>mapred</td>
- <td>Introduced new classes JobID, TaskID and
- TaskAttemptID, which should be used instead of their string counterparts.
- Deprecated functions in JobClient, TaskReport, RunningJob, jobcontrol.Job and
- TaskCompletionEvent that use string arguments. Applications can use
- xxxID.toString() and xxxID.forName() methods to convert/restore objects
- to/from strings. <br>
- </td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3230">HADOOP-3230</a></td>
- <td>scripts</td>
- <td>Added command line tool "job -counter
- <job-id> <group-name> <counter-name>" to access
- counters.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-1328">HADOOP-1328</a></td>
- <td>streaming</td>
- <td>Introduced a way for a streaming process to update
- global counters and status using stderr stream to emit information. Use
- "reporter:counter:<group>,<counter>,<amount> " to
- update a counter. Use "reporter:status:<message>" to update
- status.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3429">HADOOP-3429</a></td>
- <td>streaming</td>
- <td>Increased the size of the buffer used in the
- communication between the Java task and the Streaming process to 128KB.</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3379">HADOOP-3379</a></td>
- <td>streaming <br>
- documentation</td>
- <td>Set default value for configuration property
- "stream.non.zero.exit.status.is.failure" to be "true".</td>
- </tr>
- <tr>
- <td><a href="https://issues.apache.org/jira/browse/HADOOP-3246">HADOOP-3246</a></td>
- <td>util</td>
- <td>Introduced an FTPFileSystem backed by Apache Commons
- FTPClient to directly store data into HDFS.</td>
- </tr>
- </tbody></table>
- </ul>
- </ul>
- </font>
- </body></html>
|