Hadoop Common 0.21.0 Release Notes
These release notes include new developer and user-facing incompatibilities, features, and major improvements.
Changes Since Hadoop 0.20.2
Sub-task
- [HADOOP-4490] - Map and Reduce tasks should run as the user who submitted the job
- [HADOOP-4930] - Implement setuid executable for Linux to assist in launching tasks as job owners
- [HADOOP-4940] - Remove delete(Path f)
- [HADOOP-4941] - Remove getBlockSize(Path f), getLength(Path f) and getReplication(Path src)
- [HADOOP-4942] - Remove getName() and getNamed(String name, Configuration conf)
- [HADOOP-5037] - Deprecate FSNamesystem.getFSNamesystem() and change fsNamesystemObject to private
- [HADOOP-5045] - FileSystem.isDirectory() should not be deprecated.
- [HADOOP-5073] - Hadoop 1.0 Interface Classification - scope (visibility - public/private) and stability
- [HADOOP-5097] - Remove static variable JspHelper.fsn
- [HADOOP-5120] - UpgradeManagerNamenode and UpgradeObjectNamenode should not use FSNamesystem.getFSNamesystem()
- [HADOOP-5217] - Split the AllTestDriver for core, hdfs and mapred
- [HADOOP-5792] - to resolve jsp-2.1 jars through IVY
- [HADOOP-6170] - add Avro-based RPC serialization
- [HADOOP-6223] - New improved FileSystem interface for those implementing new files systems.
- [HADOOP-6230] - Move process tree, and memory calculator classes out of Common into Map/Reduce.
- [HADOOP-6409] - TestHDFSCLI has to check if it's running any testcases at all
- [HADOOP-6410] - Rename TestCLI class to prevent JUnit from trying to run this class as a test
- [HADOOP-6422] - permit RPC protocols to be implemented by Avro
- [HADOOP-6486] - fix common classes to work with Avro 1.3 reflection
- [HADOOP-6538] - Set hadoop.security.authentication to "simple" by default
- [HADOOP-6568] - Authorization for default servlets
- [HADOOP-6658] - Exclude Private elements from generated Javadoc
- [HADOOP-6668] - Apply audience and stability annotations to classes in common
- [HADOOP-6692] - Add FileContext#listStatus that returns an iterator
- [HADOOP-6752] - Remote cluster control functionality needs JavaDocs improvement
- [HADOOP-6771] - Herriot's artifact id for Maven deployment should be set to hadoop-core-instrumented
Bug
- [HADOOP-2337] - Trash never closes FileSystem
- [HADOOP-2366] - Space in the value for dfs.data.dir can cause great problems
- [HADOOP-2413] - Is FSNamesystem.fsNamesystemObject unique?
- [HADOOP-2827] - Remove deprecated NetUtils.getServerAddress
- [HADOOP-3205] - Read multiple chunks directly from FSInputChecker subclass into user buffers
- [HADOOP-3327] - Shuffling fetchers waited too long between map output fetch re-tries
- [HADOOP-3426] - Datanode does not start up if the local machines DNS isnt working right and dfs.datanode.dns.interface==default
- [HADOOP-4041] - IsolationRunner does not work as documented
- [HADOOP-4045] - Increment checkpoint if we see failures in rollEdits
- [HADOOP-4220] - Job Restart tests take 10 minutes, can time out very easily
- [HADOOP-4584] - Slow generation of blockReport at DataNode causes delay of sending heartbeat to NameNode
- [HADOOP-4648] - Remove ChecksumDistriubtedFileSystem and InMemoryFileSystem
- [HADOOP-4655] - FileSystem.CACHE should be ref-counted
- [HADOOP-4779] - Remove deprecated FileSystem methods
- [HADOOP-4864] - -libjars with multiple jars broken when client and cluster reside on different OSs
- [HADOOP-4933] - ConcurrentModificationException in JobHistory.java
- [HADOOP-4948] - ant test-patch does not work
- [HADOOP-4959] - System metrics does not output correctly for Redhat 5.1.
- [HADOOP-4960] - Hadoop metrics are showing in irregular intervals
- [HADOOP-4975] - CompositeRecordReader: ClassLoader set in JobConf is not passed onto WrappedRecordReaders
- [HADOOP-4985] - IOException is abused in FSDirectory
- [HADOOP-5017] - NameNode.namesystem should be private
- [HADOOP-5022] - [HOD] logcondense should delete all hod logs for a user, including jobtracker logs
- [HADOOP-5031] - metrics aggregation is incorrect in database
- [HADOOP-5032] - CHUKWA_CONF_DIR environment variable needs to be exported to shell script
- [HADOOP-5039] - Hourly&daily rolling are not using the right path
- [HADOOP-5050] - TestDFSShell fails intermittently
- [HADOOP-5070] - Update the year for the copyright to 2009
- [HADOOP-5072] - testSequenceFileGzipCodec won't pass without native gzip codec
- [HADOOP-5078] - Broken AMI/AKI for ec2 on hadoop
- [HADOOP-5095] - chukwa watchdog does not monitor the system correctly
- [HADOOP-5100] - Chukwa Log4JMetricsContext class should append new log to current log file
- [HADOOP-5103] - Too many logs saying "Adding new node" on JobClient console
- [HADOOP-5113] - logcondense should delete hod logs for a user , whose username has any of the characters in the value passed to "-l" options
- [HADOOP-5138] - Current Chukwa Trunk failed contrib unit tests.
- [HADOOP-5148] - make watchdog disable-able
- [HADOOP-5149] - HistoryViewer throws IndexOutOfBoundsException when there are files or directories not confrming to log file name convention
- [HADOOP-5172] - Chukwa : TestAgentConfig.testInitAdaptors_vs_Checkpoint regularly fails
- [HADOOP-5191] - After creation and startup of the hadoop namenode on AIX or Solaris, you will only be allowed to connect to the namenode via hostname but not IP.
- [HADOOP-5194] - DiskErrorException in TaskTracker when running a job
- [HADOOP-5198] - NPE in Shell.runCommand()
- [HADOOP-5200] - NPE when the namenode comes up but the filesystem is set to file://
- [HADOOP-5203] - TT's version build is too restrictive
- [HADOOP-5204] - hudson trunk build failure due to autoheader failure in create-c++-configure-libhdfs task
- [HADOOP-5206] - All "unprotected*" methods of FSDirectory should synchronize on the root.
- [HADOOP-5209] - Update year to 2009 for javadoc
- [HADOOP-5212] - cygwin path translation not happening correctly after Hadoop-4868
- [HADOOP-5213] - BZip2CompressionOutputStream NullPointerException
- [HADOOP-5218] - libhdfs unit test failed because it was unable to start namenode/datanode
- [HADOOP-5219] - SequenceFile is using mapred property
- [HADOOP-5226] - Add license headers to html and jsp files
- [HADOOP-5229] - duplicate variables in build.xml hadoop.version vs version let build fails at assert-hadoop-jar-exists
- [HADOOP-5251] - TestHdfsProxy and TestProxyUgiManager frequently fail
- [HADOOP-5252] - Streaming overrides -inputformat option
- [HADOOP-5253] - to remove duplicate calls to the cn-docs target.
- [HADOOP-5273] - License header missing in TestJobInProgress.java
- [HADOOP-5276] - Upon a lost tracker, the task's start time is reset to 0
- [HADOOP-5278] - Finish time of a TIP is incorrectly logged to the jobhistory upon jobtracker restart
- [HADOOP-5300] - "ant javadoc-dev" does not work
- [HADOOP-5314] - needToSave incorrectly calculated in loadFSImage()
- [HADOOP-5322] - comments in JobInProgress related to TaskCommitThread are not valid
- [HADOOP-5341] - hadoop-daemon isn't compatible after HADOOP-4868
- [HADOOP-5347] - bbp example cannot be run.
- [HADOOP-5386] - To Probe free ports dynamically for Unit test to replace fixed ports
- [HADOOP-5406] - Misnamed function in ZlibCompressor.c
- [HADOOP-5420] - Support killing of process groups in LinuxTaskController binary
- [HADOOP-5442] - The job history display needs to be paged
- [HADOOP-5456] - javadoc warning: can't find restoreFailedStorage() in ClientProtocol
- [HADOOP-5458] - Remove Chukwa from .gitignore
- [HADOOP-5462] - Glibc double free exception thrown when chown syscall fails.
- [HADOOP-5464] - DFSClient does not treat write timeout of 0 properly
- [HADOOP-5472] - Distcp does not support globbing of input paths
- [HADOOP-5476] - calling new SequenceFile.Reader(...) leaves an InputStream open, if the given sequence file is broken
- [HADOOP-5477] - TestCLI fails
- [HADOOP-5486] - ReliabilityTest does not test lostTrackers, some times.
- [HADOOP-5488] - HADOOP-2721 doesn't clean up descendant processes of a jvm that exits cleanly after running a task successfully
- [HADOOP-5489] - hadoop-env.sh still refers to java1.5
- [HADOOP-5491] - Better control memory usage in contrib/index
- [HADOOP-5507] - javadoc warning in JMXGet
- [HADOOP-5511] - Add Apache License to EditLogBackupOutputStream
- [HADOOP-5556] - A few improvements to DataNodeCluster
- [HADOOP-5561] - Javadoc-dev ant target runs out of heap space
- [HADOOP-5581] - libhdfs does not get FileNotFoundException
- [HADOOP-5582] - Hadoop Vaidya throws number format exception due to changes in the job history counters string format (escaped compact representation).
- [HADOOP-5592] - Hadoop Streaming - GzipCodec
- [HADOOP-5604] - TestBinaryPartitioner javac warnings.
- [HADOOP-5635] - distributed cache doesn't work with other distributed file systems
- [HADOOP-5650] - Namenode log that indicates why it is not leaving safemode may be confusing
- [HADOOP-5652] - Reduce does not respect in-memory segment memory limit when number of on disk segments == io.sort.factor
- [HADOOP-5656] - Counter for S3N Read Bytes does not work
- [HADOOP-5658] - Eclipse templates fail out of the box; need updating
- [HADOOP-5661] - Resolve findbugs warnings in mapred
- [HADOOP-5679] - Resolve findbugs warnings in core/streaming/pipes/examples
- [HADOOP-5704] - Scheduler test code does not compile
- [HADOOP-5709] - Remove the additional synchronization in MapTask.MapOutputBuffer.Buffer.write
- [HADOOP-5710] - Counter MAP_INPUT_BYTES missing from new mapreduce api.
- [HADOOP-5715] - Should conf/mapred-queue-acls.xml be added to the ignore list?
- [HADOOP-5734] - HDFS architecture documentation describes outdated placement policy
- [HADOOP-5737] - UGI checks in testcases are broken
- [HADOOP-5738] - Split waiting tasks field in JobTracker metrics to individual tasks
- [HADOOP-5762] - distcp does not copy empty directories
- [HADOOP-5764] - Hadoop Vaidya test rule (ReadingHDFSFilesAsSideEffect) fails w/ exception if number of map input bytes for a job is zero.
- [HADOOP-5775] - HdfsProxy Unit Test should not depend on HDFSPROXY_CONF_DIR environment
- [HADOOP-5780] - Fix slightly confusing log from "-metaSave" on NameNode
- [HADOOP-5782] - Make formatting of BlockManager.java similar to FSNamesystem.java to simplify porting patch
- [HADOOP-5801] - JobTracker should refresh the hosts list upon recovery
- [HADOOP-5804] - neither s3.block.size not fs.s3.block.size are honoured
- [HADOOP-5805] - problem using top level s3 buckets as input/output directories
- [HADOOP-5808] - Fix hdfs un-used import warnings
- [HADOOP-5809] - Job submission fails if hadoop.tmp.dir exists
- [HADOOP-5818] - Revert the renaming from checkSuperuserPrivilege to checkAccess by HADOOP-5643
- [HADOOP-5820] - Fix findbugs warnings for http related codes in hdfs
- [HADOOP-5823] - Handling javac "deprecated" warning for using UTF8
- [HADOOP-5824] - remove OP_READ_METADATA functionality from Datanode
- [HADOOP-5827] - Remove unwanted file that got checked in by accident
- [HADOOP-5829] - Fix javac warnings
- [HADOOP-5835] - Fix findbugs warnings
- [HADOOP-5836] - Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
- [HADOOP-5841] - Resolve findbugs warnings in DistributedFileSystem.java, DatanodeInfo.java, BlocksMap.java, DataNodeDescriptor.java
- [HADOOP-5842] - Fix a few javac warnings under packages fs and util
- [HADOOP-5845] - Build successful despite test failure on test-core target
- [HADOOP-5847] - Streaming unit tests failing for a while on trunk
- [HADOOP-5853] - Undeprecate HttpServer.addInternalServlet method to fix javac warnings
- [HADOOP-5855] - Fix javac warnings for DisallowedDatanodeException and UnsupportedActionException
- [HADOOP-5856] - FindBugs : fix "unsafe multithreaded use of DateFormat" warning in hdfs
- [HADOOP-5859] - FindBugs : fix "wait() or sleep() with locks held" warnings in hdfs
- [HADOOP-5861] - s3n files are not getting split by default
- [HADOOP-5864] - Fix DMI and OBL findbugs in packages hdfs and metrics
- [HADOOP-5866] - Move DeprecatedUTF8 to o.a.h.hdfs
- [HADOOP-5877] - Fix javac warnings in TestHDFSServerPorts, TestCheckpoint, TestNameEditsConfig, TestStartup and TestStorageRestore
- [HADOOP-5878] - Fix hdfs jsp import and Serializable javac warnings
- [HADOOP-5891] - If dfs.http.address is default, SecondaryNameNode can't find NameNode
- [HADOOP-5895] - Log message shows -ve number of bytes to be merged in the final merge pass when there are no intermediate merges and merge factor is > number of segments
- [HADOOP-5899] - Minor - move info log to the right place to avoid printing unnecessary log
- [HADOOP-5900] - Minor correction in HDFS Documentation
- [HADOOP-5902] - 4 contrib test cases are failing for the svn committed code
- [HADOOP-5935] - Hudson's release audit warnings link is broken
- [HADOOP-5940] - trunk eclipse-plugin build fails while trying to copy commons-cli jar from the lib dir
- [HADOOP-5944] - BlockManager needs Apache license header.
- [HADOOP-5947] - org.apache.hadoop.mapred.lib.TestCombineFileInputFormat fails trunk builds
- [HADOOP-5951] - StorageInfo needs Apache license header.
- [HADOOP-5953] - KosmosFileSystem.isDirectory() should not be deprecated.
- [HADOOP-5954] - Fix javac warnings in HDFS tests
- [HADOOP-5956] - org.apache.hadoop.hdfsproxy.TestHdfsProxy.testHdfsProxyInterface test fails on trunk
- [HADOOP-5958] - Use JDK 1.6 File APIs in DF.java wherever possible
- [HADOOP-5963] - unnecessary exception catch in NNBench
- [HADOOP-5980] - LD_LIBRARY_PATH not passed to tasks spawned off by LinuxTaskController
- [HADOOP-5981] - HADOOP-2838 doesnt work as expected
- [HADOOP-5989] - streaming tests fails trunk builds
- [HADOOP-6004] - BlockLocation deserialization is incorrect
- [HADOOP-6009] - S3N listStatus incorrectly returns null instead of empty array when called on empty root
- [HADOOP-6017] - NameNode and SecondaryNameNode fail to restart because of abnormal filenames.
- [HADOOP-6031] - Remove @author tags from Java source files
- [HADOOP-6074] - TestDFSIO does not use configuration properly.
- [HADOOP-6076] - Forrest documentation compilation is broken because of HADOOP-5913
- [HADOOP-6079] - In DataTransferProtocol, the serialization of proxySource is not consistent
- [HADOOP-6090] - GridMix is broke after upgrading random(text)writer to newer mapreduce apis
- [HADOOP-6096] - Fix Eclipse project and classpath files following project split
- [HADOOP-6103] - Configuration clone constructor does not clone all the members.
- [HADOOP-6112] - to fix hudsonPatchQueueAdmin for different projects
- [HADOOP-6114] - bug in documentation: org.apache.hadoop.fs.FileStatus.getLen()
- [HADOOP-6122] - 64 javac compiler warnings
- [HADOOP-6123] - hdfs script does not work after project split.
- [HADOOP-6124] - patchJavacWarnings and trunkJavacWarnings are not consistent.
- [HADOOP-6131] - A sysproperty should not be set unless the property is set on the ant command line in build.xml.
- [HADOOP-6132] - RPC client opens an extra connection for VersionedProtocol
- [HADOOP-6137] - to fix project specific test-patch requirements
- [HADOOP-6138] - eliminate the depracate warnings introduced by H-5438
- [HADOOP-6142] - archives relative path changes in common.
- [HADOOP-6151] - The servlets should quote html characters
- [HADOOP-6152] - Hadoop scripts do not correctly put jars on the classpath
- [HADOOP-6169] - Removing deprecated method calls in TFile
- [HADOOP-6172] - bin/hadoop version not working
- [HADOOP-6175] - Incorret version compilation with es_ES.ISO8859-15 locale on Solaris 10
- [HADOOP-6177] - FSInputChecker.getPos() would return position greater than the file size
- [HADOOP-6180] - Namenode slowed down when many files with same filename were moved to Trash
- [HADOOP-6181] - Fixes for Eclipse template
- [HADOOP-6184] - Provide a configuration dump in json format.
- [HADOOP-6188] - TestHDFSTrash fails because of TestTrash in common
- [HADOOP-6192] - Shell.getUlimitMemoryCommand is tied to Map-Reduce
- [HADOOP-6196] - sync(0); next() breaks SequenceFile
- [HADOOP-6199] - Add the documentation for io.map.index.skip in core-default
- [HADOOP-6227] - Configuration does not lock parameters marked final if they have no value.
- [HADOOP-6229] - Atempt to make a directory under an existing file on LocalFileSystem should throw an Exception.
- [HADOOP-6234] - Permission configuration files should use octal and symbolic
- [HADOOP-6240] - Rename operation is not consistent between different implementations of FileSystem
- [HADOOP-6243] - NPE in handling deprecated configuration keys.
- [HADOOP-6250] - test-patch.sh doesn't clean up conf/*.xml files after the trunk run.
- [HADOOP-6254] - s3n fails with SocketTimeoutException
- [HADOOP-6257] - Two TestFileSystem classes are confusing hadoop-hdfs-hdfwithmr
- [HADOOP-6274] - TestLocalFSFileContextMainOperations tests wrongly expect a certain order to be returned.
- [HADOOP-6281] - HtmlQuoting throws NullPointerException
- [HADOOP-6283] - The exception meessage in FileUtil$HardLink.getLinkCount(..) is not clear
- [HADOOP-6285] - HttpServer.QuotingInputFilter has the wrong signature for getParameterMap
- [HADOOP-6286] - The Glob methods in FileContext doe not deal with URIs correctly
- [HADOOP-6293] - FsShell -text should work on filesystems other than the default
- [HADOOP-6303] - Eclipse .classpath template has outdated jar files and is missing some new ones.
- [HADOOP-6314] - "bin/hadoop fs -help count" fails to show help about only "count" command.
- [HADOOP-6327] - Fix build error for one of the FileContext Tests
- [HADOOP-6334] - GenericOptionsParser does not understand uri for -files -libjars and -archives option
- [HADOOP-6341] - Hudson giving a +1 though no tests are included.
- [HADOOP-6347] - run-test-core-fault-inject runs a test case twice if -Dtestcase is set
- [HADOOP-6374] - JUnit tests should never depend on anything in conf
- [HADOOP-6375] - Update documentation for FsShell du command
- [HADOOP-6386] - NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown
- [HADOOP-6390] - Block slf4j-simple from avro's pom
- [HADOOP-6391] - Classpath should not be part of command line arguments
- [HADOOP-6395] - Inconsistent versions of libraries are being included
- [HADOOP-6396] - Provide a description in the exception when an error is encountered parsing umask
- [HADOOP-6398] - Build is broken after HADOOP-6395 patch has been applied
- [HADOOP-6402] - testConf.xsl is not well-formed XML
- [HADOOP-6404] - Rename the generated artifacts to common instead of core
- [HADOOP-6405] - Update Eclipse configuration to match changes to Ivy configuration
- [HADOOP-6411] - Remove deprecated file src/test/hadoop-site.xml
- [HADOOP-6414] - Add command line help for -expunge command.
- [HADOOP-6439] - Shuffle deadlocks on wrong number of maps
- [HADOOP-6441] - Prevent remote CSS attacks in Hostname and UTF-7.
- [HADOOP-6451] - Contrib tests are not being run
- [HADOOP-6452] - Hadoop JSP pages don't work under a security manager
- [HADOOP-6461] - webapps aren't located correctly post-split
- [HADOOP-6462] - contrib/cloud failing, target "compile" does not exist
- [HADOOP-6478] - 0.21 - .eclipse-templates/.classpath out of sync with file system
- [HADOOP-6489] - Findbug report: LI_LAZY_INIT_STATIC, OBL_UNSATISFIED_OBLIGATION
- [HADOOP-6504] - Invalid example in the documentation of org.apache.hadoop.util.Tool
- [HADOOP-6505] - sed in build.xml fails
- [HADOOP-6520] - UGI should load tokens from the environment
- [HADOOP-6521] - FsPermission:SetUMask not updated to use new-style umask setting.
- [HADOOP-6522] - TestUTF8 fails
- [HADOOP-6540] - Contrib unit tests have invalid XML for core-site, etc.
- [HADOOP-6545] - Cached FileSystem objects can lead to wrong token being used in setting up connections
- [HADOOP-6546] - BloomMapFile can return false negatives
- [HADOOP-6548] - Replace org.mortbay.log.Log imports with commons logging
- [HADOOP-6549] - TestDoAsEffectiveUser should use ip address of the host for superuser ip check
- [HADOOP-6551] - Delegation tokens when renewed or cancelled should throw an exception that explains what went wrong
- [HADOOP-6552] - KEYTAB_KERBEROS_OPTIONS in UserGroupInformation should have options for automatic renewal of keytab based tickets
- [HADOOP-6558] - archive does not work with distcp -update
- [HADOOP-6560] - HarFileSystem throws NPE for har://hdfs-/foo
- [HADOOP-6570] - RPC#stopProxy throws NullPointerExcption if getProxyEngine(proxy) returns null
- [HADOOP-6572] - RPC responses may be out-of-order with respect to SASL
- [HADOOP-6577] - IPC server response buffer reset threshold should be configurable
- [HADOOP-6591] - HarFileSystem cannot handle paths with the space character
- [HADOOP-6593] - TextRecordInputStream doesn't close SequenceFile.Reader
- [HADOOP-6609] - Deadlock in DFSClient#getBlockLocations even with the security disabled
- [HADOOP-6630] - hadoop-config.sh fails to get executed if hadoop wrapper scripts are in path
- [HADOOP-6631] - FileUtil.fullyDelete() should continue to delete other files despite failure at any level.
- [HADOOP-6634] - AccessControlList uses full-principal names to verify acls causing queue-acls to fail
- [HADOOP-6640] - FileSystem.get() does RPC retries within a static synchronized block
- [HADOOP-6645] - Bugs on listStatus for HarFileSystem
- [HADOOP-6646] - Move HarfileSystem out of Hadoop Common.
- [HADOOP-6654] - Example in WritableComparable javadoc doesn't compile
- [HADOOP-6665] - DFSadmin commands setQuota and setSpaceQuota allowed when NameNode is in safemode.
- [HADOOP-6677] - InterfaceAudience.LimitedPrivate should take a string not an enum
- [HADOOP-6690] - FilterFileSystem doesn't overwrite setTimes
- [HADOOP-6691] - TestFileSystemCaching sometimes hang
- [HADOOP-6698] - Revert the io.serialization package to 0.20.2's api
- [HADOOP-6701] - Incorrect exit codes for "dfs -chown", "dfs -chgrp"
- [HADOOP-6702] - Incorrect exit codes for "dfs -chown", "dfs -chgrp" when input is given in wildcard format.
- [HADOOP-6703] - Prevent renaming a file, symlink or directory to itself
- [HADOOP-6719] - Missing methods on FilterFs
- [HADOOP-6722] - NetUtils.connect should check that it hasn't connected a socket to itself
- [HADOOP-6723] - unchecked exceptions thrown in IPC Connection orphan clients
- [HADOOP-6724] - IPC doesn't properly handle IOEs thrown by socket factory
- [HADOOP-6727] - Remove UnresolvedLinkException from public FileContext APIs
- [HADOOP-6740] - Move commands_manual.xml from mapreduce into common
- [HADOOP-6742] - Add methods HADOOP-6709 from to TestFilterFileSystem
- [HADOOP-6748] - Remove hadoop.cluster.administrators
- [HADOOP-6750] - UserGroupInformation incompatibility: getCurrentUGI() and setCurrentUser() missing
- [HADOOP-6782] - TestAvroRpc fails with avro-1.3.1 and avro-1.3.2
- [HADOOP-6785] - Fix references to 0.22 in 0.21 branch
- [HADOOP-6788] - [Herriot] Exception exclusion functionality is not working correctly.
- [HADOOP-6790] - Instrumented (Herriot) build uses too wide mask to include aspect files.
- [HADOOP-6800] - Harmonize JAR library versions
- [HADOOP-6819] - [Herriot] Shell command for getting the new exceptions in the logs returning exitcode 1 after executing successfully.
- [HADOOP-6821] - Document changes to memory monitoring
- [HADOOP-6826] - Revert FileSystem create method that takes CreateFlags
- [HADOOP-6828] - Herrior uses old way of accessing logs directories
- [HADOOP-6847] - Problem staging 0.21.0 artifacts to Apache Nexus Maven Repository
- [HADOOP-6851] - Fix '$bin' path duplication in setup scripts
- [HADOOP-6854] - Cannot Configure 'progress' with CreateOpts API
- [HADOOP-6860] - 'compile-fault-inject' should never be called directly.
- [HADOOP-6875] - [Herriot] Cleanup of temp. configurations is needed upon restart of a cluster
- [HADOOP-6881] - The efficient comparators aren't always used except for BytesWritable and Text
- [HADOOP-6895] - Native Libraries do not load if a different platform signature is returned from org.apache.hadoop.util.PlatformName
Improvement
- [HADOOP-1722] - Make streaming to handle non-utf8 byte array
- [HADOOP-2141] - speculative execution start up condition based on completion time
- [HADOOP-2721] - Use job control for tasks (and therefore for pipes and streaming)
- [HADOOP-2838] - Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni
- [HADOOP-2898] - HOD should allow setting MapReduce UI ports within a port range
- [HADOOP-3659] - Patch to allow hadoop native to compile on Mac OS X
- [HADOOP-3953] - Sticky bit for directories
- [HADOOP-4191] - Add a testcase for jobhistory
- [HADOOP-4365] - Configuration.getProps() should be made protected for ease of overriding
- [HADOOP-4372] - Improve the way the job history files are managed during job recovery
- [HADOOP-4546] - Minor fix in dfs to make hadoop work in AIX
- [HADOOP-4656] - Add a user to groups mapping service
- [HADOOP-4788] - Set mapred.fairscheduler.assignmultiple to true by default
- [HADOOP-4794] - separate branch for HadoopVersionAnnotation
- [HADOOP-4842] - Streaming combiner should allow command, not just JavaClass
- [HADOOP-4859] - Make the M/R Job output dir unique for Daily rolling
- [HADOOP-4868] - Split the hadoop script into 3 parts
- [HADOOP-4885] - Try to restore failed replicas of Name Node storage (at checkpoint time)
- [HADOOP-4895] - Remove deprecated methods in DFSClient
- [HADOOP-4936] - Improvements to TestSafeMode
- [HADOOP-5015] - Separate block/replica management code from FSNamesystem
- [HADOOP-5023] - Add Tomcat support to hdfsproxy
- [HADOOP-5033] - chukwa writer API is confusing
- [HADOOP-5038] - remove System.out.println statement
- [HADOOP-5088] - include releaseaudit as part of test-patch.sh script
- [HADOOP-5094] - Show dead nodes information in dfsadmin -report
- [HADOOP-5101] - optimizing build.xml target dependencies
- [HADOOP-5107] - split the core, hdfs, and mapred jars from each other and publish them independently to the Maven repository
- [HADOOP-5124] - A few optimizations to FsNamesystem#RecentInvalidateSets
- [HADOOP-5126] - Empty file BlocksWithLocations.java should be removed
- [HADOOP-5135] - Separate the core, hdfs and mapred junit tests
- [HADOOP-5144] - manual way of turning on restore of failed storage replicas for namenode
- [HADOOP-5147] - remove refs to slaves file
- [HADOOP-5163] - FSNamesystem#getRandomDatanode() should not use Replicator to choose a random datanode
- [HADOOP-5176] - TestDFSIO reports itself as TestFDSIO
- [HADOOP-5196] - avoiding unnecessary byte[] allocation in SequenceFile.CompressedBytes and SequenceFile.UncompressedBytes
- [HADOOP-5205] - Change CHUKWA_IDENT_STRING from "demo" to "TODO-AGENTS-INSTANCE-NAME"
- [HADOOP-5222] - Add offset in client trace
- [HADOOP-5240] - 'ant javadoc' does not check whether outputs are up to date and always rebuilds
- [HADOOP-5264] - TaskTracker should have single conf reference
- [HADOOP-5266] - Values Iterator should support "mark" and "reset"
- [HADOOP-5279] - test-patch.sh scirpt should just call the test-core target as part of runtestcore function.
- [HADOOP-5317] - Provide documentation for LazyOutput Feature
- [HADOOP-5331] - KFS: Add support for append
- [HADOOP-5364] - Adding SSL certificate expiration warning to hdfsproxy
- [HADOOP-5365] - hdfsprxoy should log every access
- [HADOOP-5369] - Small tweaks to reduce MapFile index size
- [HADOOP-5396] - Queue ACLs should be refreshed without requiring a restart of the job tracker
- [HADOOP-5419] - Provide a way for users to find out what operations they can do on which M/R queues
- [HADOOP-5423] - It should be posible to specify metadata for the output file produced by SequenceFile.Sorter.sort
- [HADOOP-5438] - Merge FileSystem.create and FileSystem.append
- [HADOOP-5450] - Add support for application-specific typecodes to typed bytes
- [HADOOP-5455] - default "hadoop-metrics.properties" doesn't mention "rpc" context
- [HADOOP-5485] - Authorisation machanism required for acceesing jobtracker url :- jobtracker.com:port/scheduler
- [HADOOP-5494] - IFile.Reader should have a nextRawKey/nextRawValue
- [HADOOP-5500] - Allow number of fields to be supplied when field names are not known in DBOutputFormat#setOutput()
- [HADOOP-5502] - Backup and checkpoint nodes should be documented
- [HADOOP-5509] - PendingReplicationBlocks should not start monitor in constructor.
- [HADOOP-5572] - The map progress value should have a separate phase for doing the final sort.
- [HADOOP-5589] - TupleWritable: Lift implicit limit on the number of values that can be stored
- [HADOOP-5595] - NameNode does not need to run a replicator to choose a random DataNode
- [HADOOP-5596] - Make ObjectWritable support EnumSet
- [HADOOP-5603] - Improve block placement performance
- [HADOOP-5613] - change S3Exception to checked exception
- [HADOOP-5618] - Convert Storage.storageDirs into a map.
- [HADOOP-5620] - discp can preserve modification times of files
- [HADOOP-5625] - Add I/O duration time in client trace
- [HADOOP-5638] - More improvement on block placement performance
- [HADOOP-5657] - Validate data passed through TestReduceFetch
- [HADOOP-5664] - Use of ReentrantLock.lock() in MapOutputBuffer takes up too much cpu time
- [HADOOP-5675] - DistCp should not launch a job if it is not necessary
- [HADOOP-5687] - Hadoop NameNode throws NPE if fs.default.name is the default value
- [HADOOP-5705] - Improved tries in TotalOrderPartitioner to eliminate large leaf nodes.
- [HADOOP-5717] - Create public enum class for the Framework counters in org.apache.hadoop.mapreduce
- [HADOOP-5721] - Provide EditLogFileInputStream and EditLogFileOutputStream as independent classes
- [HADOOP-5727] - Faster, simpler id.hashCode() which does not allocate memory
- [HADOOP-5733] - Add map/reduce slot capacity and lost map/reduce slot capacity to JobTracker metrics
- [HADOOP-5771] - Create unit test for LinuxTaskController
- [HADOOP-5784] - The length of the heartbeat cycle should be configurable.
- [HADOOP-5790] - Allow shuffle read and connection timeouts to be configurable
- [HADOOP-5822] - Fix javac warnings in several dfs tests related to unncessary casts
- [HADOOP-5838] - Remove a few javac warnings under hdfs
- [HADOOP-5839] - fixes to ec2 scripts to allow remote job submission
- [HADOOP-5854] - findbugs : fix "Inconsistent Synchronization" warnings in hdfs
- [HADOOP-5857] - Refactor hdfs jsp codes
- [HADOOP-5858] - Eliminate UTF8 and fix warnings in test/hdfs-with-mr package
- [HADOOP-5867] - Cleaning NNBench* off javac warnings
- [HADOOP-5873] - Remove deprecated methods randomDataNode() and getDatanodeByIndex(..) in FSNamesystem
- [HADOOP-5879] - GzipCodec should read compression level etc from configuration
- [HADOOP-5890] - Use exponential backoff on Thread.sleep during DN shutdown
- [HADOOP-5896] - Remove the dependency of GenericOptionsParser on Option.withArgPattern
- [HADOOP-5897] - Add more Metrics to Namenode to capture heap usage
- [HADOOP-5925] - EC2 scripts should exit on error
- [HADOOP-5961] - DataNode should understand generic hadoop options
- [HADOOP-5967] - Sqoop should only use a single map task
- [HADOOP-5968] - Sqoop should only print a warning about mysql import speed once
- [HADOOP-5976] - create script to provide classpath for external tools
- [HADOOP-6099] - Allow configuring the IPC module to send pings
- [HADOOP-6105] - Provide a way to automatically handle backward compatibility of deprecated keys
- [HADOOP-6106] - Provide an option in ShellCommandExecutor to timeout commands that do not complete within a certain amount of time.
- [HADOOP-6109] - Handle large (several MB) text input lines in a reasonable amount of time
- [HADOOP-6133] - ReflectionUtils performance regression
- [HADOOP-6146] - Upgrade to JetS3t version 0.7.1
- [HADOOP-6148] - Implement a pure Java CRC32 calculator
- [HADOOP-6150] - Need to be able to instantiate a comparator instance from a comparator string without creating a TFile.Reader object
- [HADOOP-6160] - releaseaudit (rats) should not be run againt the entire release binary
- [HADOOP-6161] - Add get/setEnum to Configuration
- [HADOOP-6163] - Progress class should provide an api if phases exist
- [HADOOP-6166] - Improve PureJavaCrc32
- [HADOOP-6182] - Adding Apache License Headers and reduce releaseaudit warnings to zero
- [HADOOP-6201] - FileSystem::ListStatus should throw FileNotFoundException
- [HADOOP-6203] - Improve error message when moving to trash fails due to quota issue
- [HADOOP-6204] - Implementing aspects development and fault injeciton framework for Hadoop
- [HADOOP-6216] - HDFS Web UI displays comments from dfs.exclude file and counts them as dead nodes
- [HADOOP-6224] - Add a method to WritableUtils performing a bounded read of a String
- [HADOOP-6233] - Changes in common to rename the config keys as detailed in HDFS-531.
- [HADOOP-6246] - Update umask code to use key deprecation facilities from HADOOP-6105
- [HADOOP-6252] - Provide method to determine if a deprecated key was set in the config file
- [HADOOP-6267] - build-contrib.xml unnecessarily enforces that contrib projects be located in contrib/ dir
- [HADOOP-6268] - Add ivy jar to .gitignore
- [HADOOP-6271] - Fix FileContext to allow both recursive and non recursive create and mkdir
- [HADOOP-6279] - Add JVM memory usage to JvmMetrics
- [HADOOP-6289] - Add interface classification stable & scope to common
- [HADOOP-6299] - Use JAAS LoginContext for our login
- [HADOOP-6301] - Need to post Injection HowTo to Apache Hadoop's Wiki
- [HADOOP-6305] - Unify build property names to facilitate cross-projects modifications
- [HADOOP-6307] - Support reading on un-closed SequenceFile
- [HADOOP-6318] - Upgrade to Avro 1.2.0
- [HADOOP-6326] - Hundson runs should check for AspectJ warnings and report failure if any is present
- [HADOOP-6343] - Stack trace of any runtime exceptions should be recorded in the server logs.
- [HADOOP-6366] - Reduce ivy console output to ovservable level
- [HADOOP-6367] - Move Access Token implementation from Common to HDFS
- [HADOOP-6394] - Helper class for FileContext tests
- [HADOOP-6400] - Log errors getting Unix UGI
- [HADOOP-6403] - Deprecate EC2 bash scripts
- [HADOOP-6407] - Have a way to automatically update Eclipse .classpath file when new libs are added to the classpath through Ivy
- [HADOOP-6413] - Move TestReflectionUtils to Common
- [HADOOP-6420] - String-to-String Maps should be embeddable in Configuration
- [HADOOP-6434] - Make HttpServer slightly easier to manage/diagnose faults with
- [HADOOP-6435] - Make RPC.waitForProxy with timeout public
- [HADOOP-6443] - Serialization classes accept invalid metadata
- [HADOOP-6467] - Performance improvement for liststatus on directories in hadoop archives.
- [HADOOP-6471] - StringBuffer -> StringBuilder - conversion of references as necessary
- [HADOOP-6479] - TestUTF8 assertions could fail with better text
- [HADOOP-6492] - Make avro serialization APIs public
- [HADOOP-6515] - Make maximum number of http threads configurable
- [HADOOP-6518] - Kerberos login in UGI should honor KRB5CCNAME
- [HADOOP-6531] - add FileUtil.fullyDeleteContents(dir) api to delete contents of a directory
- [HADOOP-6534] - LocalDirAllocator should use whitespace trimming configuration getters
- [HADOOP-6537] - Proposal for exceptions thrown by FileContext and Abstract File System
- [HADOOP-6543] - Allow authentication-enabled RPC clients to connect to authentication-disabled RPC servers
- [HADOOP-6559] - The RPC client should try to re-login when it detects that the TGT expired
- [HADOOP-6569] - FsShell#cat should avoid calling unecessary getFileStatus before opening a file to read
- [HADOOP-6579] - A utility for reading and writing tokens into a URL safe string.
- [HADOOP-6582] - Token class should have a toString, equals and hashcode method
- [HADOOP-6583] - Capture metrics for authentication/authorization at the RPC layer
- [HADOOP-6585] - Add FileStatus#isDirectory and isFile
- [HADOOP-6589] - Better error messages for RPC clients when authentication fails
- [HADOOP-6635] - Install or deploy source jars to maven repo
- [HADOOP-6657] - Common portion of MAPREDUCE-1545
- [HADOOP-6678] - Remove FileContext#isFile, isDirectory and exists
- [HADOOP-6686] - Remove redundant exception class name in unwrapped exceptions thrown at the RPC client
- [HADOOP-6709] - Re-instate deprecated FileSystem methods that were removed after 0.20
- [HADOOP-6713] - The RPC server Listener thread is a scalability bottleneck
- [HADOOP-6717] - Log levels in o.a.h.security.Groups too high
- [HADOOP-6769] - Add an API in FileSystem to get FileSystem instances based on users
- [HADOOP-6777] - Implement a functionality for suspend and resume a process.
- [HADOOP-6794] - Move configuration and script files post split
- [HADOOP-6798] - Align Ivy version for all Hadoop subprojects.
- [HADOOP-6813] - Add a new newInstance method in FileSystem that takes a "user" as argument
New Feature
- [HADOOP-3741] - SecondaryNameNode has http server on dfs.secondary.http.address but without any contents
- [HADOOP-4012] - Providing splitting support for bzip2 compressed files
- [HADOOP-4268] - Permission checking in fsck
- [HADOOP-4359] - Access Token: Support for data access authorization checking on DataNodes
- [HADOOP-4368] - Superuser privileges required to do "df"
- [HADOOP-4539] - Streaming Edits to a Backup Node.
- [HADOOP-4756] - Create a command line tool to access JMX exported properties from a NameNode server
- [HADOOP-4768] - Dynamic Priority Scheduler that allows queue shares to be controlled dynamically by a currency
- [HADOOP-4829] - Allow FileSystem shutdown hook to be disabled
- [HADOOP-4861] - Add disk usage with human-readable size (-duh)
- [HADOOP-4927] - Part files on the output filesystem are created irrespective of whether the corresponding task has anything to write there
- [HADOOP-4952] - Improved files system interface for the application writer.
- [HADOOP-5018] - Chukwa should support pipelined writers
- [HADOOP-5042] - Add expiration handling to the chukwa log4j appender
- [HADOOP-5052] - Add an example for computing exact digits of Pi
- [HADOOP-5170] - Set max map/reduce tasks on a per-job basis, either per-node or cluster-wide
- [HADOOP-5175] - Option to prohibit jars unpacking
- [HADOOP-5232] - preparing HadoopPatchQueueAdmin.sh,test-patch.sh scripts to run builds on hudson slaves.
- [HADOOP-5257] - Export namenode/datanode functionality through a pluggable RPC layer
- [HADOOP-5258] - Provide dfsadmin functionality to report on namenode's view of network topology
- [HADOOP-5363] - Proxying for multiple HDFS clusters of different versions
- [HADOOP-5366] - Support for retrieving files using standard HTTP clients like curl
- [HADOOP-5467] - Create an offline fsimage image viewer
- [HADOOP-5469] - Exposing Hadoop metrics via HTTP
- [HADOOP-5518] - MRUnit unit test library
- [HADOOP-5528] - Binary partitioner
- [HADOOP-5643] - Ability to blacklist tasktracker
- [HADOOP-5745] - Allow setting the default value of maxRunningJobs for all pools
- [HADOOP-5752] - Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
- [HADOOP-5815] - Sqoop: A database import tool for Hadoop
- [HADOOP-5844] - Use mysqldump when connecting to local mysql instance in Sqoop
- [HADOOP-5887] - Sqoop should create tables in Hive metastore after importing to HDFS
- [HADOOP-5913] - Allow administrators to be able to start and stop queues
- [HADOOP-6120] - Add support for Avro types in hadoop
- [HADOOP-6165] - Add metadata to Serializations
- [HADOOP-6173] - src/native/packageNativeHadoop.sh only packages files with "hadoop" in the name
- [HADOOP-6185] - Replace FSDataOutputStream#sync() by hflush()
- [HADOOP-6218] - Split TFile by Record Sequence Number
- [HADOOP-6226] - Create a LimitedByteArrayOutputStream that does not expand its buffer on write
- [HADOOP-6235] - Adding a new method for getting server default values from a FileSystem
- [HADOOP-6270] - FileContext needs to provide deleteOnExit functionality
- [HADOOP-6313] - Expose flush APIs to application users
- [HADOOP-6323] - Serialization should provide comparators
- [HADOOP-6332] - Large-scale Automated Test Framework
- [HADOOP-6337] - Update FilterInitializer class to be more visible and take a conf for further development
- [HADOOP-6408] - Add a /conf servlet to dump running configuration
- [HADOOP-6415] - Adding a common token interface for both job token and delegation token
- [HADOOP-6419] - Change RPC layer to support SASL based mutual authentication
- [HADOOP-6433] - Add AsyncDiskService that is used in both hdfs and mapreduce
- [HADOOP-6497] - Introduce wrapper around FSDataInputStream providing Avro SeekableInput interface
- [HADOOP-6510] - doAs for proxy user
- [HADOOP-6517] - Ability to add/get tokens from UserGroupInformation
- [HADOOP-6547] - Move the Delegation Token feature to common since both HDFS and MapReduce needs it
- [HADOOP-6566] - Hadoop daemons should not start up if the ownership/permissions on the directories used at runtime are misconfigured
- [HADOOP-6573] - Delegation Tokens should be persisted.
- [HADOOP-6594] - Update hdfs script to provide fetchdt tool
- [HADOOP-6869] - Functionality to create file or folder on a remote daemon side
Task
- [HADOOP-6155] - deprecate Record IO
- [HADOOP-6217] - Hadoop Doc Split: Common Docs
- [HADOOP-6292] - Native Libraries Guide - Update
- [HADOOP-6321] - Hadoop Common - Site logo
- [HADOOP-6329] - Add build-fi directory to the ignore list
- [HADOOP-6346] - Add support for specifying unpack pattern regex to RunJar.unJar
- [HADOOP-6353] - Create Apache Wiki page for JSure and FlashLight tools
- [HADOOP-6477] - 0.21.0 - upload of the latest snapshot to apache snapshot repository
- [HADOOP-6507] - Hadoop Common Docs - delete 3 doc files that do not belong under Common
- [HADOOP-6772] - Utilities for system tests specific.
- [HADOOP-6839] - [Herriot] Implement a functionality for getting the user list for creating proxy users.
Test
- [HADOOP-5080] - Update TestCLI with additional test cases.
- [HADOOP-5081] - Split TestCLI into HDFS, Mapred and Core tests
- [HADOOP-5457] - Failing contrib tests should not stop the build
- [HADOOP-5948] - Modify TestJavaSerialization to use LocalJobRunner instead of MiniMR/DFS cluster
- [HADOOP-5952] - Hudson -1 wording change
- [HADOOP-5955] - TestFileOuputFormat can use LOCAL_MR instead of CLUSTER_MR
- [HADOOP-6176] - Adding a couple private methods to AccessTokenHandler for testing purposes
- [HADOOP-6222] - Core doesn't have TestCommonCLI facility
- [HADOOP-6260] - Unit tests for FileSystemContextUtil.
- [HADOOP-6261] - Junit tests for FileContextURI
- [HADOOP-6309] - Enable asserts for tests by default
- [HADOOP-6563] - Add more tests to FileContextSymlinkBaseTest that cover intermediate symlinks in paths
- [HADOOP-6689] - Add directory renaming test to FileContextMainOperationsBaseTest
- [HADOOP-6705] - jiracli fails to upload test-patch comments to jira
- [HADOOP-6738] - Move cluster_setup.xml from MapReduce to Common
- [HADOOP-6836] - [Herriot]: Generic method for adding/modifying the attributes for new configuration.
Wish