Hadoop 0.23.8 Release Notes
These release notes include new developer and user-facing incompatibilities, features, and major improvements.
Changes since Hadoop 0.23.7
- YARN-690.
Blocker bug reported by Daryn Sharp and fixed by Daryn Sharp (resourcemanager)
RM exits on token cancel/renew problems
The DelegationTokenRenewer thread is critical to the RM. When a non-IOException occurs, the thread calls System.exit to prevent the RM from running w/o the thread. It should be exiting only on non-RuntimeExceptions.
The problem is especially bad in 23 because the yarn protobuf layer converts IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which causes the renewer to abort the process. An UnknownHostException takes down the RM...
- YARN-548.
Major sub-task reported by Vadim Bondarev and fixed by Vadim Bondarev
Add tests for YarnUncaughtExceptionHandler
- YARN-476.
Minor bug reported by Jason Lowe and fixed by Sandy Ryza
ProcfsBasedProcessTree info message confuses users
ProcfsBasedProcessTree has a habit of emitting not-so-helpful messages such as the following:
{noformat}
2013-03-13 12:41:51,957 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28747 may have finished in the interim.
2013-03-13 12:41:51,958 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28978 may have finished in the interim.
2013-03-13 12:41:51,958 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28979 may have finished in the interim.
{noformat}
As described in MAPREDUCE-4570, this is something that naturally occurs in the process of monitoring processes via procfs. It's uninteresting at best and can confuse users who think it's a reason their job isn't running as expected when it appears in their logs.
We should either make this DEBUG or remove it entirely.
- YARN-363.
Major bug reported by Jason Lowe and fixed by Kenji Kikushima
yarn proxyserver fails to find webapps/proxy directory on startup
Starting up the proxy server fails with this error:
{noformat}
2013-01-29 17:37:41,357 FATAL webproxy.WebAppProxy (WebAppProxy.java:start(99)) - Could not start proxy web server
java.io.FileNotFoundException: webapps/proxy not found in CLASSPATH
at org.apache.hadoop.http.HttpServer.getWebAppsPath(HttpServer.java:533)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:225)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:164)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxy.start(WebAppProxy.java:90)
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.main(WebAppProxyServer.java:94)
{noformat}
- YARN-71.
Critical bug reported by Vinod Kumar Vavilapalli and fixed by Xuan Gong (nodemanager)
Ensure/confirm that the NodeManager cleans up local-dirs on restart
We have to make sure that NodeManagers cleanup their local files on restart.
It may already be working like that in which case we should have tests validating this.
- MAPREDUCE-5211.
Blocker bug reported by Jason Lowe and fixed by Jason Lowe (mrv2)
Reducer intermediate files can collide during merge
- MAPREDUCE-5168.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (mrv2)
Reducer can OOM during shuffle because on-disk output stream not released
- MAPREDUCE-5147.
Major bug reported by Robert Parker and fixed by Robert Parker (mrv2)
Maven build should create hadoop-mapreduce-client-app-VERSION.jar directly
- MAPREDUCE-5065.
Major bug reported by Mithun Radhakrishnan and fixed by Mithun Radhakrishnan (distcp)
DistCp should skip checksum comparisons if block-sizes are different on source/target.
- MAPREDUCE-5059.
Major bug reported by Jason Lowe and fixed by Omkar Vinit Joshi (jobhistoryserver , webapps)
Job overview shows average merge time larger than for any reduce attempt
- MAPREDUCE-5015.
Major test reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
Coverage fix for org.apache.hadoop.mapreduce.tools.CLI
- MAPREDUCE-4927.
Major bug reported by Jason Lowe and fixed by Ashwin Shankar (jobhistoryserver)
Historyserver 500 error due to NPE when accessing specific counters page for failed job
- MAPREDUCE-4383.
Minor bug reported by Andy Isaacson and fixed by Andy Isaacson (pipes)
HadoopPipes.cc needs to include unistd.h
- HDFS-4835.
Critical bug reported by Robert Parker and fixed by Robert Parker (webhdfs)
Port trunk WebHDFS changes to branch-0.23
- HDFS-4807.
Major bug reported by Kihwal Lee and fixed by Cristina L. Abad
DFSOutputStream.createSocketForPipeline() should not include timeout extension on connect
- HDFS-4714.
Major bug reported by Kihwal Lee and fixed by (namenode)
Log short messages in Namenode RPC server for exceptions meant for clients
- HDFS-4699.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (test)
TestPipelinesFailover#testPipelineRecoveryStress fails sporadically
- HDFS-4690.
Critical bug reported by Daryn Sharp and fixed by Daryn Sharp (namenode , security)
Namenode exits if entering safemode while secret manager is edit logging
- HDFS-4477.
Critical bug reported by Kihwal Lee and fixed by Daryn Sharp (security)
Secondary namenode may retain old tokens
- HDFS-3875.
Critical bug reported by Todd Lipcon and fixed by Kihwal Lee (datanode , hdfs-client)
Issue handling checksum errors in write pipeline
- HADOOP-9469.
Major bug reported by Thomas Graves and fixed by Robert Parker
mapreduce/yarn source jars not included in dist tarball
- HADOOP-9233.
Major test reported by Vadim Bondarev and fixed by Vadim Bondarev
Cover package org.apache.hadoop.io.compress.zlib with unit tests
- HADOOP-9222.
Major test reported by Vadim Bondarev and fixed by Vadim Bondarev
Cover package with org.apache.hadoop.io.lz4 unit tests