13 роки тому · 76934a07f3
--- a/src/docs/releasenotes.html
+++ b/src/docs/releasenotes.html
@@ -2,7 +2,7 @@
 
				 <html>
			
 
				 <head>
			
 
				 <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
			
 
				-<title>Hadoop 1.0.3 Release Notes</title>
			
 
				+<title>Hadoop 1.1.0 Release Notes</title>
			
 
				 <STYLE type="text/css">
			
 
				 		H1 {font-family: sans-serif}
			
 
				 		H2 {font-family: sans-serif; margin-left: 7mm}
			
@@ -10,11 +10,796 @@
 
				 	</STYLE>
			
 
				 </head>
			
 
				 <body>
			
 
				-<h1>Hadoop 1.0.3 Release Notes</h1>
			
 
				+<h1>Hadoop 1.1.0 Release Notes</h1>
			
 
				 		These release notes include new developer and user-facing incompatibilities, features, and major improvements. 
			
 
				 
			
 
				 <a name="changes"/>
			
 
				 
			
 
				+<h2>Changes since Hadoop 1.0.3</h2>
			
 
				+
			
 
				+<h3>Jiras with Release Notes (describe major or incompatible changes)</h3>
			
 
				+<ul>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-5464">HADOOP-5464</a>.
			
 
				+     Major bug reported by rangadi and fixed by rangadi <br>
			
 
				+     <b>DFSClient does not treat write timeout of 0 properly</b><br>
			
 
				+     <blockquote>                                          Zero values for dfs.socket.timeout and dfs.datanode.socket.write.timeout are now respected. Previously zero values for these parameters resulted in a 5 second timeout.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-6995">HADOOP-6995</a>.
			
 
				+     Minor improvement reported by tlipcon and fixed by tlipcon (security)<br>
			
 
				+     <b>Allow wildcards to be used in ProxyUsers configurations</b><br>
			
 
				+     <blockquote>                                          When configuring proxy users and hosts, the special wildcard value &quot;*&quot; may be specified to match any host or any user.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7509">HADOOP-7509</a>.
			
 
				+     Trivial improvement reported by raviprak and fixed by raviprak <br>
			
 
				+     <b>Improve message when Authentication is required</b><br>
			
 
				+     <blockquote>                    Thanks Aaron and Suresh!
<br/>
			
 
				+
			
 
				+Marking as resolved fixed since changes have gone in.
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8230">HADOOP-8230</a>.
			
 
				+     Major improvement reported by eli2 and fixed by eli <br>
			
 
				+     <b>Enable sync by default and disable append</b><br>
			
 
				+     <blockquote>                    Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you&#39;re OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag &quot;dfs.support.broken.append&quot; to true.
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8352">HADOOP-8352</a>.
			
 
				+     Major improvement reported by owen.omalley and fixed by owen.omalley <br>
			
 
				+     <b>We should always generate a new configure script for the c++ code</b><br>
			
 
				+     <blockquote>                    If you are compiling c++, the configure script will now be automatically regenerated as it should be.
<br/>
			
 
				+
			
 
				+This requires autoconf version 2.61 or greater.
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8365">HADOOP-8365</a>.
			
 
				+     Blocker improvement reported by eli2 and fixed by eli <br>
			
 
				+     <b>Add flag to disable durable sync</b><br>
			
 
				+     <blockquote>                    This patch enables durable sync by default. Installation where HBase was not used, that used to run without setting &quot;dfs.support.append&quot; or setting it to false explicitly in the configuration, must add a new flag &quot;dfs.durable.sync&quot; and set it to false to preserve the previous semantics.
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2318">HDFS-2318</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Provide authentication to webhdfs using SPNEGO</b><br>
			
 
				+     <blockquote>                                          Added two new conf properties dfs.web.authentication.kerberos.principal and dfs.web.authentication.kerberos.keytab for the SPNEGO servlet filter.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2338">HDFS-2338</a>.
			
 
				+     Major sub-task reported by jnp and fixed by jnp (webhdfs)<br>
			
 
				+     <b>Configuration option to enable/disable webhdfs.</b><br>
			
 
				+     <blockquote>                                          Added a conf property dfs.webhdfs.enabled for enabling/disabling webhdfs.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2465">HDFS-2465</a>.
			
 
				+     Major improvement reported by tlipcon and fixed by tlipcon (data-node, performance)<br>
			
 
				+     <b>Add HDFS support for fadvise readahead and drop-behind</b><br>
			
 
				+     <blockquote>                    HDFS now has the ability to use posix_fadvise and sync_data_range syscalls to manage the OS buffer cache. This support is currently considered experimental, and may be enabled by configuring the following keys:
<br/>
			
 
				+
			
 
				+dfs.datanode.drop.cache.behind.writes - set to true to drop data out of the buffer cache after writing
<br/>
			
 
				+
			
 
				+dfs.datanode.drop.cache.behind.reads - set to true to drop data out of the buffer cache when performing sequential reads
<br/>
			
 
				+
			
 
				+dfs.datanode.sync.behind.writes - set to true to trigger dirty page writeback immediately after writing data
<br/>
			
 
				+
			
 
				+dfs.datanode.readahead.bytes - set to a non-zero value to trigger readahead for sequential reads
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2741">HDFS-2741</a>.
			
 
				+     Minor bug reported by markus17 and fixed by  <br>
			
 
				+     <b>dfs.datanode.max.xcievers missing in 0.20.205.0</b><br>
			
 
				+     <blockquote>                                          Document and raise the maximum allowed transfer threads on a DataNode to 4096. This helps Apache HBase in particular.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3044">HDFS-3044</a>.
			
 
				+     Major improvement reported by eli2 and fixed by cmccabe (name-node)<br>
			
 
				+     <b>fsck move should be non-destructive by default</b><br>
			
 
				+     <blockquote>                    The fsck &quot;move&quot; option is no longer destructive. It copies the accessible blocks of corrupt files to lost and found as before, but no longer deletes the corrupt files after copying the blocks. The original, destructive behavior can be enabled by specifying both the &quot;move&quot; and &quot;delete&quot; options. 
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3055">HDFS-3055</a>.
			
 
				+     Minor new feature reported by cmccabe and fixed by cmccabe <br>
			
 
				+     <b>Implement recovery mode for branch-1</b><br>
			
 
				+     <blockquote>                                          This is a new feature.  It is documented in hdfs_user_guide.xml.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3094">HDFS-3094</a>.
			
 
				+     Major improvement reported by arpitgupta and fixed by arpitgupta <br>
			
 
				+     <b>add -nonInteractive and -force option to namenode -format command</b><br>
			
 
				+     <blockquote>                                          The &#39;namenode -format&#39; command now supports the flags &#39;-nonInteractive&#39; and &#39;-force&#39; to improve usefulness without user input.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3518">HDFS-3518</a>.
			
 
				+     Major bug reported by bikassaha and fixed by szetszwo (hdfs client)<br>
			
 
				+     <b>Provide API to check HDFS operational state</b><br>
			
 
				+     <blockquote>                                          Add a utility method HdfsUtils.isHealthy(uri) for checking if the given HDFS is healthy.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3522">HDFS-3522</a>.
			
 
				+     Major bug reported by brandonli and fixed by brandonli (name-node)<br>
			
 
				+     <b>If NN is in safemode, it should throw SafeModeException when getBlockLocations has zero locations</b><br>
			
 
				+     <blockquote>                                          getBlockLocations(), and hence open() for read, will now throw SafeModeException if the NameNode is still in safe mode and there are no replicas reported yet for one of the blocks in the file.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2517">MAPREDUCE-2517</a>.
			
 
				+     Major task reported by vinaythota and fixed by vinaythota (contrib/gridmix)<br>
			
 
				+     <b>Porting Gridmix v3 system tests into trunk branch.</b><br>
			
 
				+     <blockquote>                                          Adds system tests to Gridmix. These system tests cover various features like job types (load and sleep), user resolvers (round-robin, submitter-user, echo) and  submission modes (stress, replay and serial).
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3008">MAPREDUCE-3008</a>.
			
 
				+     Major sub-task reported by amar_kamat and fixed by amar_kamat (contrib/gridmix)<br>
			
 
				+     <b>[Gridmix] Improve cumulative CPU usage emulation for short running tasks</b><br>
			
 
				+     <blockquote>                                          Improves cumulative CPU emulation for short running tasks.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3118">MAPREDUCE-3118</a>.
			
 
				+     Major new feature reported by ravidotg and fixed by ravidotg (contrib/gridmix, tools/rumen)<br>
			
 
				+     <b>Backport Gridmix and Rumen features from trunk to Hadoop 0.20 security branch</b><br>
			
 
				+     <blockquote>                                          Backports latest features from trunk to 0.20.206 branch.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3597">MAPREDUCE-3597</a>.
			
 
				+     Major improvement reported by ravidotg and fixed by ravidotg (tools/rumen)<br>
			
 
				+     <b>Provide a way to access other info of history file from Rumentool</b><br>
			
 
				+     <blockquote>                                          Rumen now provides {{Parsed*}} objects. These objects provide extra information that are not provided by {{Logged*}} objects.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4087">MAPREDUCE-4087</a>.
			
 
				+     Major bug reported by ravidotg and fixed by ravidotg <br>
			
 
				+     <b>[Gridmix] GenerateDistCacheData job of Gridmix can become slow in some cases</b><br>
			
 
				+     <blockquote>                                          Fixes the issue of GenerateDistCacheData  job slowness.
			
 
				+
			
 
				+      
			
 
				+</blockquote></li>
			
 
				+
			
 
				+</ul>
			
 
				+
			
 
				+
			
 
				+<h3>Other Jiras (describe bug fixes and minor changes)</h3>
			
 
				+<ul>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-5836">HADOOP-5836</a>.
			
 
				+     Major bug reported by nowland and fixed by nowland (fs/s3)<br>
			
 
				+     <b>Bug in S3N handling of directory markers using an object with a trailing &quot;/&quot; causes jobs to fail</b><br>
			
 
				+     <blockquote>Some tools which upload to S3 and use a object terminated with a &quot;/&quot; as a directory marker, for instance &quot;s3n://mybucket/mydir/&quot;. If asked to iterate that &quot;directory&quot; via listStatus(), then the current code will return an empty file &quot;&quot;, which the InputFormatter happily assigns to a split, and which later causes a task to fail, and probably the job to fail. </blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-6527">HADOOP-6527</a>.
			
 
				+     Major bug reported by jghoman and fixed by ivanmi (security)<br>
			
 
				+     <b>UserGroupInformation::createUserForTesting clobbers already defined group mappings</b><br>
			
 
				+     <blockquote>In UserGroupInformation::createUserForTesting the follow code creates a new groups instance, obliterating any groups that have been previously defined in the static groups field.<br>{code}    if (!(groups instanceof TestingGroups)) {<br>      groups = new TestingGroups();<br>    }<br>{code}<br>This becomes a problem in tests that start a Mini{DFS,MR}Cluster and then create a testing user.  The user that started the user (generally the real user running the test) immediately has their groups wiped out and is...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-6546">HADOOP-6546</a>.
			
 
				+     Major bug reported by cjjefcoat and fixed by cjjefcoat (io)<br>
			
 
				+     <b>BloomMapFile can return false negatives</b><br>
			
 
				+     <blockquote>BloomMapFile can return false negatives when using keys of varying sizes.  If the amount of data written by the write() method of your key class differs between instance of your key, your BloomMapFile may return false negatives.<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-6947">HADOOP-6947</a>.
			
 
				+     Major bug reported by tlipcon and fixed by tlipcon (security)<br>
			
 
				+     <b>Kerberos relogin should set refreshKrb5Config to true</b><br>
			
 
				+     <blockquote>In working on securing a daemon that uses two different principals from different threads, I found that I wasn&apos;t able to login from a second keytab after I&apos;d logged in from the first. This is because we don&apos;t set the refreshKrb5Config in the Configuration for the Krb5LoginModule - hence it won&apos;t switch over to the correct keytab file if it&apos;s different than the first.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7154">HADOOP-7154</a>.
			
 
				+     Minor improvement reported by tlipcon and fixed by tlipcon (scripts)<br>
			
 
				+     <b>Should set MALLOC_ARENA_MAX in hadoop-config.sh</b><br>
			
 
				+     <blockquote>New versions of glibc present in RHEL6 include a new arena allocator design. In several clusters we&apos;ve seen this new allocator cause huge amounts of virtual memory to be used, since when multiple threads perform allocations, they each get their own memory arena. On a 64-bit system, these arenas are 64M mappings, and the maximum number of arenas is 8 times the number of cores. We&apos;ve observed a DN process using 14GB of vmem for only 300M of resident set. This causes all kinds of nasty issues fo...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7297">HADOOP-7297</a>.
			
 
				+     Trivial bug reported by nonop92 and fixed by qwertymaniac (documentation)<br>
			
 
				+     <b>Error in the documentation regarding Checkpoint/Backup Node</b><br>
			
 
				+     <blockquote>On http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_user_guide.html#Checkpoint+Node: the command bin/hdfs namenode -checkpoint required to launch the backup/checkpoint node does not exist.<br>I have removed this from the docs.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7461">HADOOP-7461</a>.
			
 
				+     Major bug reported by rbodkin and fixed by gkesavan (build)<br>
			
 
				+     <b>Jackson Dependency Not Declared in Hadoop POM</b><br>
			
 
				+     <blockquote>(COMMENT: This bug still affects 0.20.205.0, four months after the bug was filed.  This causes total failure, and the fix is trivial for whoever manages the POM -- just add the missing dependency! --ben)<br><br>This issue was identified and the fix &amp; workaround was documented at <br><br>https://issues.cloudera.org/browse/DISTRO-44<br><br>The issue affects use of Hadoop 0.20.203.0 from the Maven central repo. I built a job using that maven repo and ran it, resulting in this failure:<br><br>Exception in thread &quot;main&quot; ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7621">HADOOP-7621</a>.
			
 
				+     Critical bug reported by tucu00 and fixed by atm (security)<br>
			
 
				+     <b>alfredo config should be in a file not readable by users</b><br>
			
 
				+     <blockquote>[thxs ATM for point this one out]<br><br>Alfredo configuration currently is stored in the core-site.xml file, this file is readable by users (it must be as Configuration defaults must be loaded).<br><br>One of Alfredo config values is a secret which is used by all nodes to sign/verify the authentication cookie.<br><br>A user could get hold of this secret and forge authentication cookies for other users.<br><br>Because of this the Alfredo configuration, should be move to a user non-readable file.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7629">HADOOP-7629</a>.
			
 
				+     Major bug reported by phunt and fixed by tlipcon <br>
			
 
				+     <b>regression with MAPREDUCE-2289 - setPermission passed immutable FsPermission (rpc failure)</b><br>
			
 
				+     <blockquote>MAPREDUCE-2289 introduced the following change:<br><br>{noformat}<br>+        fs.setPermission(stagingArea, JOB_DIR_PERMISSION);<br>{noformat}<br><br>JOB_DIR_PERMISSION is an immutable FsPermission which cannot be used in RPC calls, it results in the following exception:<br><br>{noformat}<br>2011-09-08 16:31:45,187 WARN org.apache.hadoop.ipc.Server: Unable to read call parameters for client 127.0.0.1<br>java.lang.RuntimeException: java.lang.NoSuchMethodException: org.apache.hadoop.fs.permission.FsPermission$2.&lt;init&gt;()<br>   ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7634">HADOOP-7634</a>.
			
 
				+     Minor bug reported by eli and fixed by eli (documentation, security)<br>
			
 
				+     <b>Cluster setup docs specify wrong owner for task-controller.cfg </b><br>
			
 
				+     <blockquote>The cluster setup docs indicate task-controller.cfg must be owned by the user running TaskTracker but the code checks for root. We should update the docs to reflect the real requirement.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7645">HADOOP-7645</a>.
			
 
				+     Blocker bug reported by atm and fixed by jnp (security)<br>
			
 
				+     <b>HTTP auth tests requiring Kerberos infrastructure are not disabled on branch-0.20-security</b><br>
			
 
				+     <blockquote>The back-port of HADOOP-7119 to branch-0.20-security included tests which require Kerberos infrastructure in order to run. In trunk and 0.23, these are disabled unless one enables the {{testKerberos}} maven profile. In branch-0.20-security, these tests are always run regardless, and so fail most of the time.<br><br>See this Jenkins build for an example: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-0.20-security/26/</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7653">HADOOP-7653</a>.
			
 
				+     Minor bug reported by natty and fixed by natty (build)<br>
			
 
				+     <b>tarball doesn&apos;t include .eclipse.templates</b><br>
			
 
				+     <blockquote>The hadoop tarball doesn&apos;t include .eclipse.templates. This results in a failure to successfully run ant eclipse-files:<br><br>eclipse-files:<br><br>BUILD FAILED<br>/home/natty/Downloads/hadoop-0.20.2/build.xml:1606: /home/natty/Downloads/hadoop-0.20.2/.eclipse.templates not found.<br><br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7664">HADOOP-7664</a>.
			
 
				+     Minor improvement reported by raviprak and fixed by raviprak (conf)<br>
			
 
				+     <b>o.a.h.conf.Configuration complains of overriding final parameter even if the value with which its attempting to override is the same. </b><br>
			
 
				+     <blockquote>o.a.h.conf.Configuration complains of overriding final parameter even if the value with which its attempting to override is the same. </blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7665">HADOOP-7665</a>.
			
 
				+     Major bug reported by atm and fixed by atm (security)<br>
			
 
				+     <b>branch-0.20-security doesn&apos;t include SPNEGO settings in core-default.xml</b><br>
			
 
				+     <blockquote>Looks like back-port of HADOOP-7119 to branch-0.20-security missed the changes to {{core-default.xml}}.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7666">HADOOP-7666</a>.
			
 
				+     Major bug reported by atm and fixed by atm (security)<br>
			
 
				+     <b>branch-0.20-security doesn&apos;t include o.a.h.security.TestAuthenticationFilter</b><br>
			
 
				+     <blockquote>Looks like the back-port of HADOOP-7119 to branch-0.20-security missed {{o.a.h.security.TestAuthenticationFilter}}.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7674">HADOOP-7674</a>.
			
 
				+     Major bug reported by jnp and fixed by jnp <br>
			
 
				+     <b>TestKerberosName fails in 20 branch.</b><br>
			
 
				+     <blockquote>TestKerberosName fails in 20 branch. In fact this test has got duplicated in 20, with a little change to the rules.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7745">HADOOP-7745</a>.
			
 
				+     Major bug reported by raviprak and fixed by raviprak <br>
			
 
				+     <b>I switched variable names in HADOOP-7509</b><br>
			
 
				+     <blockquote>As Aaron pointed out on https://issues.apache.org/jira/browse/HADOOP-7509?focusedCommentId=13126725&amp;page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13126725 I stupidly swapped CommonConfigurationKeys.HADOOP_SECURITY_AUTHENTICATION with CommonConfigurationKeys.HADOOP_SECURITY_AUTHORIZATION.<br><br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7753">HADOOP-7753</a>.
			
 
				+     Major sub-task reported by tlipcon and fixed by tlipcon (io, native, performance)<br>
			
 
				+     <b>Support fadvise and sync_data_range in NativeIO, add ReadaheadPool class</b><br>
			
 
				+     <blockquote>This JIRA adds JNI wrappers for sync_data_range and posix_fadvise. It also implements a ReadaheadPool class for future use from HDFS and MapReduce.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7806">HADOOP-7806</a>.
			
 
				+     Major new feature reported by qwertymaniac and fixed by qwertymaniac (util)<br>
			
 
				+     <b>Support binding to sub-interfaces</b><br>
			
 
				+     <blockquote>Right now, with the {{DNS}} class, we can look up IPs of provided interface names ({{eth0}}, {{vm1}}, etc.). However, it would be useful if the I/F -&gt; IP lookup also took a look at subinterfaces ({{eth0:1}}, etc.) and allowed binding to only a specified subinterface / virtual interface.<br><br>This should be fairly easy to add, by matching against all available interfaces&apos; subinterfaces via Java.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7823">HADOOP-7823</a>.
			
 
				+     Major new feature reported by tbroberg and fixed by apurtell <br>
			
 
				+     <b>port HADOOP-4012 to branch-1 (splitting support for bzip2)</b><br>
			
 
				+     <blockquote>Please see HADOOP-4012 - Providing splitting support for bzip2 compressed files.<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7870">HADOOP-7870</a>.
			
 
				+     Major bug reported by jmhsieh and fixed by jmhsieh <br>
			
 
				+     <b>fix SequenceFile#createWriter with boolean createParent arg to respect createParent.</b><br>
			
 
				+     <blockquote>After HBASE-6840, one set of calls to createNonRecursive(...) seems fishy - the new boolean createParent variable from the signature isn&apos;t used at all.  <br><br>{code}<br>+  public static Writer<br>+    createWriter(FileSystem fs, Configuration conf, Path name,<br>+                 Class keyClass, Class valClass, int bufferSize,<br>+                 short replication, long blockSize, boolean createParent,<br>+                 CompressionType compressionType, CompressionCodec codec,<br>+                 Metadata meta...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7879">HADOOP-7879</a>.
			
 
				+     Trivial bug reported by jmhsieh and fixed by jmhsieh <br>
			
 
				+     <b>DistributedFileSystem#createNonRecursive should also incrementWriteOps statistics.</b><br>
			
 
				+     <blockquote>This method:<br><br>{code}<br> public FSDataOutputStream createNonRecursive(Path f, FsPermission permission,<br>      boolean overwrite,<br>      int bufferSize, short replication, long blockSize, <br>      Progressable progress) throws IOException {<br>    return new FSDataOutputStream<br>        (dfs.create(getPathName(f), permission, <br>                    overwrite, false, replication, blockSize, progress, bufferSize), <br>         statistics);<br>  }<br>{code}<br><br>Needs a statistics.incrementWriteOps(1);</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7898">HADOOP-7898</a>.
			
 
				+     Minor bug reported by sureshms and fixed by sureshms (security)<br>
			
 
				+     <b>Fix javadoc warnings in AuthenticationToken.java</b><br>
			
 
				+     <blockquote>Fix the following javadoc warning:<br>[WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationToken.java:33: warning - Tag @link: reference not found: HttpServletRequest<br>[WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/trunk/hadoop-common-project/hadoop-auth/src/main/java/org/apache/hadoop/security/authentication/server/AuthenticationToken.java...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7908">HADOOP-7908</a>.
			
 
				+     Trivial bug reported by eli and fixed by eli (documentation)<br>
			
 
				+     <b>Fix three javadoc warnings on branch-1</b><br>
			
 
				+     <blockquote>Fix 3 javadoc warnings on branch-1:<br><br>  [javadoc] /home/eli/src/hadoop-branch-1/src/core/org/apache/hadoop/io/Sequence<br>File.java:428: warning - @param argument &quot;progress&quot; is not a parameter name.<br><br>  [javadoc] /home/eli/src/hadoop-branch-1/src/core/org/apache/hadoop/util/ChecksumUtil.java:32: warning - @param argument &quot;chunkOff&quot; is not a parameter name.<br><br>  [javadoc] /home/eli/src/hadoop-branch-1/src/mapred/org/apache/hadoop/mapred/QueueAclsInfo.java:52: warning - @param argument &quot;queue&quot; is not ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7942">HADOOP-7942</a>.
			
 
				+     Major test reported by gkesavan and fixed by jnp <br>
			
 
				+     <b>enabling clover coverage reports fails hadoop unit test compilation</b><br>
			
 
				+     <blockquote>enabling clover reports fails compiling the following junit tests.<br>link to the console output of jerkins :<br>https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-1-Code-Coverage/13/console<br><br><br><br>{noformat}<br>[javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestUserGroupInformation.java:224: cannot find symbol<br>......<br>    [javac] /tmp/clover50695626838999169.tmp/org/apache/hadoop/security/TestUserGroupInformation.java:225: cannot find symbol<br>......<br><br> [javac] /tmp/clover50695626...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7982">HADOOP-7982</a>.
			
 
				+     Major bug reported by tlipcon and fixed by tlipcon (security)<br>
			
 
				+     <b>UserGroupInformation fails to login if thread&apos;s context classloader can&apos;t load HadoopLoginModule</b><br>
			
 
				+     <blockquote>In a few hard-to-reproduce situations, we&apos;ve seen a problem where the UGI login call causes a failure to login exception with the following cause:<br><br>Caused by: javax.security.auth.login.LoginException: unable to find <br>LoginModule class: org.apache.hadoop.security.UserGroupInformation <br>$HadoopLoginModule<br><br>After a bunch of debugging, I determined that this happens when the login occurs in a thread whose Context ClassLoader has been set to null.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7987">HADOOP-7987</a>.
			
 
				+     Major improvement reported by devaraj and fixed by jnp (security)<br>
			
 
				+     <b>Support setting the run-as user in unsecure mode</b><br>
			
 
				+     <blockquote>Some applications need to be able to perform actions (such as launch MR jobs) from map or reduce tasks. In earlier unsecure versions of hadoop (20.x), it was possible to do this by setting user.name in the configuration. But in 20.205 and 1.0, when running in unsecure mode, this does not work. (In secure mode, you can do this using the kerberos credentials).</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-7988">HADOOP-7988</a>.
			
 
				+     Major bug reported by jnp and fixed by jnp <br>
			
 
				+     <b>Upper case in hostname part of the principals doesn&apos;t work with kerberos.</b><br>
			
 
				+     <blockquote>Kerberos doesn&apos;t like upper case in the hostname part of the principals.<br>This issue has been seen in 23 as well as 1.0.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8154">HADOOP-8154</a>.
			
 
				+     Major bug reported by eli2 and fixed by eli (conf)<br>
			
 
				+     <b>DNS#getIPs shouldn&apos;t silently return the local host IP for bogus interface names</b><br>
			
 
				+     <blockquote>DNS#getIPs silently returns the local host IP for bogus interface names. In this case let&apos;s throw an UnknownHostException. This is technically an incompatbile change. I suspect the current behavior was origininally introduced so the interface name &quot;default&quot; works w/o explicitly checking for it. It may also be used in cases where someone is using a shared config file and an option like &quot;dfs.datanode.dns.interface&quot; or &quot;hbase.master.dns.interface&quot; and eg interface &quot;eth3&quot; that some hosts don&apos;t ha...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8159">HADOOP-8159</a>.
			
 
				+     Major bug reported by cmccabe and fixed by cmccabe <br>
			
 
				+     <b>NetworkTopology: getLeaf should check for invalid topologies</b><br>
			
 
				+     <blockquote>Currently, in NetworkTopology, getLeaf doesn&apos;t do too much validation on the InnerNode object itself. This results in us getting ClassCastException sometimes when the network topology is invalid. We should have a less confusing exception message for this case.<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8209">HADOOP-8209</a>.
			
 
				+     Major improvement reported by eli2 and fixed by eli <br>
			
 
				+     <b>Add option to relax build-version check for branch-1</b><br>
			
 
				+     <blockquote>In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie svn revision) do not match. TTs refuse to connect to JTs if their build *version* (version, revision, user, and source checksum) do not match.<br><br>This prevents rolling upgrades, which is intentional, see the discussion in HADOOP-5203. The primary motivation in that jira was (1) it&apos;s difficult to guarantee every build on a large cluster got deployed correctly, builds don&apos;t get rolled back to old versions by accident etc,...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8251">HADOOP-8251</a>.
			
 
				+     Blocker bug reported by tlipcon and fixed by tlipcon (security)<br>
			
 
				+     <b>SecurityUtil.fetchServiceTicket broken after HADOOP-6941</b><br>
			
 
				+     <blockquote>HADOOP-6941 replaced direct references to some classes with reflective access so as to support other JDKs. Unfortunately there was a mistake in the name of the Krb5Util class, which broke fetchServiceTicket. This manifests itself as the inability to run checkpoints or other krb5-SSL HTTP-based transfers:<br><br>java.lang.ClassNotFoundException: sun.security.jgss.krb5</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8269">HADOOP-8269</a>.
			
 
				+     Trivial bug reported by eli2 and fixed by eli (documentation)<br>
			
 
				+     <b>Fix some javadoc warnings on branch-1</b><br>
			
 
				+     <blockquote>There are some javadoc warnings on branch-1, let&apos;s fix them.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8293">HADOOP-8293</a>.
			
 
				+     Major bug reported by owen.omalley and fixed by owen.omalley (build)<br>
			
 
				+     <b>The native library&apos;s Makefile.am doesn&apos;t include JNI path</b><br>
			
 
				+     <blockquote>When compiling on centos 6, I get the following error when compiling the native library:<br><br>{code}<br> [exec] /usr/bin/ld: cannot find -ljvm<br>{code}<br><br>The problem is simply that the Makefile.am libhadoop_la_LDFLAGS doesn&apos;t include AM_LDFLAGS.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8314">HADOOP-8314</a>.
			
 
				+     Major bug reported by tucu00 and fixed by tucu00 (security)<br>
			
 
				+     <b>HttpServer#hasAdminAccess should return false if authorization is enabled but user is not authenticated</b><br>
			
 
				+     <blockquote>If the user is not authenticated (request.getRemoteUser() returns NULL) or there is not authentication filter configured (thus returning also NULL), hasAdminAccess should return false. Note that a filter could allow anonymous access, thus the first case.<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8329">HADOOP-8329</a>.
			
 
				+     Major bug reported by kumarr and fixed by eli (build)<br>
			
 
				+     <b>Build fails with Java 7</b><br>
			
 
				+     <blockquote>I am seeing the following message running IBM Java 7 running branch-1.0 code.<br>compile:<br>[echo] contrib: gridmix<br>[javac] Compiling 31 source files to /home/hadoop/branch-1.0_0427/build/contrib/gridmix/classes<br>[javac] /home/hadoop/branch-1.0_0427/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:396: error: type argument ? extends T is not within bounds of type-variable E<br>[javac] private &lt;T&gt; String getEnumValues(Enum&lt;? extends T&gt;[] e) {<br>[javac] ^<br>[javac] where T,E are ty...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8338">HADOOP-8338</a>.
			
 
				+     Major bug reported by owen.omalley and fixed by owen.omalley (security)<br>
			
 
				+     <b>Can&apos;t renew or cancel HDFS delegation tokens over secure RPC</b><br>
			
 
				+     <blockquote>The fetchdt tool is failing for secure deployments when given --renew or --cancel on tokens fetched using RPC. (The tokens fetched over HTTP can be renewed and canceled fine.)</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8399">HADOOP-8399</a>.
			
 
				+     Major bug reported by cos and fixed by cos (build)<br>
			
 
				+     <b>Remove JDK5 dependency from Hadoop 1.0+ line</b><br>
			
 
				+     <blockquote>This issues has been fixed in Hadoop starting from 0.21 (see HDFS-1552).<br>I propose to make the same fix for 1.0 line and get rid of JDK5 dependency all together.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8417">HADOOP-8417</a>.
			
 
				+     Major bug reported by zhihyu@ebaysf.com and fixed by zhihyu@ebaysf.com <br>
			
 
				+     <b>HADOOP-6963 didn&apos;t update hadoop-core-pom-template.xml</b><br>
			
 
				+     <blockquote>HADOOP-6963 introduced commons-io 2.1 in ivy.xml but forgot to update the hadoop-core-pom-template.xml.<br><br>This has caused map reduce jobs in downstream projects to fail with:<br>{code}<br>Caused by: java.lang.ClassNotFoundException: org.apache.commons.io.FileUtils<br>	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)<br>	at java.security.AccessController.doPrivileged(Native Method)<br>	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)<br>	at java.lang.ClassLoader.loadClass(ClassLoader.java:3...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8430">HADOOP-8430</a>.
			
 
				+     Major improvement reported by eli2 and fixed by eli <br>
			
 
				+     <b>Backport new FileSystem methods introduced by HADOOP-8014 to branch-1 </b><br>
			
 
				+     <blockquote>Per HADOOP-8422 let&apos;s backport the new FileSystem methods from HADOOP-8014 to branch-1 so users can transition over in Hadoop 1.x releases, which helps upstream projects like HBase work against federation (see HBASE-6067). </blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8445">HADOOP-8445</a>.
			
 
				+     Major bug reported by raviprak and fixed by raviprak (security)<br>
			
 
				+     <b>Token should not print the password in toString</b><br>
			
 
				+     <blockquote>This JIRA is for porting HADOOP-6622 to branch-1 since 6622 is already closed.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HADOOP-8552">HADOOP-8552</a>.
			
 
				+     Major bug reported by kkambatl and fixed by kkambatl (conf, security)<br>
			
 
				+     <b>Conflict: Same security.log.file for multiple users. </b><br>
			
 
				+     <blockquote>In log4j.properties, hadoop.security.log.file is set to SecurityAuth.audit. In the presence of multiple users, this can lead to a potential conflict.<br><br>Adding username to the log file would avoid this scenario.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-1108">HDFS-1108</a>.
			
 
				+     Major sub-task reported by dhruba and fixed by tlipcon (ha, name-node)<br>
			
 
				+     <b>Log newly allocated blocks</b><br>
			
 
				+     <blockquote>The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on the file persists the blocks into the transaction log. It would be nice if we can immediately persist newly allocated blocks (as soon as they are allocated) for specific files.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-1378">HDFS-1378</a>.
			
 
				+     Major improvement reported by tlipcon and fixed by cmccabe (name-node)<br>
			
 
				+     <b>Edit log replay should track and report file offsets in case of errors</b><br>
			
 
				+     <blockquote>Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it&apos;s easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a w...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-1910">HDFS-1910</a>.
			
 
				+     Minor bug reported by slukog and fixed by  (name-node)<br>
			
 
				+     <b>when dfs.name.dir and dfs.name.edits.dir are same fsimage will be saved twice every time</b><br>
			
 
				+     <blockquote>when image and edits dir are configured same, the fsimage flushing from memory to disk will be done twice whenever saveNamespace is done. this may impact the performance of backupnode/snn where it does a saveNamespace during every checkpointing time.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2065">HDFS-2065</a>.
			
 
				+     Major bug reported by bharathm and fixed by umamaheswararao <br>
			
 
				+     <b>Fix NPE in DFSClient.getFileChecksum</b><br>
			
 
				+     <blockquote>The following code can throw NPE if callGetBlockLocations returns null.<br><br>If server returns null <br><br>{code}<br>    List&lt;LocatedBlock&gt; locatedblocks<br>        = callGetBlockLocations(namenode, src, 0, Long.MAX_VALUE).getLocatedBlocks();<br>{code}<br><br>The right fix for this is server should throw right exception.<br><br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2305">HDFS-2305</a>.
			
 
				+     Major bug reported by atm and fixed by atm (name-node)<br>
			
 
				+     <b>Running multiple 2NNs can result in corrupt file system</b><br>
			
 
				+     <blockquote>Here&apos;s the scenario:<br><br>* You run the NN and 2NN (2NN A) on the same machine.<br>* You don&apos;t have the address of the 2NN configured, so it&apos;s defaulting to 127.0.0.1.<br>* There&apos;s another 2NN (2NN B) running on a second machine.<br>* When a 2NN is done checkpointing, it says &quot;hey NN, I have an updated fsimage for you. You can download it from this URL, which includes my IP address, which is x&quot;<br><br>And here&apos;s the steps that occur to cause this issue:<br><br># Some edits happen.<br># 2NN A (on the NN machine) does a c...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2317">HDFS-2317</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo <br>
			
 
				+     <b>Read access to HDFS using HTTP REST</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2331">HDFS-2331</a>.
			
 
				+     Major bug reported by abhijit.shingate and fixed by abhijit.shingate (hdfs client)<br>
			
 
				+     <b>Hdfs compilation fails</b><br>
			
 
				+     <blockquote>I am trying to perform complete build from trunk folder but the compilation fails.<br><br>*Commandline:*<br>mvn clean install  <br><br>*Error Message:*<br><br>[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.<br>3.2:compile (default-compile) on project hadoop-hdfs: Compilation failure<br>[ERROR] \Hadoop\SVN\trunk\hadoop-hdfs-project\hadoop-hdfs\src\main\java\org<br>\apache\hadoop\hdfs\web\WebHdfsFileSystem.java:[209,21] type parameters of &lt;T&gt;T<br>cannot be determined; no unique maximal instance...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2332">HDFS-2332</a>.
			
 
				+     Major test reported by tlipcon and fixed by tlipcon (test)<br>
			
 
				+     <b>Add test for HADOOP-7629: using an immutable FsPermission as an IPC parameter</b><br>
			
 
				+     <blockquote>HADOOP-7629 fixes a bug where an immutable FsPermission would throw an error if used as the argument to fs.setPermission(). This JIRA is to add a test case for the common bugfix.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2333">HDFS-2333</a>.
			
 
				+     Major bug reported by ikelly and fixed by szetszwo <br>
			
 
				+     <b>HDFS-2284 introduced 2 findbugs warnings on trunk</b><br>
			
 
				+     <blockquote>When HDFS-2284 was submitted it made DFSOutputStream public which triggered two SC_START_IN_CTOR findbug warnings.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2340">HDFS-2340</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Support getFileBlockLocations and getDelegationToken in webhdfs</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2348">HDFS-2348</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Support getContentSummary and getFileChecksum in webhdfs</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2356">HDFS-2356</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>webhdfs: support case insensitive query parameter names</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2361">HDFS-2361</a>.
			
 
				+     Critical bug reported by rajsaha and fixed by jnp (name-node)<br>
			
 
				+     <b>hftp is broken</b><br>
			
 
				+     <blockquote>Distcp with hftp is failing.<br><br>{noformat}<br>$hadoop   distcp hftp://&lt;NNhostname&gt;:50070/user/hadoopqa/1316814737/newtemp 1316814737/as<br>11/09/23 21:52:33 INFO tools.DistCp: srcPaths=[hftp://&lt;NNhostname&gt;:50070/user/hadoopqa/1316814737/newtemp]<br>11/09/23 21:52:33 INFO tools.DistCp: destPath=1316814737/as<br>Retrieving token from: https://&lt;NN IP&gt;:50470/getDelegationToken<br>Retrieving token from: https://&lt;NN IP&gt;:50470/getDelegationToken?renewer=mapred<br>11/09/23 21:52:34 INFO security.TokenCache: Got dt for h...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2366">HDFS-2366</a>.
			
 
				+     Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>webhdfs throws a npe when ugi is null from getDelegationToken</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2368">HDFS-2368</a>.
			
 
				+     Major bug reported by arpitgupta and fixed by szetszwo <br>
			
 
				+     <b>defaults created for web keytab and principal, these properties should not have defaults</b><br>
			
 
				+     <blockquote>the following defaults are set in hdfs-defaults.xml<br><br>&lt;property&gt;<br>  &lt;name&gt;dfs.web.authentication.kerberos.principal&lt;/name&gt;<br>  &lt;value&gt;HTTP/${dfs.web.hostname}@${kerberos.realm}&lt;/value&gt;<br>  &lt;description&gt;<br>    The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.<br><br>    The HTTP Kerberos principal MUST start with &apos;HTTP/&apos; per Kerberos<br>    HTTP SPENGO specification.<br>  &lt;/description&gt;<br>&lt;/property&gt;<br><br>&lt;property&gt;<br>  &lt;name&gt;dfs.web.authentication.kerberos.keytab&lt;/name&gt;<br>  &lt;value&gt;${user.home}/dfs.web....</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2427">HDFS-2427</a>.
			
 
				+     Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>webhdfs mkdirs api call creates path with 777 permission, we should default it to 755</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2432">HDFS-2432</a>.
			
 
				+     Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>webhdfs setreplication api should return a 403 when called on a directory</b><br>
			
 
				+     <blockquote>Currently the set replication api on a directory leads to a 200.<br><br>Request URI http://NN:50070/webhdfs/tmp/webhdfs_data/dir_replication_tests?op=SETREPLICATION&amp;replication=5<br>Request Method: PUT<br>Status Line: HTTP/1.1 200 OK<br>Response Content: {&quot;boolean&quot;:false}<br><br>Since we can determine that this call did not succeed (boolean=false) we should rather just return a 403</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2453">HDFS-2453</a>.
			
 
				+     Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>tail using a webhdfs uri throws an error</b><br>
			
 
				+     <blockquote>/usr//bin/hadoop --config /etc/hadoop dfs -tail webhdfs://NN:50070/file <br>tail: HTTP_PARTIAL expected, received 200<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2494">HDFS-2494</a>.
			
 
				+     Major sub-task reported by umamaheswararao and fixed by umamaheswararao (webhdfs)<br>
			
 
				+     <b>[webhdfs] When Getting the file using OP=OPEN with DN http address, ESTABLISHED sockets are growing.</b><br>
			
 
				+     <blockquote>As part of the reliable test,<br>Scenario:<br>Initially check the socket count. ---there are aroud 42 sockets are there.<br>open the file with DataNode http address using op=OPEN request parameter about 500 times in loop.<br>Wait for some time and check the socket count. --- There are thousands of ESTABLISHED sockets are growing. ~2052<br><br>Here is the netstat result:<br><br>C:\Users\uma&gt;netstat | grep 127.0.0.1 | grep ESTABLISHED |wc -l<br>2042<br>C:\Users\uma&gt;netstat | grep 127.0.0.1 | grep ESTABLISHED |wc -l<br>2042<br>C:\...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2501">HDFS-2501</a>.
			
 
				+     Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>add version prefix and root methods to webhdfs</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2541">HDFS-2541</a>.
			
 
				+     Major bug reported by qwertymaniac and fixed by qwertymaniac (data-node)<br>
			
 
				+     <b>For a sufficiently large value of blocks, the DN Scanner may request a random number with a negative seed value.</b><br>
			
 
				+     <blockquote>Running off 0.20-security, I noticed that one could get the following exception when scanners are used:<br><br>{code}<br>DataXceiver <br>java.lang.IllegalArgumentException: n must be positive <br>at java.util.Random.nextInt(Random.java:250) <br>at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.getNewBlockScanTime(DataBlockScanner.java:251) <br>at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.addBlock(DataBlockScanner.java:268) <br>at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(Da...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2547">HDFS-2547</a>.
			
 
				+     Trivial bug reported by qwertymaniac and fixed by qwertymaniac (name-node)<br>
			
 
				+     <b>ReplicationTargetChooser has incorrect block placement comments</b><br>
			
 
				+     <blockquote>{code}<br>/** The class is responsible for choosing the desired number of targets<br> * for placing block replicas.<br> * The replica placement strategy is that if the writer is on a datanode,<br> * the 1st replica is placed on the local machine, <br> * otherwise a random datanode. The 2nd replica is placed on a datanode<br> * that is on a different rack. The 3rd replica is placed on a datanode<br> * which is on the same rack as the **first replca**.<br> */<br>{code}<br><br>That should read &quot;second replica&quot;. The test cases c...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2552">HDFS-2552</a>.
			
 
				+     Major task reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Add WebHdfs Forrest doc</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2590">HDFS-2590</a>.
			
 
				+     Major bug reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Some links in WebHDFS forrest doc do not work</b><br>
			
 
				+     <blockquote>Some links are pointing to DistributedFileSystem javadoc but the javadoc of DistributedFileSystem is not generated by default.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2604">HDFS-2604</a>.
			
 
				+     Minor improvement reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>Add a log message to show if WebHDFS is enabled</b><br>
			
 
				+     <blockquote>WebHDFS can be enabled/disabled by the conf key {{dfs.webhdfs.enabled}}.  Let&apos;s add a log message to show if it is enabled.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2637">HDFS-2637</a>.
			
 
				+     Major bug reported by eli and fixed by eli (hdfs client)<br>
			
 
				+     <b>The rpc timeout for block recovery is too low </b><br>
			
 
				+     <blockquote>The RPC timeout for block recovery does not take into account that it issues multiple RPCs itself. This can cause recovery to fail if the network is congested or DNs are busy.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2638">HDFS-2638</a>.
			
 
				+     Minor improvement reported by eli and fixed by eli (name-node)<br>
			
 
				+     <b>Improve a block recovery log</b><br>
			
 
				+     <blockquote>It would be useful to know whether an attempt to recover a block is failing because the block was already recovered (has a new GS) or the block is missing.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2653">HDFS-2653</a>.
			
 
				+     Major improvement reported by eli and fixed by eli (data-node)<br>
			
 
				+     <b>DFSClient should cache whether addrs are non-local when short-circuiting is enabled</b><br>
			
 
				+     <blockquote>Something Todd mentioned to me off-line.. currently DFSClient doesn&apos;t cache the fact that non-local reads are non-local, so if short-circuiting is enabled every time we create a block reader we&apos;ll go through the isLocalAddress code path. We should cache the fact that an addr is non-local as well.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2654">HDFS-2654</a>.
			
 
				+     Major improvement reported by eli and fixed by eli (data-node)<br>
			
 
				+     <b>Make BlockReaderLocal not extend RemoteBlockReader2</b><br>
			
 
				+     <blockquote>The BlockReaderLocal code paths are easier to understand (especially true on branch-1 where BlockReaderLocal inherits code from BlockerReader and FSInputChecker) if the local and remote block reader implementations are independent, and they&apos;re not really sharing much code anyway. If for some reason they start to share significant code we can make the BlockReader interface an abstract class.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2728">HDFS-2728</a>.
			
 
				+     Minor bug reported by qwertymaniac and fixed by qwertymaniac (name-node)<br>
			
 
				+     <b>Remove dfsadmin -printTopology from branch-1 docs since it does not exist</b><br>
			
 
				+     <blockquote>It is documented we have -printTopology but we do not really have it in this branch. Possible docs mixup from somewhere in security branch pre-merge?<br><br>{code}<br>?  branch-1  grep printTopology -R .<br>./src/docs/src/documentation/content/xdocs/.svn/text-base/hdfs_user_guide.xml.svn-base:      &lt;code&gt;-printTopology&lt;/code&gt;<br>./src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml:      &lt;code&gt;-printTopology&lt;/code&gt;<br>{code}<br><br>Lets remove the reference.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2790">HDFS-2790</a>.
			
 
				+     Minor bug reported by arpitgupta and fixed by arpitgupta <br>
			
 
				+     <b>FSNamesystem.setTimes throws exception with wrong configuration name in the message</b><br>
			
 
				+     <blockquote>the api throws this message when hdfs is not configured for accessTime<br><br>&quot;Access time for hdfs is not configured.  Please set dfs.support.accessTime configuration parameter.&quot;<br><br><br>The property name should be dfs.access.time.precision</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2869">HDFS-2869</a>.
			
 
				+     Minor bug reported by qwertymaniac and fixed by qwertymaniac (webhdfs)<br>
			
 
				+     <b>Error in Webhdfs documentation for mkdir</b><br>
			
 
				+     <blockquote>Reported over the lists by user Stuti Awasthi:<br><br>{quote}<br><br>I have tried the webhdfs functionality of Hadoop-1.0.0 and it is working fine.<br>Just a small change is required in the documentation :<br><br>Make a Directory declaration in documentation:<br>curl -i -X PUT &quot;http://&lt;HOST&gt;:&lt;PORT&gt;/&lt;PATH&gt;?op=MKDIRS[&amp;permission=&lt;OCTAL&gt;]&quot;<br><br>Gives following error :<br>HTTP/1.1 405 HTTP method PUT is not supported by this URL<br>Content-Length: 0<br>Server: Jetty(6.1.26)<br><br>Correction Required : This works for me<br>curl -i -X PUT &quot;ht...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2872">HDFS-2872</a>.
			
 
				+     Major improvement reported by tlipcon and fixed by cmccabe (name-node)<br>
			
 
				+     <b>Add sanity checks during edits loading that generation stamps are non-decreasing</b><br>
			
 
				+     <blockquote>In 0.23 and later versions, we have a txid per edit, and the loading process verifies that there are no gaps. Lacking this in 1.0, we can use generation stamps as a proxy - the OP_SET_GENERATION_STAMP opcode should never result in a decreased genstamp. If it does, that would indicate that the edits are corrupt, or older edits are being applied to a newer checkpoint, for example.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-2877">HDFS-2877</a>.
			
 
				+     Major bug reported by tlipcon and fixed by tlipcon (name-node)<br>
			
 
				+     <b>If locking of a storage dir fails, it will remove the other NN&apos;s lock file on exit</b><br>
			
 
				+     <blockquote>In {{Storage.tryLock()}}, we call {{lockF.deleteOnExit()}} regardless of whether we successfully lock the directory. So, if another NN has the directory locked, then we&apos;ll fail to lock it the first time we start another NN. But our failed start attempt will still remove the other NN&apos;s lockfile, and a second attempt will erroneously start.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3008">HDFS-3008</a>.
			
 
				+     Major bug reported by eli2 and fixed by eli (hdfs client)<br>
			
 
				+     <b>Negative caching of local addrs doesn&apos;t work</b><br>
			
 
				+     <blockquote>HDFS-2653 added negative caching of local addrs, however it still goes through the fall through path every time if the address is non-local. </blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3078">HDFS-3078</a>.
			
 
				+     Major bug reported by eli2 and fixed by eli <br>
			
 
				+     <b>2NN https port setting is broken</b><br>
			
 
				+     <blockquote>The code in SecondaryNameNode.java to set the https port is broken, if the port is set it sets the bind addr to &quot;addr:addr:port&quot; which is bogus. Even if it did work it uses port 0 instead of port 50490 (default listed in ./src/packages/templates/conf/hdfs-site.xml).<br><br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3129">HDFS-3129</a>.
			
 
				+     Minor test reported by cmccabe and fixed by cmccabe <br>
			
 
				+     <b>NetworkTopology: add test that getLeaf should check for invalid topologies</b><br>
			
 
				+     <blockquote></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3131">HDFS-3131</a>.
			
 
				+     Minor improvement reported by szetszwo and fixed by brandonli <br>
			
 
				+     <b>Improve TestStorageRestore</b><br>
			
 
				+     <blockquote>Aaron has the following comments on TestStorageRestore in HDFS-3127.<br><br># removeStorageAccess, restoreAccess, and numStorageDirs can all be made private<br># numStorageDirs can be made static<br># Rather than do set(Readable/Executable/Writable), use FileUtil.chmod(...).<br># Please put the contents of the test in a try/finally, with the calls to shutdown the cluster and the 2NN in the finally block.<br># Some lines are over 80 chars.<br># No need for the numDatanodes variable - it&apos;s only used in one place.<br>#...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3148">HDFS-3148</a>.
			
 
				+     Major new feature reported by eli2 and fixed by eli (hdfs client, performance)<br>
			
 
				+     <b>The client should be able to use multiple local interfaces for data transfer</b><br>
			
 
				+     <blockquote>HDFS-3147 covers using multiple interfaces on the server (Datanode) side. Clients should also be able to utilize multiple *local* interfaces for outbound connections instead of always using the interface for the local hostname. This can be accomplished with a new configuration parameter ({{dfs.client.local.interfaces}}) that accepts a list of interfaces the client should use. Acceptable configuration values are the same as the {{dfs.datanode.available.interfaces}} parameter. The client binds ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3150">HDFS-3150</a>.
			
 
				+     Major new feature reported by eli2 and fixed by eli (data-node, hdfs client)<br>
			
 
				+     <b>Add option for clients to contact DNs via hostname</b><br>
			
 
				+     <blockquote>The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let&apos;s add back the option fo...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3176">HDFS-3176</a>.
			
 
				+     Major bug reported by kihwal and fixed by kihwal (hdfs client)<br>
			
 
				+     <b>JsonUtil should not parse the MD5MD5CRC32FileChecksum bytes on its own.</b><br>
			
 
				+     <blockquote>Currently JsonUtil used by webhdfs parses MD5MD5CRC32FileChecksum binary bytes on its own and contructs a MD5MD5CRC32FileChecksum. It should instead call MD5MD5CRC32FileChecksum.readFields().</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3330">HDFS-3330</a>.
			
 
				+     Critical bug reported by tlipcon and fixed by tlipcon (name-node)<br>
			
 
				+     <b>If GetImageServlet throws an Error or RTE, response has HTTP &quot;OK&quot; status</b><br>
			
 
				+     <blockquote>Currently in GetImageServlet, we catch Exception but not other Errors or RTEs. So, if the code ends up throwing one of these exceptions, the &quot;response.sendError()&quot; code doesn&apos;t run, but the finally clause does run. This results in the servlet returning HTTP 200 OK and an empty response, which causes the client to think it got a successful image transfer.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3453">HDFS-3453</a>.
			
 
				+     Major bug reported by kihwal and fixed by kihwal (hdfs client)<br>
			
 
				+     <b>HDFS does not use ClientProtocol in a backward-compatible way</b><br>
			
 
				+     <blockquote>HDFS-617 was brought into branch-0.20-security/branch-1 to support non-recursive create, along with HADOOP-6840 and HADOOP-6886. However, the changes in HDFS was done in an incompatible way, making the client unusable against older clusters, even when plain old create() is called. This is because DFS now internally calls create() through the newly introduced method. By simply changing how the methods are wired internally, we can remove this limitation. We may eventually switch back to the app...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3504">HDFS-3504</a>.
			
 
				+     Major improvement reported by sseth and fixed by szetszwo <br>
			
 
				+     <b>Configurable retry in DFSClient</b><br>
			
 
				+     <blockquote>When NN maintenance is performed on a large cluster, jobs end up failing. This is particularly bad for long running jobs. The client retry policy could be made configurable so that jobs don&apos;t need to be restarted.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3516">HDFS-3516</a>.
			
 
				+     Major improvement reported by szetszwo and fixed by szetszwo (hdfs client)<br>
			
 
				+     <b>Check content-type in WebHdfsFileSystem</b><br>
			
 
				+     <blockquote>WebHdfsFileSystem currently tries to parse the response as json.  It may be a good idea to check the content-type before parsing it.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3551">HDFS-3551</a>.
			
 
				+     Major bug reported by szetszwo and fixed by szetszwo (webhdfs)<br>
			
 
				+     <b>WebHDFS CREATE does not use client location for redirection</b><br>
			
 
				+     <blockquote>CREATE currently redirects client to a random datanode but not using the client location information.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/HDFS-3652">HDFS-3652</a>.
			
 
				+     Blocker bug reported by tlipcon and fixed by tlipcon (name-node)<br>
			
 
				+     <b>1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name</b><br>
			
 
				+     <blockquote>In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition:<br>{code}<br>      File parentDir = getStorageDirForStream(idx);<br>      if (parentDir.getName().equals(sd.getRoot().getName())) {<br>{code}<br>... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong strea...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-1740">MAPREDUCE-1740</a>.
			
 
				+     Major bug reported by tlipcon and fixed by ahmed.radwan (jobtracker)<br>
			
 
				+     <b>NPE in getMatchingLevelForNodes when node locations are variable depth</b><br>
			
 
				+     <blockquote>In getMatchingLevelForNodes, we assume that both nodes have the same &quot;depth&quot; (ie number of path components). If the user provides a topology script that assigns one node a path like /foo/bar/baz and another node a path like /foo/blah, this function will throw an NPE.<br><br>I&apos;m not sure if there are other places where we assume that all node locations have a constant number of paths. If so we should check the output of the topology script aggressively to be sure this is the case. Otherwise I think ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2073">MAPREDUCE-2073</a>.
			
 
				+     Trivial test reported by tlipcon and fixed by tlipcon (distributed-cache, test)<br>
			
 
				+     <b>TestTrackerDistributedCacheManager should be up-front about requirements on build environment</b><br>
			
 
				+     <blockquote>TestTrackerDistributedCacheManager will fail on a system where the build directory is in any path where an ancestor doesn&apos;t have a+x permissions. On one of our hudson boxes, for example, hudson&apos;s workspace had 700 permissions and caused this test to fail reliably, but not in an obvious manner. It would be helpful if the test failed with a more obvious error message during setUp() when the build environment is misconfigured.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2103">MAPREDUCE-2103</a>.
			
 
				+     Trivial improvement reported by tlipcon and fixed by tlipcon (task-controller)<br>
			
 
				+     <b>task-controller shouldn&apos;t require o-r permissions</b><br>
			
 
				+     <blockquote>The task-controller currently checks that &quot;other&quot; users don&apos;t have read permissions. This is unnecessary - we just need to make it&apos;s not executable. The debian policy manual explains it well:<br><br>{quote}<br>Setuid and setgid executables should be mode 4755 or 2755 respectively, and owned by the appropriate user or group. They should not be made unreadable (modes like 4711 or 2711 or even 4111); doing so achieves no extra security, because anyone can find the binary in the freely available Debian pa...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2129">MAPREDUCE-2129</a>.
			
 
				+     Major bug reported by xiaokang and fixed by subrotosanyal (jobtracker)<br>
			
 
				+     <b>Job may hang if mapreduce.job.committer.setup.cleanup.needed=false and mapreduce.map/reduce.failures.maxpercent&gt;0</b><br>
			
 
				+     <blockquote>Job may hang at RUNNING state if mapreduce.job.committer.setup.cleanup.needed=false and mapreduce.map/reduce.failures.maxpercent&gt;0. It happens when some tasks fail but havent reached failures.maxpercent.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2289">MAPREDUCE-2289</a>.
			
 
				+     Major bug reported by tlipcon and fixed by ahmed.radwan (job submission)<br>
			
 
				+     <b>Permissions race can make getStagingDir fail on local filesystem</b><br>
			
 
				+     <blockquote>I&apos;ve observed the following race condition in TestFairSchedulerSystem which uses a MiniMRCluster on top of RawLocalFileSystem:<br>- two threads call getStagingDir at the same time<br>- Thread A checks fs.exists(stagingArea) and sees false<br>-- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)<br>--- mkdirs calls the Java mkdir API which makes the file with umask-based permissions<br>- Thread B runs, checks fs.exists(stagingArea) and sees true<br>-- checks permissions, sees the default permissions, and throws IOE...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2376">MAPREDUCE-2376</a>.
			
 
				+     Major bug reported by tlipcon and fixed by tlipcon (task-controller, test)<br>
			
 
				+     <b>test-task-controller fails if run as a userid &lt; 1000</b><br>
			
 
				+     <blockquote>test-task-controller tries to verify that the task-controller won&apos;t run on behalf of users with uid &lt; 1000. This makes the test fail when running in some test environments - eg our hudson jobs internally run as a system user with uid 101.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2377">MAPREDUCE-2377</a>.
			
 
				+     Major bug reported by tlipcon and fixed by benoyantony (task-controller)<br>
			
 
				+     <b>task-controller fails to parse configuration if it doesn&apos;t end in \n</b><br>
			
 
				+     <blockquote>If the task-controller.cfg file doesn&apos;t end in a newline, it fails to parse properly.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2835">MAPREDUCE-2835</a>.
			
 
				+     Major improvement reported by tomwhite and fixed by tomwhite <br>
			
 
				+     <b>Make per-job counter limits configurable</b><br>
			
 
				+     <blockquote>The per-job counter limits introduced in MAPREDUCE-1943 are fixed, except for the total number allowed per job (mapreduce.job.counters.limit). It would be useful to make them all configurable.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2836">MAPREDUCE-2836</a>.
			
 
				+     Minor improvement reported by jwfbean and fixed by ahmed.radwan (contrib/fair-share)<br>
			
 
				+     <b>Provide option to fail jobs when submitted to non-existent pools.</b><br>
			
 
				+     <blockquote>In some environments, it might be desirable to explicitly specify the fair scheduler pools and to explicitly fail jobs that are not submitted to any of the pools. <br><br>Current behavior of the fair scheduler is to submit jobs to a default pool if a pool name isn&apos;t specified or to create a pool with the new name if the pool name doesn&apos;t already exist. There should be a configuration option for the fair scheduler that causes it to noisily fail the job if it&apos;s submitted to a pool that isn&apos;t pre-spec...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2850">MAPREDUCE-2850</a>.
			
 
				+     Major sub-task reported by eli and fixed by ravidotg (tasktracker)<br>
			
 
				+     <b>Add test for TaskTracker disk failure handling (MR-2413)</b><br>
			
 
				+     <blockquote>MR-2413 doesn&apos;t have any test coverage that eg tests that the TT can survive disk failure.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2903">MAPREDUCE-2903</a>.
			
 
				+     Major bug reported by devaraj.k and fixed by devaraj.k (jobtracker)<br>
			
 
				+     <b>Map Tasks graph is throwing XML Parse error when Job is executed with 0 maps</b><br>
			
 
				+     <blockquote>{code:xml}<br>XML Parsing Error: no element found<br>Location: http://10.18.52.170:50030/taskgraph?type=map&amp;jobid=job_201108291536_0001<br>Line Number 1, Column 1:<br>^<br>{code}<br></blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2905">MAPREDUCE-2905</a>.
			
 
				+     Major bug reported by jwfbean and fixed by jwfbean (contrib/fair-share)<br>
			
 
				+     <b>CapBasedLoadManager incorrectly allows assignment when assignMultiple is true (was: assignmultiple per job)</b><br>
			
 
				+     <blockquote>We encountered a situation where in the same cluster, large jobs benefit from mapred.fairscheduler.assignmultiple, but small jobs with small numbers of mappers do not: the mappers all clump to fully occupy just a few nodes, which causes those nodes to saturate and bottleneck. The desired behavior is to spread the job across more nodes so that a relatively small job doesn&apos;t saturate any node in the cluster.<br><br>Testing has shown that setting mapred.fairscheduler.assignmultiple to false gives the ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2919">MAPREDUCE-2919</a>.
			
 
				+     Minor improvement reported by eli and fixed by qwertymaniac (jobtracker)<br>
			
 
				+     <b>The JT web UI should show job start times </b><br>
			
 
				+     <blockquote>It would be helpful if the list of jobs in the main JT web UI (running, completed, failed..) had a column with the start time. Clicking into each job detail can get tedious.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2932">MAPREDUCE-2932</a>.
			
 
				+     Trivial bug reported by qwertymaniac and fixed by qwertymaniac (tasktracker)<br>
			
 
				+     <b>Missing instrumentation plugin class shouldn&apos;t crash the TT startup per design</b><br>
			
 
				+     <blockquote>Per the implementation of the TaskTracker instrumentation plugin implementation (from 2008), a ClassNotFoundException during loading up of an configured TaskTracker instrumentation class shouldn&apos;t have hampered TT start up at all.<br><br>But, there is one class-fetching call outside try/catch, which makes TT fall down with a RuntimeException if there&apos;s a class not found. Would be good to include this line into the try/catch itself.<br><br>Strace would appear as:<br><br>{code}<br>2011-08-25 11:45:38,470 ERROR org....</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-2957">MAPREDUCE-2957</a>.
			
 
				+     Major sub-task reported by eli and fixed by eli (tasktracker)<br>
			
 
				+     <b>The TT should not re-init if it has no good local dirs</b><br>
			
 
				+     <blockquote>The TT will currently try to re-init itself on disk failure even if it has no good local dirs. It should shutdown instead.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3015">MAPREDUCE-3015</a>.
			
 
				+     Major sub-task reported by eli and fixed by eli (tasktracker)<br>
			
 
				+     <b>Add local dir failure info to metrics and the web UI</b><br>
			
 
				+     <blockquote>Like HDFS-811/HDFS-1850 but for the TT.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3076">MAPREDUCE-3076</a>.
			
 
				+     Blocker bug reported by acmurthy and fixed by acmurthy (test)<br>
			
 
				+     <b>TestSleepJob fails </b><br>
			
 
				+     <blockquote>TestSleepJob fails, it was intended to be used in other tests for MAPREDUCE-2981.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3278">MAPREDUCE-3278</a>.
			
 
				+     Major improvement reported by tlipcon and fixed by tlipcon (mrv1, performance, task)<br>
			
 
				+     <b>0.20: avoid a busy-loop in ReduceTask scheduling</b><br>
			
 
				+     <blockquote>Looking at profiling results, it became clear that the ReduceTask has the following busy-loop which was causing it to suck up 100% of CPU in the fetch phase in some configurations:<br>- the number of reduce fetcher threads is configured to more than the number of hosts<br>- therefore &quot;busyEnough()&quot; never returns true<br>- the &quot;scheduling&quot; portion of the code can&apos;t schedule any new fetches, since all of the pending fetches in the mapLocations buffer correspond to hosts that are already being fetched (t...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3365">MAPREDUCE-3365</a>.
			
 
				+     Trivial improvement reported by sho.shimauchi and fixed by sho.shimauchi (contrib/fair-share)<br>
			
 
				+     <b>Uncomment eventlog settings from the documentation</b><br>
			
 
				+     <blockquote>Two fair scheduler debug options &quot;mapred.fairscheduler.eventlog.enabled&quot; and &quot;mapred.fairscheduler.dump.interval&quot; are commented out in fair scheduler doc file.<br>It&apos;s useful for debugging.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3394">MAPREDUCE-3394</a>.
			
 
				+     Trivial improvement reported by tlipcon and fixed by tlipcon (task)<br>
			
 
				+     <b>Add log guard for a debug message in ReduceTask</b><br>
			
 
				+     <blockquote>There&apos;s a LOG.debug message in ReduceTask that stringifies a task ID and uses a non-negligible amount of CPU in some cases. We should guard it with {{isDebugEnabled}}</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3395">MAPREDUCE-3395</a>.
			
 
				+     Trivial improvement reported by eli and fixed by eli (documentation)<br>
			
 
				+     <b>Add mapred.disk.healthChecker.interval to mapred-default.xml</b><br>
			
 
				+     <blockquote>Let&apos;s add mapred.disk.healthChecker.interval to mapred-default.xml.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3405">MAPREDUCE-3405</a>.
			
 
				+     Critical bug reported by tlipcon and fixed by tlipcon (capacity-sched, contrib/fair-share)<br>
			
 
				+     <b>MAPREDUCE-3015 broke compilation of contrib scheduler tests</b><br>
			
 
				+     <blockquote>MAPREDUCE-3015 added a new argument to the TaskTrackerStatus constructor, which is used by a few of the scheduler tests, but didn&apos;t update those tests. So, the contrib test build is now failing on 0.20-security</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3419">MAPREDUCE-3419</a>.
			
 
				+     Major bug reported by eli and fixed by eli (tasktracker, test)<br>
			
 
				+     <b>Don&apos;t mark exited TT threads as dead in MiniMRCluster  </b><br>
			
 
				+     <blockquote>MAPREDUCE-2850 flagged all TT threads that exited in the MiniMRCluster as dead, this breaks a number of the other tests that use MiniMRCluster across restart.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3424">MAPREDUCE-3424</a>.
			
 
				+     Minor sub-task reported by eli and fixed by eli (tasktracker)<br>
			
 
				+     <b>Some LinuxTaskController cleanup</b><br>
			
 
				+     <blockquote>MR-2415 had some tabs and weird indenting and spacing. Also would be more clear if LTC explicitly overrides createLogDir. Let&apos;s clean this up. </blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3674">MAPREDUCE-3674</a>.
			
 
				+     Critical bug reported by qwertymaniac and fixed by qwertymaniac (jobtracker)<br>
			
 
				+     <b>If invoked with no queueName request param, jobqueue_details.jsp injects a null queue name into schedulers.</b><br>
			
 
				+     <blockquote>When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well.<br><br>As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to &apos;default&apos;. Otherwise, this brings down the JobTracker completely.<br><br>FairSch...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3789">MAPREDUCE-3789</a>.
			
 
				+     Critical bug reported by qwertymaniac and fixed by qwertymaniac (capacity-sched, scheduler)<br>
			
 
				+     <b>CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments</b><br>
			
 
				+     <blockquote>Briefly, to reproduce:<br><br>* Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G]<br>* Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots.<br>* Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so).<br>* Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots i...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-3992">MAPREDUCE-3992</a>.
			
 
				+     Major bug reported by tlipcon and fixed by tlipcon (mrv1)<br>
			
 
				+     <b>Reduce fetcher doesn&apos;t verify HTTP status code of response</b><br>
			
 
				+     <blockquote>Currently, the reduce fetch code doesn&apos;t check the HTTP status code of the response. This can lead to the following situation:<br>- the map output servlet gets an IOException after setting the headers but before the first call to flush()<br>- this causes it to send a response with a non-OK result code, including the exception text as the response body (response.sendError() does this if the response isn&apos;t committed)<br>- it will still include the response headers indicating it&apos;s a valid response<br><br>In th...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4001">MAPREDUCE-4001</a>.
			
 
				+     Minor improvement reported by qwertymaniac and fixed by qwertymaniac (capacity-sched)<br>
			
 
				+     <b>Improve MAPREDUCE-3789&apos;s fix logic by looking at job&apos;s slot demands instead</b><br>
			
 
				+     <blockquote>In MAPREDUCE-3789, the fix had unfortunately only covered the first time assignment scenario, and the test had not really caught the mistake of using the condition of looking at available TT slots (instead of looking for how many slots a job&apos;s task demands).<br><br>We should change the condition of reservation in such a manner:<br><br>{code}<br>          if ((getPendingTasks(j) != 0 &amp;&amp;<br>               !hasSufficientReservedTaskTrackers(j)) &amp;&amp;<br>-                (taskTracker.getAvailableSlots(type) !=<br>+        ...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4088">MAPREDUCE-4088</a>.
			
 
				+     Critical bug reported by raviprak and fixed by raviprak (mrv1)<br>
			
 
				+     <b>Task stuck in JobLocalizer prevented other tasks on the same node from committing</b><br>
			
 
				+     <blockquote>We saw that as a result of HADOOP-6963, one task was stuck in this<br><br>Thread 23668: (state = IN_NATIVE)<br> - java.io.UnixFileSystem.getBooleanAttributes0(java.io.File) @bci=0 (Compiled frame; information may be imprecise)<br> - java.io.UnixFileSystem.getBooleanAttributes(java.io.File) @bci=2, line=228 (Compiled frame)<br> - java.io.File.exists() @bci=20, line=733 (Compiled frame)<br> - org.apache.hadoop.fs.FileUtil.getDU(java.io.File) @bci=3, line=446 (Compiled frame)<br> - org.apache.hadoop.fs.FileUtil.getD...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4095">MAPREDUCE-4095</a>.
			
 
				+     Major bug reported by eli2 and fixed by cmccabe <br>
			
 
				+     <b>TestJobInProgress#testLocality uses a bogus topology</b><br>
			
 
				+     <blockquote>The following in TestJobInProgress#testLocality:<br><br>{code}<br>    Node r2n4 = new NodeBase(&quot;/default/rack2/s1/node4&quot;);<br>    nt.add(r2n4);<br>{code}<br><br>violates the check introduced by HADOOP-8159:<br><br>{noformat}<br>Testcase: testLocality took 0.005 sec<br>        Caused an ERROR<br>Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology.<br>org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology. You cannot have a rack and a non-ra...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4195">MAPREDUCE-4195</a>.
			
 
				+     Critical bug reported by jira.shegalov and fixed by  (jobtracker)<br>
			
 
				+     <b>With invalid queueName request param, jobqueue_details.jsp shows NPE</b><br>
			
 
				+     <blockquote>When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well.<br><br>As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to &apos;default&apos;. Otherwise, this brings down the JobTracker completely.<br><br>FairSch...</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4241">MAPREDUCE-4241</a>.
			
 
				+     Major bug reported by abayer and fixed by abayer (build, examples)<br>
			
 
				+     <b>Pipes examples do not compile on Ubuntu 12.04</b><br>
			
 
				+     <blockquote>-lssl alone won&apos;t work for compiling the pipes examples on 12.04. -lcrypto needs to be added explicitly.</blockquote></li>
			
 
				+
			
 
				+<li> <a href="https://issues.apache.org/jira/browse/MAPREDUCE-4399">MAPREDUCE-4399</a>.
			
 
				+     Major bug reported by vicaya and fixed by vicaya (performance, tasktracker)<br>
			
 
				+     <b>Fix performance regression in shuffle </b><br>
			
 
				+     <blockquote>There is a significant (up to 3x) performance regression in shuffle (vs 0.20.2) in the Hadoop 1.x series. Most noticeable with high-end switches.</blockquote></li>
			
 
				+
			
 
				+
			
 
				+</ul>
			
 
				+
			
 
				+
			
 
				 <h2>Changes since Hadoop 1.0.2</h2>
			
 
				 
			
 
				 <h3>Jiras with Release Notes (describe major or incompatible changes)</h3>