releasenotes.html 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html><head>
  3. <title>Hadoop 0.17.0 Release Notes</title></head>
  4. <body>
  5. <font face="sans-serif">
  6. <h1>Hadoop 0.17.0 Release Notes</h1>
  7. These release notes include new developer and user facing incompatibilities, features, and major improvements. The table below is sorted by Component.
  8. <ul><a name="changes">
  9. <h2>Changes Since Hadoop 0.16.4</h2>
  10. <table border="1" width="100%" cellpadding="4">
  11. <tbody><tr>
  12. <td><b>Issue</b></td>
  13. <td><b>Component</b></td>
  14. <td><b>Notes</b></td>
  15. </tr>
  16. <tr>
  17. <td>
  18. <a href="https://issues.apache.org/jira/browse/HADOOP-2828">HADOOP-2828</a>
  19. </td>
  20. <td>
  21. conf
  22. </td>
  23. <td>
  24. Remove these deprecated methods in
  25. <tt>org.apache.hadoop.conf.Configuration</tt>:<br><tt><ul><li>
  26. public Object getObject(String name) </li><li>
  27. public void setObject(String name, Object value) </li><li>
  28. public Object get(String name, Object defaultValue) </li><li>
  29. public void set(String name, Object value)</li><li>public Iterator entries()
  30. </li></ul></tt></td>
  31. </tr>
  32. <tr>
  33. <td nowrap>
  34. <a href="https://issues.apache.org/jira/browse/HADOOP-2410">HADOOP-2410</a>
  35. </td>
  36. <td>
  37. contrib/ec2
  38. </td>
  39. <td>
  40. The command <tt>hadoop-ec2
  41. run</tt> has been replaced by <tt>hadoop-ec2 launch-cluster
  42. &lt;group&gt; &lt;number of instances&gt;</tt>, and <tt>hadoop-ec2
  43. start-hadoop</tt> has been removed since Hadoop is started on instance
  44. start up. See <a href="http://wiki.apache.org/hadoop/AmazonEC2">http://wiki.apache.org/hadoop/AmazonEC2</a>
  45. for details.
  46. </td>
  47. </tr>
  48. <tr>
  49. <td>
  50. <a href="https://issues.apache.org/jira/browse/HADOOP-2796">HADOOP-2796</a>
  51. </td>
  52. <td>
  53. contrib/hod
  54. </td>
  55. <td>
  56. Added a provision to reliably detect a
  57. failing script's exit code. When the HOD script option
  58. returns a non-zero exit code, look for a <tt>script.exitcode</tt>
  59. file written to the HOD cluster directory. If this file is present, it
  60. means the script failed with the exit code given in the file.
  61. </td>
  62. </tr>
  63. <tr>
  64. <td>
  65. <a href="https://issues.apache.org/jira/browse/HADOOP-2775">HADOOP-2775</a>
  66. </td>
  67. <td>
  68. contrib/hod
  69. </td>
  70. <td>
  71. Added A unit testing framework based on
  72. pyunit to HOD. Developers contributing patches to HOD should now
  73. contribute unit tests along with the patches when possible.
  74. </td>
  75. </tr>
  76. <tr>
  77. <td>
  78. <a href="https://issues.apache.org/jira/browse/HADOOP-3137">HADOOP-3137</a>
  79. </td>
  80. <td>
  81. contrib/hod
  82. </td>
  83. <td>
  84. The HOD version is now the same as the Hadoop version.
  85. </td>
  86. </tr>
  87. <tr>
  88. <td>
  89. <a href="https://issues.apache.org/jira/browse/HADOOP-2855">HADOOP-2855</a>
  90. </td>
  91. <td>
  92. contrib/hod
  93. </td>
  94. <td>
  95. HOD now handles relative
  96. paths correctly for important HOD options such as the cluster directory,
  97. tarball option, and script file.
  98. </td>
  99. </tr>
  100. <tr>
  101. <td>
  102. <a href="https://issues.apache.org/jira/browse/HADOOP-2899">HADOOP-2899</a>
  103. </td>
  104. <td>
  105. contrib/hod
  106. </td>
  107. <td>
  108. HOD now cleans up the HOD generated mapred system directory
  109. at cluster deallocation time.
  110. </td>
  111. </tr>
  112. <tr>
  113. <td>
  114. <a href="https://issues.apache.org/jira/browse/HADOOP-2982">HADOOP-2982</a>
  115. </td>
  116. <td>
  117. contrib/hod
  118. </td>
  119. <td>
  120. The number of free nodes in the cluster
  121. is computed using a better algorithm that filters out inconsistencies in
  122. node status as reported by Torque.
  123. </td>
  124. </tr>
  125. <tr>
  126. <td>
  127. <a href="https://issues.apache.org/jira/browse/HADOOP-2947">HADOOP-2947</a>
  128. </td>
  129. <td>
  130. contrib/hod
  131. </td>
  132. <td>
  133. The stdout and stderr streams of
  134. daemons are redirected to files that are created under the hadoop log
  135. directory. Users can now send a <tt>kill 3</tt> signal to the daemons to get stack traces
  136. and thread dumps for debugging.
  137. </td>
  138. </tr>
  139. <tr>
  140. <td>
  141. <a href="https://issues.apache.org/jira/browse/HADOOP-3168">HADOOP-3168</a>
  142. </td>
  143. <td>
  144. contrib/streaming
  145. </td>
  146. <td>
  147. Decreased the frequency of logging
  148. in Hadoop streaming (from every 100 records to every 10,000 records).
  149. </td>
  150. </tr>
  151. <tr>
  152. <td>
  153. <a href="https://issues.apache.org/jira/browse/HADOOP-3040">HADOOP-3040</a>
  154. </td>
  155. <td>
  156. contrib/streaming
  157. </td>
  158. <td>
  159. Fixed a critical bug to restore important functionality in Hadoop streaming. If the first character on a line is
  160. the separator, then an empty key is assumed and the whole line is the value.
  161. </td>
  162. </tr>
  163. <tr>
  164. <td>
  165. <a href="https://issues.apache.org/jira/browse/HADOOP-2820">HADOOP-2820</a>
  166. </td>
  167. <td>
  168. contrib/streaming
  169. </td>
  170. <td>
  171. Removed these deprecated classes: <br><tt><ul><li>org.apache.hadoop.streaming.StreamLineRecordReader</li><li>org.apache.hadoop.streaming.StreamOutputFormat</li><li>org.apache.hadoop.streaming.StreamSequenceRecordReader</li></ul></tt></td>
  172. </tr>
  173. <tr>
  174. <td>
  175. <a href="https://issues.apache.org/jira/browse/HADOOP-3280">HADOOP-3280</a>
  176. </td>
  177. <td>
  178. contrib/streaming
  179. </td>
  180. <td>
  181. Added the
  182. <tt>mapred.child.ulimit</tt> configuration variable to limit the maximum virtual memory allocated to processes launched by the
  183. Map-Reduce framework. This can be used to control both the Mapper/Reducer
  184. tasks and applications using Hadoop pipes, Hadoop streaming etc.
  185. </td>
  186. </tr>
  187. <tr>
  188. <td>
  189. <a href="https://issues.apache.org/jira/browse/HADOOP-2657">HADOOP-2657</a>
  190. </td>
  191. <td>
  192. dfs
  193. </td>
  194. <td>Added the new API <tt>DFSOututStream.flush()</tt> to
  195. flush all outstanding data to DataNodes.
  196. </td>
  197. </tr>
  198. <tr>
  199. <td>
  200. <a href="https://issues.apache.org/jira/browse/HADOOP-2219">HADOOP-2219</a>
  201. </td>
  202. <td>
  203. dfs
  204. </td>
  205. <td>
  206. Added a new <tt>fs -count</tt> command for
  207. counting the number of bytes, files, and directories under a given path. <br>
  208. <br>
  209. Added a new RPC <tt>getContentSummary(String path)</tt> to ClientProtocol.
  210. </td>
  211. </tr>
  212. <tr>
  213. <td>
  214. <a href="https://issues.apache.org/jira/browse/HADOOP-2559">HADOOP-2559</a>
  215. </td>
  216. <td>
  217. dfs
  218. </td>
  219. <td>
  220. Changed DFS block placement to
  221. allocate the first replica locally, the second off-rack, and the third
  222. intra-rack from the second.
  223. </td>
  224. </tr>
  225. <tr>
  226. <td>
  227. <a href="https://issues.apache.org/jira/browse/HADOOP-2758">HADOOP-2758</a>
  228. </td>
  229. <td>
  230. dfs
  231. </td>
  232. <td>
  233. Improved DataNode CPU usage by 50% while serving data to clients.
  234. </td>
  235. </tr>
  236. <tr>
  237. <td>
  238. <a href="https://issues.apache.org/jira/browse/HADOOP-2634">HADOOP-2634</a>
  239. </td>
  240. <td>
  241. dfs
  242. </td>
  243. <td>
  244. Deprecated ClientProtocol's <tt>exists()</tt> method. Use <tt>getFileInfo(String)</tt> instead.
  245. </td>
  246. </tr>
  247. <tr>
  248. <td>
  249. <a href="https://issues.apache.org/jira/browse/HADOOP-2423">HADOOP-2423</a>
  250. </td>
  251. <td>
  252. dfs
  253. </td>
  254. <td>
  255. Improved <tt>FSDirectory.mkdirs(...)</tt> performance by about 50% as measured by the NNThroughputBenchmark.
  256. </td>
  257. </tr>
  258. <tr>
  259. <td>
  260. <a href="https://issues.apache.org/jira/browse/HADOOP-3124">HADOOP-3124</a>
  261. </td>
  262. <td>
  263. dfs
  264. </td>
  265. <td>
  266. Made DataNode socket write timeout configurable, however the configuration variable is undocumented.
  267. </td>
  268. </tr>
  269. <tr>
  270. <td>
  271. <a href="https://issues.apache.org/jira/browse/HADOOP-2470">HADOOP-2470</a>
  272. </td>
  273. <td>
  274. dfs
  275. </td>
  276. <td>
  277. Removed <tt>open()</tt> and <tt>isDir()</tt> methods from ClientProtocol without first deprecating. <br>
  278. <br>
  279. Remove deprecated <tt>getContentLength()</tt> from ClientProtocol.<br>
  280. <br>
  281. Deprecated <tt>isDirectory</tt> in DFSClient. Use <tt>getFileStatus()</tt> instead.
  282. </td>
  283. </tr>
  284. <tr>
  285. <td>
  286. <a href="https://issues.apache.org/jira/browse/HADOOP-2854">HADOOP-2854</a>
  287. </td>
  288. <td>
  289. dfs
  290. </td>
  291. <td>
  292. Removed deprecated method <tt>org.apache.hadoop.ipc.Server.getUserInfo()</tt>.
  293. </td>
  294. </tr>
  295. <tr>
  296. <td>
  297. <a href="https://issues.apache.org/jira/browse/HADOOP-2239">HADOOP-2239</a>
  298. </td>
  299. <td>
  300. dfs
  301. </td>
  302. <td>
  303. Added a new FileSystem, HftpsFileSystem, that allows access to HDFS data over HTTPS.
  304. </td>
  305. </tr>
  306. <tr>
  307. <td>
  308. <a href="https://issues.apache.org/jira/browse/HADOOP-771">HADOOP-771</a>
  309. </td>
  310. <td>
  311. dfs
  312. </td>
  313. <td>
  314. Added a new method to <tt>FileSystem</tt> API, <tt>delete(path, boolean)</tt>,
  315. and deprecated the previous <tt>delete(path)</tt> method.
  316. The new method recursively deletes files only if boolean is set to true.
  317. </td>
  318. </tr>
  319. <tr>
  320. <td>
  321. <a href="https://issues.apache.org/jira/browse/HADOOP-3239">HADOOP-3239</a>
  322. </td>
  323. <td>
  324. dfs
  325. </td>
  326. <td>
  327. Modified <tt>org.apache.hadoop.dfs.FSDirectory.getFileInfo(String)</tt> to return null when a file is not
  328. found instead of throwing FileNotFoundException.
  329. </td>
  330. </tr>
  331. <tr>
  332. <td>
  333. <a href="https://issues.apache.org/jira/browse/HADOOP-3091">HADOOP-3091</a>
  334. </td>
  335. <td>
  336. dfs
  337. </td>
  338. <td>
  339. Enhanced <tt>hadoop dfs -put</tt> command to accept multiple
  340. sources when destination is a directory.
  341. </td>
  342. </tr>
  343. <tr>
  344. <td>
  345. <a href="https://issues.apache.org/jira/browse/HADOOP-2192">HADOOP-2192</a>
  346. </td>
  347. <td>
  348. dfs
  349. </td>
  350. <td>
  351. Modified <tt>hadoop dfs -mv</tt> to be closer in functionality to
  352. the Linux <tt>mv</tt> command by removing unnecessary output and return
  353. an error message when moving non existent files/directories.
  354. </td>
  355. </tr>
  356. <tr>
  357. <td>
  358. <u1:p></u1:p><a href="https://issues.apache.org/jira/browse/HADOOP-1985">HADOOP-1985</a>
  359. </td>
  360. <td>
  361. dfs <br>
  362. mapred
  363. </td>
  364. <td>
  365. Added rack awareness for map tasks and moves the rack resolution logic to the
  366. NameNode and JobTracker. <p> The administrator can specify a
  367. loadable class given by topology.node.switch.mapping.impl to specify the
  368. class implementing the logic for rack resolution. The class must implement
  369. a method - resolve(List&lt;String&gt; names), where names is the list of
  370. DNS-names/IP-addresses that we want resolved. The return value is a list of
  371. resolved network paths of the form /foo/rack, where rack is the rackID
  372. where the node belongs to and foo is the switch where multiple racks are
  373. connected, and so on. The default implementation of this class is packaged
  374. along with hadoop and points to org.apache.hadoop.net.ScriptBasedMapping
  375. and this class loads a script that can be used for rack resolution. The
  376. script location is configurable. It is specified by
  377. topology.script.file.name and defaults to an empty script. In the case
  378. where the script name is empty, /default-rack is returned for all
  379. dns-names/IP-addresses. The loadable topology.node.switch.mapping.impl provides
  380. administrators fleixibilty to define how their site's node resolution
  381. should happen. <br>
  382. For mapred, one can also specify the level of the cache w.r.t the number of
  383. levels in the resolved network path - defaults to two. This means that the
  384. JobTracker will cache tasks at the host level and at the rack level. <br>
  385. Known issue: the task caching will not work with levels greater than 2
  386. (beyond racks). This bug is tracked in <a href="https://issues.apache.org/jira/browse/HADOOP-3296">HADOOP-3296</a>.
  387. </td>
  388. </tr>
  389. <tr>
  390. <td>
  391. <a href="https://issues.apache.org/jira/browse/HADOOP-2063">HADOOP-2063</a>
  392. </td>
  393. <td>
  394. fs
  395. </td>
  396. <td>
  397. Added a new option <tt>-ignoreCrc</tt> to <tt>fs -get</tt> and <tt>fs -copyToLocal</tt>. The option causes CRC checksums to be
  398. ignored for this command so that corrupt files may be downloaded.
  399. </td>
  400. </tr>
  401. <tr>
  402. <td>
  403. <a href="https://issues.apache.org/jira/browse/HADOOP-3001">HADOOP-3001</a>
  404. </td>
  405. <td>
  406. fs
  407. </td>
  408. <td>
  409. Added a new Map/Reduce framework
  410. counters that track the number of bytes read and written to HDFS, local,
  411. KFS, and S3 file systems.
  412. </td>
  413. </tr>
  414. <tr>
  415. <td>
  416. <a href="https://issues.apache.org/jira/browse/HADOOP-2027">HADOOP-2027</a>
  417. </td>
  418. <td>
  419. fs
  420. </td>
  421. <td>
  422. Added a new FileSystem method <tt>getFileBlockLocations</tt> to return the number of bytes in each block in a file
  423. via a single rpc to the NameNode. Deprecated <tt>getFileCacheHints</tt>.
  424. </td>
  425. </tr>
  426. <tr>
  427. <td>
  428. <a href="https://issues.apache.org/jira/browse/HADOOP-2839">HADOOP-2839</a>
  429. </td>
  430. <td>
  431. fs
  432. </td>
  433. <td>
  434. Removed deprecated method <tt>org.apache.hadoop.fs.FileSystem.globPaths()</tt>.
  435. </td>
  436. </tr>
  437. <tr>
  438. <td>
  439. <a href="https://issues.apache.org/jira/browse/HADOOP-2563">HADOOP-2563</a>
  440. </td>
  441. <td>
  442. fs
  443. </td>
  444. <td>
  445. Removed deprecated method <tt>org.apache.hadoop.fs.FileSystem.listPaths()</tt>.
  446. </td>
  447. </tr>
  448. <tr>
  449. <td>
  450. <a href="https://issues.apache.org/jira/browse/HADOOP-1593">HADOOP-1593</a>
  451. </td>
  452. <td>
  453. fs
  454. </td>
  455. <td>
  456. Modified FSShell commands to accept non-default paths. Now you can commands like <tt>hadoop dfs -ls hdfs://remotehost1:port/path</tt>
  457. and <tt>hadoop dfs -ls hdfs://remotehost2:port/path</tt> without changing your Hadoop config.
  458. </td>
  459. </tr>
  460. <tr>
  461. <td>
  462. <a href="https://issues.apache.org/jira/browse/HADOOP-3048">HADOOP-3048</a>
  463. </td>
  464. <td>
  465. io
  466. </td>
  467. <td>
  468. Added a new API and a default
  469. implementation to convert and restore serializations of objects to strings.
  470. </td>
  471. </tr>
  472. <tr>
  473. <td>
  474. <a href="https://issues.apache.org/jira/browse/HADOOP-3152">HADOOP-3152</a>
  475. </td>
  476. <td>
  477. io
  478. </td>
  479. <td>
  480. Add a static method
  481. <tt>MapFile.setIndexInterval(Configuration, int interval)</tt> so that Map/Reduce
  482. jobs using <tt>MapFileOutputFormat</tt> can set the index interval.
  483. </td>
  484. </tr>
  485. <tr>
  486. <td>
  487. <a href="https://issues.apache.org/jira/browse/HADOOP-3073">HADOOP-3073</a>
  488. </td>
  489. <td>
  490. ipc
  491. </td>
  492. <td>
  493. <tt>SocketOutputStream.close()</tt> now closes the
  494. underlying channel. This increase compatibility with
  495. <tt>java.net.Socket.getOutputStream</tt>.
  496. </td>
  497. </tr>
  498. <tr>
  499. <td>
  500. <a href="https://issues.apache.org/jira/browse/HADOOP-3041">HADOOP-3041</a>
  501. </td>
  502. <td>
  503. mapred
  504. </td>
  505. <td>
  506. Deprecated <tt>JobConf.setOutputPath</tt> and <tt>JobConf.getOutputPath</tt>.<p>
  507. Deprecated <tt>OutputFormatBase</tt>. Added <tt>FileOutputFormat</tt>. Existing output
  508. formats extending <tt>OutputFormatBase</tt> now extend <tt>FileOutputFormat</tt>. <p>
  509. Added the following methods to <tt>FileOutputFormat</tt>:
  510. <tt><ul>
  511. <li>public static void setOutputPath(JobConf conf, Path outputDir)
  512. <li>public static Path getOutputPath(JobConf conf)
  513. <li>public static Path getWorkOutputPath(JobConf conf)
  514. <li>static void setWorkOutputPath(JobConf conf, Path outputDir)
  515. </ul></tt>
  516. </td>
  517. </tr>
  518. <tr>
  519. <td>
  520. <a href="https://issues.apache.org/jira/browse/HADOOP-3204">HADOOP-3204</a>
  521. </td>
  522. <td>
  523. mapred
  524. </td>
  525. <td>
  526. Fixed <tt>ReduceTask.LocalFSMerger</tt> to handle errors and exceptions better. Prior to this all
  527. exceptions except IOException would be silently ignored.
  528. </td>
  529. </tr>
  530. <tr>
  531. <td>
  532. <a href="https://issues.apache.org/jira/browse/HADOOP-1986">HADOOP-1986</a>
  533. </td>
  534. <td>
  535. mapred
  536. </td>
  537. <td>
  538. Programs that implement the raw
  539. <tt>Mapper</tt> or <tt>Reducer</tt> interfaces will need modification to compile with this
  540. release. For example, <p>
  541. <pre>
  542. class MyMapper implements Mapper {
  543. public void map(WritableComparable key, Writable val,
  544. OutputCollector out, Reporter reporter) throws IOException {
  545. // ...
  546. }
  547. // ...
  548. }
  549. </pre>
  550. will need to be changed to refer to the parameterized type. For example: <p>
  551. <pre>
  552. class MyMapper implements Mapper&lt;WritableComparable, Writable, WritableComparable, Writable&gt; {
  553. public void map(WritableComparable key, Writable val,
  554. OutputCollector&lt;WritableComparable, Writable&gt;
  555. out, Reporter reporter) throws IOException {
  556. // ...
  557. }
  558. // ...
  559. }
  560. </pre>
  561. Similarly implementations of the following raw interfaces will need
  562. modification:
  563. <tt><ul>
  564. <li>InputFormat
  565. <li>OutputCollector
  566. <li>OutputFormat
  567. <li>Partitioner
  568. <li>RecordReader
  569. <li>RecordWriter
  570. </ul></tt>
  571. </td>
  572. </tr>
  573. <tr>
  574. <td>
  575. <a href="https://issues.apache.org/jira/browse/HADOOP-910">HADOOP-910</a>
  576. </td>
  577. <td>
  578. mapred
  579. </td>
  580. <td>
  581. Reducers now perform merges of
  582. shuffle data (both in-memory and on disk) while fetching map outputs.
  583. Earlier, during shuffle they used to merge only the in-memory outputs.
  584. </td>
  585. </tr>
  586. <tr>
  587. <td>
  588. <a href="https://issues.apache.org/jira/browse/HADOOP-2822">HADOOP-2822</a>
  589. </td>
  590. <td>
  591. mapred
  592. </td>
  593. <td>
  594. Removed the deprecated classes <tt>org.apache.hadoop.mapred.InputFormatBase</tt>
  595. and <tt>org.apache.hadoop.mapred.PhasedFileSystem</tt>.
  596. </td>
  597. </tr>
  598. <tr>
  599. <td>
  600. <a href="https://issues.apache.org/jira/browse/HADOOP-2817">HADOOP-2817</a>
  601. </td>
  602. <td>
  603. mapred
  604. </td>
  605. <td>
  606. Removed the deprecated method
  607. <tt>org.apache.hadoop.mapred.ClusterStatus.getMaxTasks()</tt>
  608. and the deprecated configuration property <tt>mapred.tasktracker.tasks.maximum</tt>.
  609. </td>
  610. </tr>
  611. <tr>
  612. <td>
  613. <a href="https://issues.apache.org/jira/browse/HADOOP-2825">HADOOP-2825</a>
  614. </td>
  615. <td>
  616. mapred
  617. </td>
  618. <td>
  619. Removed the deprecated method
  620. <tt>org.apache.hadoop.mapred.MapOutputLocation.getFile(FileSystem fileSys, Path
  621. localFilename, int reduce, Progressable pingee, int timeout)</tt>.
  622. </td>
  623. </tr>
  624. <tr>
  625. <td>
  626. <a href="https://issues.apache.org/jira/browse/HADOOP-2818">HADOOP-2818</a>
  627. </td>
  628. <td>
  629. mapred
  630. </td>
  631. <td>
  632. Removed the deprecated methods
  633. <tt>org.apache.hadoop.mapred.Counters.getDisplayName(String counter)</tt> and
  634. <tt>org.apache.hadoop.mapred.Counters.getCounterNames()</tt>.
  635. Undeprecated the method
  636. <tt>org.apache.hadoop.mapred.Counters.getCounter(String counterName)</tt>.
  637. </td>
  638. </tr>
  639. <tr>
  640. <td>
  641. <a href="https://issues.apache.org/jira/browse/HADOOP-2826">HADOOP-2826</a>
  642. </td>
  643. <td>
  644. mapred
  645. </td>
  646. <td>
  647. Changed The signature of the method
  648. <tt>public org.apache.hadoop.streaming.UTF8ByteArrayUtils.readLIne(InputStream)</tt> to
  649. <tt>UTF8ByteArrayUtils.readLIne(LineReader, Text)</tt>. Since the old
  650. signature is not deprecated, any code using the old method must be changed
  651. to use the new method.
  652. <p>
  653. Removed the deprecated methods <tt>org.apache.hadoop.mapred.FileSplit.getFile()</tt>
  654. and <tt>org.apache.hadoop.mapred.LineRecordReader.readLine(InputStream in,
  655. OutputStream out)</tt>.
  656. <p>
  657. Made the constructor <tt>org.apache.hadoop.mapred.LineRecordReader.LineReader(InputStream in, Configuration
  658. conf)</tt> public.
  659. </td>
  660. </tr>
  661. <tr>
  662. <td>
  663. <a href="https://issues.apache.org/jira/browse/HADOOP-2819">HADOOP-2819</a>
  664. </td>
  665. <td>
  666. mapred
  667. </td>
  668. <td>
  669. Removed these deprecated methods from <tt>org.apache.hadoop.JobConf</tt>:
  670. <tt><ul>
  671. <li>public Class getInputKeyClass()
  672. <li>public void setInputKeyClass(Class theClass)
  673. <li>public Class getInputValueClass()
  674. <li>public void setInputValueClass(Class theClass)
  675. </ul></tt>
  676. and undeprecated these methods:
  677. <tt><ul>
  678. <li>getSpeculativeExecution()
  679. <li>public void setSpeculativeExecution(boolean speculativeExecution)
  680. </ul></tt>
  681. </td>
  682. </tr>
  683. <tr>
  684. <td>
  685. <a href="https://issues.apache.org/jira/browse/HADOOP-3093">HADOOP-3093</a>
  686. </td>
  687. <td>
  688. mapred
  689. </td>
  690. <td>
  691. Added the following public methods to <tt>org.apache.hadoop.conf.Configuration</tt>:
  692. <tt><ul>
  693. <li>String[] Configuration.getStrings(String name, String... defaultValue)
  694. <li>void Configuration.setStrings(String name, String... values)
  695. </ul></tt>
  696. </td>
  697. </tr>
  698. <tr>
  699. <td>
  700. <a href="https://issues.apache.org/jira/browse/HADOOP-2399">HADOOP-2399</a>
  701. </td>
  702. <td>
  703. mapred
  704. </td>
  705. <td>
  706. The key and value objects that are given
  707. to the Combiner and Reducer are now reused between calls. This is much more
  708. efficient, but the user can not assume the objects are constant.
  709. </td>
  710. </tr>
  711. <tr>
  712. <td>
  713. <a href="https://issues.apache.org/jira/browse/HADOOP-3162">HADOOP-3162</a>
  714. </td>
  715. <td>
  716. mapred
  717. </td>
  718. <td>
  719. Deprecated the public methods <tt>org.apache.hadoop.mapred.JobConf.setInputPath(Path)</tt> and
  720. <tt>org.apache.hadoop.mapred.JobConf.addInputPath(Path)</tt>.
  721. <p>
  722. Added the following public methods to <tt>org.apache.hadoop.mapred.FileInputFormat</tt>:
  723. <tt><ul>
  724. <li>public static void setInputPaths(JobConf job, Path... paths); <br>
  725. <li>public static void setInputPaths(JobConf job, String commaSeparatedPaths); <br>
  726. <li>public static void addInputPath(JobConf job, Path path); <br>
  727. <li>public static void addInputPaths(JobConf job, String commaSeparatedPaths); <br>
  728. </ul></tt>
  729. Earlier code calling <tt>JobConf.setInputPath(Path)</tt> and <tt>JobConf.addInputPath(Path)</tt>
  730. should now call <tt>FileInputFormat.setInputPaths(JobConf, Path...)</tt> and
  731. <tt>FileInputFormat.addInputPath(Path)</tt> respectively.
  732. </td>
  733. </tr>
  734. <tr>
  735. <td>
  736. <a href="https://issues.apache.org/jira/browse/HADOOP-2178">HADOOP-2178</a>
  737. </td>
  738. <td>
  739. mapred
  740. </td>
  741. <td>
  742. Provided a new facility to
  743. store job history on DFS. Cluster administrator can now provide either localFS
  744. location or DFS location using configuration property
  745. <tt>mapred.job.history.location</tt> to store job histroy. History will also
  746. be logged in user specified location if the configuration property
  747. <tt>mapred.job.history.user.location</tt> is specified.
  748. <p>
  749. Removed these classes and method:
  750. <tt><ul>
  751. <li>org.apache.hadoop.mapred.DefaultJobHistoryParser.MasterIndex
  752. <li>org.apache.hadoop.mapred.DefaultJobHistoryParser.MasterIndexParseListener
  753. <li>org.apache.hadoop.mapred.DefaultJobHistoryParser.parseMasterIndex
  754. </ul></tt>
  755. <p>
  756. Changed the signature of the public method
  757. <tt>org.apache.hadoop.mapred.DefaultJobHistoryParser.parseJobTasks(File
  758. jobHistoryFile, JobHistory.JobInfo job)</tt> to
  759. <tt>DefaultJobHistoryParser.parseJobTasks(String jobHistoryFile,
  760. JobHistory.JobInfo job, FileSystem fs)</tt>. <p>
  761. Changed the signature of the public method
  762. <tt>org.apache.hadoop.mapred.JobHistory.parseHistory(File path, Listener l)</tt>
  763. to <tt>JobHistory.parseHistoryFromFS(String path, Listener l, FileSystem fs)</tt>.
  764. </td>
  765. </tr>
  766. <tr>
  767. <td>
  768. <a href="https://issues.apache.org/jira/browse/HADOOP-2055">HADOOP-2055</a>
  769. </td>
  770. <td>
  771. mapred
  772. </td>
  773. <td>
  774. Users are now provided the ability to specify what paths to ignore when processing the job input directory
  775. (apart from the filenames that start with "_" and ".").
  776. To do this, two new methods were defined:
  777. <tt><ul>
  778. <li>FileInputFormat.setInputPathFilter(JobConf, PathFilter)
  779. <li>FileInputFormat.getInputPathFilter(JobConf)
  780. </ul></tt>
  781. </td>
  782. </tr>
  783. <tr>
  784. <td>
  785. <a href="https://issues.apache.org/jira/browse/HADOOP-2116">HADOOP-2116</a>
  786. </td>
  787. <td>
  788. mapred
  789. </td>
  790. <td>
  791. Restructured the local job directory on the tasktracker. Users are provided with a job-specific shared directory
  792. (<tt>mapred-local/taskTracker/jobcache/$jobid/work</tt>) for use as scratch
  793. space, through configuration property and system property
  794. <tt>job.local.dir</tt>. The directory <tt>../work</tt> is no longer available from the task's current working directory.
  795. </td>
  796. </tr>
  797. <tr>
  798. <td>
  799. <a href="https://issues.apache.org/jira/browse/HADOOP-1622">HADOOP-1622</a>
  800. </td>
  801. <td>
  802. mapred
  803. </td>
  804. <td>
  805. Added new command line options for <tt>hadoop jar</tt> command:
  806. <p>
  807. <tt>hadoop jar -files &lt;comma seperated list of files&gt; -libjars &lt;comma
  808. seperated list of jars&gt; -archives &lt;comma seperated list of
  809. archives&gt; </tt>
  810. <p>
  811. where the options have these meanings:
  812. <p>
  813. <ul>
  814. <li><tt>-files</tt> options allows you to speficy comma seperated list of path which
  815. would be present in your current working directory of your task <br>
  816. <li><tt>-libjars</tt> option allows you to add jars to the classpaths of the maps and
  817. reduces. <br>
  818. <li><tt>-archives</tt> allows you to pass archives as arguments that are
  819. unzipped/unjarred and a link with name of the jar/zip are created in the
  820. current working directory if tasks.
  821. </ul>
  822. </td>
  823. </tr>
  824. <tr>
  825. <td>
  826. <a href="https://issues.apache.org/jira/browse/HADOOP-2823">HADOOP-2823</a>
  827. </td>
  828. <td>
  829. record
  830. </td>
  831. <td>
  832. Removed the deprecated methods in
  833. <tt>org.apache.hadoop.record.compiler.generated.SimpleCharStream</tt>:
  834. <tt><ul>
  835. <li>public int getColumn()
  836. <li>and public int getLine()
  837. </ul></tt>
  838. </td>
  839. </tr>
  840. <tr>
  841. <td>
  842. <a href="https://issues.apache.org/jira/browse/HADOOP-2551">HADOOP-2551</a>
  843. </td>
  844. <td>
  845. scripts
  846. </td>
  847. <td>
  848. Introduced new environment variables to allow finer grained control of Java options passed to server and
  849. client JVMs. See the new <tt>*_OPTS</tt> variables in <tt>conf/hadoop-env.sh</tt>.
  850. </td>
  851. </tr>
  852. <tr>
  853. <td>
  854. <a href="https://issues.apache.org/jira/browse/HADOOP-3099">HADOOP-3099</a>
  855. </td>
  856. <td>
  857. util
  858. </td>
  859. <td>
  860. Added a new <tt>-p</tt> option to <tt>distcp</tt> for preserving file and directory status:
  861. <pre>
  862. -p[rbugp] Preserve status
  863. r: replication number
  864. b: block size
  865. u: user
  866. g: group
  867. p: permission
  868. </pre>
  869. The <tt>-p</tt> option alone is equivalent to <tt>-prbugp</tt>
  870. </td>
  871. </tr>
  872. <tr>
  873. <td>
  874. <a href="https://issues.apache.org/jira/browse/HADOOP-2821">HADOOP-2821</a>
  875. </td>
  876. <td>
  877. util
  878. </td>
  879. <td>
  880. Removed the deprecated classes <tt>org.apache.hadoop.util.ShellUtil</tt> and <tt>org.apache.hadoop.util.ToolBase</tt>.
  881. </td>
  882. </tr>
  883. </tbody></table>
  884. </ul>
  885. </body></html>