17 years ago · 54234bdf57
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -163,6 +163,10 @@ Trunk (unreleased changes)
 
															     HADOOP-2390. Added documentation for user-controls for intermediate
														
 
															     map-outputs & final job-outputs and native-hadoop libraries. (acmurthy) 
														
 
															+    HADOOP-1660. Add the cwd of the map/reduce task to the java.library.path
														
 
															+    of the child-jvm to support loading of native libraries distributed via
														
 
															+    the DistributedCache. (acmurthy)
														
 
															+ 
														
 
															   OPTIMIZATIONS
														
 
															     HADOOP-1898.  Release the lock protecting the last time of the last stack
														
--- a/docs/mapred_tutorial.html
+++ b/docs/mapred_tutorial.html
@@ -216,6 +216,9 @@ document.write("Last Published: " + document.lastModified);
 
															 <a href="#Job+Configuration">Job Configuration</a>
														
 
															 </li>
														
 
															 <li>
														
 
															+<a href="#Task+Execution+%26+Environment">Task Execution &amp; Environment</a>
														
 
															+</li>
														
 
															+<li>
														
 
															 <a href="#Job+Submission+and+Monitoring">Job Submission and Monitoring</a>
														
 
															 <ul class="minitoc">
														
 
															 <li>
														
@@ -274,7 +277,7 @@ document.write("Last Published: " + document.lastModified);
 
															 <a href="#Example%3A+WordCount+v2.0">Example: WordCount v2.0</a>
														
 
															 <ul class="minitoc">
														
 
															 <li>
														
 
															-<a href="#Source+Code-N10B1F">Source Code</a>
														
 
															+<a href="#Source+Code-N10B98">Source Code</a>
														
 
															 </li>
														
 
															 <li>
														
 
															 <a href="#Sample+Runs">Sample Runs</a>
														
@@ -1460,7 +1463,67 @@ document.write("Last Published: " + document.lastModified);
 
															         <a href="api/org/apache/hadoop/conf/Configuration.html#set(java.lang.String, java.lang.String)">set(String, String)</a>/<a href="api/org/apache/hadoop/conf/Configuration.html#get(java.lang.String, java.lang.String)">get(String, String)</a>
														
 
															         to set/get arbitrary parameters needed by applications. However, use the 
														
 
															         <span class="codefrag">DistributedCache</span> for large amounts of (read-only) data.</p>
														
 
															-<a name="N1082C"></a><a name="Job+Submission+and+Monitoring"></a>
														
 
															+<a name="N1082C"></a><a name="Task+Execution+%26+Environment"></a>
														
 
															+<h3 class="h4">Task Execution &amp; Environment</h3>
														
 
															+<p>The <span class="codefrag">TaskTracker</span> executes the <span class="codefrag">Mapper</span>/ 
														
 
															+        <span class="codefrag">Reducer</span>  <em>task</em> as a child process in a separate jvm.
														
 
															+        </p>
														
 
															+<p>The child-task inherits the environment of the parent 
														
 
															+        <span class="codefrag">TaskTracker</span>. The user can specify additional options to the
														
 
															+        child-jvm via the <span class="codefrag">mapred.child.java.opts</span> configuration
														
 
															+        parameter in the <span class="codefrag">JobConf</span> such as non-standard paths for the 
														
 
															+        run-time linker to search shared libraries via 
														
 
															+        <span class="codefrag">-Djava.library.path=&lt;&gt;</span> etc. If the 
														
 
															+        <span class="codefrag">mapred.child.java.opts</span> contains the symbol <em>@taskid@</em> 
														
 
															+        it is interpolated with value of <span class="codefrag">taskid</span> of the map/reduce
														
 
															+        task.</p>
														
 
															+<p>Here is an example with multiple arguments and substitutions, 
														
 
															+        showing jvm GC logging, and start of a passwordless JVM JMX agent so that
														
 
															+        it can connect with jconsole and the likes to watch child memory, 
														
 
															+        threads and get thread dumps. It also sets the maximum heap-size of the 
														
 
															+        child jvm to 512MB and adds an additional path to the 
														
 
															+        <span class="codefrag">java.library.path</span> of the child-jvm.</p>
														
 
															+<p>
														
 
															+          
														
 
															+<span class="codefrag">&lt;property&gt;</span>
														
 
															+<br>
														
 
															+          &nbsp;&nbsp;<span class="codefrag">&lt;name&gt;mapred.child.java.opts&lt;/name&gt;</span>
														
 
															+<br>
														
 
															+          &nbsp;&nbsp;<span class="codefrag">&lt;value&gt;</span>
														
 
															+<br>
														
 
															+          &nbsp;&nbsp;&nbsp;&nbsp;<span class="codefrag">
														
 
															+                    -Xmx512M -Djava.library.path=/home/mycompany/lib
														
 
															+                    -verbose:gc -Xloggc:/tmp/@taskid@.gc</span>
														
 
															+<br>
														
 
															+          &nbsp;&nbsp;&nbsp;&nbsp;<span class="codefrag">
														
 
															+                    -Dcom.sun.management.jmxremote.authenticate=false 
														
 
															+                    -Dcom.sun.management.jmxremote.ssl=false</span>
														
 
															+<br>
														
 
															+          &nbsp;&nbsp;<span class="codefrag">&lt;/value&gt;</span>
														
 
															+<br>
														
 
															+          
														
 
															+<span class="codefrag">&lt;/property&gt;</span>
														
 
															+        
														
 
															+</p>
														
 
															+<p>The <a href="#DistributedCache">DistributedCache</a> can also be used
														
 
															+        as a rudimentary software distribution mechanism for use in the map 
														
 
															+        and/or reduce tasks. It can be used to distribute both jars and 
														
 
															+        native libraries. The 
														
 
															+        <a href="api/org/apache/hadoop/filecache/DistributedCache.html#addArchiveToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)">
														
 
															+        DistributedCache.addArchiveToClassPath(Path, Configuration)</a> or 
														
 
															+        <a href="api/org/apache/hadoop/filecache/DistributedCache.html#addFileToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)">
														
 
															+        DistributedCache.addFileToClassPath(Path, Configuration)</a> api can 
														
 
															+        be used to cache files/jars and also add them to the <em>classpath</em> 
														
 
															+        of child-jvm. Similarly the facility provided by the 
														
 
															+        <span class="codefrag">DistributedCache</span> where-in it symlinks the cached files into
														
 
															+        the working directory of the task can be used to distribute native 
														
 
															+        libraries and load them. The underlying detail is that child-jvm always 
														
 
															+        has its <em>current working directory</em> added to the
														
 
															+        <span class="codefrag">java.library.path</span> and hence the cached libraries can be 
														
 
															+        loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
														
 
															+        System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
														
 
															+        System.load</a>.</p>
														
 
															+<a name="N108A1"></a><a name="Job+Submission+and+Monitoring"></a>
														
 
															 <h3 class="h4">Job Submission and Monitoring</h3>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/JobClient.html">
														
@@ -1496,7 +1559,7 @@ document.write("Last Published: " + document.lastModified);
 
															 <p>Normally the user creates the application, describes various facets 
														
 
															         of the job via <span class="codefrag">JobConf</span>, and then uses the 
														
 
															         <span class="codefrag">JobClient</span> to submit the job and monitor its progress.</p>
														
 
															-<a name="N1086A"></a><a name="Job+Control"></a>
														
 
															+<a name="N108DF"></a><a name="Job+Control"></a>
														
 
															 <h4>Job Control</h4>
														
 
															 <p>Users may need to chain map-reduce jobs to accomplish complex
														
 
															           tasks which cannot be done via a single map-reduce job. This is fairly
														
@@ -1532,7 +1595,7 @@ document.write("Last Published: " + document.lastModified);
 
															             </li>
														
 
															 </ul>
														
 
															-<a name="N10894"></a><a name="Job+Input"></a>
														
 
															+<a name="N10909"></a><a name="Job+Input"></a>
														
 
															 <h3 class="h4">Job Input</h3>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
														
@@ -1580,7 +1643,7 @@ document.write("Last Published: " + document.lastModified);
 
															         appropriate <span class="codefrag">CompressionCodec</span>. However, it must be noted that
														
 
															         compressed files with the above extensions cannot be <em>split</em> and 
														
 
															         each compressed file is processed in its entirety by a single mapper.</p>
														
 
															-<a name="N108FE"></a><a name="InputSplit"></a>
														
 
															+<a name="N10973"></a><a name="InputSplit"></a>
														
 
															 <h4>InputSplit</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/InputSplit.html">
														
@@ -1594,7 +1657,7 @@ document.write("Last Published: " + document.lastModified);
 
															           FileSplit</a> is the default <span class="codefrag">InputSplit</span>. It sets 
														
 
															           <span class="codefrag">map.input.file</span> to the path of the input file for the
														
 
															           logical split.</p>
														
 
															-<a name="N10923"></a><a name="RecordReader"></a>
														
 
															+<a name="N10998"></a><a name="RecordReader"></a>
														
 
															 <h4>RecordReader</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/RecordReader.html">
														
@@ -1606,7 +1669,7 @@ document.write("Last Published: " + document.lastModified);
 
															           for processing. <span class="codefrag">RecordReader</span> thus assumes the 
														
 
															           responsibility of processing record boundaries and presents the tasks 
														
 
															           with keys and values.</p>
														
 
															-<a name="N10946"></a><a name="Job+Output"></a>
														
 
															+<a name="N109BB"></a><a name="Job+Output"></a>
														
 
															 <h3 class="h4">Job Output</h3>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
														
@@ -1631,7 +1694,7 @@ document.write("Last Published: " + document.lastModified);
 
															 <p>
														
 
															 <span class="codefrag">TextOutputFormat</span> is the default 
														
 
															         <span class="codefrag">OutputFormat</span>.</p>
														
 
															-<a name="N1096F"></a><a name="Task+Side-Effect+Files"></a>
														
 
															+<a name="N109E4"></a><a name="Task+Side-Effect+Files"></a>
														
 
															 <h4>Task Side-Effect Files</h4>
														
 
															 <p>In some applications, component tasks need to create and/or write to
														
 
															           side-files, which differ from the actual job-output files.</p>
														
@@ -1657,7 +1720,7 @@ document.write("Last Published: " + document.lastModified);
 
															           JobConf.getOutputPath()</a>, and the framework will promote them 
														
 
															           similarly for succesful task-attempts, thus eliminating the need to 
														
 
															           pick unique paths per task-attempt.</p>
														
 
															-<a name="N109A4"></a><a name="RecordWriter"></a>
														
 
															+<a name="N10A19"></a><a name="RecordWriter"></a>
														
 
															 <h4>RecordWriter</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/RecordWriter.html">
														
@@ -1665,9 +1728,9 @@ document.write("Last Published: " + document.lastModified);
 
															           pairs to an output file.</p>
														
 
															 <p>RecordWriter implementations write the job outputs to the 
														
 
															           <span class="codefrag">FileSystem</span>.</p>
														
 
															-<a name="N109BB"></a><a name="Other+Useful+Features"></a>
														
 
															+<a name="N10A30"></a><a name="Other+Useful+Features"></a>
														
 
															 <h3 class="h4">Other Useful Features</h3>
														
 
															-<a name="N109C1"></a><a name="Counters"></a>
														
 
															+<a name="N10A36"></a><a name="Counters"></a>
														
 
															 <h4>Counters</h4>
														
 
															 <p>
														
 
															 <span class="codefrag">Counters</span> represent global counters, defined either by 
														
@@ -1681,7 +1744,7 @@ document.write("Last Published: " + document.lastModified);
 
															           Reporter.incrCounter(Enum, long)</a> in the <span class="codefrag">map</span> and/or 
														
 
															           <span class="codefrag">reduce</span> methods. These counters are then globally 
														
 
															           aggregated by the framework.</p>
														
 
															-<a name="N109EC"></a><a name="DistributedCache"></a>
														
 
															+<a name="N10A61"></a><a name="DistributedCache"></a>
														
 
															 <h4>DistributedCache</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/filecache/DistributedCache.html">
														
@@ -1701,19 +1764,20 @@ document.write("Last Published: " + document.lastModified);
 
															           per job and the ability to cache archives which are un-archived on 
														
 
															           the slaves.</p>
														
 
															 <p>
														
 
															-<span class="codefrag">DistributedCache</span> can be used to distribute simple, 
														
 
															-          read-only data/text files and more complex types such as archives and
														
 
															-          jars. Archives (zip files) are <em>un-archived</em> at the slave nodes.
														
 
															-          Jars maybe be optionally added to the classpath of the tasks, a
														
 
															-          rudimentary <em>software distribution</em> mechanism.  Files have 
														
 
															-          <em>execution permissions</em> set. Optionally users can also direct the
														
 
															-          <span class="codefrag">DistributedCache</span> to <em>symlink</em> the cached file(s) 
														
 
															-          into the working directory of the task.</p>
														
 
															-<p>
														
 
															 <span class="codefrag">DistributedCache</span> tracks the modification timestamps of 
														
 
															           the cached files. Clearly the cache files should not be modified by 
														
 
															           the application or externally while the job is executing.</p>
														
 
															-<a name="N10A26"></a><a name="Tool"></a>
														
 
															+<p>
														
 
															+<span class="codefrag">DistributedCache</span> can be used to distribute simple, 
														
 
															+          read-only data/text files and more complex types such as archives and
														
 
															+          jars. Archives (zip files) are <em>un-archived</em> at the slave nodes.
														
 
															+          Optionally users can also direct the <span class="codefrag">DistributedCache</span> to 
														
 
															+          <em>symlink</em> the cached file(s) into the <span class="codefrag">current working 
														
 
															+          directory</span> of the task via the 
														
 
															+          <a href="api/org/apache/hadoop/filecache/DistributedCache.html#createSymlink(org.apache.hadoop.conf.Configuration)">
														
 
															+          DistributedCache.createSymlink(Path, Configuration)</a> api. Files 
														
 
															+          have <em>execution permissions</em> set.</p>
														
 
															+<a name="N10A9F"></a><a name="Tool"></a>
														
 
															 <h4>Tool</h4>
														
 
															 <p>The <a href="api/org/apache/hadoop/util/Tool.html">Tool</a> 
														
 
															           interface supports the handling of generic Hadoop command-line options.
														
@@ -1753,7 +1817,7 @@ document.write("Last Published: " + document.lastModified);
 
															             </span>
														
 
															 </p>
														
 
															-<a name="N10A58"></a><a name="IsolationRunner"></a>
														
 
															+<a name="N10AD1"></a><a name="IsolationRunner"></a>
														
 
															 <h4>IsolationRunner</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
														
@@ -1777,13 +1841,13 @@ document.write("Last Published: " + document.lastModified);
 
															 <p>
														
 
															 <span class="codefrag">IsolationRunner</span> will run the failed task in a single 
														
 
															           jvm, which can be in the debugger, over precisely the same input.</p>
														
 
															-<a name="N10A8B"></a><a name="JobControl"></a>
														
 
															+<a name="N10B04"></a><a name="JobControl"></a>
														
 
															 <h4>JobControl</h4>
														
 
															 <p>
														
 
															 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
														
 
															           JobControl</a> is a utility which encapsulates a set of Map-Reduce jobs
														
 
															           and their dependencies.</p>
														
 
															-<a name="N10A98"></a><a name="Data+Compression"></a>
														
 
															+<a name="N10B11"></a><a name="Data+Compression"></a>
														
 
															 <h4>Data Compression</h4>
														
 
															 <p>Hadoop Map-Reduce provides facilities for the application-writer to
														
 
															           specify compression for both intermediate map-outputs and the
														
@@ -1797,7 +1861,7 @@ document.write("Last Published: " + document.lastModified);
 
															           codecs for reasons of both performance (zlib) and non-availability of
														
 
															           Java libraries (lzo). More details on their usage and availability are
														
 
															           available <a href="native_libraries.html">here</a>.</p>
														
 
															-<a name="N10AB8"></a><a name="Intermediate+Outputs"></a>
														
 
															+<a name="N10B31"></a><a name="Intermediate+Outputs"></a>
														
 
															 <h5>Intermediate Outputs</h5>
														
 
															 <p>Applications can control compression of intermediate map-outputs
														
 
															             via the 
														
@@ -1818,7 +1882,7 @@ document.write("Last Published: " + document.lastModified);
 
															             <a href="api/org/apache/hadoop/mapred/JobConf.html#setMapOutputCompressionType(org.apache.hadoop.io.SequenceFile.CompressionType)">
														
 
															             JobConf.setMapOutputCompressionType(SequenceFile.CompressionType)</a> 
														
 
															             api.</p>
														
 
															-<a name="N10AE4"></a><a name="Job+Outputs"></a>
														
 
															+<a name="N10B5D"></a><a name="Job+Outputs"></a>
														
 
															 <h5>Job Outputs</h5>
														
 
															 <p>Applications can control compression of job-outputs via the
														
 
															             <a href="api/org/apache/hadoop/mapred/OutputFormatBase.html#setCompressOutput(org.apache.hadoop.mapred.JobConf,%20boolean)">
														
@@ -1838,12 +1902,12 @@ document.write("Last Published: " + document.lastModified);
 
															 </div>
														
 
															-<a name="N10B13"></a><a name="Example%3A+WordCount+v2.0"></a>
														
 
															+<a name="N10B8C"></a><a name="Example%3A+WordCount+v2.0"></a>
														
 
															 <h2 class="h3">Example: WordCount v2.0</h2>
														
 
															 <div class="section">
														
 
															 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses many of the
														
 
															       features provided by the Map-Reduce framework we discussed so far:</p>
														
 
															-<a name="N10B1F"></a><a name="Source+Code-N10B1F"></a>
														
 
															+<a name="N10B98"></a><a name="Source+Code-N10B98"></a>
														
 
															 <h3 class="h4">Source Code</h3>
														
 
															 <table class="ForrestTable" cellspacing="1" cellpadding="4">
														
@@ -3021,7 +3085,7 @@ document.write("Last Published: " + document.lastModified);
 
															 </tr>
														
 
															 </table>
														
 
															-<a name="N11251"></a><a name="Sample+Runs"></a>
														
 
															+<a name="N112CA"></a><a name="Sample+Runs"></a>
														
 
															 <h3 class="h4">Sample Runs</h3>
														
 
															 <p>Sample text-files as input:</p>
														
 
															 <p>
														
@@ -3186,7 +3250,7 @@ document.write("Last Published: " + document.lastModified);
 
															 <br>
														
 
															 </p>
														
 
															-<a name="N11321"></a><a name="Salient+Points"></a>
														
 
															+<a name="N1139A"></a><a name="Salient+Points"></a>
														
 
															 <h3 class="h4">Salient Points</h3>
														
 
															 <p>The second version of <span class="codefrag">WordCount</span> improves upon the 
														
 
															         previous one by using some features offered by the Map-Reduce framework:
														
--- a/docs/mapred_tutorial.pdf
+++ b/docs/mapred_tutorial.pdf
--- a/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
+++ b/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
@@ -1002,6 +1002,64 @@
 
															         <code>DistributedCache</code> for large amounts of (read-only) data.</p>
														
 
															       </section>
														
 
															+      <section>
														
 
															+        <title>Task Execution &amp; Environment</title>
														
 
															+
														
 
															+        <p>The <code>TaskTracker</code> executes the <code>Mapper</code>/ 
														
 
															+        <code>Reducer</code>  <em>task</em> as a child process in a separate jvm.
														
 
															+        </p>
														
 
															+        
														
 
															+        <p>The child-task inherits the environment of the parent 
														
 
															+        <code>TaskTracker</code>. The user can specify additional options to the
														
 
															+        child-jvm via the <code>mapred.child.java.opts</code> configuration
														
 
															+        parameter in the <code>JobConf</code> such as non-standard paths for the 
														
 
															+        run-time linker to search shared libraries via 
														
 
															+        <code>-Djava.library.path=&lt;&gt;</code> etc. If the 
														
 
															+        <code>mapred.child.java.opts</code> contains the symbol <em>@taskid@</em> 
														
 
															+        it is interpolated with value of <code>taskid</code> of the map/reduce
														
 
															+        task.</p>
														
 
															+        
														
 
															+        <p>Here is an example with multiple arguments and substitutions, 
														
 
															+        showing jvm GC logging, and start of a passwordless JVM JMX agent so that
														
 
															+        it can connect with jconsole and the likes to watch child memory, 
														
 
															+        threads and get thread dumps. It also sets the maximum heap-size of the 
														
 
															+        child jvm to 512MB and adds an additional path to the 
														
 
															+        <code>java.library.path</code> of the child-jvm.</p>
														
 
															+
														
 
															+        <p>
														
 
															+          <code>&lt;property&gt;</code><br/>
														
 
															+          &nbsp;&nbsp;<code>&lt;name&gt;mapred.child.java.opts&lt;/name&gt;</code><br/>
														
 
															+          &nbsp;&nbsp;<code>&lt;value&gt;</code><br/>
														
 
															+          &nbsp;&nbsp;&nbsp;&nbsp;<code>
														
 
															+                    -Xmx512M -Djava.library.path=/home/mycompany/lib
														
 
															+                    -verbose:gc -Xloggc:/tmp/@taskid@.gc</code><br/>
														
 
															+          &nbsp;&nbsp;&nbsp;&nbsp;<code>
														
 
															+                    -Dcom.sun.management.jmxremote.authenticate=false 
														
 
															+                    -Dcom.sun.management.jmxremote.ssl=false</code><br/>
														
 
															+          &nbsp;&nbsp;<code>&lt;/value&gt;</code><br/>
														
 
															+          <code>&lt;/property&gt;</code>
														
 
															+        </p>
														
 
															+        
														
 
															+        <p>The <a href="#DistributedCache">DistributedCache</a> can also be used
														
 
															+        as a rudimentary software distribution mechanism for use in the map 
														
 
															+        and/or reduce tasks. It can be used to distribute both jars and 
														
 
															+        native libraries. The 
														
 
															+        <a href="ext:api/org/apache/hadoop/filecache/distributedcache/addarchivetoclasspath">
														
 
															+        DistributedCache.addArchiveToClassPath(Path, Configuration)</a> or 
														
 
															+        <a href="ext:api/org/apache/hadoop/filecache/distributedcache/addfiletoclasspath">
														
 
															+        DistributedCache.addFileToClassPath(Path, Configuration)</a> api can 
														
 
															+        be used to cache files/jars and also add them to the <em>classpath</em> 
														
 
															+        of child-jvm. Similarly the facility provided by the 
														
 
															+        <code>DistributedCache</code> where-in it symlinks the cached files into
														
 
															+        the working directory of the task can be used to distribute native 
														
 
															+        libraries and load them. The underlying detail is that child-jvm always 
														
 
															+        has its <em>current working directory</em> added to the
														
 
															+        <code>java.library.path</code> and hence the cached libraries can be 
														
 
															+        loaded via <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#loadLibrary(java.lang.String)">
														
 
															+        System.loadLibrary</a> or <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#load(java.lang.String)">
														
 
															+        System.load</a>.</p>
														
 
															+      </section>
														
 
															+      
														
 
															       <section>
														
 
															         <title>Job Submission and Monitoring</title>
														
@@ -1260,19 +1318,20 @@
 
															           efficiency stems from the fact that the files are only copied once 
														
 
															           per job and the ability to cache archives which are un-archived on 
														
 
															           the slaves.</p> 
														
 
															+          
														
 
															+          <p><code>DistributedCache</code> tracks the modification timestamps of 
														
 
															+          the cached files. Clearly the cache files should not be modified by 
														
 
															+          the application or externally while the job is executing.</p>
														
 
															           <p><code>DistributedCache</code> can be used to distribute simple, 
														
 
															           read-only data/text files and more complex types such as archives and
														
 
															           jars. Archives (zip files) are <em>un-archived</em> at the slave nodes.
														
 
															-          Jars maybe be optionally added to the classpath of the tasks, a
														
 
															-          rudimentary <em>software distribution</em> mechanism.  Files have 
														
 
															-          <em>execution permissions</em> set. Optionally users can also direct the
														
 
															-          <code>DistributedCache</code> to <em>symlink</em> the cached file(s) 
														
 
															-          into the working directory of the task.</p>
														
 
															- 
														
 
															-          <p><code>DistributedCache</code> tracks the modification timestamps of 
														
 
															-          the cached files. Clearly the cache files should not be modified by 
														
 
															-          the application or externally while the job is executing.</p>
														
 
															+          Optionally users can also direct the <code>DistributedCache</code> to 
														
 
															+          <em>symlink</em> the cached file(s) into the <code>current working 
														
 
															+          directory</code> of the task via the 
														
 
															+          <a href="ext:api/org/apache/hadoop/filecache/distributedcache/createsymlink">
														
 
															+          DistributedCache.createSymlink(Path, Configuration)</a> api. Files 
														
 
															+          have <em>execution permissions</em> set.</p>
														
 
															         </section>
														
 
															         <section>
														
--- a/src/docs/src/documentation/content/xdocs/site.xml
+++ b/src/docs/src/documentation/content/xdocs/site.xml
@@ -61,7 +61,11 @@ See http://forrest.apache.org/docs/linking.html for more info.
 
															               </configuration>
														
 
															             </conf>
														
 
															             <filecache href="filecache/">
														
 
															-              <distributedcache href="DistributedCache.html" />
														
 
															+              <distributedcache href="DistributedCache.html">
														
 
															+                <addarchivetoclasspath href="#addArchiveToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)" />
														
 
															+                <addfiletoclasspath href="#addFileToClassPath(org.apache.hadoop.fs.Path,%20org.apache.hadoop.conf.Configuration)" />
														
 
															+                <createsymlink href="#createSymlink(org.apache.hadoop.conf.Configuration)" />
														
 
															+              </distributedcache>  
														
 
															             </filecache>
														
 
															             <fs href="fs/">
														
 
															               <filesystem href="FileSystem.html" />
														
--- a/src/java/org/apache/hadoop/mapred/TaskRunner.java
+++ b/src/java/org/apache/hadoop/mapred/TaskRunner.java
@@ -293,19 +293,31 @@ abstract class TaskRunner extends Thread {
 
															       javaOpts = replaceAll(javaOpts, "@taskid@", taskid);
														
 
															       String [] javaOptsSplit = javaOpts.split(" ");
														
 
															-      //Add java.library.path; necessary for native-hadoop libraries
														
 
															+      // Add java.library.path; necessary for loading native libraries.
														
 
															+      //
														
 
															+      // 1. To support native-hadoop library i.e. libhadoop.so, we add the 
														
 
															+      //    parent processes' java.library.path to the child. 
														
 
															+      // 2. We also add the 'cwd' of the task to it's java.library.path to help 
														
 
															+      //    users distribute native libraries via the DistributedCache.
														
 
															+      // 3. The user can also specify extra paths to be added to the 
														
 
															+      //    java.library.path via mapred.child.java.opts.
														
 
															+      //
														
 
															       String libraryPath = System.getProperty("java.library.path");
														
 
															-      if (libraryPath != null) {
														
 
															-        boolean hasLibrary = false;
														
 
															-        for(int i=0; i<javaOptsSplit.length ;i++) { 
														
 
															-          if(javaOptsSplit[i].startsWith("-Djava.library.path=")) {
														
 
															-            javaOptsSplit[i] += sep + libraryPath;
														
 
															-            hasLibrary = true;
														
 
															-            break;
														
 
															-          }
														
 
															+      if (libraryPath == null) {
														
 
															+        libraryPath = workDir.getAbsolutePath();
														
 
															+      } else {
														
 
															+        libraryPath += sep + workDir;
														
 
															+      }
														
 
															+      boolean hasUserLDPath = false;
														
 
															+      for(int i=0; i<javaOptsSplit.length ;i++) { 
														
 
															+        if(javaOptsSplit[i].startsWith("-Djava.library.path=")) {
														
 
															+          javaOptsSplit[i] += sep + libraryPath;
														
 
															+          hasUserLDPath = true;
														
 
															+          break;
														
 
															         }
														
 
															-        if(!hasLibrary)
														
 
															-          vargs.add("-Djava.library.path=" + libraryPath);
														
 
															+      }
														
 
															+      if(!hasUserLDPath) {
														
 
															+        vargs.add("-Djava.library.path=" + libraryPath);
														
 
															       }
														
 
															       for (int i = 0; i < javaOptsSplit.length; i++) {
														
 
															         vargs.add(javaOptsSplit[i]);