|
@@ -1290,26 +1290,35 @@
|
|
semi-random local directory. When the job starts, task tracker
|
|
semi-random local directory. When the job starts, task tracker
|
|
creates a localized job directory relative to the local directory
|
|
creates a localized job directory relative to the local directory
|
|
specified in the configuration. Thus the task tracker directory
|
|
specified in the configuration. Thus the task tracker directory
|
|
- structure looks the following: </p>
|
|
|
|
|
|
+ structure looks as following: </p>
|
|
<ul>
|
|
<ul>
|
|
- <li><code>${mapred.local.dir}/taskTracker/archive/</code> :
|
|
|
|
- The distributed cache. This directory holds the localized distributed
|
|
|
|
- cache. Thus localized distributed cache is shared among all
|
|
|
|
- the tasks and jobs </li>
|
|
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/</code> :
|
|
|
|
- The localized job directory
|
|
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/distcache/</code> :
|
|
|
|
+ The public distributed cache for the jobs of all users. This directory
|
|
|
|
+ holds the localized public distributed cache. Thus localized public
|
|
|
|
+ distributed cache is shared among all the tasks and jobs of all users.
|
|
|
|
+ </li>
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/distcache/</code> :
|
|
|
|
+ The private distributed cache for the jobs of the specific user. This
|
|
|
|
+ directory holds the localized private distributed cache. Thus localized
|
|
|
|
+ private distributed cache is shared among all the tasks and jobs of the
|
|
|
|
+ specific user only. It is not accessible to jobs of other users.
|
|
|
|
+ </li>
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/
|
|
|
|
+ </code> : The localized job directory
|
|
<ul>
|
|
<ul>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/work/</code>
|
|
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/work/
|
|
|
|
+ </code>
|
|
: The job-specific shared directory. The tasks can use this space as
|
|
: The job-specific shared directory. The tasks can use this space as
|
|
scratch space and share files among them. This directory is exposed
|
|
scratch space and share files among them. This directory is exposed
|
|
to the users through the configuration property
|
|
to the users through the configuration property
|
|
<code>job.local.dir</code>. The directory can accessed through
|
|
<code>job.local.dir</code>. The directory can accessed through
|
|
- api <a href="ext:api/org/apache/hadoop/mapred/jobconf/getjoblocaldir">
|
|
|
|
|
|
+ the API <a href="ext:api/org/apache/hadoop/mapred/jobconf/getjoblocaldir">
|
|
JobConf.getJobLocalDir()</a>. It is available as System property also.
|
|
JobConf.getJobLocalDir()</a>. It is available as System property also.
|
|
So, users (streaming etc.) can call
|
|
So, users (streaming etc.) can call
|
|
<code>System.getProperty("job.local.dir")</code> to access the
|
|
<code>System.getProperty("job.local.dir")</code> to access the
|
|
directory.</li>
|
|
directory.</li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/jars/</code>
|
|
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/jars/
|
|
|
|
+ </code>
|
|
: The jars directory, which has the job jar file and expanded jar.
|
|
: The jars directory, which has the job jar file and expanded jar.
|
|
The <code>job.jar</code> is the application's jar file that is
|
|
The <code>job.jar</code> is the application's jar file that is
|
|
automatically distributed to each machine. It is expanded in jars
|
|
automatically distributed to each machine. It is expanded in jars
|
|
@@ -1318,27 +1327,37 @@
|
|
<a href="ext:api/org/apache/hadoop/mapred/jobconf/getjar">
|
|
<a href="ext:api/org/apache/hadoop/mapred/jobconf/getjar">
|
|
JobConf.getJar() </a>. To access the unjarred directory,
|
|
JobConf.getJar() </a>. To access the unjarred directory,
|
|
JobConf.getJar().getParent() can be called.</li>
|
|
JobConf.getJar().getParent() can be called.</li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/job.xml</code>
|
|
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/job.xml
|
|
|
|
+ </code>
|
|
: The job.xml file, the generic job configuration, localized for
|
|
: The job.xml file, the generic job configuration, localized for
|
|
the job. </li>
|
|
the job. </li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid</code>
|
|
|
|
|
|
+ <li><code>${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/$taskid
|
|
|
|
+ </code>
|
|
: The task directory for each task attempt. Each task directory
|
|
: The task directory for each task attempt. Each task directory
|
|
again has the following structure :
|
|
again has the following structure :
|
|
<ul>
|
|
<ul>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/job.xml</code>
|
|
|
|
|
|
+ <li><code>
|
|
|
|
+ ${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/$taskid/job.xml
|
|
|
|
+ </code>
|
|
: A job.xml file, task localized job configuration, Task localization
|
|
: A job.xml file, task localized job configuration, Task localization
|
|
means that properties have been set that are specific to
|
|
means that properties have been set that are specific to
|
|
this particular task within the job. The properties localized for
|
|
this particular task within the job. The properties localized for
|
|
each task are described below.</li>
|
|
each task are described below.</li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/output</code>
|
|
|
|
|
|
+ <li><code>
|
|
|
|
+ ${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/$taskid/output
|
|
|
|
+ </code>
|
|
: A directory for intermediate output files. This contains the
|
|
: A directory for intermediate output files. This contains the
|
|
temporary map reduce data generated by the framework
|
|
temporary map reduce data generated by the framework
|
|
such as map output files etc. </li>
|
|
such as map output files etc. </li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work</code>
|
|
|
|
- : The curernt working directory of the task.
|
|
|
|
|
|
+ <li><code>
|
|
|
|
+ ${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/$taskid/work
|
|
|
|
+ </code>
|
|
|
|
+ : The current working directory of the task.
|
|
With <a href="#Task+JVM+Reuse">jvm reuse</a> enabled for tasks, this
|
|
With <a href="#Task+JVM+Reuse">jvm reuse</a> enabled for tasks, this
|
|
directory will be the directory on which the jvm has started</li>
|
|
directory will be the directory on which the jvm has started</li>
|
|
- <li><code>${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work/tmp</code>
|
|
|
|
|
|
+ <li><code>
|
|
|
|
+ ${mapred.local.dir}/taskTracker/$user/jobcache/$jobid/$taskid/work/tmp
|
|
|
|
+ </code>
|
|
: The temporary directory for the task.
|
|
: The temporary directory for the task.
|
|
(User can specify the property <code>mapred.child.tmp</code> to set
|
|
(User can specify the property <code>mapred.child.tmp</code> to set
|
|
the value of temporary directory for map and reduce tasks. This
|
|
the value of temporary directory for map and reduce tasks. This
|
|
@@ -1347,7 +1366,7 @@
|
|
directly assigned. The directory will be created if it doesn't exist.
|
|
directly assigned. The directory will be created if it doesn't exist.
|
|
Then, the child java tasks are executed with option
|
|
Then, the child java tasks are executed with option
|
|
<code>-Djava.io.tmpdir='the absolute path of the tmp dir'</code>.
|
|
<code>-Djava.io.tmpdir='the absolute path of the tmp dir'</code>.
|
|
- Anp pipes and streaming are set with environment variable,
|
|
|
|
|
|
+ Pipes and streaming are set with environment variable,
|
|
<code>TMPDIR='the absolute path of the tmp dir'</code>). This
|
|
<code>TMPDIR='the absolute path of the tmp dir'</code>). This
|
|
directory is created, if <code>mapred.child.tmp</code> has the value
|
|
directory is created, if <code>mapred.child.tmp</code> has the value
|
|
<code>./tmp</code> </li>
|
|
<code>./tmp</code> </li>
|