CommandsManual.apt.vm 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327
  1. ~~ Licensed to the Apache Software Foundation (ASF) under one or more
  2. ~~ contributor license agreements. See the NOTICE file distributed with
  3. ~~ this work for additional information regarding copyright ownership.
  4. ~~ The ASF licenses this file to You under the Apache License, Version 2.0
  5. ~~ (the "License"); you may not use this file except in compliance with
  6. ~~ the License. You may obtain a copy of the License at
  7. ~~
  8. ~~ http://www.apache.org/licenses/LICENSE-2.0
  9. ~~
  10. ~~ Unless required by applicable law or agreed to in writing, software
  11. ~~ distributed under the License is distributed on an "AS IS" BASIS,
  12. ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  13. ~~ See the License for the specific language governing permissions and
  14. ~~ limitations under the License.
  15. ---
  16. Hadoop Commands Guide
  17. ---
  18. ---
  19. ${maven.build.timestamp}
  20. %{toc}
  21. Hadoop Commands Guide
  22. * Overview
  23. All of the Hadoop commands and subprojects follow the same basic structure:
  24. Usage: <<<shellcommand [SHELL_OPTIONS] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
  25. *--------+---------+
  26. || FIELD || Description
  27. *-----------------------+---------------+
  28. | shellcommand | The command of the project being invoked. For example,
  29. | Hadoop common uses <<<hadoop>>>, HDFS uses <<<hdfs>>>,
  30. | and YARN uses <<<yarn>>>.
  31. *---------------+-------------------+
  32. | SHELL_OPTIONS | Options that the shell processes prior to executing Java.
  33. *-----------------------+---------------+
  34. | COMMAND | Action to perform.
  35. *-----------------------+---------------+
  36. | GENERIC_OPTIONS | The common set of options supported by
  37. | multiple commands.
  38. *-----------------------+---------------+
  39. | COMMAND_OPTIONS | Various commands with their options are
  40. | described in this documention for the
  41. | Hadoop common sub-project. HDFS and YARN are
  42. | covered in other documents.
  43. *-----------------------+---------------+
  44. ** {Shell Options}
  45. All of the shell commands will accept a common set of options. For some commands,
  46. these options are ignored. For example, passing <<<---hostnames>>> on a
  47. command that only executes on a single host will be ignored.
  48. *-----------------------+---------------+
  49. || SHELL_OPTION || Description
  50. *-----------------------+---------------+
  51. | <<<--buildpaths>>> | Enables developer versions of jars.
  52. *-----------------------+---------------+
  53. | <<<--config confdir>>> | Overwrites the default Configuration
  54. | directory. Default is <<<${HADOOP_PREFIX}/conf>>>.
  55. *-----------------------+----------------+
  56. | <<<--daemon mode>>> | If the command supports daemonization (e.g.,
  57. | <<<hdfs namenode>>>), execute in the appropriate
  58. | mode. Supported modes are <<<start>>> to start the
  59. | process in daemon mode, <<<stop>>> to stop the
  60. | process, and <<<status>>> to determine the active
  61. | status of the process. <<<status>>> will return
  62. | an {{{http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html}LSB-compliant}} result code.
  63. | If no option is provided, commands that support
  64. | daemonization will run in the foreground.
  65. *-----------------------+---------------+
  66. | <<<--debug>>> | Enables shell level configuration debugging information
  67. *-----------------------+---------------+
  68. | <<<--help>>> | Shell script usage information.
  69. *-----------------------+---------------+
  70. | <<<--hostnames>>> | A space delimited list of hostnames where to execute
  71. | a multi-host subcommand. By default, the content of
  72. | the <<<slaves>>> file is used.
  73. *-----------------------+----------------+
  74. | <<<--hosts>>> | A file that contains a list of hostnames where to execute
  75. | a multi-host subcommand. By default, the content of the
  76. | <<<slaves>>> file is used.
  77. *-----------------------+----------------+
  78. | <<<--loglevel loglevel>>> | Overrides the log level. Valid log levels are
  79. | | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
  80. | | Default is INFO.
  81. *-----------------------+---------------+
  82. ** {Generic Options}
  83. Many subcommands honor a common set of configuration options to alter their behavior:
  84. *------------------------------------------------+-----------------------------+
  85. || GENERIC_OPTION || Description
  86. *------------------------------------------------+-----------------------------+
  87. |<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
  88. | archives to be unarchived on
  89. | the compute machines. Applies
  90. | only to job.
  91. *------------------------------------------------+-----------------------------+
  92. |<<<-conf \<configuration file\> >>> | Specify an application
  93. | configuration file.
  94. *------------------------------------------------+-----------------------------+
  95. |<<<-D \<property\>=\<value\> >>> | Use value for given property.
  96. *------------------------------------------------+-----------------------------+
  97. |<<<-files \<comma separated list of files\> >>> | Specify comma separated files
  98. | to be copied to the map
  99. | reduce cluster. Applies only
  100. | to job.
  101. *------------------------------------------------+-----------------------------+
  102. |<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
  103. | Applies only to job.
  104. *------------------------------------------------+-----------------------------+
  105. |<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
  106. | files to include in the
  107. | classpath. Applies only to
  108. | job.
  109. *------------------------------------------------+-----------------------------+
  110. Hadoop Common Commands
  111. All of these commands are executed from the <<<hadoop>>> shell command. They
  112. have been broken up into {{User Commands}} and
  113. {{Admininistration Commands}}.
  114. * User Commands
  115. Commands useful for users of a hadoop cluster.
  116. ** <<<archive>>>
  117. Creates a hadoop archive. More information can be found at
  118. {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html}
  119. Hadoop Archives Guide}}.
  120. ** <<<checknative>>>
  121. Usage: <<<hadoop checknative [-a] [-h] >>>
  122. *-----------------+-----------------------------------------------------------+
  123. || COMMAND_OPTION || Description
  124. *-----------------+-----------------------------------------------------------+
  125. | -a | Check all libraries are available.
  126. *-----------------+-----------------------------------------------------------+
  127. | -h | print help
  128. *-----------------+-----------------------------------------------------------+
  129. This command checks the availability of the Hadoop native code. See
  130. {{{NativeLibraries.html}}} for more information. By default, this command
  131. only checks the availability of libhadoop.
  132. ** <<<classpath>>>
  133. Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
  134. *-----------------+-----------------------------------------------------------+
  135. || COMMAND_OPTION || Description
  136. *-----------------+-----------------------------------------------------------+
  137. | --glob | expand wildcards
  138. *-----------------+-----------------------------------------------------------+
  139. | --jar <path> | write classpath as manifest in jar named <path>
  140. *-----------------+-----------------------------------------------------------+
  141. | -h, --help | print help
  142. *-----------------+-----------------------------------------------------------+
  143. Prints the class path needed to get the Hadoop jar and the required
  144. libraries. If called without arguments, then prints the classpath set up by
  145. the command scripts, which is likely to contain wildcards in the classpath
  146. entries. Additional options print the classpath after wildcard expansion or
  147. write the classpath into the manifest of a jar file. The latter is useful in
  148. environments where wildcards cannot be used and the expanded classpath exceeds
  149. the maximum supported command line length.
  150. ** <<<credential>>>
  151. Usage: <<<hadoop credential <subcommand> [options]>>>
  152. *-------------------+-------------------------------------------------------+
  153. ||COMMAND_OPTION || Description
  154. *-------------------+-------------------------------------------------------+
  155. | create <alias> [-v <value>][-provider <provider-path>]| Prompts the user for
  156. | a credential to be stored as the given alias when a value
  157. | is not provided via <<<-v>>>. The
  158. | <hadoop.security.credential.provider.path> within the
  159. | core-site.xml file will be used unless a <<<-provider>>> is
  160. | indicated.
  161. *-------------------+-------------------------------------------------------+
  162. | delete <alias> [-i][-provider <provider-path>] | Deletes the credential with
  163. | the provided alias and optionally warns the user when
  164. | <<<--interactive>>> is used.
  165. | The <hadoop.security.credential.provider.path> within the
  166. | core-site.xml file will be used unless a <<<-provider>>> is
  167. | indicated.
  168. *-------------------+-------------------------------------------------------+
  169. | list [-provider <provider-path>] | Lists all of the credential aliases
  170. | The <hadoop.security.credential.provider.path> within the
  171. | core-site.xml file will be used unless a <<<-provider>>> is
  172. | indicated.
  173. *-------------------+-------------------------------------------------------+
  174. Command to manage credentials, passwords and secrets within credential providers.
  175. The CredentialProvider API in Hadoop allows for the separation of applications
  176. and how they store their required passwords/secrets. In order to indicate
  177. a particular provider type and location, the user must provide the
  178. <hadoop.security.credential.provider.path> configuration element in core-site.xml
  179. or use the command line option <<<-provider>>> on each of the following commands.
  180. This provider path is a comma-separated list of URLs that indicates the type and
  181. location of a list of providers that should be consulted. For example, the following path:
  182. <<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
  183. indicates that the current user's credentials file should be consulted through
  184. the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
  185. Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
  186. is also a store for a Java Keystore Provider.
  187. When utilizing the credential command it will often be for provisioning a password
  188. or secret to a particular credential store provider. In order to explicitly
  189. indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
  190. given a path of multiple providers, the first non-transient provider will be used.
  191. This may or may not be the one that you intended.
  192. Example: <<<-provider jceks://file/tmp/test.jceks>>>
  193. ** <<<distch>>>
  194. Usage: <<<hadoop distch [-f urilist_url] [-i] [-log logdir] path:owner:group:permissions>>>
  195. *-------------------+-------------------------------------------------------+
  196. ||COMMAND_OPTION || Description
  197. *-------------------+-------------------------------------------------------+
  198. | -f | List of objects to change
  199. *----+------------+
  200. | -i | Ignore failures
  201. *----+------------+
  202. | -log | Directory to log output
  203. *-----+---------+
  204. Change the ownership and permissions on many files at once.
  205. ** <<<distcp>>>
  206. Copy file or directories recursively. More information can be found at
  207. {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html}
  208. Hadoop DistCp Guide}}.
  209. ** <<<fs>>>
  210. This command is documented in the {{{./FileSystemShell.html}File System Shell Guide}}. It is a synonym for <<<hdfs dfs>>> when HDFS is in use.
  211. ** <<<jar>>>
  212. Usage: <<<hadoop jar <jar> [mainClass] args...>>>
  213. Runs a jar file.
  214. Use {{{../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar}<<<yarn jar>>>}}
  215. to launch YARN applications instead.
  216. ** <<<jnipath>>>
  217. Usage: <<<hadoop jnipath>>>
  218. Print the computed java.library.path.
  219. ** <<<key>>>
  220. Manage keys via the KeyProvider.
  221. ** <<<trace>>>
  222. View and modify Hadoop tracing settings. See the {{{./Tracing.html}Tracing Guide}}.
  223. ** <<<version>>>
  224. Usage: <<<hadoop version>>>
  225. Prints the version.
  226. ** <<<CLASSNAME>>>
  227. Usage: <<<hadoop CLASSNAME>>>
  228. Runs the class named <<<CLASSNAME>>>. The class must be part of a package.
  229. * {Administration Commands}
  230. Commands useful for administrators of a hadoop cluster.
  231. ** <<<daemonlog>>>
  232. Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
  233. Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
  234. *------------------------------+-----------------------------------------------------------+
  235. || COMMAND_OPTION || Description
  236. *------------------------------+-----------------------------------------------------------+
  237. | -getlevel <host:port> <name> | Prints the log level of the daemon running at
  238. | <host:port>. This command internally connects
  239. | to http://<host:port>/logLevel?log=<name>
  240. *------------------------------+-----------------------------------------------------------+
  241. | -setlevel <host:port> <name> <level> | Sets the log level of the daemon
  242. | running at <host:port>. This command internally
  243. | connects to http://<host:port>/logLevel?log=<name>
  244. *------------------------------+-----------------------------------------------------------+
  245. Get/Set the log level for each daemon.
  246. * Files
  247. ** <<etc/hadoop/hadoop-env.sh>>
  248. This file stores the global settings used by all Hadoop shell commands.
  249. ** <<etc/hadoop/hadoop-user-functions.sh>>
  250. This file allows for advanced users to override some shell functionality.
  251. ** <<~/.hadooprc>>
  252. This stores the personal environment for an individual user. It is
  253. processed after the hadoop-env.sh and hadoop-user-functions.sh files
  254. and can contain the same settings.