123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327 |
- ~~ Licensed to the Apache Software Foundation (ASF) under one or more
- ~~ contributor license agreements. See the NOTICE file distributed with
- ~~ this work for additional information regarding copyright ownership.
- ~~ The ASF licenses this file to You under the Apache License, Version 2.0
- ~~ (the "License"); you may not use this file except in compliance with
- ~~ the License. You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License.
- ---
- Hadoop Commands Guide
- ---
- ---
- ${maven.build.timestamp}
- %{toc}
- Hadoop Commands Guide
- * Overview
- All of the Hadoop commands and subprojects follow the same basic structure:
- Usage: <<<shellcommand [SHELL_OPTIONS] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
- *--------+---------+
- || FIELD || Description
- *-----------------------+---------------+
- | shellcommand | The command of the project being invoked. For example,
- | Hadoop common uses <<<hadoop>>>, HDFS uses <<<hdfs>>>,
- | and YARN uses <<<yarn>>>.
- *---------------+-------------------+
- | SHELL_OPTIONS | Options that the shell processes prior to executing Java.
- *-----------------------+---------------+
- | COMMAND | Action to perform.
- *-----------------------+---------------+
- | GENERIC_OPTIONS | The common set of options supported by
- | multiple commands.
- *-----------------------+---------------+
- | COMMAND_OPTIONS | Various commands with their options are
- | described in this documention for the
- | Hadoop common sub-project. HDFS and YARN are
- | covered in other documents.
- *-----------------------+---------------+
- ** {Shell Options}
- All of the shell commands will accept a common set of options. For some commands,
- these options are ignored. For example, passing <<<---hostnames>>> on a
- command that only executes on a single host will be ignored.
- *-----------------------+---------------+
- || SHELL_OPTION || Description
- *-----------------------+---------------+
- | <<<--buildpaths>>> | Enables developer versions of jars.
- *-----------------------+---------------+
- | <<<--config confdir>>> | Overwrites the default Configuration
- | directory. Default is <<<${HADOOP_PREFIX}/conf>>>.
- *-----------------------+----------------+
- | <<<--daemon mode>>> | If the command supports daemonization (e.g.,
- | <<<hdfs namenode>>>), execute in the appropriate
- | mode. Supported modes are <<<start>>> to start the
- | process in daemon mode, <<<stop>>> to stop the
- | process, and <<<status>>> to determine the active
- | status of the process. <<<status>>> will return
- | an {{{http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html}LSB-compliant}} result code.
- | If no option is provided, commands that support
- | daemonization will run in the foreground.
- *-----------------------+---------------+
- | <<<--debug>>> | Enables shell level configuration debugging information
- *-----------------------+---------------+
- | <<<--help>>> | Shell script usage information.
- *-----------------------+---------------+
- | <<<--hostnames>>> | A space delimited list of hostnames where to execute
- | a multi-host subcommand. By default, the content of
- | the <<<slaves>>> file is used.
- *-----------------------+----------------+
- | <<<--hosts>>> | A file that contains a list of hostnames where to execute
- | a multi-host subcommand. By default, the content of the
- | <<<slaves>>> file is used.
- *-----------------------+----------------+
- | <<<--loglevel loglevel>>> | Overrides the log level. Valid log levels are
- | | FATAL, ERROR, WARN, INFO, DEBUG, and TRACE.
- | | Default is INFO.
- *-----------------------+---------------+
- ** {Generic Options}
- Many subcommands honor a common set of configuration options to alter their behavior:
- *------------------------------------------------+-----------------------------+
- || GENERIC_OPTION || Description
- *------------------------------------------------+-----------------------------+
- |<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
- | archives to be unarchived on
- | the compute machines. Applies
- | only to job.
- *------------------------------------------------+-----------------------------+
- |<<<-conf \<configuration file\> >>> | Specify an application
- | configuration file.
- *------------------------------------------------+-----------------------------+
- |<<<-D \<property\>=\<value\> >>> | Use value for given property.
- *------------------------------------------------+-----------------------------+
- |<<<-files \<comma separated list of files\> >>> | Specify comma separated files
- | to be copied to the map
- | reduce cluster. Applies only
- | to job.
- *------------------------------------------------+-----------------------------+
- |<<<-jt \<local\> or \<resourcemanager:port\>>>> | Specify a ResourceManager.
- | Applies only to job.
- *------------------------------------------------+-----------------------------+
- |<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
- | files to include in the
- | classpath. Applies only to
- | job.
- *------------------------------------------------+-----------------------------+
- Hadoop Common Commands
- All of these commands are executed from the <<<hadoop>>> shell command. They
- have been broken up into {{User Commands}} and
- {{Admininistration Commands}}.
- * User Commands
- Commands useful for users of a hadoop cluster.
- ** <<<archive>>>
-
- Creates a hadoop archive. More information can be found at
- {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html}
- Hadoop Archives Guide}}.
- ** <<<checknative>>>
- Usage: <<<hadoop checknative [-a] [-h] >>>
- *-----------------+-----------------------------------------------------------+
- || COMMAND_OPTION || Description
- *-----------------+-----------------------------------------------------------+
- | -a | Check all libraries are available.
- *-----------------+-----------------------------------------------------------+
- | -h | print help
- *-----------------+-----------------------------------------------------------+
- This command checks the availability of the Hadoop native code. See
- {{{NativeLibraries.html}}} for more information. By default, this command
- only checks the availability of libhadoop.
- ** <<<classpath>>>
- Usage: <<<hadoop classpath [--glob|--jar <path>|-h|--help]>>>
- *-----------------+-----------------------------------------------------------+
- || COMMAND_OPTION || Description
- *-----------------+-----------------------------------------------------------+
- | --glob | expand wildcards
- *-----------------+-----------------------------------------------------------+
- | --jar <path> | write classpath as manifest in jar named <path>
- *-----------------+-----------------------------------------------------------+
- | -h, --help | print help
- *-----------------+-----------------------------------------------------------+
- Prints the class path needed to get the Hadoop jar and the required
- libraries. If called without arguments, then prints the classpath set up by
- the command scripts, which is likely to contain wildcards in the classpath
- entries. Additional options print the classpath after wildcard expansion or
- write the classpath into the manifest of a jar file. The latter is useful in
- environments where wildcards cannot be used and the expanded classpath exceeds
- the maximum supported command line length.
- ** <<<credential>>>
- Usage: <<<hadoop credential <subcommand> [options]>>>
- *-------------------+-------------------------------------------------------+
- ||COMMAND_OPTION || Description
- *-------------------+-------------------------------------------------------+
- | create <alias> [-v <value>][-provider <provider-path>]| Prompts the user for
- | a credential to be stored as the given alias when a value
- | is not provided via <<<-v>>>. The
- | <hadoop.security.credential.provider.path> within the
- | core-site.xml file will be used unless a <<<-provider>>> is
- | indicated.
- *-------------------+-------------------------------------------------------+
- | delete <alias> [-i][-provider <provider-path>] | Deletes the credential with
- | the provided alias and optionally warns the user when
- | <<<--interactive>>> is used.
- | The <hadoop.security.credential.provider.path> within the
- | core-site.xml file will be used unless a <<<-provider>>> is
- | indicated.
- *-------------------+-------------------------------------------------------+
- | list [-provider <provider-path>] | Lists all of the credential aliases
- | The <hadoop.security.credential.provider.path> within the
- | core-site.xml file will be used unless a <<<-provider>>> is
- | indicated.
- *-------------------+-------------------------------------------------------+
- Command to manage credentials, passwords and secrets within credential providers.
- The CredentialProvider API in Hadoop allows for the separation of applications
- and how they store their required passwords/secrets. In order to indicate
- a particular provider type and location, the user must provide the
- <hadoop.security.credential.provider.path> configuration element in core-site.xml
- or use the command line option <<<-provider>>> on each of the following commands.
- This provider path is a comma-separated list of URLs that indicates the type and
- location of a list of providers that should be consulted. For example, the following path:
- <<<user:///,jceks://file/tmp/test.jceks,jceks://hdfs@nn1.example.com/my/path/test.jceks>>>
- indicates that the current user's credentials file should be consulted through
- the User Provider, that the local file located at <<</tmp/test.jceks>>> is a Java Keystore
- Provider and that the file located within HDFS at <<<nn1.example.com/my/path/test.jceks>>>
- is also a store for a Java Keystore Provider.
- When utilizing the credential command it will often be for provisioning a password
- or secret to a particular credential store provider. In order to explicitly
- indicate which provider store to use the <<<-provider>>> option should be used. Otherwise,
- given a path of multiple providers, the first non-transient provider will be used.
- This may or may not be the one that you intended.
- Example: <<<-provider jceks://file/tmp/test.jceks>>>
- ** <<<distch>>>
- Usage: <<<hadoop distch [-f urilist_url] [-i] [-log logdir] path:owner:group:permissions>>>
-
- *-------------------+-------------------------------------------------------+
- ||COMMAND_OPTION || Description
- *-------------------+-------------------------------------------------------+
- | -f | List of objects to change
- *----+------------+
- | -i | Ignore failures
- *----+------------+
- | -log | Directory to log output
- *-----+---------+
- Change the ownership and permissions on many files at once.
- ** <<<distcp>>>
- Copy file or directories recursively. More information can be found at
- {{{../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html}
- Hadoop DistCp Guide}}.
- ** <<<fs>>>
- This command is documented in the {{{./FileSystemShell.html}File System Shell Guide}}. It is a synonym for <<<hdfs dfs>>> when HDFS is in use.
- ** <<<jar>>>
- Usage: <<<hadoop jar <jar> [mainClass] args...>>>
- Runs a jar file.
-
- Use {{{../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html#jar}<<<yarn jar>>>}}
- to launch YARN applications instead.
- ** <<<jnipath>>>
- Usage: <<<hadoop jnipath>>>
- Print the computed java.library.path.
- ** <<<key>>>
- Manage keys via the KeyProvider.
- ** <<<trace>>>
- View and modify Hadoop tracing settings. See the {{{./Tracing.html}Tracing Guide}}.
- ** <<<version>>>
- Usage: <<<hadoop version>>>
- Prints the version.
- ** <<<CLASSNAME>>>
- Usage: <<<hadoop CLASSNAME>>>
- Runs the class named <<<CLASSNAME>>>. The class must be part of a package.
- * {Administration Commands}
- Commands useful for administrators of a hadoop cluster.
- ** <<<daemonlog>>>
- Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
- Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
- *------------------------------+-----------------------------------------------------------+
- || COMMAND_OPTION || Description
- *------------------------------+-----------------------------------------------------------+
- | -getlevel <host:port> <name> | Prints the log level of the daemon running at
- | <host:port>. This command internally connects
- | to http://<host:port>/logLevel?log=<name>
- *------------------------------+-----------------------------------------------------------+
- | -setlevel <host:port> <name> <level> | Sets the log level of the daemon
- | running at <host:port>. This command internally
- | connects to http://<host:port>/logLevel?log=<name>
- *------------------------------+-----------------------------------------------------------+
- Get/Set the log level for each daemon.
- * Files
- ** <<etc/hadoop/hadoop-env.sh>>
- This file stores the global settings used by all Hadoop shell commands.
- ** <<etc/hadoop/hadoop-user-functions.sh>>
- This file allows for advanced users to override some shell functionality.
- ** <<~/.hadooprc>>
- This stores the personal environment for an individual user. It is
- processed after the hadoop-env.sh and hadoop-user-functions.sh files
- and can contain the same settings.
|