123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License. See accompanying LICENSE file.
- ---
- Service Level Authorization Guide
- ---
- ---
- ${maven.build.timestamp}
- Service Level Authorization Guide
- %{toc|section=1|fromDepth=0}
- * Purpose
- This document describes how to configure and manage Service Level
- Authorization for Hadoop.
- * Prerequisites
- Make sure Hadoop is installed, configured and setup correctly. For more
- information see:
- * {{{./SingleCluster.html}Single Node Setup}} for first-time users.
- * {{{./ClusterSetup.html}Cluster Setup}} for large, distributed clusters.
- * Overview
- Service Level Authorization is the initial authorization mechanism to
- ensure clients connecting to a particular Hadoop service have the
- necessary, pre-configured, permissions and are authorized to access the
- given service. For example, a MapReduce cluster can use this mechanism
- to allow a configured list of users/groups to submit jobs.
- The <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> configuration file is used to
- define the access control lists for various Hadoop services.
- Service Level Authorization is performed much before to other access
- control checks such as file-permission checks, access control on job
- queues etc.
- * Configuration
- This section describes how to configure service-level authorization via
- the configuration file <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>>.
- ** Enable Service Level Authorization
- By default, service-level authorization is disabled for Hadoop. To
- enable it set the configuration property hadoop.security.authorization
- to true in <<<${HADOOP_CONF_DIR}/core-site.xml>>>.
- ** Hadoop Services and Configuration Properties
- This section lists the various Hadoop services and their configuration
- knobs:
- *-------------------------------------+--------------------------------------+
- || Property || Service
- *-------------------------------------+--------------------------------------+
- security.client.protocol.acl | ACL for ClientProtocol, which is used by user code via the DistributedFileSystem.
- *-------------------------------------+--------------------------------------+
- security.client.datanode.protocol.acl | ACL for ClientDatanodeProtocol, the client-to-datanode protocol for block recovery.
- *-------------------------------------+--------------------------------------+
- security.datanode.protocol.acl | ACL for DatanodeProtocol, which is used by datanodes to communicate with the namenode.
- *-------------------------------------+--------------------------------------+
- security.inter.datanode.protocol.acl | ACL for InterDatanodeProtocol, the inter-datanode protocol for updating generation timestamp.
- *-------------------------------------+--------------------------------------+
- security.namenode.protocol.acl | ACL for NamenodeProtocol, the protocol used by the secondary namenode to communicate with the namenode.
- *-------------------------------------+--------------------------------------+
- security.inter.tracker.protocol.acl | ACL for InterTrackerProtocol, used by the tasktrackers to communicate with the jobtracker.
- *-------------------------------------+--------------------------------------+
- security.job.submission.protocol.acl | ACL for JobSubmissionProtocol, used by job clients to communciate with the jobtracker for job submission, querying job status etc.
- *-------------------------------------+--------------------------------------+
- security.task.umbilical.protocol.acl | ACL for TaskUmbilicalProtocol, used by the map and reduce tasks to communicate with the parent tasktracker.
- *-------------------------------------+--------------------------------------+
- security.refresh.policy.protocol.acl | ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect.
- *-------------------------------------+--------------------------------------+
- security.ha.service.protocol.acl | ACL for HAService protocol used by HAAdmin to manage the active and stand-by states of namenode.
- *-------------------------------------+--------------------------------------+
- ** Access Control Lists
- <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> defines an access control list for
- each Hadoop service. Every access control list has a simple format:
- The list of users and groups are both comma separated list of names.
- The two lists are separated by a space.
- Example: <<<user1,user2 group1,group2>>>.
- Add a blank at the beginning of the line if only a list of groups is to
- be provided, equivalently a comma-separated list of users followed by
- a space or nothing implies only a set of given users.
- A special value of <<<*>>> implies that all users are allowed to access the
- service.
-
- If access control list is not defined for a service, the value of
- <<<security.service.authorization.default.acl>>> is applied. If
- <<<security.service.authorization.default.acl>>> is not defined, <<<*>>> is applied.
- ** Blocked Access Control Lists
- In some cases, it is required to specify blocked access control list for a service. This specifies
- the list of users and groups who are not authorized to access the service. The format of
- the blocked access control list is same as that of access control list. The blocked access
- control list can be specified via <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>>. The property name
- is derived by suffixing with ".blocked".
- Example: The property name of blocked access control list for <<<security.client.protocol.acl>>
- will be <<<security.client.protocol.acl.blocked>>>
- For a service, it is possible to specify both an access control list and a blocked control
- list. A user is authorized to access the service if the user is in the access control and not in
- the blocked access control list.
- If blocked access control list is not defined for a service, the value of
- <<<security.service.authorization.default.acl.blocked>>> is applied. If
- <<<security.service.authorization.default.acl.blocked>>> is not defined,
- empty blocked access control list is applied.
- ** Refreshing Service Level Authorization Configuration
- The service-level authorization configuration for the NameNode and
- JobTracker can be changed without restarting either of the Hadoop
- master daemons. The cluster administrator can change
- <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> on the master nodes and instruct
- the NameNode and JobTracker to reload their respective configurations
- via the <<<-refreshServiceAcl>>> switch to <<<dfsadmin>>> and <<<mradmin>>> commands
- respectively.
- Refresh the service-level authorization configuration for the NameNode:
- ----
- $ bin/hadoop dfsadmin -refreshServiceAcl
- ----
- Refresh the service-level authorization configuration for the
- JobTracker:
- ----
- $ bin/hadoop mradmin -refreshServiceAcl
- ----
- Of course, one can use the <<<security.refresh.policy.protocol.acl>>>
- property in <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> to restrict access to
- the ability to refresh the service-level authorization configuration to
- certain users/groups.
- ** Examples
- Allow only users <<<alice>>>, <<<bob>>> and users in the <<<mapreduce>>> group to submit
- jobs to the MapReduce cluster:
- ----
- <property>
- <name>security.job.submission.protocol.acl</name>
- <value>alice,bob mapreduce</value>
- </property>
- ----
- Allow only DataNodes running as the users who belong to the group
- datanodes to communicate with the NameNode:
- ----
- <property>
- <name>security.datanode.protocol.acl</name>
- <value>datanodes</value>
- </property>
- ----
- Allow any user to talk to the HDFS cluster as a DFSClient:
- ----
- <property>
- <name>security.client.protocol.acl</name>
- <value>*</value>
- </property>
- ----
|