ServiceLevelAuth.apt.vm 8.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191
  1. ~~ Licensed under the Apache License, Version 2.0 (the "License");
  2. ~~ you may not use this file except in compliance with the License.
  3. ~~ You may obtain a copy of the License at
  4. ~~
  5. ~~ http://www.apache.org/licenses/LICENSE-2.0
  6. ~~
  7. ~~ Unless required by applicable law or agreed to in writing, software
  8. ~~ distributed under the License is distributed on an "AS IS" BASIS,
  9. ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  10. ~~ See the License for the specific language governing permissions and
  11. ~~ limitations under the License. See accompanying LICENSE file.
  12. ---
  13. Service Level Authorization Guide
  14. ---
  15. ---
  16. ${maven.build.timestamp}
  17. Service Level Authorization Guide
  18. %{toc|section=1|fromDepth=0}
  19. * Purpose
  20. This document describes how to configure and manage Service Level
  21. Authorization for Hadoop.
  22. * Prerequisites
  23. Make sure Hadoop is installed, configured and setup correctly. For more
  24. information see:
  25. * {{{./SingleCluster.html}Single Node Setup}} for first-time users.
  26. * {{{./ClusterSetup.html}Cluster Setup}} for large, distributed clusters.
  27. * Overview
  28. Service Level Authorization is the initial authorization mechanism to
  29. ensure clients connecting to a particular Hadoop service have the
  30. necessary, pre-configured, permissions and are authorized to access the
  31. given service. For example, a MapReduce cluster can use this mechanism
  32. to allow a configured list of users/groups to submit jobs.
  33. The <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> configuration file is used to
  34. define the access control lists for various Hadoop services.
  35. Service Level Authorization is performed much before to other access
  36. control checks such as file-permission checks, access control on job
  37. queues etc.
  38. * Configuration
  39. This section describes how to configure service-level authorization via
  40. the configuration file <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>>.
  41. ** Enable Service Level Authorization
  42. By default, service-level authorization is disabled for Hadoop. To
  43. enable it set the configuration property hadoop.security.authorization
  44. to true in <<<${HADOOP_CONF_DIR}/core-site.xml>>>.
  45. ** Hadoop Services and Configuration Properties
  46. This section lists the various Hadoop services and their configuration
  47. knobs:
  48. *-------------------------------------+--------------------------------------+
  49. || Property || Service
  50. *-------------------------------------+--------------------------------------+
  51. security.client.protocol.acl | ACL for ClientProtocol, which is used by user code via the DistributedFileSystem.
  52. *-------------------------------------+--------------------------------------+
  53. security.client.datanode.protocol.acl | ACL for ClientDatanodeProtocol, the client-to-datanode protocol for block recovery.
  54. *-------------------------------------+--------------------------------------+
  55. security.datanode.protocol.acl | ACL for DatanodeProtocol, which is used by datanodes to communicate with the namenode.
  56. *-------------------------------------+--------------------------------------+
  57. security.inter.datanode.protocol.acl | ACL for InterDatanodeProtocol, the inter-datanode protocol for updating generation timestamp.
  58. *-------------------------------------+--------------------------------------+
  59. security.namenode.protocol.acl | ACL for NamenodeProtocol, the protocol used by the secondary namenode to communicate with the namenode.
  60. *-------------------------------------+--------------------------------------+
  61. security.inter.tracker.protocol.acl | ACL for InterTrackerProtocol, used by the tasktrackers to communicate with the jobtracker.
  62. *-------------------------------------+--------------------------------------+
  63. security.job.submission.protocol.acl | ACL for JobSubmissionProtocol, used by job clients to communciate with the jobtracker for job submission, querying job status etc.
  64. *-------------------------------------+--------------------------------------+
  65. security.task.umbilical.protocol.acl | ACL for TaskUmbilicalProtocol, used by the map and reduce tasks to communicate with the parent tasktracker.
  66. *-------------------------------------+--------------------------------------+
  67. security.refresh.policy.protocol.acl | ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect.
  68. *-------------------------------------+--------------------------------------+
  69. security.ha.service.protocol.acl | ACL for HAService protocol used by HAAdmin to manage the active and stand-by states of namenode.
  70. *-------------------------------------+--------------------------------------+
  71. ** Access Control Lists
  72. <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> defines an access control list for
  73. each Hadoop service. Every access control list has a simple format:
  74. The list of users and groups are both comma separated list of names.
  75. The two lists are separated by a space.
  76. Example: <<<user1,user2 group1,group2>>>.
  77. Add a blank at the beginning of the line if only a list of groups is to
  78. be provided, equivalently a comma-separated list of users followed by
  79. a space or nothing implies only a set of given users.
  80. A special value of <<<*>>> implies that all users are allowed to access the
  81. service.
  82. If access control list is not defined for a service, the value of
  83. <<<security.service.authorization.default.acl>>> is applied. If
  84. <<<security.service.authorization.default.acl>>> is not defined, <<<*>>> is applied.
  85. ** Blocked Access Control Lists
  86. In some cases, it is required to specify blocked access control list for a service. This specifies
  87. the list of users and groups who are not authorized to access the service. The format of
  88. the blocked access control list is same as that of access control list. The blocked access
  89. control list can be specified via <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>>. The property name
  90. is derived by suffixing with ".blocked".
  91. Example: The property name of blocked access control list for <<<security.client.protocol.acl>>
  92. will be <<<security.client.protocol.acl.blocked>>>
  93. For a service, it is possible to specify both an access control list and a blocked control
  94. list. A user is authorized to access the service if the user is in the access control and not in
  95. the blocked access control list.
  96. If blocked access control list is not defined for a service, the value of
  97. <<<security.service.authorization.default.acl.blocked>>> is applied. If
  98. <<<security.service.authorization.default.acl.blocked>>> is not defined,
  99. empty blocked access control list is applied.
  100. ** Refreshing Service Level Authorization Configuration
  101. The service-level authorization configuration for the NameNode and
  102. JobTracker can be changed without restarting either of the Hadoop
  103. master daemons. The cluster administrator can change
  104. <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> on the master nodes and instruct
  105. the NameNode and JobTracker to reload their respective configurations
  106. via the <<<-refreshServiceAcl>>> switch to <<<dfsadmin>>> and <<<mradmin>>> commands
  107. respectively.
  108. Refresh the service-level authorization configuration for the NameNode:
  109. ----
  110. $ bin/hadoop dfsadmin -refreshServiceAcl
  111. ----
  112. Refresh the service-level authorization configuration for the
  113. JobTracker:
  114. ----
  115. $ bin/hadoop mradmin -refreshServiceAcl
  116. ----
  117. Of course, one can use the <<<security.refresh.policy.protocol.acl>>>
  118. property in <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> to restrict access to
  119. the ability to refresh the service-level authorization configuration to
  120. certain users/groups.
  121. ** Examples
  122. Allow only users <<<alice>>>, <<<bob>>> and users in the <<<mapreduce>>> group to submit
  123. jobs to the MapReduce cluster:
  124. ----
  125. <property>
  126. <name>security.job.submission.protocol.acl</name>
  127. <value>alice,bob mapreduce</value>
  128. </property>
  129. ----
  130. Allow only DataNodes running as the users who belong to the group
  131. datanodes to communicate with the NameNode:
  132. ----
  133. <property>
  134. <name>security.datanode.protocol.acl</name>
  135. <value>datanodes</value>
  136. </property>
  137. ----
  138. Allow any user to talk to the HDFS cluster as a DFSClient:
  139. ----
  140. <property>
  141. <name>security.client.protocol.acl</name>
  142. <value>*</value>
  143. </property>
  144. ----