Selaa lähdekoodia

YARN-11524. Improve the Policy Description in Federation.md. (#5797)

slfan1989 1 vuosi sitten
vanhempi
commit
33b1677e9e

+ 88 - 7
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/Federation.md

@@ -157,7 +157,7 @@ Configuration
 
   To configure the `YARN` to use the `Federation`, set the following property in the **conf/yarn-site.xml**:
 
-###EVERYWHERE:
+### EVERYWHERE:
 
 These are common configurations that should appear in the **conf/yarn-site.xml** at each machine in the federation.
 
@@ -167,7 +167,7 @@ These are common configurations that should appear in the **conf/yarn-site.xml**
 |`yarn.federation.enabled` | `true` | Whether federation is enabled or not |
 |`yarn.resourcemanager.cluster-id` | `<unique-subcluster-id>` | The unique subcluster identifier for this RM (same as the one used for HA). |
 
-####State-Store:
+#### State-Store:
 
 Currently, we support ZooKeeper and SQL based implementations of the state-store.
 
@@ -192,7 +192,7 @@ SQL: one must setup the following parameters:
 
 We provide scripts for **MySQL** and **Microsoft SQL Server**.
 
-> MySQL
+- MySQL
 
 For MySQL, one must download the latest jar version 5.x from [MVN Repository](https://mvnrepository.com/artifact/mysql/mysql-connector-java) and add it to the CLASSPATH.
 Then the DB schema is created by executing the following SQL scripts in the database:
@@ -211,7 +211,7 @@ In the same directory we provide scripts to drop the Stored Procedures, the Tabl
 1. MySQL 5.7
 2. MySQL 8.0
 
-> Microsoft SQL Server
+- Microsoft SQL Server
 
 For SQL-Server, the process is similar, but the jdbc driver is already included.
 SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
@@ -221,10 +221,10 @@ SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
 1. SQL Server 2008 R2 Enterprise
 2. SQL Server 2012 Enterprise
 3. SQL Server 2016 Enterprise
-4. SQL Server 2017 Enterprise
+4. SQL Server 2017 Enterprise
 5. SQL Server 2019 Enterprise
 
-####Optional:
+#### Optional:
 
 | Property | Example | Description |
 |:---- |:---- |:---- |
@@ -235,7 +235,88 @@ SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
 |`yarn.federation.subcluster-resolver.class` | `org.apache.hadoop.yarn.server.federation.resolver.DefaultSubClusterResolverImpl` | The class used to resolve which subcluster a node belongs to, and which subcluster(s) a rack belongs to. |
 |`yarn.federation.machine-list` | `<path of machine-list file>` | Path of machine-list file used by `SubClusterResolver`. Each line of the file is a node with sub-cluster and rack information. Below is the example: <br/> <br/> node1, subcluster1, rack1 <br/> node2, subcluster2, rack1 <br/> node3, subcluster3, rack2 <br/> node4, subcluster3, rack2 |
 
-###ON RMs:
+**How to configure the policy-manager?**
+
+- Router Policy
+
+  Router Policy defines the logic for determining the routing of an application submission and determines the HomeSubCluster for the application.
+
+  - HashBasedRouterPolicy
+    - This policy selects a sub-cluster based on the hash of the job's queue name. It is particularly useful when dealing with a large number of queues in a system, providing a default behavior. Furthermore, it ensures that all jobs belonging to the same queue are consistently mapped to the same sub-cluster, which can improve locality and performance.
+  - LoadBasedRouterPolicy
+    - This is a simplified load-balancing policy implementation. The policy utilizes binary weights (0/1 values) to enable or disable each sub-cluster. It selects the sub-cluster with the least load to forward the application traffic, ensuring optimal distribution.
+  - LocalityRouterPolicy
+    - This policy selects the sub-cluster based on the node specified by the client for running its application. Follows these conditions:
+      - It succeeds if
+        - There are three AMContainerResourceRequests in the order NODE, RACK, ANY
+      - Falls back to WeightedRandomRouterPolicy
+        - Null or empty AMContainerResourceRequests;
+        - One AMContainerResourceRequests and it has ANY as ResourceName;
+        - The node is in blacklisted SubClusters.
+      - It fails if
+        - The node does not exist and RelaxLocality is False;
+        - We have an invalid number (not 0, 1 or 3) resource requests
+  - RejectRouterPolicy
+    - This policy simply rejects all incoming requests.
+  - UniformRandomRouterPolicy
+    - This simple policy picks at uniform random among any of the currently active sub-clusters. This policy is easy to use and good for testing.
+  - WeightedRandomRouterPolicy
+    - This policy implements a weighted random sample among currently active sub-clusters.
+
+- AMRM Policy
+
+  AMRM Proxy defines the logic to split the resource request list received by AM among RMs.
+
+  - BroadcastAMRMProxyPolicy
+    - This policy simply broadcasts each ResourceRequest to all the available sub-clusters.
+  - HomeAMRMProxyPolicy
+    - This policy simply sends the ResourceRequest to the home sub-cluster.
+  - LocalityMulticastAMRMProxyPolicy
+    - Host localized ResourceRequests are always forwarded to the RM that owns the corresponding node, based on the feedback of a SubClusterResolver
+      If the SubClusterResolver cannot resolve this node we default to forwarding the ResourceRequest to the home sub-cluster.
+    - Rack localized ResourceRequests are forwarded to the RMs that owns the corresponding rack. Note that in some deployments each rack could be
+      striped across multiple RMs. This policy respects that. If the SubClusterResolver cannot resolve this rack we default to forwarding
+      the ResourceRequest to the home sub-cluster.
+    - ANY requests corresponding to node/rack local requests are forwarded only to the set of RMs that owns the corresponding localized requests. The number of
+      containers listed in each ANY is proportional to the number of localized container requests (associated to this ANY via the same allocateRequestId).
+  - RejectAMRMProxyPolicy
+    - This policy simply rejects all requests. Useful to prevent apps from accessing any sub-cluster.
+
+- Policy Manager
+
+  The PolicyManager is providing a combination of RouterPolicy and AMRMPolicy.
+
+  We can set policy-manager like this:
+  ```xml
+      <!--
+       We provide 6 PolicyManagers, They have a common prefix: org.apache.hadoop.yarn.server.federation.policies.manager
+       1. HashBroadcastPolicyManager
+       2. HomePolicyManager
+       3. PriorityBroadcastPolicyManager
+       4. RejectAllPolicyManager
+       5. UniformBroadcastPolicyManager
+       6. WeightedLocalityPolicyManager
+      -->
+      <property>
+         <name>yarn.federation.policy-manager</name>
+         <value>org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager</value>
+      </property>
+  ```
+
+  - HashBroadcastPolicyManager
+    - Policy that routes applications via hashing of their queuename, and broadcast resource requests. This picks a HashBasedRouterPolicy for the router and a BroadcastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
+  - HomePolicyManager
+    - Policy manager which uses the UniformRandomRouterPolicy for the Router and HomeAMRMProxyPolicy as the AMRMProxy policy to find the RM.
+  - PriorityBroadcastPolicyManager
+    - Policy that allows operator to configure "weights" for routing. This picks a PriorityRouterPolicy for the router and a BroadcastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
+  - RejectAllPolicyManager
+    - This policy rejects all requests for both router and amrmproxy routing. This picks a RejectRouterPolicy for the router and a RejectAMRMProxyPolicy for the amrmproxy as they are designed to work together.
+  - UniformBroadcastPolicyManager
+    - It combines the basic policies: UniformRandomRouterPolicy and BroadcastAMRMProxyPolicy, which are designed to work together and "spread" the load among sub-clusters uniformly. This simple policy might impose heavy load on the RMs and return more containers than a job requested as all requests are (replicated and) broadcasted.
+  - WeightedLocalityPolicyManager
+    - Policy that allows operator to configure "weights" for routing. This picks a LocalityRouterPolicy for the router and a LocalityMulticastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
+
+### ON RMs:
 
 These are extra configurations that should appear in the **conf/yarn-site.xml** at each ResourceManager.