|
@@ -157,7 +157,7 @@ Configuration
|
|
|
|
|
|
To configure the `YARN` to use the `Federation`, set the following property in the **conf/yarn-site.xml**:
|
|
|
|
|
|
-###EVERYWHERE:
|
|
|
+### EVERYWHERE:
|
|
|
|
|
|
These are common configurations that should appear in the **conf/yarn-site.xml** at each machine in the federation.
|
|
|
|
|
@@ -167,7 +167,7 @@ These are common configurations that should appear in the **conf/yarn-site.xml**
|
|
|
|`yarn.federation.enabled` | `true` | Whether federation is enabled or not |
|
|
|
|`yarn.resourcemanager.cluster-id` | `<unique-subcluster-id>` | The unique subcluster identifier for this RM (same as the one used for HA). |
|
|
|
|
|
|
-####State-Store:
|
|
|
+#### State-Store:
|
|
|
|
|
|
Currently, we support ZooKeeper and SQL based implementations of the state-store.
|
|
|
|
|
@@ -192,7 +192,7 @@ SQL: one must setup the following parameters:
|
|
|
|
|
|
We provide scripts for **MySQL** and **Microsoft SQL Server**.
|
|
|
|
|
|
-> MySQL
|
|
|
+- MySQL
|
|
|
|
|
|
For MySQL, one must download the latest jar version 5.x from [MVN Repository](https://mvnrepository.com/artifact/mysql/mysql-connector-java) and add it to the CLASSPATH.
|
|
|
Then the DB schema is created by executing the following SQL scripts in the database:
|
|
@@ -211,7 +211,7 @@ In the same directory we provide scripts to drop the Stored Procedures, the Tabl
|
|
|
1. MySQL 5.7
|
|
|
2. MySQL 8.0
|
|
|
|
|
|
-> Microsoft SQL Server
|
|
|
+- Microsoft SQL Server
|
|
|
|
|
|
For SQL-Server, the process is similar, but the jdbc driver is already included.
|
|
|
SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
|
|
@@ -221,10 +221,10 @@ SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
|
|
|
1. SQL Server 2008 R2 Enterprise
|
|
|
2. SQL Server 2012 Enterprise
|
|
|
3. SQL Server 2016 Enterprise
|
|
|
-4. SQL Server 2017 Enterprise
|
|
|
+4. SQL Server 2017 Enterprise
|
|
|
5. SQL Server 2019 Enterprise
|
|
|
|
|
|
-####Optional:
|
|
|
+#### Optional:
|
|
|
|
|
|
| Property | Example | Description |
|
|
|
|:---- |:---- |:---- |
|
|
@@ -235,7 +235,88 @@ SQL-Server scripts are located in **sbin/FederationStateStore/SQLServer/**.
|
|
|
|`yarn.federation.subcluster-resolver.class` | `org.apache.hadoop.yarn.server.federation.resolver.DefaultSubClusterResolverImpl` | The class used to resolve which subcluster a node belongs to, and which subcluster(s) a rack belongs to. |
|
|
|
|`yarn.federation.machine-list` | `<path of machine-list file>` | Path of machine-list file used by `SubClusterResolver`. Each line of the file is a node with sub-cluster and rack information. Below is the example: <br/> <br/> node1, subcluster1, rack1 <br/> node2, subcluster2, rack1 <br/> node3, subcluster3, rack2 <br/> node4, subcluster3, rack2 |
|
|
|
|
|
|
-###ON RMs:
|
|
|
+**How to configure the policy-manager?**
|
|
|
+
|
|
|
+- Router Policy
|
|
|
+
|
|
|
+ Router Policy defines the logic for determining the routing of an application submission and determines the HomeSubCluster for the application.
|
|
|
+
|
|
|
+ - HashBasedRouterPolicy
|
|
|
+ - This policy selects a sub-cluster based on the hash of the job's queue name. It is particularly useful when dealing with a large number of queues in a system, providing a default behavior. Furthermore, it ensures that all jobs belonging to the same queue are consistently mapped to the same sub-cluster, which can improve locality and performance.
|
|
|
+ - LoadBasedRouterPolicy
|
|
|
+ - This is a simplified load-balancing policy implementation. The policy utilizes binary weights (0/1 values) to enable or disable each sub-cluster. It selects the sub-cluster with the least load to forward the application traffic, ensuring optimal distribution.
|
|
|
+ - LocalityRouterPolicy
|
|
|
+ - This policy selects the sub-cluster based on the node specified by the client for running its application. Follows these conditions:
|
|
|
+ - It succeeds if
|
|
|
+ - There are three AMContainerResourceRequests in the order NODE, RACK, ANY
|
|
|
+ - Falls back to WeightedRandomRouterPolicy
|
|
|
+ - Null or empty AMContainerResourceRequests;
|
|
|
+ - One AMContainerResourceRequests and it has ANY as ResourceName;
|
|
|
+ - The node is in blacklisted SubClusters.
|
|
|
+ - It fails if
|
|
|
+ - The node does not exist and RelaxLocality is False;
|
|
|
+ - We have an invalid number (not 0, 1 or 3) resource requests
|
|
|
+ - RejectRouterPolicy
|
|
|
+ - This policy simply rejects all incoming requests.
|
|
|
+ - UniformRandomRouterPolicy
|
|
|
+ - This simple policy picks at uniform random among any of the currently active sub-clusters. This policy is easy to use and good for testing.
|
|
|
+ - WeightedRandomRouterPolicy
|
|
|
+ - This policy implements a weighted random sample among currently active sub-clusters.
|
|
|
+
|
|
|
+- AMRM Policy
|
|
|
+
|
|
|
+ AMRM Proxy defines the logic to split the resource request list received by AM among RMs.
|
|
|
+
|
|
|
+ - BroadcastAMRMProxyPolicy
|
|
|
+ - This policy simply broadcasts each ResourceRequest to all the available sub-clusters.
|
|
|
+ - HomeAMRMProxyPolicy
|
|
|
+ - This policy simply sends the ResourceRequest to the home sub-cluster.
|
|
|
+ - LocalityMulticastAMRMProxyPolicy
|
|
|
+ - Host localized ResourceRequests are always forwarded to the RM that owns the corresponding node, based on the feedback of a SubClusterResolver
|
|
|
+ If the SubClusterResolver cannot resolve this node we default to forwarding the ResourceRequest to the home sub-cluster.
|
|
|
+ - Rack localized ResourceRequests are forwarded to the RMs that owns the corresponding rack. Note that in some deployments each rack could be
|
|
|
+ striped across multiple RMs. This policy respects that. If the SubClusterResolver cannot resolve this rack we default to forwarding
|
|
|
+ the ResourceRequest to the home sub-cluster.
|
|
|
+ - ANY requests corresponding to node/rack local requests are forwarded only to the set of RMs that owns the corresponding localized requests. The number of
|
|
|
+ containers listed in each ANY is proportional to the number of localized container requests (associated to this ANY via the same allocateRequestId).
|
|
|
+ - RejectAMRMProxyPolicy
|
|
|
+ - This policy simply rejects all requests. Useful to prevent apps from accessing any sub-cluster.
|
|
|
+
|
|
|
+- Policy Manager
|
|
|
+
|
|
|
+ The PolicyManager is providing a combination of RouterPolicy and AMRMPolicy.
|
|
|
+
|
|
|
+ We can set policy-manager like this:
|
|
|
+ ```xml
|
|
|
+ <!--
|
|
|
+ We provide 6 PolicyManagers, They have a common prefix: org.apache.hadoop.yarn.server.federation.policies.manager
|
|
|
+ 1. HashBroadcastPolicyManager
|
|
|
+ 2. HomePolicyManager
|
|
|
+ 3. PriorityBroadcastPolicyManager
|
|
|
+ 4. RejectAllPolicyManager
|
|
|
+ 5. UniformBroadcastPolicyManager
|
|
|
+ 6. WeightedLocalityPolicyManager
|
|
|
+ -->
|
|
|
+ <property>
|
|
|
+ <name>yarn.federation.policy-manager</name>
|
|
|
+ <value>org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager</value>
|
|
|
+ </property>
|
|
|
+ ```
|
|
|
+
|
|
|
+ - HashBroadcastPolicyManager
|
|
|
+ - Policy that routes applications via hashing of their queuename, and broadcast resource requests. This picks a HashBasedRouterPolicy for the router and a BroadcastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
|
|
|
+ - HomePolicyManager
|
|
|
+ - Policy manager which uses the UniformRandomRouterPolicy for the Router and HomeAMRMProxyPolicy as the AMRMProxy policy to find the RM.
|
|
|
+ - PriorityBroadcastPolicyManager
|
|
|
+ - Policy that allows operator to configure "weights" for routing. This picks a PriorityRouterPolicy for the router and a BroadcastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
|
|
|
+ - RejectAllPolicyManager
|
|
|
+ - This policy rejects all requests for both router and amrmproxy routing. This picks a RejectRouterPolicy for the router and a RejectAMRMProxyPolicy for the amrmproxy as they are designed to work together.
|
|
|
+ - UniformBroadcastPolicyManager
|
|
|
+ - It combines the basic policies: UniformRandomRouterPolicy and BroadcastAMRMProxyPolicy, which are designed to work together and "spread" the load among sub-clusters uniformly. This simple policy might impose heavy load on the RMs and return more containers than a job requested as all requests are (replicated and) broadcasted.
|
|
|
+ - WeightedLocalityPolicyManager
|
|
|
+ - Policy that allows operator to configure "weights" for routing. This picks a LocalityRouterPolicy for the router and a LocalityMulticastAMRMProxyPolicy for the amrmproxy as they are designed to work together.
|
|
|
+
|
|
|
+### ON RMs:
|
|
|
|
|
|
These are extra configurations that should appear in the **conf/yarn-site.xml** at each ResourceManager.
|
|
|
|