Explorar o código

HDFS-14558. RBF: Isolation/Fairness documentation. Contributed by Fengnan Li.

Yiqun Lin %!s(int64=4) %!d(string=hai) anos
pai
achega
0d7ac54510

+ 23 - 0
hadoop-hdfs-project/hadoop-hdfs-rbf/src/site/markdown/HDFSRouterFederation.md

@@ -184,6 +184,18 @@ Router relies on a state store to distribute tokens across all routers. Apart fr
 See the Apache JIRA ticket [HDFS-13532](https://issues.apache.org/jira/browse/HDFS-13532) for more information on this feature.
 
 
+### Isolation
+Router supports assignment of the dedicated number of RPC handlers to achieve isolation for all downstream nameservices it is configured to proxy. Since large or busy clusters may have relatively higher RPC traffic to the namenode compared to other clusters namenodes, this feature if enabled allows admins to configure higher number of RPC handlers for busy clusters. If dedicated handlers are not assigned for specific nameservices, equal distribution of RPC handlers is done for all configured nameservices. **Note** Fanout calls are treated as targeting a special nameservice, thus can be configured with handlers as well.
+
+If a downstream namenode is slow/busy enough that permits are unavailable, routers would throw StandByException exception to the client. This would in turn trigger a failover behavior at the client side and clients would connect to a different router in the cluster. This offers a positive effect of automatically load balancing RPCs across all routers nodes. This is important to ensure that a single router does not become a bottleneck in case of unhealthy namenodes and all handlers available in the entire router cluster are utilized.
+
+Users can configure handlers based on steady state load that individual downstream namenodes expect and can introduce more routers to the cluster to handle more RPCs overall. Because of the bouncing behavior that clients automatically get in this feature in an event where certain namenodes are overloaded, good clients connecting to good namenodes will always continue to use Rpc lanes dedicated to them. For bad behaving namenodes or backfill jobs that put spiky loads on namenodes, more routers could potentially be added with a higher than usual handler count to deal with the surge in traffic for specific nameservices if needed.
+
+Overall the isolation feature is exposed via a configuration dfs.federation.router.handler.isolation.enable. The default value of this feature will be “false”. Users can also introduce their own fairness policy controller for custom allocation of handlers to various nameservices.
+
+See the Apache JIRA ticket [HDFS-14090](https://issues.apache.org/jira/browse/HDFS-14090) for more information on this feature.
+
+
 Deployment
 ----------
 
@@ -482,6 +494,17 @@ Kerberos and Delegation token supported in federation.
 | dfs.federation.router.kerberos.internal.spnego.principal | `${dfs.web.authentication.kerberos.principal}` | The server principal used by the Router for web UI SPNEGO authentication when Kerberos security is enabled. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is '*', the web server will attempt to login with every principal specified in the keytab file 'dfs.web.authentication.kerberos.keytab'. |
 | dfs.federation.router.secret.manager.class | `org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl` |  Class to implement state store to delegation tokens. Default implementation uses zookeeper as the backend to store delegation tokens. |
 
+### Isolation
+
+Isolation and dedicated assignment of RPC handlers across all configured downstream nameservices. The sum of these numbers must be strictly smaller than the total number of router handlers (configed by dfs.federation.router.handler.count).
+
+| Property | Default | Description|
+|:---- |:---- |:---- |
+| dfs.federation.router.fairness.enable | `false` | If `true`, dedicated RPC handlers will be assigned to each nameservice based on the fairness assignment policy configured. |
+| dfs.federation.router.fairness.policy.controller.class | `org.apache.hadoop.hdfs.server.federation.fairness.NoRouterRpcFairnessPolicyController` | Default handler allocation model to be used if isolation feature is enabled. Recommend to use `org.apache.hadoop.hdfs.server.federation.fairness.StaticRouterRpcFairnessPolicyController` to fully use the feature. |
+| dfs.federation.router.fairness.handler.count.*EXAMPLENAMESERVICE* | | Dedicated handler assigned to a specific nameservice. If none is specified equal allocation is done across all nameservices. |
+| dfs.federation.router.fairness.handler.count.concurrent | | Dedicated handler assigned to fan out calls such as `renewLease`. |
+
 Metrics
 -------