Просмотр исходного кода

HADOOP-15076. Enhance S3A troubleshooting documents and add a performance document.
Contributed by Steve Loughran.

Steve Loughran 7 лет назад
Родитель
Сommit
c761e658f6

+ 18 - 3
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/encryption.md

@@ -37,6 +37,8 @@ and keys with which the file was encrypted.
 * You can use AWS bucket policies to mandate encryption rules for a bucket.
 * You can use AWS bucket policies to mandate encryption rules for a bucket.
 * You can use S3A per-bucket configuration to ensure that S3A clients use encryption
 * You can use S3A per-bucket configuration to ensure that S3A clients use encryption
 policies consistent with the mandated rules.
 policies consistent with the mandated rules.
+* You can use S3 Default Encryption to encrypt data without needing to
+set anything in the client.
 * Changing the encryption options on the client does not change how existing
 * Changing the encryption options on the client does not change how existing
 files were encrypted, except when the files are renamed.
 files were encrypted, except when the files are renamed.
 * For all mechanisms other than SSE-C, clients do not need any configuration
 * For all mechanisms other than SSE-C, clients do not need any configuration
@@ -58,9 +60,10 @@ The server-side "SSE" encryption is performed with symmetric AES256 encryption;
 S3 offers different mechanisms for actually defining the key to use.
 S3 offers different mechanisms for actually defining the key to use.
 
 
 
 
-There are thrre key management mechanisms, which in order of simplicity of use,
+There are four key management mechanisms, which in order of simplicity of use,
 are:
 are:
 
 
+* S3 Default Encryption
 * SSE-S3: an AES256 key is generated in S3, and saved alongside the data.
 * SSE-S3: an AES256 key is generated in S3, and saved alongside the data.
 * SSE-KMS: an AES256 key is generated in S3, and encrypted with a secret key provided
 * SSE-KMS: an AES256 key is generated in S3, and encrypted with a secret key provided
 by Amazon's Key Management Service, a key referenced by name in the uploading client.
 by Amazon's Key Management Service, a key referenced by name in the uploading client.
@@ -68,6 +71,19 @@ by Amazon's Key Management Service, a key referenced by name in the uploading cl
 to encrypt and decrypt the data.
 to encrypt and decrypt the data.
 
 
 
 
+## <a name="sse-s3"></a> S3 Default Encryption
+
+This feature allows the administrators of the AWS account to set the "default"
+encryption policy on a bucket -the encryption to use if the client does
+not explicitly declare an encryption algorithm.
+
+[S3 Default Encryption for S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/dev/bucket-encryption.html)
+
+This supports SSE-S3 and SSE-KMS.
+
+There is no need to set anything up in the client: do it in the AWS console.
+
+
 ## <a name="sse-s3"></a> SSE-S3 Amazon S3-Managed Encryption Keys
 ## <a name="sse-s3"></a> SSE-S3 Amazon S3-Managed Encryption Keys
 
 
 In SSE-S3, all keys and secrets are managed inside S3. This is the simplest encryption mechanism.
 In SSE-S3, all keys and secrets are managed inside S3. This is the simplest encryption mechanism.
@@ -413,7 +429,6 @@ How can you do that from Hadoop? With `rename()`.
 
 
 The S3A client mimics a real filesystem's' rename operation by copying all the
 The S3A client mimics a real filesystem's' rename operation by copying all the
 source files to the destination paths, then deleting the old ones.
 source files to the destination paths, then deleting the old ones.
-If you do a rename()
 
 
 Note: this does not work for SSE-C, because you cannot set a different key
 Note: this does not work for SSE-C, because you cannot set a different key
 for reading as for writing, and you must supply that key for reading. There
 for reading as for writing, and you must supply that key for reading. There
@@ -421,7 +436,7 @@ you need to copy one bucket to a different bucket, one with a different key.
 Use `distCp`for this, with per-bucket encryption policies.
 Use `distCp`for this, with per-bucket encryption policies.
 
 
 
 
-## <a name="Troubleshooting"></a> Troubleshooting Encryption
+## <a name="troubleshooting"></a> Troubleshooting Encryption
 
 
 The [troubleshooting](./troubleshooting_s3a.html) document covers
 The [troubleshooting](./troubleshooting_s3a.html) document covers
 stack traces which may surface when working with encrypted data.
 stack traces which may surface when working with encrypted data.

+ 3 - 74
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md

@@ -25,6 +25,7 @@ Please use `s3a:` as the connector to data hosted in S3 with Apache Hadoop.**
 See also:
 See also:
 
 
 * [Encryption](./encryption.html)
 * [Encryption](./encryption.html)
+* [Performance](./performance.html)
 * [S3Guard](./s3guard.html)
 * [S3Guard](./s3guard.html)
 * [Troubleshooting](./troubleshooting_s3a.html)
 * [Troubleshooting](./troubleshooting_s3a.html)
 * [Committing work to S3 with the "S3A Committers"](./committers.html)
 * [Committing work to S3 with the "S3A Committers"](./committers.html)
@@ -1580,80 +1581,8 @@ The S3A Filesystem client supports the notion of input policies, similar
 to that of the Posix `fadvise()` API call. This tunes the behavior of the S3A
 to that of the Posix `fadvise()` API call. This tunes the behavior of the S3A
 client to optimise HTTP GET requests for the different use cases.
 client to optimise HTTP GET requests for the different use cases.
 
 
-*"sequential"*
-
-Read through the file, possibly with some short forward seeks.
-
-The whole document is requested in a single HTTP request; forward seeks
-within the readahead range are supported by skipping over the intermediate
-data.
-
-This is leads to maximum read throughput —but with very expensive
-backward seeks.
-
-
-*"normal" (default)*
-
-The "Normal" policy starts off reading a file  in "sequential" mode,
-but if the caller seeks backwards in the stream, it switches from
-sequential to "random".
-
-This policy effectively recognizes the initial read pattern of columnar
-storage formats (e.g. Apache ORC and Apache Parquet), which seek to the end
-of a file, read in index data and then seek backwards to selectively read
-columns. The first seeks may be be expensive compared to the random policy,
-however the overall process is much less expensive than either sequentially
-reading through a file with the "random" policy, or reading columnar data
-with the "sequential" policy. When the exact format/recommended
-seek policy of data are known in advance, this policy
-
-*"random"*
-
-Optimised for random IO, specifically the Hadoop `PositionedReadable`
-operations —though `seek(offset); read(byte_buffer)` also benefits.
-
-Rather than ask for the whole file, the range of the HTTP request is
-set to that that of the length of data desired in the `read` operation
-(Rounded up to the readahead value set in `setReadahead()` if necessary).
-
-By reducing the cost of closing existing HTTP requests, this is
-highly efficient for file IO accessing a binary file
-through a series of `PositionedReadable.read()` and `PositionedReadable.readFully()`
-calls. Sequential reading of a file is expensive, as now many HTTP requests must
-be made to read through the file.
-
-For operations simply reading through a file: copying, distCp, reading
-Gzipped or other compressed formats, parsing .csv files, etc, the `sequential`
-policy is appropriate. This is the default: S3A does not need to be configured.
-
-For the specific case of high-performance random access IO, the `random` policy
-may be considered. The requirements are:
-
-* Data is read using the `PositionedReadable` API.
-* Long distance (many MB) forward seeks
-* Backward seeks as likely as forward seeks.
-* Little or no use of single character `read()` calls or small `read(buffer)`
-calls.
-* Applications running close to the S3 data store. That is: in EC2 VMs in
-the same datacenter as the S3 instance.
-
-The desired fadvise policy must be set in the configuration option
-`fs.s3a.experimental.input.fadvise` when the filesystem instance is created.
-That is: it can only be set on a per-filesystem basis, not on a per-file-read
-basis.
-
-    <property>
-      <name>fs.s3a.experimental.input.fadvise</name>
-      <value>random</value>
-      <description>Policy for reading files.
-       Values: 'random', 'sequential' or 'normal'
-       </description>
-    </property>
-
-[HDFS-2744](https://issues.apache.org/jira/browse/HDFS-2744),
-*Extend FSDataInputStream to allow fadvise* proposes adding a public API
-to set fadvise policies on input streams. Once implemented,
-this will become the supported mechanism used for configuring the input IO policy.
+See [Improving data input performance through fadvise](./performance.html#fadvise)
+for the details.
 
 
 ##<a name="metrics"></a>Metrics
 ##<a name="metrics"></a>Metrics
 
 

+ 518 - 0
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/performance.md

@@ -0,0 +1,518 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# Maximizing Performance when working with the S3A Connector
+
+<!-- MACRO{toc|fromDepth=0|toDepth=3} -->
+
+
+## <a name="introduction"></a> Introduction
+
+S3 is slower to work with than HDFS, even on virtual clusters running on
+Amazon EC2.
+
+That's because its a very different system, as you can see:
+
+
+| Feature | HDFS | S3 through the S3A connector |
+|---------|------|------------------------------|
+| communication | RPC | HTTP GET/PUT/HEAD/LIST/COPY requests |
+| data locality | local storage | remote S3 servers |
+| replication | multiple datanodes | asynchronous after upload |
+| consistency | consistent data and listings | eventual consistent for listings, deletes and updates |
+| bandwidth | best: local IO, worst: datacenter network | bandwidth between servers and S3 |
+| latency | low | high, especially for "low cost" directory operations |
+| rename | fast, atomic | slow faked rename through COPY & DELETE|
+| delete | fast, atomic | fast for a file, slow & non-atomic for directories |
+| writing| incremental | in blocks; not visible until the writer is closed |
+| reading | seek() is fast | seek() is slow and expensive |
+| IOPs | limited only by hardware | callers are throttled to shards in an s3 bucket |
+| Security | Posix user+group; ACLs | AWS Roles and policies |
+
+From a performance perspective, key points to remember are:
+
+* S3 throttles bucket access across all callers: adding workers can make things worse.
+* EC2 VMs have network IO throttled based on the VM type.
+* Directory rename and copy operations take *much* longer the more objects and data there is.
+The slow performance of `rename()` surfaces during the commit phase of jobs,
+applications like `DistCP`, and elsewhere.
+* seek() calls when reading a file can force new HTTP requests.
+This can make reading columnar Parquet/ORC data expensive.
+
+Overall, although the S3A connector makes S3 look like a file system,
+it isn't, and some attempts to preserve the metaphor are "aggressively suboptimal".
+
+To make most efficient use of S3, care is needed.
+
+## <a name="s3guard"></a> Speeding up directory listing operations through S3Guard
+
+[S3Guard](s3guard.html) provides significant speedups for operations which
+list files a lot. This includes the setup of all queries against data:
+MapReduce, Hive and Spark, as well as DistCP.
+
+
+Experiment with using it to see what speedup it delivers.
+
+
+## <a name="fadvise"></a> Improving data input performance through fadvise
+
+The S3A Filesystem client supports the notion of input policies, similar
+to that of the Posix `fadvise()` API call. This tunes the behavior of the S3A
+client to optimise HTTP GET requests for the different use cases.
+
+### fadvise `sequential`
+
+Read through the file, possibly with some short forward seeks.
+
+The whole document is requested in a single HTTP request; forward seeks
+within the readahead range are supported by skipping over the intermediate
+data.
+
+This delivers maximum sequential throughput —but with very expensive
+backward seeks.
+
+Applications reading a file in bulk (DistCP, any copy operations) should use
+sequential access, as should those reading data from gzipped `.gz` files.
+Because the "normal" fadvise policy starts off in sequential IO mode,
+there is rarely any need to explicit request this policy.
+
+### fadvise `random`
+
+Optimised for random IO, specifically the Hadoop `PositionedReadable`
+operations —though `seek(offset); read(byte_buffer)` also benefits.
+
+Rather than ask for the whole file, the range of the HTTP request is
+set to that that of the length of data desired in the `read` operation
+(Rounded up to the readahead value set in `setReadahead()` if necessary).
+
+By reducing the cost of closing existing HTTP requests, this is
+highly efficient for file IO accessing a binary file
+through a series of `PositionedReadable.read()` and `PositionedReadable.readFully()`
+calls. Sequential reading of a file is expensive, as now many HTTP requests must
+be made to read through the file: there's a delay between each GET operation.
+
+
+Random IO is best for IO with seek-heavy characteristics:
+
+* Data is read using the `PositionedReadable` API.
+* Long distance (many MB) forward seeks
+* Backward seeks as likely as forward seeks.
+* Little or no use of single character `read()` calls or small `read(buffer)`
+calls.
+* Applications running close to the S3 data store. That is: in EC2 VMs in
+the same datacenter as the S3 instance.
+
+The desired fadvise policy must be set in the configuration option
+`fs.s3a.experimental.input.fadvise` when the filesystem instance is created.
+That is: it can only be set on a per-filesystem basis, not on a per-file-read
+basis.
+
+```xml
+<property>
+  <name>fs.s3a.experimental.input.fadvise</name>
+  <value>random</value>
+  <description>
+  Policy for reading files.
+  Values: 'random', 'sequential' or 'normal'
+   </description>
+</property>
+```
+
+[HDFS-2744](https://issues.apache.org/jira/browse/HDFS-2744),
+*Extend FSDataInputStream to allow fadvise* proposes adding a public API
+to set fadvise policies on input streams. Once implemented,
+this will become the supported mechanism used for configuring the input IO policy.
+
+### fadvise `normal` (default)
+
+The `normal` policy starts off reading a file  in `sequential` mode,
+but if the caller seeks backwards in the stream, it switches from
+sequential to `random`.
+
+This policy essentially recognizes the initial read pattern of columnar
+storage formats (e.g. Apache ORC and Apache Parquet), which seek to the end
+of a file, read in index data and then seek backwards to selectively read
+columns. The first seeks may be be expensive compared to the random policy,
+however the overall process is much less expensive than either sequentially
+reading through a file with the `random` policy, or reading columnar data
+with the `sequential` policy.
+
+
+## <a name="commit"></a> Committing Work in MapReduce and Spark
+
+Hadoop MapReduce, Apache Hive and Apache Spark all write their work
+to HDFS and similar filesystems.
+When using S3 as a destination, this is slow because of the way `rename()`
+is mimicked with copy and delete.
+
+If committing output takes a long time, it is because you are using the standard
+`FileOutputCommitter`. If you are doing this on any S3 endpoint which lacks
+list consistency (Amazon S3 without [S3Guard](s3guard.html)), this committer
+is at risk of losing data!
+
+*Your problem may appear to be performance, but that is a symptom
+of the underlying problem: the way S3A fakes rename operations means that
+the rename cannot be safely be used in output-commit algorithms.*
+
+Fix: Use one of the dedicated [S3A Committers](committers.md).
+
+## <a name="tuning"></a> Options to Tune
+
+### <a name="pooling"></a> Thread and connection pool sizes.
+
+Each S3A client interacting with a single bucket, as a single user, has its
+own dedicated pool of open HTTP 1.1 connections alongside a pool of threads used
+for upload and copy operations.
+The default pool sizes are intended to strike a balance between performance
+and memory/thread use.
+
+You can have a larger pool of (reused) HTTP connections and threads
+for parallel IO (especially uploads) by setting the properties
+
+
+| property | meaning | default |
+|----------|---------|---------|
+| `fs.s3a.threads.max`| Threads in the AWS transfer manager| 10 |
+| `fs.s3a.connection.maximum`| Maximum number of HTTP connections | 10|
+
+We recommend using larger values for processes which perform
+a lot of IO: `DistCp`, Spark Workers and similar.
+
+```xml
+<property>
+  <name>fs.s3a.threads.max</name>
+  <value>20</value>
+</property>
+<property>
+  <name>fs.s3a.connection.maximum</name>
+  <value>20</value>
+</property>
+```
+
+Be aware, however, that processes which perform many parallel queries
+may consume large amounts of resources if each query is working with
+a different set of s3 buckets, or are acting on behalf of different users.
+
+### For large data uploads, tune the block size: `fs.s3a.block.size`
+
+When uploading data, it is uploaded in blocks set by the option
+`fs.s3a.block.size`; default value "32M" for 32 Megabytes.
+
+If a larger value is used, then more data is buffered before the upload
+begins:
+
+```xml
+<property>
+  <name>fs.s3a.block.size</name>
+  <value>128M</value>
+</property>
+```
+
+This means that fewer PUT/POST requests are made of S3 to upload data,
+which reduces the likelihood that S3 will throttle the client(s)
+
+### Maybe: Buffer Write Data in Memory
+
+When large files are being uploaded, blocks are saved to disk and then
+queued for uploading, with multiple threads uploading different blocks
+in parallel.
+
+The blocks can be buffered in memory by setting the option
+`fs.s3a.fast.upload.buffer` to `bytebuffer`, or, for on-heap storage
+`array`.
+
+1. Switching to in memory-IO reduces disk IO, and can be faster if the bandwidth
+to the S3 store is so high that the disk IO becomes the bottleneck.
+This can have a tangible benefit when working with on-premise S3-compatible
+object stores with very high bandwidth to servers.
+
+It is very easy to run out of memory when buffering to it; the option
+`fs.s3a.fast.upload.active.blocks"` exists to tune how many active blocks
+a single output stream writing to S3 may have queued at a time.
+
+As the size of each buffered block is determined by the value of `fs.s3a.block.size`,
+the larger the block size, the more likely you will run out of memory.
+
+## <a name="distcp"></a> DistCP
+
+DistCP can be slow, especially if the parameters and options for the operation
+are not tuned for working with S3.
+
+To exacerbate the issue, DistCP invariably puts heavy load against the
+bucket being worked with, which will cause S3 to throttle requests.
+It will throttle: directory operations, uploads of new data, and delete operations,
+amongst other things
+
+### DistCP: Options to Tune
+
+* `-numListstatusThreads <threads>` : set to something higher than the default (1).
+* `-bandwidth <mb>` : use to limit the upload bandwidth per worker
+* `-m <maps>` : limit the number of mappers, hence the load on the S3 bucket.
+
+Adding more maps with the `-m` option does not guarantee better performance;
+it may just increase the amount of throttling which takes place.
+A smaller number of maps with a higher bandwidth per map can be more efficient.
+
+### DistCP: Options to Avoid.
+
+DistCp's `-atomic` option copies up data into a directory, then renames
+it into place, which is the where the copy takes place. This is a performance
+killer.
+
+* Do not use the `-atomic` option.
+* The `-append` operation is not supported on S3; avoid.
+* `-p` S3 does not have a POSIX-style permission model; this will fail.
+
+
+### DistCP: Parameters to Tune
+
+1. As discussed [earlier](#pooling), use large values for
+`fs.s3a.threads.max` and `fs.s3a.connection.maximum`.
+
+1. Make sure that the bucket is using `sequential` or `normal` fadvise seek policies,
+that is, `fs.s3a.experimental.fadvise` is not set to `random`
+
+1. Perform listings in parallel by setting `-numListstatusThreads`
+to a higher number. Make sure that `fs.s3a.connection.maximum`
+is equal to or greater than the value used.
+
+1. If using `-delete`, set `fs.trash.interval` to 0 to avoid the deleted
+objects from being copied to a trash directory.
+
+*DO NOT* switch `fs.s3a.fast.upload.buffer` to buffer in memory.
+If one distcp mapper runs out of memory it will fail,
+and that runs the risk of failing the entire job.
+It is safer to keep the default value, `disk`.
+
+What is potentially useful is uploading in bigger blocks; this is more
+efficient in terms of HTTP connection use, and reduce the IOP rate against
+the S3 bucket/shard.
+
+```xml
+<property>
+  <name>fs.s3a.threads.max</name>
+  <value>20</value>
+</property>
+
+<property>
+  <name>fs.s3a.connection.maximum</name>
+  <value>30</value>
+  <descriptiom>
+   Make greater than both fs.s3a.threads.max and -numListstatusThreads
+   </descriptiom>
+</property>
+
+<property>
+  <name>fs.s3a.experimental.fadvise</name>
+  <value>normal</value>
+</property>
+
+<property>
+  <name>fs.s3a.block.size</name>
+  <value>128M</value>
+</property>
+
+<property>
+  <name>fs.s3a.fast.upload.buffer</name>
+  <value>disk</value>
+</property>
+
+<property>
+  <name>fs.trash.interval</name>
+  <value>0</value>
+</property>
+```
+
+## <a name="rm"></a> hadoop shell commands `fs -rm`
+
+The `hadoop fs -rm` command can rename the file under `.Trash` rather than
+deleting it. Use `-skipTrash` to eliminate that step.
+
+
+This can be set in the property `fs.trash.interval`; while the default is 0,
+most HDFS deployments have it set to a non-zero value to reduce the risk of
+data loss.
+
+```xml
+<property>
+  <name>fs.trash.interval</name>
+  <value>0</value>
+</property>
+```
+
+
+## <a name="load balancing"></a> Improving S3 load-balancing behavior
+
+Amazon S3 uses a set of front-end servers to provide access to the underlying data.
+The choice of which front-end server to use is handled via load-balancing DNS
+service: when the IP address of an S3 bucket is looked up, the choice of which
+IP address to return to the client is made based on the the current load
+of the front-end servers.
+
+Over time, the load across the front-end changes, so those servers considered
+"lightly loaded" will change. If the DNS value is cached for any length of time,
+your application may end up talking to an overloaded server. Or, in the case
+of failures, trying to talk to a server that is no longer there.
+
+And by default, for historical security reasons in the era of applets,
+the DNS TTL of a JVM is "infinity".
+
+To work with AWS better, set the DNS time-to-live of an application which
+works with S3 to something lower.
+See [AWS documentation](http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-jvm-ttl.html).
+
+## <a name="network_performance"></a> Troubleshooting network performance
+
+An example of this is covered in [HADOOP-13871](https://issues.apache.org/jira/browse/HADOOP-13871).
+
+1. For public data, use `curl`:
+
+        curl -O https://landsat-pds.s3.amazonaws.com/scene_list.gz
+1. Use `nettop` to monitor a processes connections.
+
+
+## <a name="throttling"></a> Throttling
+
+When many requests are made of a specific S3 bucket (or shard inside it),
+S3 will respond with a 503 "throttled" response.
+Throttling can be recovered from, provided overall load decreases.
+Furthermore, because it is sent before any changes are made to the object store,
+is inherently idempotent. For this reason, the client will always attempt to
+retry throttled requests.
+
+The limit of the number of times a throttled request can be retried,
+and the exponential interval increase between attempts, can be configured
+independently of the other retry limits.
+
+```xml
+<property>
+  <name>fs.s3a.retry.throttle.limit</name>
+  <value>20</value>
+  <description>
+    Number of times to retry any throttled request.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.retry.throttle.interval</name>
+  <value>500ms</value>
+  <description>
+    Interval between retry attempts on throttled requests.
+  </description>
+</property>
+```
+
+If a client is failing due to `AWSServiceThrottledException` failures,
+increasing the interval and limit *may* address this. However, it
+it is a sign of AWS services being overloaded by the sheer number of clients
+and rate of requests. Spreading data across different buckets, and/or using
+a more balanced directory structure may be beneficial.
+Consult [the AWS documentation](http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
+
+Reading or writing data encrypted with SSE-KMS forces S3 to make calls of
+the AWS KMS Key Management Service, which comes with its own
+[Request Rate Limits](http://docs.aws.amazon.com/kms/latest/developerguide/limits.html).
+These default to 1200/second for an account, across all keys and all uses of
+them, which, for S3 means: across all buckets with data encrypted with SSE-KMS.
+
+### <a name="minimizing_throttling"></a> Tips to Keep Throttling down
+
+If you are seeing a lot of throttling responses on a large scale
+operation like a `distcp` copy, *reduce* the number of processes trying
+to work with the bucket (for distcp: reduce the number of mappers with the
+`-m` option).
+
+If you are reading or writing lists of files, if you can randomize
+the list so they are not processed in a simple sorted order, you may
+reduce load on a specific shard of S3 data, so potentially increase throughput.
+
+An S3 Bucket is throttled by requests coming from all
+simultaneous clients. Different applications and jobs may interfere with
+each other: consider that when troubleshooting.
+Partitioning data into different buckets may help isolate load here.
+
+If you are using data encrypted with SSE-KMS, then the
+will also apply: these are stricter than the S3 numbers.
+If you believe that you are reaching these limits, you may be able to
+get them increased.
+Consult [the KMS Rate Limit documentation](http://docs.aws.amazon.com/kms/latest/developerguide/limits.html).
+
+### <a name="s3guard_throttling"></a> S3Guard and Throttling
+
+
+S3Guard uses DynamoDB for directory and file lookups;
+it is rate limited to the amount of (guaranteed) IO purchased for a
+table.
+
+To see the allocated capacity of a bucket, the `hadoop s3guard bucket-info s3a://bucket`
+command will print out the allocated capacity.
+
+
+If significant throttling events/rate is observed here, the pre-allocated
+IOPs can be increased with the `hadoop s3guard set-capacity` command, or
+through the AWS Console. Throttling events in S3Guard are noted in logs, and
+also in the S3A metrics `s3guard_metadatastore_throttle_rate` and
+`s3guard_metadatastore_throttled`.
+
+If you are using DistCP for a large backup to/from a S3Guarded bucket, it is
+actually possible to increase the capacity for the duration of the operation.
+
+
+## <a name="coding"></a> Best Practises for Code
+
+Here are some best practises if you are writing applications to work with
+S3 or any other object store through the Hadoop APIs.
+
+Use `listFiles(path, recursive)` over `listStatus(path)`.
+The recursive `listFiles()` call can enumerate all dependents of a path
+in a single LIST call, irrespective of how deep the path is.
+In contrast, any directory tree-walk implemented in the client is issuing
+multiple HTTP requests to scan each directory, all the way down.
+
+Cache the outcome of `getFileStats()`, rather than repeatedly ask for it.
+That includes using `isFile()`, `isDirectory()`, which are simply wrappers
+around `getFileStatus()`.
+
+Don't immediately look for a file with a `getFileStatus()` or listing call
+after creating it, or try to read it immediately.
+This is where eventual consistency problems surface: the data may not yet be visible.
+
+Rely on `FileNotFoundException` being raised if the source of an operation is
+missing, rather than implementing your own probe for the file before
+conditionally calling the operation.
+
+### `rename()`
+
+Avoid any algorithm which uploads data into a temporary file and then uses
+`rename()` to commit it into place with a final path.
+On HDFS this offers a fast commit operation.
+With S3, Wasb and other object stores, you can write straight to the destination,
+knowing that the file isn't visible until you close the write: the write itself
+is atomic.
+
+The `rename()` operation may return `false` if the source is missing; this
+is a weakness in the API. Consider a check before calling rename, and if/when
+a new rename() call is made public, switch to it.
+
+
+### `delete(path, recursive)`
+
+Keep in mind that `delete(path, recursive)` is a no-op if the path does not exist, so
+there's no need to have a check for the path existing before you call it.
+
+`delete()` is often used as a cleanup operation.
+With an object store this is slow, and may cause problems if the caller
+expects an immediate response. For example, a thread may block so long
+that other liveness checks start to fail.
+Consider spawning off an executor thread to do these background cleanup operations.

+ 490 - 263
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md

@@ -14,9 +14,9 @@
 
 
 # Troubleshooting
 # Troubleshooting
 
 
-<!-- MACRO{toc|fromDepth=0|toDepth=5} -->
+<!-- MACRO{toc|fromDepth=0|toDepth=3} -->
 
 
-##<a name="introduction"></a> Introduction
+## <a name="introduction"></a> Introduction
 
 
 Common problems working with S3 are
 Common problems working with S3 are
 
 
@@ -24,28 +24,42 @@ Common problems working with S3 are
 1. Authentication
 1. Authentication
 1. S3 Inconsistency side-effects
 1. S3 Inconsistency side-effects
 
 
-Classpath is usually the first problem. For the S3x filesystem clients,
-you need the Hadoop-specific filesystem clients, third party S3 client libraries
-compatible with the Hadoop code, and any dependent libraries compatible with
+
+Troubleshooting IAM Assumed Roles is covered in its
+[specific documentation](assumed_roles.html#troubleshooting).
+
+## <a name="classpath"></a> Classpath Setup
+
+Classpath is usually the first problem. For the S3A filesystem client,
+you need the Hadoop-specific filesystem clients, the very same AWS SDK library
+which Hadoop was built against, and any dependent libraries compatible with
 Hadoop and the specific JVM.
 Hadoop and the specific JVM.
 
 
 The classpath must be set up for the process talking to S3: if this is code
 The classpath must be set up for the process talking to S3: if this is code
 running in the Hadoop cluster, the JARs must be on that classpath. That
 running in the Hadoop cluster, the JARs must be on that classpath. That
 includes `distcp` and the `hadoop fs` command.
 includes `distcp` and the `hadoop fs` command.
 
 
-<!-- MACRO{toc|fromDepth=0|toDepth=2} -->
+<b>Critical:</b> *Do not attempt to "drop in" a newer version of the AWS
+SDK than that which the Hadoop version was built with*
+Whatever problem you have, changing the AWS SDK version will not fix things,
+only change the stack traces you see.
 
 
-Troubleshooting IAM Assumed Roles is covered in its
-[specific documentation](assumed_roles.html#troubeshooting).
+Similarly, don't try and mix a `hadoop-aws` JAR from one Hadoop release
+with that of any other. The JAR must be in sync with `hadoop-common` and
+some other Hadoop JARs.
 
 
-## <a name="classpath"></a> Classpath Setup
+<i>Randomly changing hadoop- and aws- JARs in the hope of making a problem
+"go away" or to gain access to a feature you want,
+will not lead to the outcome you desire.</i>
+
+Tip: you can use [mvnrepository](http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws)
+to determine the dependency version requirements of a specific `hadoop-aws`
+JAR published by the ASF.
 
 
-Note that for security reasons, the S3A client does not provide much detail
-on the authentication process (i.e. the secrets used to authenticate).
 
 
 ### `ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem`
 ### `ClassNotFoundException: org.apache.hadoop.fs.s3a.S3AFileSystem`
 
 
-These is Hadoop filesytem client classes, found in the `hadoop-aws` JAR.
+These are Hadoop filesystem client classes, found in the `hadoop-aws` JAR.
 An exception reporting this class as missing means that this JAR is not on
 An exception reporting this class as missing means that this JAR is not on
 the classpath.
 the classpath.
 
 
@@ -56,7 +70,7 @@ the classpath.
 This means that the `aws-java-sdk-bundle.jar` JAR is not on the classpath:
 This means that the `aws-java-sdk-bundle.jar` JAR is not on the classpath:
 add it.
 add it.
 
 
-### Missing method in `com.amazonaws` class
+### `java.lang.NoSuchMethodError` referencing a `com.amazonaws` class
 
 
 This can be triggered by incompatibilities between the AWS SDK on the classpath
 This can be triggered by incompatibilities between the AWS SDK on the classpath
 and the version which Hadoop was compiled with.
 and the version which Hadoop was compiled with.
@@ -68,6 +82,15 @@ version.
 The sole fix is to use the same version of the AWS SDK with which Hadoop
 The sole fix is to use the same version of the AWS SDK with which Hadoop
 was built.
 was built.
 
 
+This can also be caused by having more than one version of an AWS SDK
+JAR on the classpath. If the full `aws-java-sdk-bundle<` JAR is on the
+classpath, do not add any of the `aws-sdk-` JARs.
+
+
+### `java.lang.NoSuchMethodError` referencing an `org.apache.hadoop` class
+
+This happens if the `hadoop-aws` and `hadoop-common` JARs are out of sync.
+You can't mix them around: they have to have exactly matching version numbers.
 
 
 ## <a name="authentication"></a> Authentication Failure
 ## <a name="authentication"></a> Authentication Failure
 
 
@@ -115,7 +138,7 @@ mechanism.
 1. If using session authentication, the session may have expired.
 1. If using session authentication, the session may have expired.
 Generate a new session token and secret.
 Generate a new session token and secret.
 
 
-1. If using environement variable-based authentication, make sure that the
+1. If using environment variable-based authentication, make sure that the
 relevant variables are set in the environment in which the process is running.
 relevant variables are set in the environment in which the process is running.
 
 
 The standard first step is: try to use the AWS command line tools with the same
 The standard first step is: try to use the AWS command line tools with the same
@@ -126,7 +149,6 @@ credentials, through a command such as:
 Note the trailing "/" here; without that the shell thinks you are trying to list
 Note the trailing "/" here; without that the shell thinks you are trying to list
 your home directory under the bucket, which will only exist if explicitly created.
 your home directory under the bucket, which will only exist if explicitly created.
 
 
-
 Attempting to list a bucket using inline credentials is a
 Attempting to list a bucket using inline credentials is a
 means of verifying that the key and secret can access a bucket;
 means of verifying that the key and secret can access a bucket;
 
 
@@ -186,7 +208,9 @@ Requests using the V2 API will be rejected with 400 `Bad Request`
 $ bin/hadoop fs -ls s3a://frankfurt/
 $ bin/hadoop fs -ls s3a://frankfurt/
 WARN s3a.S3AFileSystem: Client: Amazon S3 error 400: 400 Bad Request; Bad Request (retryable)
 WARN s3a.S3AFileSystem: Client: Amazon S3 error 400: 400 Bad Request; Bad Request (retryable)
 
 
-com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 923C5D9E75E44C06), S3 Extended Request ID: HDwje6k+ANEeDsM6aJ8+D5gUmNAMguOk2BvZ8PH3g9z0gpH+IuwT7N19oQOnIr5CIx7Vqb/uThE=
+com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3;
+ Status Code: 400; Error Code: 400 Bad Request; Request ID: 923C5D9E75E44C06),
+  S3 Extended Request ID: HDwje6k+ANEeDsM6aJ8+D5gUmNAMguOk2BvZ8PH3g9z0gpH+IuwT7N19oQOnIr5CIx7Vqb/uThE=
     at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
     at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
     at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
     at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
     at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
     at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
@@ -231,13 +255,129 @@ As an example, the endpoint for S3 Frankfurt is `s3.eu-central-1.amazonaws.com`:
 </property>
 </property>
 ```
 ```
 
 
+## <a name="access_denied"></a> `AccessDeniedException` "Access Denied"
+
+### <a name="access_denied_unknown-ID"></a> AccessDeniedException "The AWS Access Key Id you provided does not exist in our records."
+
+The value of `fs.s3a.access.key` does not match a known access key ID.
+It may be mistyped, or the access key may have been deleted by one of the account managers.
+
+```
+java.nio.file.AccessDeniedException: bucket: doesBucketExist on bucket:
+    com.amazonaws.services.s3.model.AmazonS3Exception:
+    The AWS Access Key Id you provided does not exist in our records.
+     (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId;
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:214)
+  at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
+  at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
+  at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:314)
+  at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
+  at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:366)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:302)
+  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354)
+  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
+  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
+  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
+  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
+  at org.apache.hadoop.fs.contract.AbstractBondedFSContract.init(AbstractBondedFSContract.java:72)
+  at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:177)
+  at org.apache.hadoop.fs.s3a.commit.AbstractCommitITest.setup(AbstractCommitITest.java:163)
+  at org.apache.hadoop.fs.s3a.commit.AbstractITCommitMRJob.setup(AbstractITCommitMRJob.java:129)
+  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+  at java.lang.reflect.Method.invoke(Method.java:498)
+  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
+  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
+  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
+  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
+  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
+  at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
+  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
+  at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
+Caused by: com.amazonaws.services.s3.model.AmazonS3Exception:
+               The AWS Access Key Id you provided does not exist in our records.
+                (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId;
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1638)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1303)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1055)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4229)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4176)
+  at com.amazonaws.services.s3.AmazonS3Client.getAcl(AmazonS3Client.java:3381)
+  at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1160)
+  at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1150)
+  at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1266)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:367)
+  at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
+  ... 27 more
+
+```
+
+###  <a name="access_denied_disabled"></a> `AccessDeniedException` All access to this object has been disabled
 
 
-### "403 Access denied" when trying to write data
+Caller has no permission to access the bucket at all.
+
+```
+doesBucketExist on fdsd: java.nio.file.AccessDeniedException: fdsd: doesBucketExist on fdsd:
+ com.amazonaws.services.s3.model.AmazonS3Exception: All access to this object has been disabled
+ (Service: Amazon S3; Status Code: 403; Error Code: AllAccessDisabled; Request ID: E6229D7F8134E64F;
+  S3 Extended Request ID: 6SzVz2t4qa8J2Wxo/oc8yBuB13Mgrn9uMKnxVY0hsBd2kU/YdHzW1IaujpJdDXRDCQRX3f1RYn0=),
+  S3 Extended Request ID: 6SzVz2t4qa8J2Wxo/oc8yBuB13Mgrn9uMKnxVY0hsBd2kU/YdHzW1IaujpJdDXRDCQRX3f1RYn0=:AllAccessDisabled
+ All access to this object has been disabled (Service: Amazon S3; Status Code: 403;
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:205)
+  at org.apache.hadoop.fs.s3a.S3ALambda.once(S3ALambda.java:122)
+  at org.apache.hadoop.fs.s3a.S3ALambda.lambda$retry$2(S3ALambda.java:233)
+  at org.apache.hadoop.fs.s3a.S3ALambda.retryUntranslated(S3ALambda.java:288)
+  at org.apache.hadoop.fs.s3a.S3ALambda.retry(S3ALambda.java:228)
+  at org.apache.hadoop.fs.s3a.S3ALambda.retry(S3ALambda.java:203)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:357)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:293)
+  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288)
+  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
+  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3337)
+  at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3311)
+  at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:529)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:997)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:309)
+  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1218)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.main(S3GuardTool.java:1227)
+Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: All access to this object has been disabled
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1638)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1303)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1055)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4229)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4176)
+  at com.amazonaws.services.s3.AmazonS3Client.getAcl(AmazonS3Client.java:3381)
+  at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1160)
+  at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1150)
+  at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1266)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:360)
+  at org.apache.hadoop.fs.s3a.S3ALambda.once(S3ALambda.java:120)
+```
+
+Check the name of the bucket is correct, and validate permissions for the active user/role.
+
+### <a name="access_denied_writing"></a> `AccessDeniedException` "Access denied" when trying to manipulate data
 
 
 Data can be read, but attempts to write data or manipulate the store fail with
 Data can be read, but attempts to write data or manipulate the store fail with
 403/Access denied.
 403/Access denied.
 
 
 The bucket may have an access policy which the request does not comply with.
 The bucket may have an access policy which the request does not comply with.
+or the caller does not have the right to access the data.
 
 
 ```
 ```
 java.nio.file.AccessDeniedException: test/: PUT 0-byte object  on test/:
 java.nio.file.AccessDeniedException: test/: PUT 0-byte object  on test/:
@@ -257,14 +397,31 @@ java.nio.file.AccessDeniedException: test/: PUT 0-byte object  on test/:
 ```
 ```
 
 
 In the AWS S3 management console, select the "permissions" tab for the bucket, then "bucket policy".
 In the AWS S3 management console, select the "permissions" tab for the bucket, then "bucket policy".
-If there is no bucket policy, then the error cannot be caused by one.
 
 
 If there is a bucket access policy, e.g. required encryption headers,
 If there is a bucket access policy, e.g. required encryption headers,
 then the settings of the s3a client must guarantee the relevant headers are set
 then the settings of the s3a client must guarantee the relevant headers are set
 (e.g. the encryption options match).
 (e.g. the encryption options match).
 Note: S3 Default Encryption options are not considered here:
 Note: S3 Default Encryption options are not considered here:
 if the bucket policy requires AES256 as the encryption policy on PUT requests,
 if the bucket policy requires AES256 as the encryption policy on PUT requests,
-then the encryption option must be set in the s3a client so that the header is set.
+then the encryption option must be set in the hadoop client so that the header is set.
+
+
+Otherwise, the problem will likely be that the user does not have full access to the
+operation. Check what they were trying to (read vs write) and then look
+at the permissions of the user/role.
+
+If the client using [assumed roles](assumed_roles.html), and a policy
+is set in `fs.s3a.assumed.role.policy`, then that policy declares
+_all_ the rights which the caller has.
+
+
+### <a name="kms_access_denied"></a>  `AccessDeniedException` when using SSE-KMS
+
+When trying to write or read SEE-KMS-encrypted data, the client gets a
+`java.nio.AccessDeniedException` with the error 403/Forbidden.
+
+The caller does not have the permissions to access
+the key with which the data was encrypted.
 
 
 ## <a name="connectivity"></a> Connectivity Problems
 ## <a name="connectivity"></a> Connectivity Problems
 
 
@@ -283,14 +440,14 @@ org.apache.hadoop.fs.s3a.AWSS3IOException: Received permanent redirect response
   addressed using the specified endpoint. Please send all future requests to
   addressed using the specified endpoint. Please send all future requests to
   this endpoint. (Service: Amazon S3; Status Code: 301;
   this endpoint. (Service: Amazon S3; Status Code: 301;
   Error Code: PermanentRedirect; Request ID: 7D39EC1021C61B11)
   Error Code: PermanentRedirect; Request ID: 7D39EC1021C61B11)
-        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:132)
-        at org.apache.hadoop.fs.s3a.S3AFileSystem.initMultipartUploads(S3AFileSystem.java:287)
-        at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:203)
-        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2895)
-        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:102)
-        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2932)
-        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2914)
-        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
+      at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:132)
+      at org.apache.hadoop.fs.s3a.S3AFileSystem.initMultipartUploads(S3AFileSystem.java:287)
+      at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:203)
+      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2895)
+      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:102)
+      at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2932)
+      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2914)
+      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:390)
 ```
 ```
 
 
 1. Use the [Specific endpoint of the bucket's S3 service](http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region)
 1. Use the [Specific endpoint of the bucket's S3 service](http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region)
@@ -308,12 +465,15 @@ Using the explicit endpoint for the region is recommended for speed and
 to use the V4 signing API.
 to use the V4 signing API.
 
 
 
 
-### <a name="timeout"></a> "Timeout waiting for connection from pool" when writing data
+### <a name="timeout_from_pool"></a> "Timeout waiting for connection from pool" when writing data
 
 
 This happens when using the output stream thread pool runs out of capacity.
 This happens when using the output stream thread pool runs out of capacity.
 
 
 ```
 ```
-[s3a-transfer-shared-pool1-t20] INFO  http.AmazonHttpClient (AmazonHttpClient.java:executeHelper(496)) - Unable to execute HTTP request: Timeout waiting for connection from poolorg.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
+[s3a-transfer-shared-pool1-t20] INFO  http.AmazonHttpClient (AmazonHttpClient.java:executeHelper(496))
+ - Unable to execute HTTP request:
+  Timeout waiting for connection from poolorg.apache.http.conn.ConnectionPoolTimeoutException:
+   Timeout waiting for connection from pool
   at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:230)
   at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:230)
   at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199)
   at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199)
   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
@@ -364,6 +524,46 @@ the maximum number of allocated HTTP connections.
 Set `fs.s3a.connection.maximum` to a larger value (and at least as large as
 Set `fs.s3a.connection.maximum` to a larger value (and at least as large as
 `fs.s3a.threads.max`)
 `fs.s3a.threads.max`)
 
 
+
+### `NoHttpResponseException`
+
+The HTTP Server did not respond.
+
+```
+2017-02-07 10:01:07,950 INFO [s3a-transfer-shared-pool1-t7] com.amazonaws.http.AmazonHttpClient:
+  Unable to execute HTTP request: bucket.s3.amazonaws.com:443 failed to respond
+org.apache.http.NoHttpResponseException: bucket.s3.amazonaws.com:443 failed to respond
+  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
+  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
+  at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
+  at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
+  at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:259)
+  at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:209)
+  at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
+  at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66)
+  at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
+  at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:686)
+  at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:488)
+  at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)
+  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
+  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
+  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
+  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
+  at com.amazonaws.services.s3.AmazonS3Client.copyPart(AmazonS3Client.java:1731)
+  at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:41)
+  at com.amazonaws.services.s3.transfer.internal.CopyPartCallable.call(CopyPartCallable.java:28)
+  at org.apache.hadoop.fs.s3a.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:222)
+  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+  at java.lang.Thread.run(Thread.java:745)
+```
+
+Probably network problems, unless it really is an outage of S3.
+
+
 ### Out of heap memory when writing with via Fast Upload
 ### Out of heap memory when writing with via Fast Upload
 
 
 This can happen when using the upload buffering mechanism
 This can happen when using the upload buffering mechanism
@@ -418,7 +618,8 @@ for up to date advice.
 org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on test/testname/streaming/:
 org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on test/testname/streaming/:
   com.amazonaws.AmazonClientException: Failed to sanitize XML document
   com.amazonaws.AmazonClientException: Failed to sanitize XML document
   destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
   destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler:
-  Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
+  Failed to sanitize XML document destined for handler class
+   com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:105)
     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:105)
     at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1462)
     at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1462)
     at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1227)
     at org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1227)
@@ -444,19 +645,136 @@ Again, we believe this is caused by the connection to S3 being broken.
 It may go away if the operation is retried.
 It may go away if the operation is retried.
 
 
 
 
+## <a name="other"></a> Other Errors
+
+### <a name="integrity"></a> `SdkClientException` Unable to verify integrity of data upload
 
 
-## Miscellaneous Errors
+Something has happened to the data as it was uploaded.
+
+```
+Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: saving output on dest/_task_tmp.-ext-10000/_tmp.000000_0:
+    com.amazonaws.AmazonClientException: Unable to verify integrity of data upload.
+    Client calculated content hash (contentMD5: L75PalQk0CIhTp04MStVOA== in base 64)
+    didn't match hash (etag: 37ace01f2c383d6b9b3490933c83bb0f in hex) calculated by Amazon S3.
+    You may need to delete the data stored in Amazon S3.
+    (metadata.contentMD5: L75PalQk0CIhTp04MStVOA==, md5DigestStream: null,
+    bucketName: ext2, key: dest/_task_tmp.-ext-10000/_tmp.000000_0):
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:144)
+  at org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:121)
+  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
+  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
+  at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat$1.close(HiveIgnoreKeyTextOutputFormat.java:99)
+  at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:190)
+  ... 22 more
+Caused by: com.amazonaws.AmazonClientException: Unable to verify integrity of data upload.
+  Client calculated content hash (contentMD5: L75PalQk0CIhTp04MStVOA== in base 64)
+  didn't match hash (etag: 37ace01f2c383d6b9b3490933c83bb0f in hex) calculated by Amazon S3.
+  You may need to delete the data stored in Amazon S3.
+  (metadata.contentMD5: L75PalQk0CIhTp04MStVOA==, md5DigestStream: null,
+  bucketName: ext2, key: dest/_task_tmp.-ext-10000/_tmp.000000_0)
+  at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1492)
+  at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
+  at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
+  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
+  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
+  ... 4 more
+```
+
+As it uploads data to S3, the AWS SDK builds up an MD5 checksum of what was
+PUT/POSTed. When S3 returns the checksum of the uploaded data, that is compared
+with the local checksum. If there is a mismatch, this error is reported.
+
+The uploaded data is already on S3 and will stay there, though if this happens
+during a multipart upload, it may not be visible (but still billed: clean up your
+multipart uploads via the `hadoop s3guard uploads` command).
+
+Possible causes for this
+
+1. A (possibly transient) network problem, including hardware faults.
+1. A proxy server is doing bad things to the data.
+1. Some signing problem, especially with third-party S3-compatible object stores.
+
+This is a very, very rare occurrence.
+
+If the problem is a signing one, try changing the signature algorithm.
+
+```xml
+<property>
+  <name>fs.s3a.signing-algorithm</name>
+  <value>S3SignerType</value>
+</property>
+```
+
+We cannot make any promises that it will work,
+only that it has been known to make the problem go away "once"
+
+### `AWSS3IOException` The Content-MD5 you specified did not match what we received
+
+Reads work, but writes, even `mkdir`, fail:
+
+```
+org.apache.hadoop.fs.s3a.AWSS3IOException: copyFromLocalFile(file:/tmp/hello.txt, s3a://bucket/hello.txt)
+    on file:/tmp/hello.txt:
+    The Content-MD5 you specified did not match what we received.
+    (Service: Amazon S3; Status Code: 400; Error Code: BadDigest; Request ID: 4018131225),
+    S3 Extended Request ID: null
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:127)
+	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:69)
+	at org.apache.hadoop.fs.s3a.S3AFileSystem.copyFromLocalFile(S3AFileSystem.java:1494)
+	at org.apache.hadoop.tools.cloudup.Cloudup.uploadOneFile(Cloudup.java:466)
+	at org.apache.hadoop.tools.cloudup.Cloudup.access$000(Cloudup.java:63)
+	at org.apache.hadoop.tools.cloudup.Cloudup$1.call(Cloudup.java:353)
+	at org.apache.hadoop.tools.cloudup.Cloudup$1.call(Cloudup.java:350)
+	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
+	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+	at java.lang.Thread.run(Thread.java:748)
+Caused by: com.amazonaws.services.s3.model.AmazonS3Exception:
+    The Content-MD5 you specified did not match what we received.
+    (Service: Amazon S3; Status Code: 400; Error Code: BadDigest; Request ID: 4018131225),
+    S3 Extended Request ID: null
+  at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1307)
+	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:894)
+	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:597)
+	at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:363)
+	at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:329)
+	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:308)
+	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3659)
+	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1422)
+	at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
+	at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
+	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
+	at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
+	at org.apache.hadoop.fs.s3a.BlockingThreadPoolExecutorService$CallableWithPermitRelease.call(BlockingThreadPoolExecutorService.java:239)
+	... 4 more
+```
+
+This stack trace was seen when interacting with a third-party S3 store whose
+expectations of headers related to the AWS V4 signing mechanism was not
+compatible with that of the specific AWS SDK Hadoop was using.
+
+Workaround: revert to V2 signing.
+
+```xml
+<property>
+  <name>fs.s3a.signing-algorithm</name>
+  <value>S3SignerType</value>
+</property>
+```
 
 
 ### When writing data: "java.io.FileNotFoundException: Completing multi-part upload"
 ### When writing data: "java.io.FileNotFoundException: Completing multi-part upload"
 
 
 
 
 A multipart upload was trying to complete, but failed as there was no upload
 A multipart upload was trying to complete, but failed as there was no upload
 with that ID.
 with that ID.
+
 ```
 ```
 java.io.FileNotFoundException: Completing multi-part upload on fork-5/test/multipart/1c397ca6-9dfb-4ac1-9cf7-db666673246b:
 java.io.FileNotFoundException: Completing multi-part upload on fork-5/test/multipart/1c397ca6-9dfb-4ac1-9cf7-db666673246b:
  com.amazonaws.services.s3.model.AmazonS3Exception: The specified upload does not exist.
  com.amazonaws.services.s3.model.AmazonS3Exception: The specified upload does not exist.
-  The upload ID may be invalid, or the upload may have been aborted or completed. (Service: Amazon S3; Status Code: 404;
-   Error Code: NoSuchUpload;
+  The upload ID may be invalid, or the upload may have been aborted or completed.
+   (Service: Amazon S3; Status Code: 404; Error Code: NoSuchUpload;
   at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
   at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
   at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
   at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
@@ -482,14 +800,11 @@ for all open writes to complete the write,
 ### Application hangs after reading a number of files
 ### Application hangs after reading a number of files
 
 
 
 
-
-
-The pool of https client connectons and/or IO threads have been used up,
+The pool of https client connections and/or IO threads have been used up,
 and none are being freed.
 and none are being freed.
 
 
 
 
-1. The pools aren't big enough. Increas `fs.s3a.connection.maximum` for
-the http connections, and `fs.s3a.threads.max` for the thread pool.
+1. The pools aren't big enough. See ["Timeout waiting for connection from pool"](#timeout_from_pool)
 2. Likely root cause: whatever code is reading files isn't calling `close()`
 2. Likely root cause: whatever code is reading files isn't calling `close()`
 on the input streams. Make sure your code does this!
 on the input streams. Make sure your code does this!
 And if it's someone else's: make sure you have a recent version; search their
 And if it's someone else's: make sure you have a recent version; search their
@@ -497,81 +812,13 @@ issue trackers to see if its a known/fixed problem.
 If not, it's time to work with the developers, or come up with a workaround
 If not, it's time to work with the developers, or come up with a workaround
 (i.e closing the input stream yourself).
 (i.e closing the input stream yourself).
 
 
-### "Timeout waiting for connection from pool"
 
 
-This the same problem as above, exhibiting itself as the http connection
-pool determining that it has run out of capacity.
-
-```
-
-java.io.InterruptedIOException: getFileStatus on s3a://example/fork-0007/test:
- com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
-  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:145)
-  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:119)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2040)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.checkPathForDirectory(S3AFileSystem.java:1857)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:1890)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1826)
-  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2230)
-  ...
-Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1069)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
-  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
-  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
-  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
-  at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1249)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1162)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2022)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.checkPathForDirectory(S3AFileSystem.java:1857)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:1890)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1826)
-  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2230)
-...
-Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
-  at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:286)
-  at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:263)
-  at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
-  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
-  at java.lang.reflect.Method.invoke(Method.java:498)
-  at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
-  at com.amazonaws.http.conn.$Proxy15.get(Unknown Source)
-  at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
-  at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
-  at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
-  at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
-  at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
-  at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1190)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
-  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
-  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
-  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
-  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
-  at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1249)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1162)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2022)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.checkPathForDirectory(S3AFileSystem.java:1857)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:1890)
-  at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1826)
-  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2230)
-```
-
-This is the same problem as the previous one, exhibited differently.
 
 
 ### Issue: when writing data, HTTP Exceptions logged at info from `AmazonHttpClient`
 ### Issue: when writing data, HTTP Exceptions logged at info from `AmazonHttpClient`
 
 
 ```
 ```
-[s3a-transfer-shared-pool4-t6] INFO  http.AmazonHttpClient (AmazonHttpClient.java:executeHelper(496)) - Unable to execute HTTP request: hwdev-steve-ireland-new.s3.amazonaws.com:443 failed to respond
+[s3a-transfer-shared-pool4-t6] INFO  http.AmazonHttpClient (AmazonHttpClient.java:executeHelper(496))
+ - Unable to execute HTTP request: hwdev-steve-ireland-new.s3.amazonaws.com:443 failed to respond
 org.apache.http.NoHttpResponseException: bucket.s3.amazonaws.com:443 failed to respond
 org.apache.http.NoHttpResponseException: bucket.s3.amazonaws.com:443 failed to respond
   at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
   at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
   at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
   at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
@@ -606,6 +853,45 @@ will attempt to retry the operation; it may just be a transient event. If there
 are many such exceptions in logs, it may be a symptom of connectivity or network
 are many such exceptions in logs, it may be a symptom of connectivity or network
 problems.
 problems.
 
 
+### `AWSBadRequestException` IllegalLocationConstraintException/The unspecified location constraint is incompatible
+
+```
+ Cause: org.apache.hadoop.fs.s3a.AWSBadRequestException: put on :
+  com.amazonaws.services.s3.model.AmazonS3Exception:
+   The unspecified location constraint is incompatible for the region specific
+    endpoint this request was sent to.
+    (Service: Amazon S3; Status Code: 400; Error Code: IllegalLocationConstraintException;
+
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:178)
+  at org.apache.hadoop.fs.s3a.S3ALambda.execute(S3ALambda.java:64)
+  at org.apache.hadoop.fs.s3a.WriteOperationHelper.uploadObject(WriteOperationHelper.java:451)
+  at org.apache.hadoop.fs.s3a.commit.magic.MagicCommitTracker.aboutToComplete(MagicCommitTracker.java:128)
+  at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:373)
+  at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
+  at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
+  at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2429)
+  at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
+  at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:91)
+  ...
+  Cause: com.amazonaws.services.s3.model.AmazonS3Exception:
+   The unspecified location constraint is incompatible for the region specific endpoint
+   this request was sent to. (Service: Amazon S3; Status Code: 400; Error Code: IllegalLocationConstraintException;
+   Request ID: EEBC5A08BCB3A645)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1588)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
+  ...
+```
+
+Something has been trying to write data to "/".
+
 ## File System Semantics
 ## File System Semantics
 
 
 These are the issues where S3 does not appear to behave the way a filesystem
 These are the issues where S3 does not appear to behave the way a filesystem
@@ -664,7 +950,7 @@ that it is not there)
 This is a visible sign of updates to the metadata server lagging
 This is a visible sign of updates to the metadata server lagging
 behind the state of the underlying filesystem.
 behind the state of the underlying filesystem.
 
 
-Fix: Use S3Guard
+Fix: Use [S3Guard](s3guard.html).
 
 
 
 
 ### File not visible/saved
 ### File not visible/saved
@@ -686,26 +972,74 @@ and the like. The standard strategy here is to save to HDFS and then copy to S3.
 
 
 ## <a name="encryption"></a> S3 Server Side Encryption
 ## <a name="encryption"></a> S3 Server Side Encryption
 
 
-### Using SSE-KMS "Invalid arn"
+### `AWSS3IOException` `KMS.NotFoundException` "Invalid arn" when using SSE-KMS
 
 
 When performing file operations, the user may run into an issue where the KMS
 When performing file operations, the user may run into an issue where the KMS
 key arn is invalid.
 key arn is invalid.
+
 ```
 ```
-com.amazonaws.services.s3.model.AmazonS3Exception:
-Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: 708284CF60EE233F),
-S3 Extended Request ID: iHUUtXUSiNz4kv3Bdk/hf9F+wjPt8GIVvBHx/HEfCBYkn7W6zmpvbA3XT7Y5nTzcZtfuhcqDunw=:
-Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: 708284CF60EE233F)
+org.apache.hadoop.fs.s3a.AWSS3IOException: innerMkdirs on /test:
+ com.amazonaws.services.s3.model.AmazonS3Exception:
+  Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException;
+   Request ID: CA89F276B3394565),
+   S3 Extended Request ID: ncz0LWn8zor1cUO2fQ7gc5eyqOk3YfyQLDn2OQNoe5Zj/GqDLggUYz9QY7JhdZHdBaDTh+TL5ZQ=:
+   Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: CA89F276B3394565)
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:194)
+  at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:117)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1541)
+  at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2230)
+  at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338)
+  at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193)
+  at org.apache.hadoop.fs.s3a.scale.S3AScaleTestBase.setup(S3AScaleTestBase.java:90)
+  at org.apache.hadoop.fs.s3a.scale.AbstractSTestS3AHugeFiles.setup(AbstractSTestS3AHugeFiles.java:77)
+  at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
+  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+  at java.lang.reflect.Method.invoke(Method.java:498)
+  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
+  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
+  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
+  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
+  at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
+  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
+  at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
+Caused by: com.amazonaws.services.s3.model.AmazonS3Exception:
+ Invalid arn (Service: Amazon S3; Status Code: 400; Error Code: KMS.NotFoundException; Request ID: CA89F276B3394565)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1588)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
+  at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
+  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
+  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
+  at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1718)
+  at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:133)
+  at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:125)
+  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:143)
+  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:48)
+  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
+  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
+  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
+  at java.lang.Thread.run(Thread.java:745)
 ```
 ```
 
 
-This is due to either, the KMS key id is entered incorrectly, or the KMS key id
-is in a different region than the S3 bucket being used.
+Possible causes:
+
+* the KMS key ARN is entered incorrectly, or
+* the KMS key referenced by the ARN is in a different region than the S3 bucket
+being used.
+
 
 
 ### Using SSE-C "Bad Request"
 ### Using SSE-C "Bad Request"
 
 
 When performing file operations the user may run into an unexpected 400/403
 When performing file operations the user may run into an unexpected 400/403
 error such as
 error such as
 ```
 ```
-org.apache.hadoop.fs.s3a.AWSS3IOException: getFileStatus on fork-4/: com.amazonaws.services.s3.model.AmazonS3Exception:
+org.apache.hadoop.fs.s3a.AWSS3IOException: getFileStatus on fork-4/:
+ com.amazonaws.services.s3.model.AmazonS3Exception:
 Bad Request (Service: Amazon S3; Status Code: 400;
 Bad Request (Service: Amazon S3; Status Code: 400;
 Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99),
 Error Code: 400 Bad Request; Request ID: 42F9A1987CB49A99),
 S3 Extended Request ID: jU2kcwaXnWj5APB14Cgb1IKkc449gu2+dhIsW/+7x9J4D+VUkKvu78mBo03oh9jnOT2eoTLdECU=:
 S3 Extended Request ID: jU2kcwaXnWj5APB14Cgb1IKkc449gu2+dhIsW/+7x9J4D+VUkKvu78mBo03oh9jnOT2eoTLdECU=:
@@ -719,83 +1053,49 @@ is used, no encryption is specified, or the SSE-C specified is incorrect.
 2. A directory is encrypted with a SSE-C keyA and the user is trying to move a
 2. A directory is encrypted with a SSE-C keyA and the user is trying to move a
 file using configured SSE-C keyB into that structure.
 file using configured SSE-C keyB into that structure.
 
 
-## <a name="performance"></a> Performance
-
-S3 is slower to read data than HDFS, even on virtual clusters running on
-Amazon EC2.
-
-* HDFS replicates data for faster query performance.
-* HDFS stores the data on the local hard disks, avoiding network traffic
- if the code can be executed on that host. As EC2 hosts often have their
- network bandwidth throttled, this can make a tangible difference.
-* HDFS is significantly faster for many "metadata" operations: listing
-the contents of a directory, calling `getFileStatus()` on path,
-creating or deleting directories. (S3Guard reduces but does not eliminate
-the speed gap).
-* On HDFS, Directory renames and deletes are `O(1)` operations. On
-S3 renaming is a very expensive `O(data)` operation which may fail partway through
-in which case the final state depends on where the copy+ delete sequence was when it failed.
-All the objects are copied, then the original set of objects are deleted, so
-a failure should not lose data —it may result in duplicate datasets.
-* Unless fast upload enabled, the write only begins on a `close()` operation.
-This can take so long that some applications can actually time out.
-* File IO involving many seek calls/positioned read calls will encounter
-performance problems due to the size of the HTTP requests made. Enable the
-"random" fadvise policy to alleviate this at the
-expense of sequential read performance and bandwidth.
-
-The slow performance of `rename()` surfaces during the commit phase of work,
-including
-
-* The MapReduce `FileOutputCommitter`. This also used by Apache Spark.
-* DistCp's rename-after-copy operation.
-* The `hdfs fs -rm` command renaming the file under `.Trash` rather than
-deleting it. Use `-skipTrash` to eliminate that step.
-
-These operations can be significantly slower when S3 is the destination
-compared to HDFS or other "real" filesystem.
+## <a name="not_all_bytes_were_read"></a> Message appears in logs "Not all bytes were read from the S3ObjectInputStream"
 
 
-*Improving S3 load-balancing behavior*
 
 
-Amazon S3 uses a set of front-end servers to provide access to the underlying data.
-The choice of which front-end server to use is handled via load-balancing DNS
-service: when the IP address of an S3 bucket is looked up, the choice of which
-IP address to return to the client is made based on the the current load
-of the front-end servers.
+This is a message which can be generated by the Amazon SDK when the client application
+calls `abort()` on the HTTP input stream, rather than reading to the end of
+the file/stream and causing `close()`. The S3A client does call `abort()` when
+seeking round large files, [so leading to the message](https://github.com/aws/aws-sdk-java/issues/1211).
 
 
-Over time, the load across the front-end changes, so those servers considered
-"lightly loaded" will change. If the DNS value is cached for any length of time,
-your application may end up talking to an overloaded server. Or, in the case
-of failures, trying to talk to a server that is no longer there.
+No ASF Hadoop releases have shipped with an SDK which prints this message
+when used by the S3A client. However third party and private builds of Hadoop
+may cause the message to be logged.
 
 
-And by default, for historical security reasons in the era of applets,
-the DNS TTL of a JVM is "infinity".
+Ignore it. The S3A client does call `abort()`, but that's because our benchmarking
+shows that it is generally more efficient to abort the TCP connection and initiate
+a new one than read to the end of a large file.
 
 
-To work with AWS better, set the DNS time-to-live of an application which
-works with S3 to something lower. See [AWS documentation](http://docs.aws.amazon.com/AWSSdkDocsJava/latest/DeveloperGuide/java-dg-jvm-ttl.html).
+Note: the threshold when data is read rather than the stream aborted can be tuned
+by `fs.s3a.readahead.range`; seek policy in `fs.s3a.experimental.fadvise`.
 
 
-## <a name="network_performance"></a>Troubleshooting network performance
+### <a name="no_such_bucket"></a> `FileNotFoundException` Bucket does not exist.
 
 
-An example of this is covered in [HADOOP-13871](https://issues.apache.org/jira/browse/HADOOP-13871).
+The bucket does not exist.
 
 
-1. For public data, use `curl`:
-
-        curl -O https://landsat-pds.s3.amazonaws.com/scene_list.gz
-1. Use `nettop` to monitor a processes connections.
-
-Consider reducing the connection timeout of the s3a connection.
-
-```xml
-<property>
-  <name>fs.s3a.connection.timeout</name>
-  <value>15000</value>
-</property>
 ```
 ```
-This *may* cause the client to react faster to network pauses, so display
-stack traces fast. At the same time, it may be less resilient to
-connectivity problems.
+java.io.FileNotFoundException: Bucket stevel45r56666 does not exist
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:361)
+  at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:293)
+  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288)
+  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
+  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3337)
+  at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3311)
+  at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:529)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:997)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:309)
+  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1218)
+  at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.main(S3GuardTool.java:1227)
+```
 
 
 
 
+Check the URI. If using a third-party store, verify that you've configured
+the client to talk to the specific server in `fs.s3a.endpoint`.
+
 ## Other Issues
 ## Other Issues
 
 
 ### <a name="logging"></a> Enabling low-level logging
 ### <a name="logging"></a> Enabling low-level logging
@@ -816,7 +1116,7 @@ log4j.logger.org.apache.http=DEBUG
 ```
 ```
 
 
 
 
-This produces a log such as this, wich is for a V4-authenticated PUT of a 0-byte file used
+This produces a log such as this, which is for a V4-authenticated PUT of a 0-byte file used
 as an empty directory marker
 as an empty directory marker
 
 
 ```
 ```
@@ -866,9 +1166,9 @@ execchain.MainClientExec (MainClientExec.java:execute(284)) - Connection can be
 
 
 ## <a name="retries"></a>  Reducing failures by configuring retry policy
 ## <a name="retries"></a>  Reducing failures by configuring retry policy
 
 
-The S3A client can ba configured to rety those operations which are considered
-retriable. That can be because they are idempotent, or
-because there failure happened before the request was processed by S3.
+The S3A client can ba configured to retry those operations which are considered
+retryable. That can be because they are idempotent, or
+because the failure happened before the request was processed by S3.
 
 
 The number of retries and interval between each retry can be configured:
 The number of retries and interval between each retry can be configured:
 
 
@@ -893,8 +1193,8 @@ Not all failures are retried. Specifically excluded are those considered
 unrecoverable:
 unrecoverable:
 
 
 * Low-level networking: `UnknownHostException`, `NoRouteToHostException`.
 * Low-level networking: `UnknownHostException`, `NoRouteToHostException`.
-* 302 redirects
-* Missing resources, 404/`FileNotFoundException`
+* 302 redirects.
+* Missing resources, 404/`FileNotFoundException`.
 * HTTP 416 response/`EOFException`. This can surface if the length of a file changes
 * HTTP 416 response/`EOFException`. This can surface if the length of a file changes
   while another client is reading it.
   while another client is reading it.
 * Failures during execution or result processing of non-idempotent operations where
 * Failures during execution or result processing of non-idempotent operations where
@@ -910,79 +1210,6 @@ be idempotent, and will retry them on failure. These are only really idempotent
 if no other client is attempting to manipulate the same objects, such as:
 if no other client is attempting to manipulate the same objects, such as:
 renaming() the directory tree or uploading files to the same location.
 renaming() the directory tree or uploading files to the same location.
 Please don't do that. Given that the emulated directory rename and delete operations
 Please don't do that. Given that the emulated directory rename and delete operations
-aren't atomic, even without retries, multiple S3 clients working with the same
+are not atomic, even without retries, multiple S3 clients working with the same
 paths can interfere with each other
 paths can interfere with each other
 
 
-#### <a name="retries"></a> Throttling
-
-When many requests are made of a specific S3 bucket (or shard inside it),
-S3 will respond with a 503 "throttled" response.
-Throttling can be recovered from, provided overall load decreases.
-Furthermore, because it is sent before any changes are made to the object store,
-is inherently idempotent. For this reason, the client will always attempt to
-retry throttled requests.
-
-The limit of the number of times a throttled request can be retried,
-and the exponential interval increase between attempts, can be configured
-independently of the other retry limits.
-
-```xml
-<property>
-  <name>fs.s3a.retry.throttle.limit</name>
-  <value>20</value>
-  <description>
-    Number of times to retry any throttled request.
-  </description>
-</property>
-
-<property>
-  <name>fs.s3a.retry.throttle.interval</name>
-  <value>500ms</value>
-  <description>
-    Interval between retry attempts on throttled requests.
-  </description>
-</property>
-```
-
-If a client is failing due to `AWSServiceThrottledException` failures,
-increasing the interval and limit *may* address this. However, it
-it is a sign of AWS services being overloaded by the sheer number of clients
-and rate of requests. Spreading data across different buckets, and/or using
-a more balanced directory structure may be beneficial.
-Consult [the AWS documentation](http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
-
-Reading or writing data encrypted with SSE-KMS forces S3 to make calls of
-the AWS KMS Key Management Service, which comes with its own
-[Request Rate Limits](http://docs.aws.amazon.com/kms/latest/developerguide/limits.html).
-These default to 1200/second for an account, across all keys and all uses of
-them, which, for S3 means: across all buckets with data encrypted with SSE-KMS.
-
-###### Tips to Keep Throttling down
-
-* If you are seeing a lot of throttling responses on a large scale
-operation like a `distcp` copy, *reduce* the number of processes trying
-to work with the bucket (for distcp: reduce the number of mappers with the
-`-m` option).
-
-* If you are reading or writing lists of files, if you can randomize
-the list so they are not processed in a simple sorted order, you may
-reduce load on a specific shard of S3 data, so potentially increase throughput.
-
-* An S3 Bucket is throttled by requests coming from all
-simultaneous clients. Different applications and jobs may interfere with
-each other: consider that when troubleshooting.
-Partitioning data into different buckets may help isolate load here.
-
-* If you are using data encrypted with SSE-KMS, then the
-will also apply: these are stricter than the S3 numbers.
-If you believe that you are reaching these limits, you may be able to
-get them increased.
-Consult [the KMS Rate Limit documentation](http://docs.aws.amazon.com/kms/latest/developerguide/limits.html).
-
-* S3Guard uses DynamoDB for directory and file lookups;
-it is rate limited to the amount of (guaranteed) IO purchased for a
-table. If significant throttling events/rate is observed here, the preallocated
-IOPs can be increased with the `s3guard set-capacity` command, or
-through the AWS Console. Throttling events in S3Guard are noted in logs, and
-also in the S3A metrics `s3guard_metadatastore_throttle_rate` and
-`s3guard_metadatastore_throttled`.