HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery.
The implementation of HDFS Snapshots is efficient:
Snapshots can be taken on any directory once the directory has been set as snapshottable. A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. There is no limit on the number of snapshottable directories. Administrators may set any directory to be snapshottable. If there are snapshots in a snapshottable directory, the directory can be neither deleted nor renamed before all the snapshots are deleted.
For a snapshottable directory,
the path component ".snapshot" is used for accessing its snapshots.
Suppose /foo
is a snapshottable directory,
/foo/bar
is a file/directory in /foo
,
and /foo
has a snapshot s0
.
Then, the path /foo/bar
.
The usual API and CLI can work with the ".snapshot" paths.
The following are some examples.
s0
:
s0
:
The name ".snapshot" is now a reserved file name in HDFS so that users cannot create a file/directory with ".snapshot" as the name. If ".snapshot" is used in a previous version of HDFS, it must be renamed before upgrade; otherwise, upgrade will fail.
The operations described in this section require superuser privilege.
Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable.
path | The path of the snapshottable directory. |
See also the corresponding Java API
void allowSnapshot(Path path)
in HdfsAdmin
.
Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots.
path | The path of the snapshottable directory. |
See also the corresponding Java API
void disallowSnapshot(Path path)
in HdfsAdmin
.
The section describes user operations. Note that HDFS superuser can perform all the operations without satisfying the permission requirement in the individual operations.
Create a snapshot of a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
path | The path of the snapshottable directory. |
snapshotName |
The snapshot name, which is an optional argument.
When it is omitted, a default name is generated using a timestamp with the format
"'s'yyyyMMdd-HHmmss.SSS" , e.g. "s20130412-151029.033".
|
See also the corresponding Java API
Path createSnapshot(Path path)
and
Path createSnapshot(Path path, String snapshotName)
in FileSystem
.
The snapshot path is returned in these methods.
Delete a snapshot of from a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
path | The path of the snapshottable directory. |
snapshotName | The snapshot name. |
See also the corresponding Java API
void deleteSnapshot(Path path, String snapshotName)
in FileSystem
.
Rename a snapshot. This operation requires owner privilege of the snapshottable directory.
path | The path of the snapshottable directory. |
oldName | The old snapshot name. |
newName | The new snapshot name. |
See also the corresponding Java API
void renameSnapshot(Path path, String oldName, String newName)
in FileSystem
.
Get all the snapshottable directories where the current user has permission to take snapshtos.
See also the corresponding Java API
SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()
in DistributedFileSystem
.
Get the differences between two snapshots. This operation requires read access privilege for all files/directories in both snapshots.
path | The path of the snapshottable directory. |
fromSnapshot | The name of the starting snapshot. |
toSnapshot | The name of the ending snapshot. |
See also the corresponding Java API
SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)
in DistributedFileSystem
.