1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117 |
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- <!--
- | Generated by Apache Maven Doxia at 2025-05-06
- | Rendered using Apache Maven Stylus Skin 1.5
- -->
- <html xmlns="http://www.w3.org/1999/xhtml">
- <head>
- <title>Apache Hadoop Volcano Engine Services support – Integration of VolcanoEngine TOS in Hadoop</title>
- <style type="text/css" media="all">
- @import url("../css/maven-base.css");
- @import url("../css/maven-theme.css");
- @import url("../css/site.css");
- </style>
- <link rel="stylesheet" href="../css/print.css" type="text/css" media="print" />
- <meta name="Date-Revision-yyyymmdd" content="20250506" />
- <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
- </head>
- <body class="composite">
- <div id="banner">
- <a href="http://hadoop.apache.org/" id="bannerLeft">
- <img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
- </a>
- <a href="http://www.apache.org/" id="bannerRight">
- <img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
- </a>
- <div class="clear">
- <hr/>
- </div>
- </div>
- <div id="breadcrumbs">
-
- <div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
- |
- <a href="https://gitbox.apache.org/repos/asf/hadoop.git" class="externalLink">git</a>
-
- | Last Published: 2025-05-06
- | Version: 3.5.0-SNAPSHOT
- </div>
- <div class="clear">
- <hr/>
- </div>
- </div>
- <div id="leftColumn">
- <div id="navcolumn">
-
- <h5>General</h5>
- <ul>
- <li class="none">
- <a href="../../index.html">Overview</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/CommandsManual.html">Commands Reference</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/FileSystemShell.html">FileSystem Shell</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/Compatibility.html">Compatibility Specification</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/DownstreamDev.html">Downstream Developer's Guide</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/AdminCompatibilityGuide.html">Admin Compatibility Guide</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/InterfaceClassification.html">Interface Classification</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/filesystem/index.html">FileSystem Specification</a>
- </li>
- </ul>
- <h5>Common</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/FairCallQueue.html">Fair Call Queue</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/Superusers.html">Proxy User</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/RackAwareness.html">Rack Awareness</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html">Credential Provider API</a>
- </li>
- <li class="none">
- <a href="../../hadoop-kms/index.html">Hadoop KMS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/Tracing.html">Tracing</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/UnixShellGuide.html">Unix Shell Guide</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/registry/index.html">Registry</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/AsyncProfilerServlet.html">Async Profiler</a>
- </li>
- </ul>
- <h5>HDFS</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">Architecture</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">User Guide</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HDFSCommands.html">Commands Reference</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html">NameNode HA With QJM</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html">NameNode HA With NFS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html">Observer NameNode</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ViewFs.html">ViewFs</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ViewFsOverloadScheme.html">ViewFsOverloadScheme</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">Snapshots</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">libhdfs (C API)</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS (REST API)</a>
- </li>
- <li class="none">
- <a href="../../hadoop-hdfs-httpfs/index.html">HttpFS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">NFS Gateway</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html">Rolling Upgrade</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html">Extended Attributes</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html">Transparent Encryption</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html">Multihoming</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html">Storage Policies</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/MemoryStorage.html">Memory Storage Support</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/SLGUserGuide.html">Synthetic Load Generator</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html">Erasure Coding</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html">Disk Balancer</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUpgradeDomain.html">Upgrade Domain</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html">DataNode Admin</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html">Router Federation</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/HdfsProvidedStorage.html">Provided Storage</a>
- </li>
- </ul>
- <h5>MapReduce</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html">Tutorial</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html">Commands Reference</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibility with 1.x</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/SharedCacheSupport.html">Support for YARN Shared Cache</a>
- </li>
- </ul>
- <h5>MapReduce REST APIs</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html">MR Application Master</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html">MR History Server</a>
- </li>
- </ul>
- <h5>YARN</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/YARN.html">Architecture</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">Commands Reference</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html">ResourceManager Restart</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html">ResourceManager HA</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/ResourceModel.html">Resource Model</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/NodeLabel.html">Node Labels</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/NodeAttributes.html">Node Attributes</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html">Timeline Server</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html">Timeline Service V.2</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html">YARN Application Security</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/NodeManager.html">NodeManager</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/DockerContainers.html">Running Applications in Docker Containers</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/RuncContainers.html">Running Applications in runC Containers</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html">Using CGroups</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/SecureContainer.html">Secure Containers</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/ReservationSystem.html">Reservation System</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html">Graceful Decommission</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/OpportunisticContainers.html">Opportunistic Containers</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/Federation.html">YARN Federation</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/SharedCache.html">Shared Cache</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/UsingGpus.html">Using GPU</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/UsingFPGA.html">Using FPGA</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/PlacementConstraints.html">Placement Constraints</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/YarnUI2.html">YARN UI2</a>
- </li>
- </ul>
- <h5>YARN REST APIs</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1">Timeline Server</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html#Timeline_Service_v.2_REST_API">Timeline Service V.2</a>
- </li>
- </ul>
- <h5>YARN Service</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html">Overview</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/QuickStart.html">QuickStart</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/Concepts.html">Concepts</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html">Yarn Service API</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceDiscovery.html">Service Discovery</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-site/yarn-service/SystemServices.html">System Services</a>
- </li>
- </ul>
- <h5>Hadoop Compatible File Systems</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-aliyun/tools/hadoop-aliyun/index.html">Aliyun OSS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-aws/tools/hadoop-aws/index.html">Amazon S3</a>
- </li>
- <li class="none">
- <a href="../../hadoop-azure/index.html">Azure Blob Storage</a>
- </li>
- <li class="none">
- <a href="../../hadoop-azure-datalake/index.html">Azure Data Lake Storage</a>
- </li>
- <li class="none">
- <a href="../../hadoop-cos/cloud-storage/index.html">Tencent COS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-huaweicloud/index.html">Huaweicloud OBS</a>
- </li>
- <li class="none">
- <a href="../../hadoop-tos/cloud-storage/index.html">VolcanoEngine TOS</a>
- </li>
- </ul>
- <h5>Auth</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-auth/index.html">Overview</a>
- </li>
- <li class="none">
- <a href="../../hadoop-auth/Examples.html">Examples</a>
- </li>
- <li class="none">
- <a href="../../hadoop-auth/Configuration.html">Configuration</a>
- </li>
- <li class="none">
- <a href="../../hadoop-auth/BuildingIt.html">Building</a>
- </li>
- </ul>
- <h5>Tools</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-streaming/HadoopStreaming.html">Hadoop Streaming</a>
- </li>
- <li class="none">
- <a href="../../hadoop-archives/HadoopArchives.html">Hadoop Archives</a>
- </li>
- <li class="none">
- <a href="../../hadoop-archive-logs/HadoopArchiveLogs.html">Hadoop Archive Logs</a>
- </li>
- <li class="none">
- <a href="../../hadoop-distcp/DistCp.html">DistCp</a>
- </li>
- <li class="none">
- <a href="../../hadoop-federation-balance/HDFSFederationBalance.html">HDFS Federation Balance</a>
- </li>
- <li class="none">
- <a href="../../hadoop-gridmix/GridMix.html">GridMix</a>
- </li>
- <li class="none">
- <a href="../../hadoop-rumen/Rumen.html">Rumen</a>
- </li>
- <li class="none">
- <a href="../../hadoop-resourceestimator/ResourceEstimator.html">Resource Estimator Service</a>
- </li>
- <li class="none">
- <a href="../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/Benchmarking.html">Hadoop Benchmarking</a>
- </li>
- <li class="none">
- <a href="../../hadoop-dynamometer/Dynamometer.html">Dynamometer</a>
- </li>
- </ul>
- <h5>Reference</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/release/">Changelog and Release Notes</a>
- </li>
- <li class="none">
- <a href="../../api/index.html">Java API docs</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/UnixShellAPI.html">Unix Shell API</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/Metrics.html">Metrics</a>
- </li>
- </ul>
- <h5>Configuration</h5>
- <ul>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-hdfs-rbf/hdfs-rbf-default.xml">hdfs-rbf-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-kms/kms-default.html">kms-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-hdfs-httpfs/httpfs-default.html">httpfs-default.xml</a>
- </li>
- <li class="none">
- <a href="../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
- </li>
- </ul>
- <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
- <img alt="Built by Maven" src="../images/logos/maven-feather.png"/>
- </a>
-
- </div>
- </div>
- <div id="bodyColumn">
- <div id="contentBox">
- <!---
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
- -->
- <h1>Integration of VolcanoEngine TOS in Hadoop</h1><section>
- <h2><a name="Overview"></a>Overview</h2>
- <p>TOS is the object storage service of Volcano Engine, which is a cloud vendor launched by ByteDance. Hadoop-tos is a connector between computing systems and underlying storage. For systems like Hadoop MR, Hive(both mr and tez) and Spark, hadoop-tos helps them use TOS as the underlying storage system instead of HDFS.</p></section><section>
- <h2><a name="Quick_Start"></a>Quick Start</h2>
- <p>In quick start, we will use hadoop shell command to access a tos bucket.</p><section>
- <h3><a name="Requirements"></a>Requirements</h3>
- <ol style="list-style-type: decimal">
- <li>A Volcano Engine account. Use the account to create a TOS bucket.</li>
- <li>A dev environment that can access TOS. E.g. a local server or a volcano engine cloud server.</li>
- <li>Install hadoop to the dev environment. Hadoop is installed at <code>$HADOOP_HOME</code>.</li>
- </ol></section><section>
- <h3><a name="Usage"></a>Usage</h3>
- <ul>
- <li>Compile hadoop-tos bundle tar. The hadoop-tos bundle is not packaged in hadoop final tar file. So we have to compile it manually. Download the hadoop project, and build it with command below.</li>
- </ul>
- <div class="source">
- <div class="source">
- <pre>mvn package -DskipTests -pl org.apache.hadoop:hadoop-tos
- </pre></div></div>
- <ul>
- <li>
- <p>The bundle jar file is placed at <code>$HADOOP_HOME/hadoop-cloud-storage-project/hadoop-tos/target/hadoop-tos-{VERSION}.jar</code>.</p>
- </li>
- <li>
- <p>Copy the bundle jar to hdfs lib path. The hdfs lib path is <code>$HADOOP_HOME/share/hadoop/hdfs/lib</code>. Remember copying to all hadoop nodes.</p>
- </li>
- </ul>
- <div class="source">
- <div class="source">
- <pre>cp hadoop-tos-{VERSION}.jar $HADOOP_HOME/share/hadoop/hdfs/lib/
- </pre></div></div>
- <ul>
- <li>Configure properties below.</li>
- </ul>
- <div class="source">
- <div class="source">
- <pre><properties>
- <property>
- <name>fs.defaultFS</name>
- <value>tos://{your_bucket_name}/</value>
- <description>
- The name of the default file system. Make it your tos bucket.
- </description>
- </property>
- <property>
- <name>fs.tos.endpoint</name>
- <value></value>
- <description>
- Object storage endpoint to connect to, which should include both region and object domain name.
- e.g. 'fs.tos.endpoint'='tos-cn-beijing.volces.com'.
- </description>
- </property>
- <property>
- <name>fs.tos.impl</name>
- <value>org.apache.hadoop.fs.tosfs.TosFileSystem</value>
- <description>
- The implementation class of the tos FileSystem.
- </description>
- </property>
- <property>
- <name>fs.AbstractFileSystem.tos.impl</name>
- <value>org.apache.hadoop.fs.tosfs.TosFS</value>
- <description>
- The implementation class of the tos AbstractFileSystem.
- </description>
- </property>
- <property>
- <name>fs.tos.access-key-id</name>
- <value></value>
- <description>
- The access key of volcano engine's user or role.
- </description>
- </property>
- <property>
- <name>fs.tos.secret-access-key</name>
- <value></value>
- <description>
- The secret key of the access key specified by 'fs.tos.access-key-id'.
- </description>
- </property>
- </properties>
- </pre></div></div>
- <ul>
- <li>Use hadoop shell command to access TOS.</li>
- </ul>
- <div class="source">
- <div class="source">
- <pre># 1. List root dir.
- hadoop fs -ls /
- # 2. Make directory.
- hadoop fs -mkdir /hadoop-tos
- # 3. Write and read.
- echo "hello tos." > hello.txt
- hadoop fs -put hello.txt /hadoop-tos/
- hadoop fs -cat /hadoop-tos/hello.txt
- # 4. Delete file and directory.
- hadoop fs -rm -r /hadoop-tos/
- </pre></div></div>
- </section></section><section>
- <h2><a name="Introduction"></a>Introduction</h2>
- <p>This is a brief introduction of hadoop-tos design and basic functions. The following contents are based on flat mode by default. The differences between hierarchy mode will be explained at the end of each section.</p><section>
- <h3><a name="TOS"></a>TOS</h3>
- <p>TOS is the object storage service of Volcano Engine. It is similar to ASW S3, Azure Blob Storage and Aliyun OSS, and has some unique features, such as object fast copy, object fast rename, CRC32C checksum etc. Learn more details about TOS from: <a class="externalLink" href="https://www.volcengine.com/product/TOS">https://www.volcengine.com/product/TOS</a>.</p>
- <p>TOS has 2 modes: the flat mode and the hierarchy mode. In flat mode, there are no directories, all objects are files indexed by the object names. User can use ‘slash’ in the object name to logically divide objects into different “directories”, though the “directories divided by the slash” are not real. Cleanup a logic directory is to clean all the files with “directory path” as prefix.</p>
- <p>In hierarchy mode, there are directories and files. A directory object is the object whose name ends with slash. All objects start with the directory object name are the directory object’s consecutive objects, and together they form a directory tree. A directory object can’t contain any data. Delete or rename a directory object will clean or rename all objects under the directory tree atomically.</p>
- <p>TOS has some distinctive features that are very useful in bigdata scenarios.</p>
- <ol style="list-style-type: decimal">
- <li>The fast copy feature enables users to duplicate objects without copying data, even huge objects could be copied within tens of milliseconds.</li>
- <li>The fast rename feature enables users to rename one object with a new name without copying data.</li>
- <li>TOS supports CRC32C checksum. It enables user to compare files’ checksums between HDFS and TOS.</li>
- </ol></section><section>
- <h3><a name="Directory_and_file"></a>Directory and file</h3>
- <p>This section illustrates how hadoop-tos transforms TOS to a hadoop FileSystem. TOS requires object’s name must not start with slash, must not contain consecutive slash and must not be empty. Here is the transformation rules.</p>
- <ul>
- <li>Object name is divided by slash to form hierarchy.</li>
- <li>An object whose name ends with slash is a directory.</li>
- <li>An object whose name doesn’t end with slash is a file.</li>
- <li>A file’s parents are directories, no matter whether the parent exists or not.</li>
- </ul>
- <p>For example, supposing we have 2 objects “user/table/” and “user/table/part-0”. The first object is mapped to “/user/table” in hadoop and is a directory. The second object is mapped to “/user/table/part-0” as a file. The non-existent object “user/” is mapped to “/user” as a directory because it’s the parent of file “/user/table/part-0”.</p>
- <table border="0" class="bodyTable">
- <thead>
- <tr class="a">
- <th> Object name </th>
- <th> Object existence </th>
- <th> FileSystem path </th>
- <th> FileSystem Type </th></tr>
- </thead><tbody>
- <tr class="b">
- <td> user/table/ </td>
- <td> yes </td>
- <td> /user/table </td>
- <td> Directory </td></tr>
- <tr class="a">
- <td> user/table/part-0 </td>
- <td> yes </td>
- <td> /user/table/part-0 </td>
- <td> File </td></tr>
- <tr class="b">
- <td> user/ </td>
- <td> no </td>
- <td> /user </td>
- <td> Directory </td></tr>
- </tbody>
- </table>
- <p>The FileSystem requirements above are not enforced rules in flat mode, users can construct cases violating the requirements above. For example, creating a file with its parent is a file. In hierarchy mode, the requirements are enforced rules controlled by TOS service, so there won’t be semantic violations.</p></section><section>
- <h3><a name="List.2C_Rename_and_Delete"></a>List, Rename and Delete</h3>
- <p>List, rename and delete are costly operations in flat mode. Since the namespace is flat, to list a directory, the client needs to scan all objects with directory as the prefix and filter with delimiter. For rename and delete directory, the client needs to first list the directory to get all objects and then rename or delete objects one by one. So they are not atomic operations and costs a lot comparing to hdfs.</p>
- <p>The idiosyncrasies of hierarchy mode is supporting directory. So it can list very fast and support atomic rename and delete directory. Rename or delete failure in flat mode may leave the bucket in an inconsistent state, the hierarchy mode won’t have this problem.</p></section><section>
- <h3><a name="Read_and_write_file"></a>Read and write file</h3>
- <p>The read behaviour in hadoop-tos is very like reading an HDFS file. The challenge is how to keep the input stream consistent with object. If the object is changed after we open the file, the input stream should fail. This is implemented by saving the file checksum when open file. If the file is changed while reading, the input stream will compare the checksum and trigger an exception.</p>
- <p>The write behaviour in hadoop-tos is slightly different from hdfs. Firstly, the append interface is not supported. Secondly, the file is not visible until it is successfully closed. Finally, when 2 clients try to write one file, the last client to close the file will override the previous one.</p>
- <p>Both read and write has many performance optimizations. E.g. range read, connection reuse, local write buffer, put for small files, multipart-upload for big files etc.</p></section><section>
- <h3><a name="Permissions"></a>Permissions</h3>
- <p>TOS permission model is different from hadoop filesystem permission model. TOS supports permissions based on IAM, Bucket Policy, Bucket and Object ACL, while hadoop filesystem permission model uses mode and acl. There is no way to mapped tos permission to hadoop filesystem permission, so we have to use fake permissions in TosFileSystem and TosFS. Users can read and change the filesystem permissions, they can only be seen but not effective. Permission control eventually depends on TOS permission model.</p></section><section>
- <h3><a name="Times"></a>Times</h3>
- <p>Hadoop-tos supports last modified time and doesn’t support access time. For files, the last modified time is the object’s modified time. For directories, if the directory object doesn’t exist, the last modified time is the current system time. If the directory object exists, the last modified time is the object’s modify time when <code>getFileStatus</code> and current system time when <code>listStatus</code>.</p></section><section>
- <h3><a name="File_checksum"></a>File checksum</h3>
- <p>TOS supports CRC64ECMA checksum by default, it is mapped to Hadoop FileChecksum. We can retrieve it by calling <code>FileSystem#getFileChecksum</code>. To be compatible with HDFS, TOS provides optional CRC32C checksum. When we distcp between HDFS and TOS, we can rely on distcp checksum mechanisms to keep data consistent. To use CRC32C, configure keys below.</p>
- <div class="source">
- <div class="source">
- <pre><configuration>
- <property>
- <name>fs.tos.checksum.enabled</name>
- <value>true</value>
- </property>
- <property>
- <name>fs.tos.checksum-algorithm</name>
- <value>COMPOSITE-CRC32C</value>
- </property>
- <property>
- <name>fs.tos.checksum-type</name>
- <value>CRC32C</value>
- </property>
- </configuration>
- </pre></div></div>
- </section><section>
- <h3><a name="Credential"></a>Credential</h3>
- <p>TOS client uses access key id and secret access key to authenticate with tos service. There are 2 ways to configure them. First is adding to hadoop configuration, such as adding to core-site.xml or configuring through <code>-D</code> parameter. The second is setting environment variable, hadoop-tos will search for environment variables automatically.</p>
- <p>To configure ak, sk in hadoop configuration, using the key below.</p>
- <div class="source">
- <div class="source">
- <pre><configuration>
- <!--Set global ak, sk for all buckets.-->
- <property>
- <name>fs.tos.access-key-id</name>
- <value></value>
- <description>
- The accessKey key to access the tos object storage.
- </description>
- </property>
- <property>
- <name>fs.tos.secret-access-key</name>
- <value></value>
- <description>
- The secret access key to access the object storage.
- </description>
- </property>
- <property>
- <name>fs.tos.session-token</name>
- <value></value>
- <description>
- The session token to access the object storage.
- </description>
- </property>
- <!--Set ak, sk for specified bucket. It has higher priority then the global keys.-->
- <property>
- <name>fs.tos.bucket.{bucket_name}.access-key-id</name>
- <value></value>
- <description>
- The access key to access the object storage for the configured bucket.
- </description>
- </property>
- <property>
- <name>fs.tos.bucket.{bucket_name}.secret-access-key</name>
- <value></value>
- <description>
- The secret access key to access the object storage for the configured bucket.
- </description>
- </property>
- <property>
- <name>fs.tos.bucket.{bucket_name}.session-token</name>
- <value></value>
- <description>
- The session token to access the object storage for the configured bucket.
- </description>
- </property>
- </configuration>
- </pre></div></div>
- <p>The ak, sk in environment variables have the top priority and automatically fall back to hadoop configuration if not found. The priority could be changed by <code>fs.tos.credential.provider.custom.classes</code>.</p></section><section>
- <h3><a name="Committer"></a>Committer</h3>
- <p>Hadoop-tos provides MapReduce job committer for better performance. By default, hadoop uses FileOutputCommitter which will rename files many times. First when tasks commit, files will be renamed to a output path. Then when job commits, the files will be renamed from output path to the final path. When using hdfs, rename is not a problem because it only changes meta. But in TOS, by default the rename is implemented as copy and delete and costs a lot.</p>
- <p>TOS committer is an optimized implementation for object storage. When task commits, it won’t complete multipart-upload for the files, instead it write pending set files including all the information to complete multipart-upload. Then when job commits, it reads all the pending files and completes all multipart-uploads.</p>
- <p>An alternative way is turning on TOS renameObject switch and still use FileOutputFormat. Objects are renamed with only meta change. The performance is slightly slower than hadoop-tos committer, because when using TOS rename objects, each file is committed first and then renamed to the final path. There is 1 commit request and 2 rename request to TOS. But with hadoop-tos committer, the object is postponing committed and there is no rename request overhead.</p>
- <p>To enable hadoop-tos committer, configure the key value below.</p>
- <div class="source">
- <div class="source">
- <pre><configurations>
- <!-- mapreduce v1 -->
- <property>
- <name>mapred.output.committer.class</name>
- <value>org.apache.hadoop.fs.tosfs.commit.mapred.Committer</value>
- </property>
- <!-- mapreduce v2 -->
- <property>
- <name>mapreduce.outputcommitter.factory.scheme.tos</name>
- <value>org.apache.hadoop.fs.tosfs.commit.CommitterFactory</value>
- </property>
- </configurations>
- </pre></div></div>
- <p>To enable tos objectRename, first turn on the object rename switch on tos, then configure the key value below.</p>
- <div class="source">
- <div class="source">
- <pre><configurations>
- <property>
- <name>fs.tos.rename.enabled</name>
- <value>true</value>
- </property>
- </configurations>
- </pre></div></div>
- </section></section><section>
- <h2><a name="Properties_Summary"></a>Properties Summary</h2>
- <table border="0" class="bodyTable">
- <thead>
- <tr class="a">
- <th> properties </th>
- <th> description </th>
- <th> default value </th>
- <th> required </th></tr>
- </thead><tbody>
- <tr class="b">
- <td> fs.tos.access-key-id </td>
- <td> The accessKey key to access the tos object storage </td>
- <td> NONE </td>
- <td> YES </td></tr>
- <tr class="a">
- <td> fs.tos.secret-access-key </td>
- <td> The secret access key to access the object storage </td>
- <td> NONE </td>
- <td> YES </td></tr>
- <tr class="b">
- <td> fs.tos.session-token </td>
- <td> The session token to access the object storage </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.%s.endpoint </td>
- <td> Object storage endpoint to connect to, which should include both region and object domain name. </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="b">
- <td> fs.%s.region </td>
- <td> The region of the object storage, e.g. fs.tos.region. Parsing template “fs.%s.endpoint” to know the region. </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.tos.bucket.%s.access-key-id </td>
- <td> The access key to access the object storage for the configured bucket, where %s is the bucket name. </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="b">
- <td> fs.tos.bucket.%s.secret-access-key </td>
- <td> The secret access key to access the object storage for the configured bucket, where %s is the bucket name </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.tos.bucket.%s.session-token </td>
- <td> The session token to access the object storage for the configured bucket, where %s is the bucketname </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="b">
- <td> fs.tos.credentials.provider </td>
- <td> Default credentials provider chain that looks for credentials in this order: SimpleCredentialsProvider,EnvironmentCredentialsProvider </td>
- <td> org.apache.hadoop.fs.tosfs.object.tos.auth.DefaultCredentialsProviderChain </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.tos.credential.provider.custom.classes </td>
- <td> User customized credential provider classes, separate provider class name with comma if there are multiple providers. </td>
- <td> org.apache.hadoop.fs.tosfs.object.tos.auth.EnvironmentCredentialsProvider,org.apache.hadoop.fs.tosfs.object.tos.auth.SimpleCredentialsProvider </td>
- <td> NO </td></tr>
- <tr class="b">
- <td> fs.tos.credentials.provider </td>
- <td> Default credentials provider chain that looks for credentials in this order: SimpleCredentialsProvider,EnvironmentCredentialsProvider. </td>
- <td> org.apache.hadoop.fs.tosfs.object.tos.auth.DefaultCredentialsProviderChain </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.credential.provider.custom.classes </td>
- <td> User customized credential provider classes, separate provider class name with comma if there are multiple providers. </td>
- <td> org.apache.hadoop.fs.tosfs.object.tos.auth.EnvironmentCredentialsProvider,org.apache.hadoop.fs.tosfs.object.tos.auth.SimpleCredentialsProvider </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.http.maxConnections </td>
- <td> The maximum number of connections to the TOS service that a client can create. </td>
- <td> 1024 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.http.idleConnectionTimeMills </td>
- <td> The time that a connection thread can be in idle state, larger than which the thread will be terminated. </td>
- <td> 60000 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.http.connectTimeoutMills </td>
- <td> The connect timeout that the tos client tries to connect to the TOS service. </td>
- <td> 10000 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.http.readTimeoutMills </td>
- <td> The reading timeout when reading data from tos. Note that it is configured for the tos client sdk, not hadoop-tos. </td>
- <td> 30000 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.http.writeTimeoutMills </td>
- <td> The writing timeout when uploading data to tos. Note that it is configured for the tos client sdk, not hadoop-tos. </td>
- <td> 30000 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.http.enableVerifySSL </td>
- <td> Enables SSL connections to TOS or not. </td>
- <td> true </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.http.dnsCacheTimeMinutes </td>
- <td> The timeout (in minutes) of the dns cache used in tos client. </td>
- <td> 0 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.rmr.server.enabled </td>
- <td> Used for directory bucket, whether enable recursive delete capability in TOS server, which will atomic delete all objects under given dir(inclusive), otherwise the client will list all sub objects, and then send batch delete request to TOS to delete dir. </td>
- <td> false </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.rmr.client.enabled </td>
- <td> If fs.tos.rmr.client.enabled is true, client will list all objects under the given dir and delete them by batch. Set value with true will use the recursive delete capability of TOS SDK, otherwise will delete object one by one via preorder tree walk. </td>
- <td> true </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.user.agent.prefix </td>
- <td> The prefix will be used as the product name in TOS SDK. The final user agent pattern is ‘{prefix}/TOS_FS/{hadoop tos version}’. </td>
- <td> HADOOP-TOS </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.max-drain-bytes </td>
- <td> The threshold indicates whether reuse the socket connection to optimize read performance during closing tos object inputstream of get object. If the remaining bytes is less than max drain bytes during closing the inputstream, will just skip the bytes instead of closing the socket connection. </td>
- <td> 1024 * 1024L </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.client.disable.cache </td>
- <td> Whether disable the tos http client cache in the current JVM. </td>
- <td> false </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.batch.delete.objects-count </td>
- <td> The batch size when deleting the objects in batches. </td>
- <td> 1000 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.batch.delete.max-retries </td>
- <td> The maximum retry times when deleting objects in batches failed. </td>
- <td> 20 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.batch.delete.retry-codes </td>
- <td> The codes from TOS deleteMultiObjects response, client will resend the batch delete request to delete the failed keys again if the response only contains these codes, otherwise won’t send request anymore. </td>
- <td> ExceedAccountQPSLimit,ExceedAccountRateLimit,ExceedBucketQPSLimit,ExceedBucketRateLimit,InternalError,ServiceUnavailable,SlowDown,TooManyRequests </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.batch.delete.retry.interval </td>
- <td> The retry interval (in milliseconds) when deleting objects in batches failed. </td>
- <td> 1000 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.list.objects-count </td>
- <td> The batch size of listing object per request for the given object storage, such as listing a directory, searching for all objects whose path starts with the directory path, and returning them as a list. </td>
- <td> 1000 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.request.max.retry.times </td>
- <td> The maximum retry times of sending request via TOS client, client will resend the request if got retryable exceptions, e.g. SocketException, UnknownHostException, SSLException, InterruptedException, SocketTimeoutException, or got TOO_MANY_REQUESTS, INTERNAL_SERVER_ERROR http codes. </td>
- <td> 20 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.fast-fail-409-error-codes </td>
- <td> The fast-fail error codes means the error cannot be solved by retrying the request. TOS client won’t retry the request if receiving a 409 http status code and if the error code is in the configured non-retryable error code list. </td>
- <td> 0026-00000013,0026-00000020,0026-00000021,0026-00000025,0026-00000026,0026-00000027 ,0017-00000208,0017-00000209 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.inputstream.max.retry.times </td>
- <td> The maximum retry times of reading object content via TOS client, client will resend the request to create a new input stream if getting unexpected end of stream error during reading the input stream. </td>
- <td> 5 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.crc.check.enable </td>
- <td> Enable the crc check when uploading files to tos or not. </td>
- <td> true </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.get-file-status.enabled </td>
- <td> Whether enable tos getFileStatus API or not, which returns the object info directly in one RPC request, otherwise, might need to send three RPC requests to get object info. For example, there is a key ‘a/b/c’ exists in TOS, and we want to get object status of ‘a/b’, the GetFileStatus(‘a/b’) will return the prefix ‘a/b/’ as a directory object directly. If this property is disabled, we need to head(‘a/b’) at first, and then head(‘a/b/’), and last call list(‘a/b/’, limit=1) to get object info. Using GetFileStatus API can reduce the RPC call times. </td>
- <td> true </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.tos.checksum-algorithm </td>
- <td> The key indicates the name of the tos checksum algorithm. Specify the algorithm name to compare checksums between different storage systems. For example to compare checksums between hdfs and tos, we need to configure the algorithm name to COMPOSITE-CRC32C. </td>
- <td> TOS-CHECKSUM </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.tos.checksum-type </td>
- <td> The key indicates how to retrieve file checksum from tos, error will be thrown if the configured checksum type is not supported by tos. The supported checksum types are: CRC32C, CRC64ECMA. </td>
- <td> CRC64ECMA </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.objectstorage.%s.impl </td>
- <td> The object storage implementation for the defined scheme. For example, we can delegate the scheme ‘abc’ to TOS (or other object storage),and access the TOS object storage as ‘<a class="externalLink" href="abc://bucket/path/to/key">abc://bucket/path/to/key</a>’ </td>
- <td> NONE </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.%s.delete.batch-size </td>
- <td> The batch size of deleting multiple objects per request for the given object storage. e.g. fs.tos.delete.batch-size </td>
- <td> 250 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.%s.multipart.size </td>
- <td> The multipart upload part size of the given object storage, e.g. fs.tos.multipart.size. </td>
- <td> 8388608 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.%s.multipart.copy-threshold </td>
- <td> The threshold (larger than this value) to enable multipart upload during copying objects in the given object storage. If the copied data size is less than threshold, will copy data via executing copyObject instead of uploadPartCopy. E.g. fs.tos.multipart.copy-threshold </td>
- <td> 5242880 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.%s.multipart.threshold </td>
- <td> The threshold which control whether enable multipart upload during writing data to the given object storage, if the write data size is less than threshold, will write data via simple put instead of multipart upload. E.g. fs.tos.multipart.threshold. </td>
- <td> 10485760 </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.%s.multipart.staging-buffer-size </td>
- <td> The max byte size which will buffer the staging data in-memory before flushing to the staging file. It will decrease the random write in local staging disk dramatically if writing plenty of small files. </td>
- <td> 4096 </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.%s.multipart.staging-dir </td>
- <td> The multipart upload part staging dir(s) of the given object storage. e.g. fs.tos.multipart.staging-dir. Separate the staging dirs with comma if there are many staging dir paths. </td>
- <td> ${java.io.tmpdir}/multipart-staging-dir </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.%s.missed.parent.dir.async-create </td>
- <td> True to create the missed parent dir asynchronously during deleting or renaming a file or dir. </td>
- <td> true </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.%s.rename.enabled </td>
- <td> Whether using rename semantic of object storage during rename files, otherwise using copy + delete. Please ensure that the object storage support and enable rename semantic and before enable it, and also ensure grant rename permission to the requester. If you are using TOS, you have to send putBucketRename request before sending rename request, otherwise MethodNotAllowed exception will be thrown. </td>
- <td> false </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.%s.task.thread-pool-size </td>
- <td> The range size when open object storage input stream. Value must be positive. The size of thread pool used for running tasks in parallel for the given object fs, e.g. delete objects, copy files. the key example: fs.tos.task.thread-pool-size. </td>
- <td> Long.MAX_VALUE </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.%s.multipart.thread-pool-size </td>
- <td> The size of thread pool used for uploading multipart in parallel for the given object storage, e.g. fs.tos.multipart.thread-pool-size </td>
- <td> Max value of 2 and available processors. </td>
- <td> NONE </td></tr>
- <tr class="a">
- <td> fs.%s.checksum.enabled </td>
- <td> The toggle indicates whether enable checksum during getting file status for the given object. E.g. fs.tos.checksum.enabled </td>
- <td> Max value of 2 and available processors. </td>
- <td> NONE </td></tr>
- <tr class="b">
- <td> fs.filestore.checksum-algorithm </td>
- <td> The key indicates the name of the filestore checksum algorithm. Specify the algorithm name to satisfy different storage systems. For example, the hdfs style name is COMPOSITE-CRC32 and COMPOSITE-CRC32C. </td>
- <td> TOS-CHECKSUM </td>
- <td> NO </td></tr>
- <tr class="a">
- <td> fs.filestore.checksum-type </td>
- <td> The key indicates how to retrieve file checksum from filestore, error will be thrown if the configured checksum type is not supported. The supported checksum type is: MD5. </td>
- <td> MD5 </td>
- <td> NO </td></tr>
- </tbody>
- </table></section><section>
- <h2><a name="Running_unit_tests_in_hadoop-tos_module"></a>Running unit tests in hadoop-tos module</h2>
- <p>Unit tests need to connect to tos service. Setting the 6 environment variables below to run unit tests.</p>
- <div class="source">
- <div class="source">
- <pre>export TOS_ACCESS_KEY_ID={YOUR_ACCESS_KEY}
- export TOS_SECRET_ACCESS_KEY={YOUR_SECRET_ACCESS_KEY}
- export TOS_ENDPOINT={TOS_SERVICE_ENDPOINT}
- export FILE_STORAGE_ROOT=/tmp/local_dev/
- export TOS_BUCKET={YOUR_BUCKET_NAME}
- export TOS_UNIT_TEST_ENABLED=true
- </pre></div></div>
- <p>Then cd to hadoop project root directory, and run the test command below.</p>
- <div class="source">
- <div class="source">
- <pre>mvn -Dtest=org.apache.hadoop.fs.tosfs.** test -pl org.apache.hadoop:hadoop-tos
- </pre></div></div></section>
- </div>
- </div>
- <div class="clear">
- <hr/>
- </div>
- <div id="footer">
- <div class="xright">
- © 2008-2025
- Apache Software Foundation
-
- - <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a>.
- Apache Maven, Maven, Apache, the Apache feather logo, and the Apache Maven project logos are trademarks of The Apache Software Foundation.
- </div>
- <div class="clear">
- <hr/>
- </div>
- </div>
- </body>
- </html>
|