1
0

Metrics.apt.vm 53 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879
  1. ~~ Licensed to the Apache Software Foundation (ASF) under one or more
  2. ~~ contributor license agreements. See the NOTICE file distributed with
  3. ~~ this work for additional information regarding copyright ownership.
  4. ~~ The ASF licenses this file to You under the Apache License, Version 2.0
  5. ~~ (the "License"); you may not use this file except in compliance with
  6. ~~ the License. You may obtain a copy of the License at
  7. ~~
  8. ~~ http://www.apache.org/licenses/LICENSE-2.0
  9. ~~
  10. ~~ Unless required by applicable law or agreed to in writing, software
  11. ~~ distributed under the License is distributed on an "AS IS" BASIS,
  12. ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  13. ~~ See the License for the specific language governing permissions and
  14. ~~ limitations under the License.
  15. ---
  16. Metrics Guide
  17. ---
  18. ---
  19. ${maven.build.timestamp}
  20. %{toc}
  21. Overview
  22. Metrics are statistical information exposed by Hadoop daemons,
  23. used for monitoring, performance tuning and debug.
  24. There are many metrics available by default
  25. and they are very useful for troubleshooting.
  26. This page shows the details of the available metrics.
  27. Each section describes each context into which metrics are grouped.
  28. The documentation of Metrics 2.0 framework is
  29. {{{../../api/org/apache/hadoop/metrics2/package-summary.html}here}}.
  30. jvm context
  31. * JvmMetrics
  32. Each metrics record contains tags such as ProcessName, SessionID
  33. and Hostname as additional information along with metrics.
  34. *-------------------------------------+--------------------------------------+
  35. || Name || Description
  36. *-------------------------------------+--------------------------------------+
  37. |<<<MemNonHeapUsedM>>> | Current non-heap memory used in MB
  38. *-------------------------------------+--------------------------------------+
  39. |<<<MemNonHeapCommittedM>>> | Current non-heap memory committed in MB
  40. *-------------------------------------+--------------------------------------+
  41. |<<<MemNonHeapMaxM>>> | Max non-heap memory size in MB
  42. *-------------------------------------+--------------------------------------+
  43. |<<<MemHeapUsedM>>> | Current heap memory used in MB
  44. *-------------------------------------+--------------------------------------+
  45. |<<<MemHeapCommittedM>>> | Current heap memory committed in MB
  46. *-------------------------------------+--------------------------------------+
  47. |<<<MemHeapMaxM>>> | Max heap memory size in MB
  48. *-------------------------------------+--------------------------------------+
  49. |<<<MemMaxM>>> | Max memory size in MB
  50. *-------------------------------------+--------------------------------------+
  51. |<<<ThreadsNew>>> | Current number of NEW threads
  52. *-------------------------------------+--------------------------------------+
  53. |<<<ThreadsRunnable>>> | Current number of RUNNABLE threads
  54. *-------------------------------------+--------------------------------------+
  55. |<<<ThreadsBlocked>>> | Current number of BLOCKED threads
  56. *-------------------------------------+--------------------------------------+
  57. |<<<ThreadsWaiting>>> | Current number of WAITING threads
  58. *-------------------------------------+--------------------------------------+
  59. |<<<ThreadsTimedWaiting>>> | Current number of TIMED_WAITING threads
  60. *-------------------------------------+--------------------------------------+
  61. |<<<ThreadsTerminated>>> | Current number of TERMINATED threads
  62. *-------------------------------------+--------------------------------------+
  63. |<<<GcInfo>>> | Total GC count and GC time in msec, grouped by the kind of GC. \
  64. | ex.) GcCountPS Scavenge=6, GCTimeMillisPS Scavenge=40,
  65. | GCCountPS MarkSweep=0, GCTimeMillisPS MarkSweep=0
  66. *-------------------------------------+--------------------------------------+
  67. |<<<GcCount>>> | Total GC count
  68. *-------------------------------------+--------------------------------------+
  69. |<<<GcTimeMillis>>> | Total GC time in msec
  70. *-------------------------------------+--------------------------------------+
  71. |<<<LogFatal>>> | Total number of FATAL logs
  72. *-------------------------------------+--------------------------------------+
  73. |<<<LogError>>> | Total number of ERROR logs
  74. *-------------------------------------+--------------------------------------+
  75. |<<<LogWarn>>> | Total number of WARN logs
  76. *-------------------------------------+--------------------------------------+
  77. |<<<LogInfo>>> | Total number of INFO logs
  78. *-------------------------------------+--------------------------------------+
  79. |<<<GcNumWarnThresholdExceeded>>> | Number of times that the GC warn
  80. | threshold is exceeded
  81. *-------------------------------------+--------------------------------------+
  82. |<<<GcNumInfoThresholdExceeded>>> | Number of times that the GC info
  83. | threshold is exceeded
  84. *-------------------------------------+--------------------------------------+
  85. |<<<GcTotalExtraSleepTime>>> | Total GC extra sleep time in msec
  86. *-------------------------------------+--------------------------------------+
  87. rpc context
  88. * rpc
  89. Each metrics record contains tags such as Hostname
  90. and port (number to which server is bound)
  91. as additional information along with metrics.
  92. *-------------------------------------+--------------------------------------+
  93. || Name || Description
  94. *-------------------------------------+--------------------------------------+
  95. |<<<ReceivedBytes>>> | Total number of received bytes
  96. *-------------------------------------+--------------------------------------+
  97. |<<<SentBytes>>> | Total number of sent bytes
  98. *-------------------------------------+--------------------------------------+
  99. |<<<RpcQueueTimeNumOps>>> | Total number of RPC calls
  100. *-------------------------------------+--------------------------------------+
  101. |<<<RpcQueueTimeAvgTime>>> | Average queue time in milliseconds
  102. *-------------------------------------+--------------------------------------+
  103. |<<<RpcProcessingTimeNumOps>>> | Total number of RPC calls (same to
  104. | RpcQueueTimeNumOps)
  105. *-------------------------------------+--------------------------------------+
  106. |<<<RpcProcessingAvgTime>>> | Average Processing time in milliseconds
  107. *-------------------------------------+--------------------------------------+
  108. |<<<RpcAuthenticationFailures>>> | Total number of authentication failures
  109. *-------------------------------------+--------------------------------------+
  110. |<<<RpcAuthenticationSuccesses>>> | Total number of authentication successes
  111. *-------------------------------------+--------------------------------------+
  112. |<<<RpcAuthorizationFailures>>> | Total number of authorization failures
  113. *-------------------------------------+--------------------------------------+
  114. |<<<RpcAuthorizationSuccesses>>> | Total number of authorization successes
  115. *-------------------------------------+--------------------------------------+
  116. |<<<NumOpenConnections>>> | Current number of open connections
  117. *-------------------------------------+--------------------------------------+
  118. |<<<CallQueueLength>>> | Current length of the call queue
  119. *-------------------------------------+--------------------------------------+
  120. |<<<rpcQueueTime>>><num><<<sNumOps>>> | Shows total number of RPC calls
  121. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  122. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  123. *-------------------------------------+--------------------------------------+
  124. |<<<rpcQueueTime>>><num><<<s50thPercentileLatency>>> |
  125. | | Shows the 50th percentile of RPC queue time in milliseconds
  126. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  127. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  128. *-------------------------------------+--------------------------------------+
  129. |<<<rpcQueueTime>>><num><<<s75thPercentileLatency>>> |
  130. | | Shows the 75th percentile of RPC queue time in milliseconds
  131. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  132. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  133. *-------------------------------------+--------------------------------------+
  134. |<<<rpcQueueTime>>><num><<<s90thPercentileLatency>>> |
  135. | | Shows the 90th percentile of RPC queue time in milliseconds
  136. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  137. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  138. *-------------------------------------+--------------------------------------+
  139. |<<<rpcQueueTime>>><num><<<s95thPercentileLatency>>> |
  140. | | Shows the 95th percentile of RPC queue time in milliseconds
  141. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  142. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  143. *-------------------------------------+--------------------------------------+
  144. |<<<rpcQueueTime>>><num><<<s99thPercentileLatency>>> |
  145. | | Shows the 99th percentile of RPC queue time in milliseconds
  146. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  147. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  148. *-------------------------------------+--------------------------------------+
  149. |<<<rpcProcessingTime>>><num><<<sNumOps>>> | Shows total number of RPC calls
  150. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  151. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  152. *-------------------------------------+--------------------------------------+
  153. |<<<rpcProcessingTime>>><num><<<s50thPercentileLatency>>> |
  154. | | Shows the 50th percentile of RPC processing time in milliseconds
  155. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  156. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  157. *-------------------------------------+--------------------------------------+
  158. |<<<rpcProcessingTime>>><num><<<s75thPercentileLatency>>> |
  159. | | Shows the 75th percentile of RPC processing time in milliseconds
  160. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  161. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  162. *-------------------------------------+--------------------------------------+
  163. |<<<rpcProcessingTime>>><num><<<s90thPercentileLatency>>> |
  164. | | Shows the 90th percentile of RPC processing time in milliseconds
  165. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  166. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  167. *-------------------------------------+--------------------------------------+
  168. |<<<rpcProcessingTime>>><num><<<s95thPercentileLatency>>> |
  169. | | Shows the 95th percentile of RPC processing time in milliseconds
  170. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  171. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  172. *-------------------------------------+--------------------------------------+
  173. |<<<rpcProcessingTime>>><num><<<s99thPercentileLatency>>> |
  174. | | Shows the 99th percentile of RPC processing time in milliseconds
  175. | | (<num> seconds granularity) if <<<rpc.metrics.quantile.enable>>> is set to
  176. | | true. <num> is specified by <<<rpc.metrics.percentiles.intervals>>>.
  177. *-------------------------------------+--------------------------------------+
  178. * RetryCache/NameNodeRetryCache
  179. RetryCache metrics is useful to monitor NameNode fail-over.
  180. Each metrics record contains Hostname tag.
  181. *-------------------------------------+--------------------------------------+
  182. || Name || Description
  183. *-------------------------------------+--------------------------------------+
  184. |<<<CacheHit>>> | Total number of RetryCache hit
  185. *-------------------------------------+--------------------------------------+
  186. |<<<CacheCleared>>> | Total number of RetryCache cleared
  187. *-------------------------------------+--------------------------------------+
  188. |<<<CacheUpdated>>> | Total number of RetryCache updated
  189. *-------------------------------------+--------------------------------------+
  190. rpcdetailed context
  191. Metrics of rpcdetailed context are exposed in unified manner by RPC
  192. layer. Two metrics are exposed for each RPC based on its name.
  193. Metrics named "(RPC method name)NumOps" indicates total number of
  194. method calls, and metrics named "(RPC method name)AvgTime" shows
  195. average turn around time for method calls in milliseconds.
  196. * rpcdetailed
  197. Each metrics record contains tags such as Hostname
  198. and port (number to which server is bound)
  199. as additional information along with metrics.
  200. The Metrics about RPCs which is not called are not included
  201. in metrics record.
  202. *-------------------------------------+--------------------------------------+
  203. || Name || Description
  204. *-------------------------------------+--------------------------------------+
  205. |<methodname><<<NumOps>>> | Total number of the times the method is called
  206. *-------------------------------------+--------------------------------------+
  207. |<methodname><<<AvgTime>>> | Average turn around time of the method in
  208. | milliseconds
  209. *-------------------------------------+--------------------------------------+
  210. dfs context
  211. * namenode
  212. Each metrics record contains tags such as ProcessName, SessionId,
  213. and Hostname as additional information along with metrics.
  214. *-------------------------------------+--------------------------------------+
  215. || Name || Description
  216. *-------------------------------------+--------------------------------------+
  217. |<<<CreateFileOps>>> | Total number of files created
  218. *-------------------------------------+--------------------------------------+
  219. |<<<FilesCreated>>> | Total number of files and directories created by create
  220. | or mkdir operations
  221. *-------------------------------------+--------------------------------------+
  222. |<<<FilesAppended>>> | Total number of files appended
  223. *-------------------------------------+--------------------------------------+
  224. |<<<GetBlockLocations>>> | Total number of getBlockLocations operations
  225. *-------------------------------------+--------------------------------------+
  226. |<<<FilesRenamed>>> | Total number of rename <<operations>> (NOT number of
  227. | files/dirs renamed)
  228. *-------------------------------------+--------------------------------------+
  229. |<<<GetListingOps>>> | Total number of directory listing operations
  230. *-------------------------------------+--------------------------------------+
  231. |<<<DeleteFileOps>>> | Total number of delete operations
  232. *-------------------------------------+--------------------------------------+
  233. |<<<FilesDeleted>>> | Total number of files and directories deleted by delete
  234. | or rename operations
  235. *-------------------------------------+--------------------------------------+
  236. |<<<FileInfoOps>>> | Total number of getFileInfo and getLinkFileInfo
  237. | operations
  238. *-------------------------------------+--------------------------------------+
  239. |<<<AddBlockOps>>> | Total number of addBlock operations succeeded
  240. *-------------------------------------+--------------------------------------+
  241. |<<<GetAdditionalDatanodeOps>>> | Total number of getAdditionalDatanode
  242. | operations
  243. *-------------------------------------+--------------------------------------+
  244. |<<<CreateSymlinkOps>>> | Total number of createSymlink operations
  245. *-------------------------------------+--------------------------------------+
  246. |<<<GetLinkTargetOps>>> | Total number of getLinkTarget operations
  247. *-------------------------------------+--------------------------------------+
  248. |<<<FilesInGetListingOps>>> | Total number of files and directories listed by
  249. | directory listing operations
  250. *-------------------------------------+--------------------------------------+
  251. |<<<AllowSnapshotOps>>> | Total number of allowSnapshot operations
  252. *-------------------------------------+--------------------------------------+
  253. |<<<DisallowSnapshotOps>>> | Total number of disallowSnapshot operations
  254. *-------------------------------------+--------------------------------------+
  255. |<<<CreateSnapshotOps>>> | Total number of createSnapshot operations
  256. *-------------------------------------+--------------------------------------+
  257. |<<<DeleteSnapshotOps>>> | Total number of deleteSnapshot operations
  258. *-------------------------------------+--------------------------------------+
  259. |<<<RenameSnapshotOps>>> | Total number of renameSnapshot operations
  260. *-------------------------------------+--------------------------------------+
  261. |<<<ListSnapshottableDirOps>>> | Total number of snapshottableDirectoryStatus
  262. | operations
  263. *-------------------------------------+--------------------------------------+
  264. |<<<SnapshotDiffReportOps>>> | Total number of getSnapshotDiffReport
  265. | operations
  266. *-------------------------------------+--------------------------------------+
  267. |<<<TransactionsNumOps>>> | Total number of Journal transactions
  268. *-------------------------------------+--------------------------------------+
  269. |<<<TransactionsAvgTime>>> | Average time of Journal transactions in
  270. | milliseconds
  271. *-------------------------------------+--------------------------------------+
  272. |<<<SyncsNumOps>>> | Total number of Journal syncs
  273. *-------------------------------------+--------------------------------------+
  274. |<<<SyncsAvgTime>>> | Average time of Journal syncs in milliseconds
  275. *-------------------------------------+--------------------------------------+
  276. |<<<TransactionsBatchedInSync>>> | Total number of Journal transactions batched
  277. | in sync
  278. *-------------------------------------+--------------------------------------+
  279. |<<<BlockReportNumOps>>> | Total number of processing block reports from
  280. | DataNode
  281. *-------------------------------------+--------------------------------------+
  282. |<<<BlockReportAvgTime>>> | Average time of processing block reports in
  283. | milliseconds
  284. *-------------------------------------+--------------------------------------+
  285. |<<<CacheReportNumOps>>> | Total number of processing cache reports from
  286. | DataNode
  287. *-------------------------------------+--------------------------------------+
  288. |<<<CacheReportAvgTime>>> | Average time of processing cache reports in
  289. | milliseconds
  290. *-------------------------------------+--------------------------------------+
  291. |<<<SafeModeTime>>> | The interval between FSNameSystem starts and the last
  292. | time safemode leaves in milliseconds. \
  293. | (sometimes not equal to the time in SafeMode,
  294. | see {{{https://issues.apache.org/jira/browse/HDFS-5156}HDFS-5156}})
  295. *-------------------------------------+--------------------------------------+
  296. |<<<FsImageLoadTime>>> | Time loading FS Image at startup in milliseconds
  297. *-------------------------------------+--------------------------------------+
  298. |<<<FsImageLoadTime>>> | Time loading FS Image at startup in milliseconds
  299. *-------------------------------------+--------------------------------------+
  300. |<<<GetEditNumOps>>> | Total number of edits downloads from SecondaryNameNode
  301. *-------------------------------------+--------------------------------------+
  302. |<<<GetEditAvgTime>>> | Average edits download time in milliseconds
  303. *-------------------------------------+--------------------------------------+
  304. |<<<GetImageNumOps>>> |Total number of fsimage downloads from SecondaryNameNode
  305. *-------------------------------------+--------------------------------------+
  306. |<<<GetImageAvgTime>>> | Average fsimage download time in milliseconds
  307. *-------------------------------------+--------------------------------------+
  308. |<<<PutImageNumOps>>> | Total number of fsimage uploads to SecondaryNameNode
  309. *-------------------------------------+--------------------------------------+
  310. |<<<PutImageAvgTime>>> | Average fsimage upload time in milliseconds
  311. *-------------------------------------+--------------------------------------+
  312. * FSNamesystem
  313. Each metrics record contains tags such as HAState and Hostname
  314. as additional information along with metrics.
  315. *-------------------------------------+--------------------------------------+
  316. || Name || Description
  317. *-------------------------------------+--------------------------------------+
  318. |<<<MissingBlocks>>> | Current number of missing blocks
  319. *-------------------------------------+--------------------------------------+
  320. |<<<ExpiredHeartbeats>>> | Total number of expired heartbeats
  321. *-------------------------------------+--------------------------------------+
  322. |<<<TransactionsSinceLastCheckpoint>>> | Total number of transactions since
  323. | last checkpoint
  324. *-------------------------------------+--------------------------------------+
  325. |<<<TransactionsSinceLastLogRoll>>> | Total number of transactions since last
  326. | edit log roll
  327. *-------------------------------------+--------------------------------------+
  328. |<<<LastWrittenTransactionId>>> | Last transaction ID written to the edit log
  329. *-------------------------------------+--------------------------------------+
  330. |<<<LastCheckpointTime>>> | Time in milliseconds since epoch of last checkpoint
  331. *-------------------------------------+--------------------------------------+
  332. |<<<CapacityTotal>>> | Current raw capacity of DataNodes in bytes
  333. *-------------------------------------+--------------------------------------+
  334. |<<<CapacityTotalGB>>> | Current raw capacity of DataNodes in GB
  335. *-------------------------------------+--------------------------------------+
  336. |<<<CapacityUsed>>> | Current used capacity across all DataNodes in bytes
  337. *-------------------------------------+--------------------------------------+
  338. |<<<CapacityUsedGB>>> | Current used capacity across all DataNodes in GB
  339. *-------------------------------------+--------------------------------------+
  340. |<<<CapacityRemaining>>> | Current remaining capacity in bytes
  341. *-------------------------------------+--------------------------------------+
  342. |<<<CapacityRemainingGB>>> | Current remaining capacity in GB
  343. *-------------------------------------+--------------------------------------+
  344. |<<<CapacityUsedNonDFS>>> | Current space used by DataNodes for non DFS
  345. | purposes in bytes
  346. *-------------------------------------+--------------------------------------+
  347. |<<<TotalLoad>>> | Current number of connections
  348. *-------------------------------------+--------------------------------------+
  349. |<<<SnapshottableDirectories>>> | Current number of snapshottable directories
  350. *-------------------------------------+--------------------------------------+
  351. |<<<Snapshots>>> | Current number of snapshots
  352. *-------------------------------------+--------------------------------------+
  353. |<<<BlocksTotal>>> | Current number of allocated blocks in the system
  354. *-------------------------------------+--------------------------------------+
  355. |<<<FilesTotal>>> | Current number of files and directories
  356. *-------------------------------------+--------------------------------------+
  357. |<<<PendingReplicationBlocks>>> | Current number of blocks pending to be
  358. | replicated
  359. *-------------------------------------+--------------------------------------+
  360. |<<<UnderReplicatedBlocks>>> | Current number of blocks under replicated
  361. *-------------------------------------+--------------------------------------+
  362. |<<<CorruptBlocks>>> | Current number of blocks with corrupt replicas.
  363. *-------------------------------------+--------------------------------------+
  364. |<<<ScheduledReplicationBlocks>>> | Current number of blocks scheduled for
  365. | replications
  366. *-------------------------------------+--------------------------------------+
  367. |<<<PendingDeletionBlocks>>> | Current number of blocks pending deletion
  368. *-------------------------------------+--------------------------------------+
  369. |<<<ExcessBlocks>>> | Current number of excess blocks
  370. *-------------------------------------+--------------------------------------+
  371. |<<<PostponedMisreplicatedBlocks>>> | (HA-only) Current number of blocks
  372. | postponed to replicate
  373. *-------------------------------------+--------------------------------------+
  374. |<<<PendingDataNodeMessageCourt>>> | (HA-only) Current number of pending
  375. | block-related messages for later
  376. | processing in the standby NameNode
  377. *-------------------------------------+--------------------------------------+
  378. |<<<MillisSinceLastLoadedEdits>>> | (HA-only) Time in milliseconds since the
  379. | last time standby NameNode load edit log.
  380. | In active NameNode, set to 0
  381. *-------------------------------------+--------------------------------------+
  382. |<<<BlockCapacity>>> | Current number of block capacity
  383. *-------------------------------------+--------------------------------------+
  384. |<<<StaleDataNodes>>> | Current number of DataNodes marked stale due to delayed
  385. | heartbeat
  386. *-------------------------------------+--------------------------------------+
  387. |<<<TotalFiles>>> |Current number of files and directories (same as FilesTotal)
  388. *-------------------------------------+--------------------------------------+
  389. * JournalNode
  390. The server-side metrics for a journal from the JournalNode's perspective.
  391. Each metrics record contains Hostname tag as additional information
  392. along with metrics.
  393. *-------------------------------------+--------------------------------------+
  394. || Name || Description
  395. *-------------------------------------+--------------------------------------+
  396. |<<<Syncs60sNumOps>>> | Number of sync operations (1 minute granularity)
  397. *-------------------------------------+--------------------------------------+
  398. |<<<Syncs60s50thPercentileLatencyMicros>>> | The 50th percentile of sync
  399. | | latency in microseconds (1 minute granularity)
  400. *-------------------------------------+--------------------------------------+
  401. |<<<Syncs60s75thPercentileLatencyMicros>>> | The 75th percentile of sync
  402. | | latency in microseconds (1 minute granularity)
  403. *-------------------------------------+--------------------------------------+
  404. |<<<Syncs60s90thPercentileLatencyMicros>>> | The 90th percentile of sync
  405. | | latency in microseconds (1 minute granularity)
  406. *-------------------------------------+--------------------------------------+
  407. |<<<Syncs60s95thPercentileLatencyMicros>>> | The 95th percentile of sync
  408. | | latency in microseconds (1 minute granularity)
  409. *-------------------------------------+--------------------------------------+
  410. |<<<Syncs60s99thPercentileLatencyMicros>>> | The 99th percentile of sync
  411. | | latency in microseconds (1 minute granularity)
  412. *-------------------------------------+--------------------------------------+
  413. |<<<Syncs300sNumOps>>> | Number of sync operations (5 minutes granularity)
  414. *-------------------------------------+--------------------------------------+
  415. |<<<Syncs300s50thPercentileLatencyMicros>>> | The 50th percentile of sync
  416. | | latency in microseconds (5 minutes granularity)
  417. *-------------------------------------+--------------------------------------+
  418. |<<<Syncs300s75thPercentileLatencyMicros>>> | The 75th percentile of sync
  419. | | latency in microseconds (5 minutes granularity)
  420. *-------------------------------------+--------------------------------------+
  421. |<<<Syncs300s90thPercentileLatencyMicros>>> | The 90th percentile of sync
  422. | | latency in microseconds (5 minutes granularity)
  423. *-------------------------------------+--------------------------------------+
  424. |<<<Syncs300s95thPercentileLatencyMicros>>> | The 95th percentile of sync
  425. | | latency in microseconds (5 minutes granularity)
  426. *-------------------------------------+--------------------------------------+
  427. |<<<Syncs300s99thPercentileLatencyMicros>>> | The 99th percentile of sync
  428. | | latency in microseconds (5 minutes granularity)
  429. *-------------------------------------+--------------------------------------+
  430. |<<<Syncs3600sNumOps>>> | Number of sync operations (1 hour granularity)
  431. *-------------------------------------+--------------------------------------+
  432. |<<<Syncs3600s50thPercentileLatencyMicros>>> | The 50th percentile of sync
  433. | | latency in microseconds (1 hour granularity)
  434. *-------------------------------------+--------------------------------------+
  435. |<<<Syncs3600s75thPercentileLatencyMicros>>> | The 75th percentile of sync
  436. | | latency in microseconds (1 hour granularity)
  437. *-------------------------------------+--------------------------------------+
  438. |<<<Syncs3600s90thPercentileLatencyMicros>>> | The 90th percentile of sync
  439. | | latency in microseconds (1 hour granularity)
  440. *-------------------------------------+--------------------------------------+
  441. |<<<Syncs3600s95thPercentileLatencyMicros>>> | The 95th percentile of sync
  442. | | latency in microseconds (1 hour granularity)
  443. *-------------------------------------+--------------------------------------+
  444. |<<<Syncs3600s99thPercentileLatencyMicros>>> | The 99th percentile of sync
  445. | | latency in microseconds (1 hour granularity)
  446. *-------------------------------------+--------------------------------------+
  447. |<<<BatchesWritten>>> | Total number of batches written since startup
  448. *-------------------------------------+--------------------------------------+
  449. |<<<TxnsWritten>>> | Total number of transactions written since startup
  450. *-------------------------------------+--------------------------------------+
  451. |<<<BytesWritten>>> | Total number of bytes written since startup
  452. *-------------------------------------+--------------------------------------+
  453. |<<<BatchesWrittenWhileLagging>>> | Total number of batches written where this
  454. | | node was lagging
  455. *-------------------------------------+--------------------------------------+
  456. |<<<LastWriterEpoch>>> | Current writer's epoch number
  457. *-------------------------------------+--------------------------------------+
  458. |<<<CurrentLagTxns>>> | The number of transactions that this JournalNode is
  459. | | lagging
  460. *-------------------------------------+--------------------------------------+
  461. |<<<LastWrittenTxId>>> | The highest transaction id stored on this JournalNode
  462. *-------------------------------------+--------------------------------------+
  463. |<<<LastPromisedEpoch>>> | The last epoch number which this node has promised
  464. | | not to accept any lower epoch, or 0 if no promises have been made
  465. *-------------------------------------+--------------------------------------+
  466. * datanode
  467. Each metrics record contains tags such as SessionId and Hostname
  468. as additional information along with metrics.
  469. *-------------------------------------+--------------------------------------+
  470. || Name || Description
  471. *-------------------------------------+--------------------------------------+
  472. |<<<BytesWritten>>> | Total number of bytes written to DataNode
  473. *-------------------------------------+--------------------------------------+
  474. |<<<BytesRead>>> | Total number of bytes read from DataNode
  475. *-------------------------------------+--------------------------------------+
  476. |<<<BlocksWritten>>> | Total number of blocks written to DataNode
  477. *-------------------------------------+--------------------------------------+
  478. |<<<BlocksRead>>> | Total number of blocks read from DataNode
  479. *-------------------------------------+--------------------------------------+
  480. |<<<BlocksReplicated>>> | Total number of blocks replicated
  481. *-------------------------------------+--------------------------------------+
  482. |<<<BlocksRemoved>>> | Total number of blocks removed
  483. *-------------------------------------+--------------------------------------+
  484. |<<<BlocksVerified>>> | Total number of blocks verified
  485. *-------------------------------------+--------------------------------------+
  486. |<<<BlockVerificationFailures>>> | Total number of verifications failures
  487. *-------------------------------------+--------------------------------------+
  488. |<<<BlocksCached>>> | Total number of blocks cached
  489. *-------------------------------------+--------------------------------------+
  490. |<<<BlocksUncached>>> | Total number of blocks uncached
  491. *-------------------------------------+--------------------------------------+
  492. |<<<ReadsFromLocalClient>>> | Total number of read operations from local client
  493. *-------------------------------------+--------------------------------------+
  494. |<<<ReadsFromRemoteClient>>> | Total number of read operations from remote
  495. | client
  496. *-------------------------------------+--------------------------------------+
  497. |<<<WritesFromLocalClient>>> | Total number of write operations from local
  498. | client
  499. *-------------------------------------+--------------------------------------+
  500. |<<<WritesFromRemoteClient>>> | Total number of write operations from remote
  501. | client
  502. *-------------------------------------+--------------------------------------+
  503. |<<<BlocksGetLocalPathInfo>>> | Total number of operations to get local path
  504. | names of blocks
  505. *-------------------------------------+--------------------------------------+
  506. |<<<FsyncCount>>> | Total number of fsync
  507. *-------------------------------------+--------------------------------------+
  508. |<<<VolumeFailures>>> | Total number of volume failures occurred
  509. *-------------------------------------+--------------------------------------+
  510. |<<<ReadBlockOpNumOps>>> | Total number of read operations
  511. *-------------------------------------+--------------------------------------+
  512. |<<<ReadBlockOpAvgTime>>> | Average time of read operations in milliseconds
  513. *-------------------------------------+--------------------------------------+
  514. |<<<WriteBlockOpNumOps>>> | Total number of write operations
  515. *-------------------------------------+--------------------------------------+
  516. |<<<WriteBlockOpAvgTime>>> | Average time of write operations in milliseconds
  517. *-------------------------------------+--------------------------------------+
  518. |<<<BlockChecksumOpNumOps>>> | Total number of blockChecksum operations
  519. *-------------------------------------+--------------------------------------+
  520. |<<<BlockChecksumOpAvgTime>>> | Average time of blockChecksum operations in
  521. | milliseconds
  522. *-------------------------------------+--------------------------------------+
  523. |<<<CopyBlockOpNumOps>>> | Total number of block copy operations
  524. *-------------------------------------+--------------------------------------+
  525. |<<<CopyBlockOpAvgTime>>> | Average time of block copy operations in
  526. | milliseconds
  527. *-------------------------------------+--------------------------------------+
  528. |<<<ReplaceBlockOpNumOps>>> | Total number of block replace operations
  529. *-------------------------------------+--------------------------------------+
  530. |<<<ReplaceBlockOpAvgTime>>> | Average time of block replace operations in
  531. | milliseconds
  532. *-------------------------------------+--------------------------------------+
  533. |<<<HeartbeatsNumOps>>> | Total number of heartbeats
  534. *-------------------------------------+--------------------------------------+
  535. |<<<HeartbeatsAvgTime>>> | Average heartbeat time in milliseconds
  536. *-------------------------------------+--------------------------------------+
  537. |<<<BlockReportsNumOps>>> | Total number of block report operations
  538. *-------------------------------------+--------------------------------------+
  539. |<<<BlockReportsAvgTime>>> | Average time of block report operations in
  540. | milliseconds
  541. *-------------------------------------+--------------------------------------+
  542. |<<<CacheReportsNumOps>>> | Total number of cache report operations
  543. *-------------------------------------+--------------------------------------+
  544. |<<<CacheReportsAvgTime>>> | Average time of cache report operations in
  545. | milliseconds
  546. *-------------------------------------+--------------------------------------+
  547. |<<<PacketAckRoundTripTimeNanosNumOps>>> | Total number of ack round trip
  548. *-------------------------------------+--------------------------------------+
  549. |<<<PacketAckRoundTripTimeNanosAvgTime>>> | Average time from ack send to
  550. | | receive minus the downstream ack time in nanoseconds
  551. *-------------------------------------+--------------------------------------+
  552. |<<<FlushNanosNumOps>>> | Total number of flushes
  553. *-------------------------------------+--------------------------------------+
  554. |<<<FlushNanosAvgTime>>> | Average flush time in nanoseconds
  555. *-------------------------------------+--------------------------------------+
  556. |<<<FsyncNanosNumOps>>> | Total number of fsync
  557. *-------------------------------------+--------------------------------------+
  558. |<<<FsyncNanosAvgTime>>> | Average fsync time in nanoseconds
  559. *-------------------------------------+--------------------------------------+
  560. |<<<SendDataPacketBlockedOnNetworkNanosNumOps>>> | Total number of sending
  561. | packets
  562. *-------------------------------------+--------------------------------------+
  563. |<<<SendDataPacketBlockedOnNetworkNanosAvgTime>>> | Average waiting time of
  564. | | sending packets in nanoseconds
  565. *-------------------------------------+--------------------------------------+
  566. |<<<SendDataPacketTransferNanosNumOps>>> | Total number of sending packets
  567. *-------------------------------------+--------------------------------------+
  568. |<<<SendDataPacketTransferNanosAvgTime>>> | Average transfer time of sending
  569. | packets in nanoseconds
  570. *-------------------------------------+--------------------------------------+
  571. yarn context
  572. * ClusterMetrics
  573. ClusterMetrics shows the metrics of the YARN cluster from the
  574. ResourceManager's perspective. Each metrics record contains
  575. Hostname tag as additional information along with metrics.
  576. *-------------------------------------+--------------------------------------+
  577. || Name || Description
  578. *-------------------------------------+--------------------------------------+
  579. |<<<NumActiveNMs>>> | Current number of active NodeManagers
  580. *-------------------------------------+--------------------------------------+
  581. |<<<NumDecommissionedNMs>>> | Current number of decommissioned NodeManagers
  582. *-------------------------------------+--------------------------------------+
  583. |<<<NumLostNMs>>> | Current number of lost NodeManagers for not sending
  584. | heartbeats
  585. *-------------------------------------+--------------------------------------+
  586. |<<<NumUnhealthyNMs>>> | Current number of unhealthy NodeManagers
  587. *-------------------------------------+--------------------------------------+
  588. |<<<NumRebootedNMs>>> | Current number of rebooted NodeManagers
  589. *-------------------------------------+--------------------------------------+
  590. * QueueMetrics
  591. QueueMetrics shows an application queue from the
  592. ResourceManager's perspective. Each metrics record shows
  593. the statistics of each queue, and contains tags such as
  594. queue name and Hostname as additional information along with metrics.
  595. In <<<running_>>><num> metrics such as <<<running_0>>>, you can set the
  596. property <<<yarn.resourcemanager.metrics.runtime.buckets>>> in yarn-site.xml
  597. to change the buckets. The default values is <<<60,300,1440>>>.
  598. *-------------------------------------+--------------------------------------+
  599. || Name || Description
  600. *-------------------------------------+--------------------------------------+
  601. |<<<running_0>>> | Current number of running applications whose elapsed time are
  602. | less than 60 minutes
  603. *-------------------------------------+--------------------------------------+
  604. |<<<running_60>>> | Current number of running applications whose elapsed time are
  605. | between 60 and 300 minutes
  606. *-------------------------------------+--------------------------------------+
  607. |<<<running_300>>> | Current number of running applications whose elapsed time are
  608. | between 300 and 1440 minutes
  609. *-------------------------------------+--------------------------------------+
  610. |<<<running_1440>>> | Current number of running applications elapsed time are
  611. | more than 1440 minutes
  612. *-------------------------------------+--------------------------------------+
  613. |<<<AppsSubmitted>>> | Total number of submitted applications
  614. *-------------------------------------+--------------------------------------+
  615. |<<<AppsRunning>>> | Current number of running applications
  616. *-------------------------------------+--------------------------------------+
  617. |<<<AppsPending>>> | Current number of applications that have not yet been
  618. | assigned by any containers
  619. *-------------------------------------+--------------------------------------+
  620. |<<<AppsCompleted>>> | Total number of completed applications
  621. *-------------------------------------+--------------------------------------+
  622. |<<<AppsKilled>>> | Total number of killed applications
  623. *-------------------------------------+--------------------------------------+
  624. |<<<AppsFailed>>> | Total number of failed applications
  625. *-------------------------------------+--------------------------------------+
  626. |<<<AllocatedMB>>> | Current allocated memory in MB
  627. *-------------------------------------+--------------------------------------+
  628. |<<<AllocatedVCores>>> | Current allocated CPU in virtual cores
  629. *-------------------------------------+--------------------------------------+
  630. |<<<AllocatedContainers>>> | Current number of allocated containers
  631. *-------------------------------------+--------------------------------------+
  632. |<<<AggregateContainersAllocated>>> | Total number of allocated containers
  633. *-------------------------------------+--------------------------------------+
  634. |<<<AggregateContainersReleased>>> | Total number of released containers
  635. *-------------------------------------+--------------------------------------+
  636. |<<<AvailableMB>>> | Current available memory in MB
  637. *-------------------------------------+--------------------------------------+
  638. |<<<AvailableVCores>>> | Current available CPU in virtual cores
  639. *-------------------------------------+--------------------------------------+
  640. |<<<PendingMB>>> | Current pending memory resource requests in MB that are
  641. | not yet fulfilled by the scheduler
  642. *-------------------------------------+--------------------------------------+
  643. |<<<PendingVCores>>> | Current pending CPU allocation requests in virtual
  644. | cores that are not yet fulfilled by the scheduler
  645. *-------------------------------------+--------------------------------------+
  646. |<<<PendingContainers>>> | Current pending resource requests that are not
  647. | yet fulfilled by the scheduler
  648. *-------------------------------------+--------------------------------------+
  649. |<<<ReservedMB>>> | Current reserved memory in MB
  650. *-------------------------------------+--------------------------------------+
  651. |<<<ReservedVCores>>> | Current reserved CPU in virtual cores
  652. *-------------------------------------+--------------------------------------+
  653. |<<<ReservedContainers>>> | Current number of reserved containers
  654. *-------------------------------------+--------------------------------------+
  655. |<<<ActiveUsers>>> | Current number of active users
  656. *-------------------------------------+--------------------------------------+
  657. |<<<ActiveApplications>>> | Current number of active applications
  658. *-------------------------------------+--------------------------------------+
  659. |<<<FairShareMB>>> | (FairScheduler only) Current fair share of memory in MB
  660. *-------------------------------------+--------------------------------------+
  661. |<<<FairShareVCores>>> | (FairScheduler only) Current fair share of CPU in
  662. | virtual cores
  663. *-------------------------------------+--------------------------------------+
  664. |<<<MinShareMB>>> | (FairScheduler only) Minimum share of memory in MB
  665. *-------------------------------------+--------------------------------------+
  666. |<<<MinShareVCores>>> | (FairScheduler only) Minimum share of CPU in virtual
  667. | cores
  668. *-------------------------------------+--------------------------------------+
  669. |<<<MaxShareMB>>> | (FairScheduler only) Maximum share of memory in MB
  670. *-------------------------------------+--------------------------------------+
  671. |<<<MaxShareVCores>>> | (FairScheduler only) Maximum share of CPU in virtual
  672. | cores
  673. *-------------------------------------+--------------------------------------+
  674. * NodeManagerMetrics
  675. NodeManagerMetrics shows the statistics of the containers in the node.
  676. Each metrics record contains Hostname tag as additional information
  677. along with metrics.
  678. *-------------------------------------+--------------------------------------+
  679. || Name || Description
  680. *-------------------------------------+--------------------------------------+
  681. |<<<containersLaunched>>> | Total number of launched containers
  682. *-------------------------------------+--------------------------------------+
  683. |<<<containersCompleted>>> | Total number of successfully completed containers
  684. *-------------------------------------+--------------------------------------+
  685. |<<<containersFailed>>> | Total number of failed containers
  686. *-------------------------------------+--------------------------------------+
  687. |<<<containersKilled>>> | Total number of killed containers
  688. *-------------------------------------+--------------------------------------+
  689. |<<<containersIniting>>> | Current number of initializing containers
  690. *-------------------------------------+--------------------------------------+
  691. |<<<containersRunning>>> | Current number of running containers
  692. *-------------------------------------+--------------------------------------+
  693. |<<<allocatedContainers>>> | Current number of allocated containers
  694. *-------------------------------------+--------------------------------------+
  695. |<<<allocatedGB>>> | Current allocated memory in GB
  696. *-------------------------------------+--------------------------------------+
  697. |<<<availableGB>>> | Current available memory in GB
  698. *-------------------------------------+--------------------------------------+
  699. ugi context
  700. * UgiMetrics
  701. UgiMetrics is related to user and group information.
  702. Each metrics record contains Hostname tag as additional information
  703. along with metrics.
  704. *-------------------------------------+--------------------------------------+
  705. || Name || Description
  706. *-------------------------------------+--------------------------------------+
  707. |<<<LoginSuccessNumOps>>> | Total number of successful kerberos logins
  708. *-------------------------------------+--------------------------------------+
  709. |<<<LoginSuccessAvgTime>>> | Average time for successful kerberos logins in
  710. | milliseconds
  711. *-------------------------------------+--------------------------------------+
  712. |<<<LoginFailureNumOps>>> | Total number of failed kerberos logins
  713. *-------------------------------------+--------------------------------------+
  714. |<<<LoginFailureAvgTime>>> | Average time for failed kerberos logins in
  715. | milliseconds
  716. *-------------------------------------+--------------------------------------+
  717. |<<<getGroupsNumOps>>> | Total number of group resolutions
  718. *-------------------------------------+--------------------------------------+
  719. |<<<getGroupsAvgTime>>> | Average time for group resolution in milliseconds
  720. *-------------------------------------+--------------------------------------+
  721. |<<<getGroups>>><num><<<sNumOps>>> |
  722. | | Total number of group resolutions (<num> seconds granularity). <num> is
  723. | | specified by <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  724. *-------------------------------------+--------------------------------------+
  725. |<<<getGroups>>><num><<<s50thPercentileLatency>>> |
  726. | | Shows the 50th percentile of group resolution time in milliseconds
  727. | | (<num> seconds granularity). <num> is specified by
  728. | | <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  729. *-------------------------------------+--------------------------------------+
  730. |<<<getGroups>>><num><<<s75thPercentileLatency>>> |
  731. | | Shows the 75th percentile of group resolution time in milliseconds
  732. | | (<num> seconds granularity). <num> is specified by
  733. | | <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  734. *-------------------------------------+--------------------------------------+
  735. |<<<getGroups>>><num><<<s90thPercentileLatency>>> |
  736. | | Shows the 90th percentile of group resolution time in milliseconds
  737. | | (<num> seconds granularity). <num> is specified by
  738. | | <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  739. *-------------------------------------+--------------------------------------+
  740. |<<<getGroups>>><num><<<s95thPercentileLatency>>> |
  741. | | Shows the 95th percentile of group resolution time in milliseconds
  742. | | (<num> seconds granularity). <num> is specified by
  743. | | <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  744. *-------------------------------------+--------------------------------------+
  745. |<<<getGroups>>><num><<<s99thPercentileLatency>>> |
  746. | | Shows the 99th percentile of group resolution time in milliseconds
  747. | | (<num> seconds granularity). <num> is specified by
  748. | | <<<hadoop.user.group.metrics.percentiles.intervals>>>.
  749. *-------------------------------------+--------------------------------------+
  750. metricssystem context
  751. * MetricsSystem
  752. MetricsSystem shows the statistics for metrics snapshots and publishes.
  753. Each metrics record contains Hostname tag as additional information
  754. along with metrics.
  755. *-------------------------------------+--------------------------------------+
  756. || Name || Description
  757. *-------------------------------------+--------------------------------------+
  758. |<<<NumActiveSources>>> | Current number of active metrics sources
  759. *-------------------------------------+--------------------------------------+
  760. |<<<NumAllSources>>> | Total number of metrics sources
  761. *-------------------------------------+--------------------------------------+
  762. |<<<NumActiveSinks>>> | Current number of active sinks
  763. *-------------------------------------+--------------------------------------+
  764. |<<<NumAllSinks>>> | Total number of sinks \
  765. | (BUT usually less than <<<NumActiveSinks>>>,
  766. | see {{{https://issues.apache.org/jira/browse/HADOOP-9946}HADOOP-9946}})
  767. *-------------------------------------+--------------------------------------+
  768. |<<<SnapshotNumOps>>> | Total number of operations to snapshot statistics from
  769. | a metrics source
  770. *-------------------------------------+--------------------------------------+
  771. |<<<SnapshotAvgTime>>> | Average time in milliseconds to snapshot statistics
  772. | from a metrics source
  773. *-------------------------------------+--------------------------------------+
  774. |<<<PublishNumOps>>> | Total number of operations to publish statistics to a
  775. | sink
  776. *-------------------------------------+--------------------------------------+
  777. |<<<PublishAvgTime>>> | Average time in milliseconds to publish statistics to
  778. | a sink
  779. *-------------------------------------+--------------------------------------+
  780. |<<<DroppedPubAll>>> | Total number of dropped publishes
  781. *-------------------------------------+--------------------------------------+
  782. |<<<Sink_>>><instance><<<NumOps>>> | Total number of sink operations for the
  783. | <instance>
  784. *-------------------------------------+--------------------------------------+
  785. |<<<Sink_>>><instance><<<AvgTime>>> | Average time in milliseconds of sink
  786. | operations for the <instance>
  787. *-------------------------------------+--------------------------------------+
  788. |<<<Sink_>>><instance><<<Dropped>>> | Total number of dropped sink operations
  789. | for the <instance>
  790. *-------------------------------------+--------------------------------------+
  791. |<<<Sink_>>><instance><<<Qsize>>> | Current queue length of sink operations \
  792. | (BUT always set to 0 because nothing to
  793. | increment this metrics, see
  794. | {{{https://issues.apache.org/jira/browse/HADOOP-9941}HADOOP-9941}})
  795. *-------------------------------------+--------------------------------------+
  796. default context
  797. * StartupProgress
  798. StartupProgress metrics shows the statistics of NameNode startup.
  799. Four metrics are exposed for each startup phase based on its name.
  800. The startup <phase>s are <<<LoadingFsImage>>>, <<<LoadingEdits>>>,
  801. <<<SavingCheckpoint>>>, and <<<SafeMode>>>.
  802. Each metrics record contains Hostname tag as additional information
  803. along with metrics.
  804. *-------------------------------------+--------------------------------------+
  805. || Name || Description
  806. *-------------------------------------+--------------------------------------+
  807. |<<<ElapsedTime>>> | Total elapsed time in milliseconds
  808. *-------------------------------------+--------------------------------------+
  809. |<<<PercentComplete>>> | Current rate completed in NameNode startup progress \
  810. | (The max value is not 100 but 1.0)
  811. *-------------------------------------+--------------------------------------+
  812. |<phase><<<Count>>> | Total number of steps completed in the phase
  813. *-------------------------------------+--------------------------------------+
  814. |<phase><<<ElapsedTime>>> | Total elapsed time in the phase in milliseconds
  815. *-------------------------------------+--------------------------------------+
  816. |<phase><<<Total>>> | Total number of steps in the phase
  817. *-------------------------------------+--------------------------------------+
  818. |<phase><<<PercentComplete>>> | Current rate completed in the phase \
  819. | (The max value is not 100 but 1.0)
  820. *-------------------------------------+--------------------------------------+