123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License. See accompanying LICENSE file.
- ---
- Offline Edits Viewer Guide
- ---
- Erik Steffl
- ---
- ${maven.build.timestamp}
- Offline Edits Viewer Guide
- %{toc|section=1|fromDepth=0}
- * Overview
- Offline Edits Viewer is a tool to parse the Edits log file. The current
- processors are mostly useful for conversion between different formats,
- including XML which is human readable and easier to edit than native
- binary format.
- The tool can parse the edits formats -18 (roughly Hadoop 0.19) and
- later. The tool operates on files only, it does not need Hadoop cluster
- to be running.
- Input formats supported:
- [[1]] <<binary>>: native binary format that Hadoop uses internally
- [[2]] <<xml>>: XML format, as produced by xml processor, used if filename
- has <<<.xml>>> (case insensitive) extension
- The Offline Edits Viewer provides several output processors (unless
- stated otherwise the output of the processor can be converted back to
- original edits file):
- [[1]] <<binary>>: native binary format that Hadoop uses internally
- [[2]] <<xml>>: XML format
- [[3]] <<stats>>: prints out statistics, this cannot be converted back to
- Edits file
- * Usage
- ----
- bash$ bin/hdfs oev -i edits -o edits.xml
- ----
- *-----------------------:-----------------------------------+
- | Flag | Description |
- *-----------------------:-----------------------------------+
- |[<<<-i>>> ; <<<--inputFile>>>] <input file> | Specify the input edits log file to
- | | process. Xml (case insensitive) extension means XML format otherwise
- | | binary format is assumed. Required.
- *-----------------------:-----------------------------------+
- |[<<-o>> ; <<--outputFile>>] <output file> | Specify the output filename, if the
- | | specified output processor generates one. If the specified file already
- | | exists, it is silently overwritten. Required.
- *-----------------------:-----------------------------------+
- |[<<-p>> ; <<--processor>>] <processor> | Specify the image processor to apply
- | | against the image file. Currently valid options are
- | | <<<binary>>>, <<<xml>>> (default) and <<<stats>>>.
- *-----------------------:-----------------------------------+
- |<<[-v ; --verbose] >> | Print the input and output filenames and pipe output of
- | | processor to console as well as specified file. On extremely large
- | | files, this may increase processing time by an order of magnitude.
- *-----------------------:-----------------------------------+
- |<<[-h ; --help] >> | Display the tool usage and help information and exit.
- *-----------------------:-----------------------------------+
- * Case study: Hadoop cluster recovery
- In case there is some problem with hadoop cluster and the edits file is
- corrupted it is possible to save at least part of the edits file that
- is correct. This can be done by converting the binary edits to XML,
- edit it manually and then convert it back to binary. The most common
- problem is that the edits file is missing the closing record (record
- that has opCode -1). This should be recognized by the tool and the XML
- format should be properly closed.
- If there is no closing record in the XML file you can add one after
- last correct record. Anything after the record with opCode -1 is
- ignored.
- Example of a closing record (with opCode -1):
- +----
- <RECORD>
- <OPCODE>-1</OPCODE>
- <DATA>
- </DATA>
- </RECORD>
- +----
|