Browse Source

HADOOP-3588. Fixed usability bugs for archives. (mahadev)

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@672376 13f79535-47bb-0310-9956-ffa450edef68
Mahadev Konar 17 năm trước cách đây
mục cha
commit
084576d986

+ 2 - 0
CHANGES.txt

@@ -672,6 +672,8 @@ Release 0.18.0 - Unreleased
 
     HADOOP-3480.  Need to update Eclipse template to reflect current trunk.
     (Brice Arnould via tomwhite)
+  
+    HADOOP-3588. Fixed usability issues with archives. (mahadev)
 
 Release 0.17.1 - Unreleased
 

+ 2 - 1
docs/changes.html

@@ -310,7 +310,7 @@ via the DistributedCache.<br />(Amareshwari Sriramadasu via ddas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.18.0_-_unreleased_._optimizations_')">  OPTIMIZATIONS
-</a>&nbsp;&nbsp;&nbsp;(9)
+</a>&nbsp;&nbsp;&nbsp;(10)
     <ol id="release_0.18.0_-_unreleased_._optimizations_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3274">HADOOP-3274</a>. The default constructor of BytesWritable creates empty
 byte array. (Tsz Wo (Nicholas), SZE via shv)
@@ -329,6 +329,7 @@ DataNodes take 30% less CPU while writing data.<br />(rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3095">HADOOP-3095</a>. Speed up split generation in the FileInputSplit,
 especially for non-HDFS file systems. Deprecates
 InputFormat.validateInput.<br />(tomwhite via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3552">HADOOP-3552</a>. Add forrest documentation for Hadoop commands.<br />(Sharad Agarwal via cdouglas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.18.0_-_unreleased_._bug_fixes_')">  BUG FIXES

+ 1 - 1
docs/hadoop-default.html

@@ -594,7 +594,7 @@ creations/deletions), or "all".</td>
   </td>
 </tr>
 <tr>
-<td><a name="map.sort.class">map.sort.class</a></td><td>org.apache.hadoop.mapred.MergeSorter</td><td>The default sort class for sorting keys.
+<td><a name="map.sort.class">map.sort.class</a></td><td>org.apache.hadoop.util.QuickSort</td><td>The default sort class for sorting keys.
   </td>
 </tr>
 <tr>

+ 21 - 2
docs/hadoop_archives.html

@@ -210,7 +210,7 @@ document.write("Last Published: " + document.lastModified);
         maps to a FileSystem directory. A Hadoop archive always has a *.har
         extension. A Hadoop archive directory contains metadata (in the form 
         of _index and _masterindex) and data (part-*) files. The _index file contains
-        the name of the files that are part of the archive and there location
+        the name of the files that are part of the archive and the location
         within the part files. 
         </p>
 </div>
@@ -245,7 +245,7 @@ document.write("Last Published: " + document.lastModified);
 <h2 class="h3"> How to look up files in archives? </h2>
 <div class="section">
 <p>
-        The archives exposes itself as a filesystem layer. So all the fs shell commands in the archives work but 
+        The archive exposes itself as a filesystem layer. So all the fs shell commands in the archives work but 
         with a different URI. Also, note that archives are immutable. So, rename's, deletes and creates return an error. 
         URI for Hadoop Archives is 
         </p>
@@ -260,6 +260,25 @@ document.write("Last Published: " + document.lastModified);
 <span class="codefrag">
         har:///archivepath/fileinarchive</span>
 </p>
+<p>
+        Here is an example of archive. The input to the archives is /dir. The directory dir contains 
+        files filea, fileb. To archive /dir to /user/hadoop/foo.har, the command is 
+        </p>
+<p>
+<span class="codefrag">hadoop archive -archiveName foo.har /dir /user/hadoop</span>
+        
+</p>
+<p>
+        To get file listing for files in the created archive 
+        </p>
+<p>
+<span class="codefrag">hadoop dfs -lsr har:///user/hadoop/foo.har</span>
+</p>
+<p>To cat filea in archive -
+        </p>
+<p>
+<span class="codefrag">hadoop dfs -cat har:///user/hadoop/foo.har/dir/filea</span>
+</p>
 </div>
 	
 </div>

+ 61 - 44
docs/hadoop_archives.pdf

@@ -58,10 +58,10 @@ endobj
 >>
 endobj
 14 0 obj
-<< /Length 1683 /Filter [ /ASCII85Decode /FlateDecode ]
+<< /Length 1800 /Filter [ /ASCII85Decode /FlateDecode ]
  >>
 stream
-GatU4D/ZI5'`O\2_7-i&6GU72RdSZeXh1.2X6hfHe$Ml6"&<8@0XIHep?Q>m`5c7E?)N(lET"JocT0;Jc!kf7=fu[)en2C1XN-$13TB(&CIanK^1Y&aBPgB-pC#Te[$E-(c7aXkE$H[QL_N&2PgUKLQt6"+iSRsFIor0^(Z0-HIg[ctV&pJd(!jAL0XUc<$d5L?L(S45pgs!0F"jh'e2sTiLKrSpHJ_Cc%6j?O,WRL<_2bg!NMq=KKR@i!7!VVg)OPCjGZ9.*bmSQAi!h,jbc&oLZ&OM%d[ZGqU$V8<GAsh-#N/F5\^mN+42XaLjLHEKcY<&AZR<5)nHJ6oG)#c)=uA:CJ5[b.E/GUmgVZHdJWsb7/Y83j:4E4!Y>Xp">c^im@u3YFg$t2[9>a+q-YgkojV</j-cu_lACJ)dM)TVFG_gS/<i88Vo5#i&<k'FOB[ro#DSb4hJF,[LKu/+:_*!QEU!A6:\.P8V:\K'tFVWD:-B.>2aGaS*F_o:n;#t2Z<LFZXNe;Xp#aFEQh<PZ@0S15iXBEbZl3bnlN]]-rQQ&@O;RgMEkH;T0D/U]NcQ#04CLK)B#'^Wn)J#93iF"[9I-b'G33<nZ)pXUj=$TLeDr.&5BAc`rT^kMG'/N!No)IC1ZT^Fs(:4Bb$d_pCre8!a2TSRbDWD\%28n!4m8aTl?U/]r#/?q?A&Bb"otg2meXp7R.0#-4'f1,8[uG*Z*>("7&;cf%l[/g'm'Agr[.+L]7:WuWb.[j;i;JtNE)_!GQ4bF'D4u\`1W$bkAu'\2<7-PJ2;nco/.=rdGk'n(ZF,5fJK9]#I\[PB.PD-d!eKP2.JXV]6gk[[gB?a;Mf)83qi3P-8$d45p&Zot6;jq0-1m3,hJ#mI,=t-ld<:^M%lDj<C4],jh9hsiqn.53A8mRTlOt9VB[R\IhMI9L8;dS3jh[tYKWTQ;/(&OC2Esq^\R>7<RO\2ida^?mp#8fBmEXA(>7.?>7V!G#[I[M`nd3)8JFds@K0[ZiDY$PIOXIuAklDetBu%Gf:7^j:HA7,6D-:?'ngb'pQf73AL.4JD1K)Y2gm[_!7fm7oO\4NYZ`+.5^INa!G'WVr2A([`Dp*LPnkr9k=M([_C"F0lr/Jo6\F\;ABrs\8mD%dUjuHiHg?@`p)&/;c/iK46?u%H$;KVk:hueO)EY!(nYhR\C)&V9LJ^5XL/-VUR$.KTsY'JAtghb#YW"X@/K'$$F.L7rWG<<'3=5_cXTC1Xg;fMT@ddkE:G32PiQY"q0D*aCK!o:^&6*+F5E0QN;1ksAr1)5hC/(?DFK8_W=1Z4-E7U]jsoPidi$r=(%2m[IEIW"!b!Hrs7>4Bi1C;lBqP[p1ZhK7^`f9A.n+dLlh[HdTcCbm=8?QK9i>MEgJE4:IR"2/q9WMboJdAK*,-lSD=o<pKY(p1;jZJo8"B96YH3[m-!k/<ouWQ_TKB\ENAOs*ful?E1_"`aU9q?"@)(s%R<IfWi(,eWVV.rmf-7I02f8E(DoN^<kq8.oM.-eZl`=0)[)0Q<gtUS),<h8k]Ge7Kh>H*M")PRn9NIhihb`?54G]2"sua,X$X#N3?W=7!9Z0X,"nGjk@+^8(@PjcA*"=Adj"oDo,uP`Xon;B!We(N*uX5l%#>Rf,J]rQ-W+c#_[mD=C`TR([IGc%/R,%"FF-:ZUqNG16!'~>
+Gatm<?#SIU'Rf_Zd+a_VQ6khF_9a>GQ8JbFClO?]=GDBmBhPKPWh^?Lqt6^8ah3kZ36d2jKXA[Em`.?^;X[Ym`GK.r^?o%'mEDhY1Y6ZcT5-#!Yo!8.m<YJ6.]bie>l2s<US-GXpasIm>[^+_i]Z-p>/;;E1j^MG]1_HcC\?0CLXkaLYgQ_OERn:_Hohg*b<N]pO'4B!=R4YDZZe":^iE9W:^)VZgQ+C3-DLd(]HTq%F1JQPXAD(Fh?P[jS97\H'Y"4ACC2,Js5%?J\NfSP'RaLkn!r$"]m#s)Oeel@n4GZ"hr=e=P[^daFChA;aX)548]`bl^=)?S[a)Wq);%^Q?UDpSB-dOXO;`%V%Br!#A0hLq9%;qD(#@%4s2rInGleQ#%J[3D,5CO"aMXk5TeV?1Z[.'(+5_r4_E(i-cntaAXr[99Na]=caON*cMROlaLu)>n-nR"FRusqW9.r8D>#d7Ob+gskE%U?Emd.8@D=aZ^9,QDl*uF#M("NrXirMu_eAJrOFr]#&L-SYne%M?6#pZr\S=4\=(9RWhmDqYIE4JhV,=T//HK>Lhd.RZHrP\5d>RRU+e!%,9<POi>ml0D-cA/P6qV>%PIEPdap'Rl[K\(-<iA2m^00"6?qiR:1-J<l"b=NEhBEQD12E<1c4SqgD9,lA,O>l``M[D4SFR31(:gXcU3B4dM[+3#JXnC!1<'<Ib$;J6,?PPA"=Z*UdLE.8[[h$TK3c2-QGP5a:Ufr6OnY3kW4_:7:*UnNq)qZQC,0L]G]-^TbWYj]s:D6;^.B9>bX;pI,BGH8tor@e7=kQtKK>7s$7mBhkG#]926ik=aI?4:DK9l7oCmkKspGZBI%dtmu`#d19nbY,\-jeR?d1m35ld9E5iZfm;8`$h&:erq!P\P_BC#.3"Vu*RN)'V'%kOhE,$o7(*!Uh:(YE["taDte<E5ECHBa[d]Xu8I=,dh%X"r\TW&#G2*dd"fFAD<Z[N7Sb9a$e!*i3hBC#PgdG*7/soX[-pMKPo70H''JrV5XKd8nl@1T9$gt5s;![S2MoQcM/]>lq;?/K<A'Jmu.tQkdSRFd#g`LVkhgDoVAP(ddNR<3BOlu@[[S?\?k;1A9c3F"fgBXUTj%tH2^"i$/TsNpb^R?0>*Bl%[at2cZX`a-R`%-estf/BJA4<FQJbI"+`=M&$K3Af[]d7(uQiocR#qH"J#tke*=BEG0j,lCPQl-%<gQXg/M9+E-@F^[tKkV4T"C'*?9BG:>[H7DsV=R0#Umi1c8V&7#<-J>JLf]q@*[6)j6E=i4>;P_SL@`R3ld4Oh+s*=qSb&bcL:T2B5!@nQkd!iblc^[2UC$mLPtEls)1Z4@r+1jsfD7li)#q=hX#dg!3cZ[N<CA1R.Y*4TKb#)\!#>[P2]o(69g88l@D?Qu*OkL+28t.?"p0,HdOf58_<V"BfS:ap<V[0q>CWVB8hqAV!/s&?9ttU]-Ya3Ijq3JmP&X2tgd=^*h,9Em<A@RKsV/dfc]<<Q@sCYMoaPd@k9$\T+dDG#bdN<GhN40MRKFJ^PY5Ncp9:W9Z;D5ptkr^j<2L&Z'F<D[\qr"o2rTm.u6Js4cnmH)$cQ:N2C1gHJ$c8##/EfSWq[G0t@.\M=eMAeZGf=L'Hdhd\`d`&nqAI7&0o'X[7P41&LACblX2]\'RE"r-_WJ&+4@(dnA1lZ=:llki0?!'ULsipsGb"&$i</onS!^8KM4A\7S:cV`K;iW04c>Tq.EjJoE3DK/V6"Kq4X[bgb\W8"kZ6E6't6Q<\hp#g44[*?FULdXAJ#tr5)?c>Hi!<~>
 endstream
 endobj
 15 0 obj
@@ -72,64 +72,79 @@ endobj
 /Contents 14 0 R
 >>
 endobj
+16 0 obj
+<< /Length 492 /Filter [ /ASCII85Decode /FlateDecode ]
+ >>
+stream
+Gasak9iHZu&A@ZcEi]N:<C4K*lW>0'3c\.0&HZJI8I\Y>IrG*qPA6XL9UNIeG2M'!>A(cp(;'kT<-I(#.>A\XEGYXe)tm\Q_#tS]?A"#@SHW]E!29*/.g%m\@J^Wj,r,g;*.dGc4=r=:!CAg_=]D\NTN$nmW)6j(g<)M.k4iiWTPE;sYIuErOeS1r$ODsJlGMJ'!SO#DSK:`P.p`i*m1n6Id!BJC`-A@\Paqnbdi!pU@4i-Y]1s8$V4ge&BYt'Ep'2hYOIhn</q2#8SF6:$ffft=CQ]tm`;aUn&ok?O<%*B2NAEKs$&j=1F0^IPVA&XiSsj[0;]\lA=A@%^F.(F6K<<.(JiVn.2PQ`\)gkcUI7R*GVUhntk.;Fp4CVN@C'gJ+$kpc.k#n*+#1Jq?l[2%@<9d(o^Po2XnVMKG9]qg!iXj(r4-WIrSA8kiEZ[O2ZMtWYMer$g/OZda$b`pEEF@kq4=olpI)lB#_0lD;\)VZ~>
+endstream
+endobj
 17 0 obj
+<< /Type /Page
+/Parent 1 0 R
+/MediaBox [ 0 0 612 792 ]
+/Resources 3 0 R
+/Contents 16 0 R
+>>
+endobj
+19 0 obj
 <<
  /Title (\376\377\0\61\0\40\0\127\0\150\0\141\0\164\0\40\0\141\0\162\0\145\0\40\0\110\0\141\0\144\0\157\0\157\0\160\0\40\0\141\0\162\0\143\0\150\0\151\0\166\0\145\0\163\0\77)
- /Parent 16 0 R
- /Next 18 0 R
+ /Parent 18 0 R
+ /Next 20 0 R
  /A 9 0 R
 >> endobj
-18 0 obj
+20 0 obj
 <<
  /Title (\376\377\0\62\0\40\0\110\0\157\0\167\0\40\0\164\0\157\0\40\0\143\0\162\0\145\0\141\0\164\0\145\0\40\0\141\0\156\0\40\0\141\0\162\0\143\0\150\0\151\0\166\0\145\0\77)
- /Parent 16 0 R
- /Prev 17 0 R
- /Next 19 0 R
+ /Parent 18 0 R
+ /Prev 19 0 R
+ /Next 21 0 R
  /A 11 0 R
 >> endobj
-19 0 obj
+21 0 obj
 <<
  /Title (\376\377\0\63\0\40\0\110\0\157\0\167\0\40\0\164\0\157\0\40\0\154\0\157\0\157\0\153\0\40\0\165\0\160\0\40\0\146\0\151\0\154\0\145\0\163\0\40\0\151\0\156\0\40\0\141\0\162\0\143\0\150\0\151\0\166\0\145\0\163\0\77)
- /Parent 16 0 R
- /Prev 18 0 R
+ /Parent 18 0 R
+ /Prev 20 0 R
  /A 13 0 R
 >> endobj
-20 0 obj
+22 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F3
 /BaseFont /Helvetica-Bold
 /Encoding /WinAnsiEncoding >>
 endobj
-21 0 obj
+23 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F5
 /BaseFont /Times-Roman
 /Encoding /WinAnsiEncoding >>
 endobj
-22 0 obj
+24 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F1
 /BaseFont /Helvetica
 /Encoding /WinAnsiEncoding >>
 endobj
-23 0 obj
+25 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F9
 /BaseFont /Courier
 /Encoding /WinAnsiEncoding >>
 endobj
-24 0 obj
+26 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F2
 /BaseFont /Helvetica-Oblique
 /Encoding /WinAnsiEncoding >>
 endobj
-25 0 obj
+27 0 obj
 << /Type /Font
 /Subtype /Type1
 /Name /F7
@@ -138,19 +153,19 @@ endobj
 endobj
 1 0 obj
 << /Type /Pages
-/Count 2
-/Kids [6 0 R 15 0 R ] >>
+/Count 3
+/Kids [6 0 R 15 0 R 17 0 R ] >>
 endobj
 2 0 obj
 << /Type /Catalog
 /Pages 1 0 R
- /Outlines 16 0 R
+ /Outlines 18 0 R
  /PageMode /UseOutlines
  >>
 endobj
 3 0 obj
 << 
-/Font << /F3 20 0 R /F5 21 0 R /F1 22 0 R /F9 23 0 R /F2 24 0 R /F7 25 0 R >> 
+/Font << /F3 22 0 R /F5 23 0 R /F1 24 0 R /F9 25 0 R /F2 26 0 R /F7 27 0 R >> 
 /ProcSet [ /PDF /ImageC /Text ] >> 
 endobj
 9 0 obj
@@ -171,45 +186,47 @@ endobj
 /D [15 0 R /XYZ 85.0 345.532 null]
 >>
 endobj
-16 0 obj
+18 0 obj
 <<
- /First 17 0 R
- /Last 19 0 R
+ /First 19 0 R
+ /Last 21 0 R
 >> endobj
 xref
-0 26
+0 28
 0000000000 65535 f 
-0000004524 00000 n 
-0000004589 00000 n 
-0000004681 00000 n 
+0000005333 00000 n 
+0000005405 00000 n 
+0000005497 00000 n 
 0000000015 00000 n 
 0000000071 00000 n 
 0000000640 00000 n 
 0000000760 00000 n 
 0000000799 00000 n 
-0000004815 00000 n 
+0000005631 00000 n 
 0000000934 00000 n 
-0000004878 00000 n 
+0000005694 00000 n 
 0000001070 00000 n 
-0000004944 00000 n 
+0000005760 00000 n 
 0000001207 00000 n 
-0000002983 00000 n 
-0000005010 00000 n 
-0000003091 00000 n 
-0000003328 00000 n 
-0000003579 00000 n 
-0000003862 00000 n 
-0000003975 00000 n 
-0000004085 00000 n 
-0000004193 00000 n 
-0000004299 00000 n 
-0000004415 00000 n 
+0000003100 00000 n 
+0000003208 00000 n 
+0000003792 00000 n 
+0000005826 00000 n 
+0000003900 00000 n 
+0000004137 00000 n 
+0000004388 00000 n 
+0000004671 00000 n 
+0000004784 00000 n 
+0000004894 00000 n 
+0000005002 00000 n 
+0000005108 00000 n 
+0000005224 00000 n 
 trailer
 <<
-/Size 26
+/Size 28
 /Root 2 0 R
 /Info 4 0 R
 >>
 startxref
-5061
+5877
 %%EOF

+ 10 - 10
src/core/org/apache/hadoop/fs/HarFileSystem.java

@@ -576,7 +576,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public FSDataOutputStream create(Path f, int bufferSize) 
                                     throws IOException {
-    throw new IOException("Har: Create not implemented");
+    throw new IOException("Har: Create not allowed");
   }
   
   public FSDataOutputStream create(Path f,
@@ -586,7 +586,7 @@ public class HarFileSystem extends FilterFileSystem {
       short replication,
       long blockSize,
       Progressable progress) throws IOException {
-    throw new IOException("Har: create not implemented.");
+    throw new IOException("Har: create not allowed.");
   }
   
   @Override
@@ -606,7 +606,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   @Override
   public boolean setReplication(Path src, short replication) throws IOException{
-    throw new IOException("Har: setreplication not implemented");
+    throw new IOException("Har: setreplication not allowed");
   }
   
   /**
@@ -614,7 +614,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   @Override
   public boolean delete(Path f, boolean recursive) throws IOException { 
-    throw new IOException("Har: delete not implemented");
+    throw new IOException("Har: delete not allowed");
   }
   
   /**
@@ -667,7 +667,7 @@ public class HarFileSystem extends FilterFileSystem {
    * not implemented.
    */
   public boolean mkdirs(Path f, FsPermission permission) throws IOException {
-    throw new IOException("Har: mkdirs not implemented");
+    throw new IOException("Har: mkdirs not allowed");
   }
   
   /**
@@ -675,7 +675,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public void copyFromLocalFile(boolean delSrc, Path src, Path dst) throws 
         IOException {
-    throw new IOException("Har: copyfromlocalfile not implemented");
+    throw new IOException("Har: copyfromlocalfile not allowed");
   }
   
   /**
@@ -691,7 +691,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile) 
     throws IOException {
-    throw new IOException("Har: startLocalOutput not implemented");
+    throw new IOException("Har: startLocalOutput not allowed");
   }
   
   /**
@@ -699,7 +699,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile) 
     throws IOException {
-    throw new IOException("Har: completeLocalOutput not implemented");
+    throw new IOException("Har: completeLocalOutput not allowed");
   }
   
   /**
@@ -707,7 +707,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public void setOwner(Path p, String username, String groupname)
     throws IOException {
-    throw new IOException("Har: setowner not implemented");
+    throw new IOException("Har: setowner not allowed");
   }
 
   /**
@@ -715,7 +715,7 @@ public class HarFileSystem extends FilterFileSystem {
    */
   public void setPermission(Path p, FsPermission permisssion) 
     throws IOException {
-    throw new IOException("Har: setPermission not implemented");
+    throw new IOException("Har: setPermission not allowed");
   }
   
   /**

+ 13 - 2
src/docs/src/documentation/content/xdocs/hadoop_archives.xml

@@ -27,7 +27,7 @@
         maps to a FileSystem directory. A Hadoop archive always has a *.har
         extension. A Hadoop archive directory contains metadata (in the form 
         of _index and _masterindex) and data (part-*) files. The _index file contains
-        the name of the files that are part of the archive and there location
+        the name of the files that are part of the archive and the location
         within the part files. 
         </p>
         </section>
@@ -53,7 +53,7 @@
         <section>
         <title> How to look up files in archives? </title>
         <p>
-        The archives exposes itself as a filesystem layer. So all the fs shell commands in the archives work but 
+        The archive exposes itself as a filesystem layer. So all the fs shell commands in the archives work but 
         with a different URI. Also, note that archives are immutable. So, rename's, deletes and creates return an error. 
         URI for Hadoop Archives is 
         </p><p><code>har://scheme-hostname:port/archivepath/fileinarchive</code></p><p>
@@ -61,6 +61,17 @@
         In that case the URI would look like 
         </p><p><code>
         har:///archivepath/fileinarchive</code></p>
+        <p>
+        Here is an example of archive. The input to the archives is /dir. The directory dir contains 
+        files filea, fileb. To archive /dir to /user/hadoop/foo.har, the command is 
+        </p>
+        <p><code>hadoop archive -archiveName foo.har /dir /user/hadoop</code>
+        </p><p>
+        To get file listing for files in the created archive 
+        </p>
+        <p><code>hadoop dfs -lsr har:///user/hadoop/foo.har</code></p>
+        <p>To cat filea in archive -
+        </p><p><code>hadoop dfs -cat har:///user/hadoop/foo.har/dir/filea</code></p>
         </section>
 	</body>
 </document>

+ 26 - 2
src/test/org/apache/hadoop/fs/TestHarFileSystem.java

@@ -123,14 +123,38 @@ public class TestHarFileSystem extends TestCase {
     out.close();
     Configuration conf = mapred.createJobConf();
     HadoopArchives har = new HadoopArchives(conf);
-    String[] args = new String[4];
+    String[] args = new String[3];
+    //check for destination not specfied
     args[0] = "-archiveName";
     args[1] = "foo.har";
     args[2] = inputPath.toString();
-    args[3] = archivePath.toString();
     int ret = ToolRunner.run(har, args);
+    assertTrue(ret != 0);
+    args = new String[4];
+    //check for wrong archiveName
+    args[0] = "-archiveName";
+    args[1] = "/d/foo.har";
+    args[2] = inputPath.toString();
+    args[3] = archivePath.toString();
+    ret = ToolRunner.run(har, args);
+    assertTrue(ret != 0);
+//  se if dest is a file 
+    args[1] = "foo.har";
+    args[3] = filec.toString();
+    ret = ToolRunner.run(har, args);
+    assertTrue(ret != 0);
+    //this is a valid run
+    args[0] = "-archiveName";
+    args[1] = "foo.har";
+    args[2] = inputPath.toString();
+    args[3] = archivePath.toString();
+    ret = ToolRunner.run(har, args);
     //checl for the existenece of the archive
     assertTrue(ret == 0);
+    ///try running it again. it should not 
+    // override the directory
+    ret = ToolRunner.run(har, args);
+    assertTrue(ret != 0);
     Path finalPath = new Path(archivePath, "foo.har");
     Path fsPath = new Path(inputPath.toUri().getPath());
     String relative = fsPath.toString().substring(1);

+ 49 - 35
src/tools/org/apache/hadoop/tools/HadoopArchives.java

@@ -209,6 +209,10 @@ public class HadoopArchives implements Tool {
   }
 
   private boolean checkValidName(String name) {
+    Path tmp = new Path(name);
+    if (tmp.depth() != 1) {
+      return false;
+    }
     if (name.endsWith(".har")) 
       return true;
     return false;
@@ -301,16 +305,16 @@ public class HadoopArchives implements Tool {
    */
   public void archive(List<Path> srcPaths, String archiveName, Path dest) 
   throws IOException {
-    boolean isValid = checkValidName(archiveName);
-    if (!isValid) { 
-      throw new IOException("Invalid archiveName " + archiveName);
-    }
     checkPaths(conf, srcPaths);
     int numFiles = 0;
     long totalSize = 0;
     conf.set(DST_HAR_LABEL, archiveName);
     Path outputPath = new Path(dest, archiveName);
     FileOutputFormat.setOutputPath(conf, outputPath);
+    FileSystem outFs = outputPath.getFileSystem(conf);
+    if (outFs.exists(outputPath) || outFs.isFile(dest)) {
+      throw new IOException("Invalid Output.");
+    }
     conf.set(DST_DIR_LABEL, outputPath.toString());
     final String randomId = DistCp.getRandomId();
     Path jobDirectory = new Path(new JobClient(conf).getSystemDir(),
@@ -620,41 +624,51 @@ public class HadoopArchives implements Tool {
    */
 
   public int run(String[] args) throws Exception {
-    List<Path> srcPaths = new ArrayList<Path>();
-    Path destPath = null;
-    // check we were supposed to archive or 
-    // unarchive
-    String archiveName = null;
-    if (args.length < 2) {
-      System.out.println(usage);
-      throw new IOException("Invalid usage.");
-    }
-    if (!"-archiveName".equals(args[0])) {
-      System.out.println(usage);
-      throw new IOException("Archive Name not specified.");
-    }
-    archiveName = args[1];
-    if (!checkValidName(archiveName)) {
-      throw new IOException("Invalid name for archives. " + archiveName);
-    }
-    for (int i = 2; i < args.length; i++) {
-      if (i == (args.length - 1)) {
-        destPath = new Path(args[i]);
+    try {
+      List<Path> srcPaths = new ArrayList<Path>();
+      Path destPath = null;
+      // check we were supposed to archive or 
+      // unarchive
+      String archiveName = null;
+      if (args.length < 4) {
+        System.out.println(usage);
+        throw new IOException("Invalid usage.");
       }
-      else {
-        srcPaths.add(new Path(args[i]));
+      if (!"-archiveName".equals(args[0])) {
+        System.out.println(usage);
+        throw new IOException("Archive Name not specified.");
       }
-    }
-    // do a glob on the srcPaths and then pass it on
-    List<Path> globPaths = new ArrayList<Path>();
-    for (Path p: srcPaths) {
-      FileSystem fs = p.getFileSystem(getConf());
-      FileStatus[] statuses = fs.globStatus(p);
-      for (FileStatus status: statuses) {
-        globPaths.add(fs.makeQualified(status.getPath()));
+      archiveName = args[1];
+      if (!checkValidName(archiveName)) {
+        System.out.println(usage);
+        throw new IOException("Invalid name for archives. " + archiveName);
       }
+      for (int i = 2; i < args.length; i++) {
+        if (i == (args.length - 1)) {
+          destPath = new Path(args[i]);
+        }
+        else {
+          srcPaths.add(new Path(args[i]));
+        }
+      }
+      if (srcPaths.size() == 0) {
+        System.out.println(usage);
+        throw new IOException("Invalid Usage: No input sources specified.");
+      }
+      // do a glob on the srcPaths and then pass it on
+      List<Path> globPaths = new ArrayList<Path>();
+      for (Path p: srcPaths) {
+        FileSystem fs = p.getFileSystem(getConf());
+        FileStatus[] statuses = fs.globStatus(p);
+        for (FileStatus status: statuses) {
+          globPaths.add(fs.makeQualified(status.getPath()));
+        }
+      }
+      archive(globPaths, archiveName, destPath);
+    } catch(IOException ie) {
+      System.err.println(ie.getLocalizedMessage());
+      return -1;
     }
-    archive(globPaths, archiveName, destPath);
     return 0;
   }