Browse Source

HADOOP-3668. Made editorial changes to HOD documentation. Contributed by Vinod Kumar Vavilapalli.

git-svn-id: https://svn.apache.org/repos/asf/hadoop/core/trunk@673334 13f79535-47bb-0310-9956-ffa450edef68
Hemanth Yamijala 17 years ago
parent
commit
2b0ac94c01

+ 21 - 6
docs/changes.html

@@ -56,7 +56,7 @@
 </a></h2>
 <ul id="trunk_(unreleased_changes)_">
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._incompatible_changes_')">  INCOMPATIBLE CHANGES
-</a>&nbsp;&nbsp;&nbsp;(2)
+</a>&nbsp;&nbsp;&nbsp;(3)
     <ol id="trunk_(unreleased_changes)_._incompatible_changes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3595">HADOOP-3595</a>. Remove deprecated methods for mapred.combine.once
 functionality, which was necessary to providing backwards
@@ -70,6 +70,7 @@ compatible combiner semantics for 0.18.<br />(cdouglas via omalley)</li>
   setInputPath(Path)
   setMapOutputCompressionType(CompressionType style)
   setOutputPath(Path)<br />(Amareshwari Sriramadasu via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3652">HADOOP-3652</a>. Remove deprecated class OutputFormatBase.<br />(Amareshwari Sriramadasu via cdouglas)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._new_features_')">  NEW FEATURES
@@ -85,7 +86,7 @@ All of them default to "\t".<br />(Zheng Shao via omalley)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._improvements_')">  IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(3)
+</a>&nbsp;&nbsp;&nbsp;(4)
     <ol id="trunk_(unreleased_changes)_._improvements_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3577">HADOOP-3577</a>. Tools to inject blocks into name node and simulated
 data nodes for testing.<br />(Sanjay Radia via hairong)</li>
@@ -93,6 +94,7 @@ data nodes for testing.<br />(Sanjay Radia via hairong)</li>
 may be processed by map/reduce.<br />(cdouglas via omalley)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3655">HADOOP-3655</a>. Add additional ant properties to control junit.<br />(Steve
 Loughran via omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3543">HADOOP-3543</a>. Update the copyright year to 2008.<br />(cdouglas via omalley)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._optimizations_')">  OPTIMIZATIONS
@@ -100,14 +102,16 @@ Loughran via omalley)</li>
     <ol id="trunk_(unreleased_changes)_._optimizations_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3556">HADOOP-3556</a>. Removed lock contention in MD5Hash by changing the
 singleton MessageDigester by an instance per Thread using
-ThreadLocal.<br />(Ivn de Prado via omalley)</li>
+ThreadLocal.<br />(Iv?n de Prado via omalley)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('trunk_(unreleased_changes)_._bug_fixes_')">  BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(1)
+</a>&nbsp;&nbsp;&nbsp;(2)
     <ol id="trunk_(unreleased_changes)_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3563">HADOOP-3563</a>.  Refactor the distributed upgrade code so that it is
 easier to identify datanode and namenode related code.<br />(dhruba)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3640">HADOOP-3640</a>. Fix the read method in the NativeS3InputStream.<br />(tomwhite via
+omalley)</li>
     </ol>
   </li>
 </ul>
@@ -241,7 +245,7 @@ in hadoop user guide.<br />(shv)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.18.0_-_unreleased_._improvements_')">  IMPROVEMENTS
-</a>&nbsp;&nbsp;&nbsp;(44)
+</a>&nbsp;&nbsp;&nbsp;(45)
     <ol id="release_0.18.0_-_unreleased_._improvements_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2928">HADOOP-2928</a>. Remove deprecated FileSystem.getContentLength().<br />(Lohit Vjayarenu via rangadi)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3130">HADOOP-3130</a>. Make the connect timeout smaller for getFile.<br />(Amar Ramesh Kamat via ddas)</li>
@@ -331,6 +335,7 @@ reflect that it should only be used in cleanup contexts.<br />(omalley)</li>
 via the DistributedCache.<br />(Amareshwari Sriramadasu via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3606">HADOOP-3606</a>. Updates the Streaming doc.<br />(Amareshwari Sriramadasu via ddas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3532">HADOOP-3532</a>. Add jdiff reports to the build scripts.<br />(omalley)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3100">HADOOP-3100</a>. Develop tests to test the DFS command line interface.<br />(mukund)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.18.0_-_unreleased_._optimizations_')">  OPTIMIZATIONS
@@ -357,7 +362,7 @@ InputFormat.validateInput.<br />(tomwhite via omalley)</li>
     </ol>
   </li>
   <li><a href="javascript:toggleList('release_0.18.0_-_unreleased_._bug_fixes_')">  BUG FIXES
-</a>&nbsp;&nbsp;&nbsp;(103)
+</a>&nbsp;&nbsp;&nbsp;(108)
     <ol id="release_0.18.0_-_unreleased_._bug_fixes_">
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-2905">HADOOP-2905</a>. 'fsck -move' triggers NPE in NameNode.<br />(Lohit Vjayarenu via rangadi)</li>
       <li>Increment ClientProtocol.versionID missed by <a href="http://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a>.<br />(shv)</li>
@@ -562,6 +567,16 @@ not yet resolved.<br />(Amar Ramesh Kamat via ddas)</li>
 current semantics.<br />(lohit vijayarenu via cdouglas)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3480">HADOOP-3480</a>.  Need to update Eclipse template to reflect current trunk.<br />(Brice Arnould via tomwhite)</li>
       <li><a href="http://issues.apache.org/jira/browse/HADOOP-3588">HADOOP-3588</a>. Fixed usability issues with archives.<br />(mahadev)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3536">HADOOP-3536</a>. Uncaught exception in DataBlockScanner.
+(Tsz Wo (Nicholas), SZE via hairong)
+</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3539">HADOOP-3539</a>. Exception when closing DFSClient while multiple files are
+open.<br />(Benjamin Gufler via hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3572">HADOOP-3572</a>. SetQuotas usage interface has some minor bugs.<br />(hairong)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3649">HADOOP-3649</a>. Fix bug in removing blocks from the corrupted block map.<br />(Lohit Vijayarenu via shv)</li>
+      <li><a href="http://issues.apache.org/jira/browse/HADOOP-3604">HADOOP-3604</a>. Work around a JVM synchronization problem observed while
+retrieving the address of direct buffers from compression code by obtaining
+a lock during this call.<br />(Arun C Murthy via cdouglas)</li>
     </ol>
   </li>
 </ul>

+ 3 - 3
docs/hod.html

@@ -218,13 +218,13 @@ Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters over
 <ul>
         
 <li>
-<a href="hod_admin_guide.html">Hod Admin Guide</a> : This guide will walk you through an overview of architecture of HOD, prerequisites, installing various components and dependent software, and configuring HOD to get it up and running.</li>
+<a href="hod_admin_guide.html">HOD Admin Guide</a> : This guide provides an overview of the HOD architecture, Torque resource manager, and various support tools and utilities, and shows you how to install, configure, and run HOD.</li>
         
 <li>
-<a href="hod_user_guide.html">Hod User Guide</a> : This guide will let you know about how to get started on running hod, its various features, command line options and help on troubleshooting in detail.</li>
+<a href="hod_config_guide.html">Hod Configuration Guide</a> : This guide discusses HOD configuration sections and shows you how to work with the most important and commonly used configuration options.</li>
         
 <li>
-<a href="hod_config_guide.html">Hod Configuration Guide</a> : This guide discusses about onfiguring HOD, describing various configuration sections, parameters and their purpose in detail.</li>
+<a href="hod_user_guide.html">Hod User Guide</a> : This guide shows you how to get started using HOD, reviews various HOD features and command line options, and provides detailed troubleshooting help.</li>
       
 </ul>
 </div>

+ 26 - 26
docs/hod.pdf

@@ -47,10 +47,10 @@ endobj
 >>
 endobj
 12 0 obj
-<< /Length 1823 /Filter [ /ASCII85Decode /FlateDecode ]
+<< /Length 1826 /Filter [ /ASCII85Decode /FlateDecode ]
  >>
 stream
-Gat=,968iG&AII3nEA*K#iC^"P.J*NR?#`a4k2^2,Xc*8&N!$Ne\BF.,_ZCj.!ETd*WcE:Kl=_3i:D9fnLh*eT@h_@Za-]\j]FG=3EBV,>j-OQnNq$)[Jmj[a9V%N%Jon$03'Mketjr7eHbD)C]3G]O7hcQG4uX;=2Sq%-ZKu,rc-8Dj*2iV,dX(DJ$2'1RKL`Q#L0h\f-,T".".bV&@GF6=K1an[\[!$`_TYHO1.n,e-'^>2cZKpTkW1\NAj2(5qU'92rGITE](JhZL)>)-K]KJAHtOJ;CH]^/ugS(bh3gih/>A%mc\k#*#bDXNM?]*#d*sk)MW@C"]L&]OKTZC9@enoVJ$1N898Ta/(Act^+5>>D6ti_8p^i*jTErT(g"%[an6b_Lcr?+.qu06.3^]LMSqG*Q*'jQrhB)+D[\mgNPKda\@qd"r1n-ZWl>TWg9kAt^;?"s#;r'YMFhc%*r?*O@u&!tLU78MeWRg>jI3#*4RXO9Ep<d&?=N.p=2iS3RteD2r4ag<hDHEk_4Sb[FOh"j[,e<';'[9f(4?O"&l(WU%0B-f9k[479qQ^QgOfNE]P/[Amb/'"j7Sl1,BOK;9NXDMk:sO:Ef*U66ZBn^O-G?["Mor,^0n#X6FPLPLoUB"dEG)6(i9b;"/ab;=+.Gh/#;DXJJ[fo3-)2C:L>fC)qK2?=jHq-Asoq9$f=,;R7>.6<"Lhe@idIlS$']<Mp`/o1O-5H.n.^A@=dIffm7"5RU[mQN`f:WaVJ_pKY7p&2`\.t):J/Q"hV.Y"0gnNGHn?q?:4Ma_K>o,0(#)jIM*=OFpEY)FH3gSmC&,QoO;,#iP4E`Wid^fX6Y3KP%a,-^6Sb9'Zs\(BdpX+0I6*V5R-ZL.&"m=Q.Vk]'&%UoMbYdkRM<_]+7C2%$UI*!B6<+SFTWVZ@@d)YhHDp9b59^Dn\F*8#Qpq4rR4GSfd%%OCN"Vc&)&F%k13-",VL=W'Yjs9R?Z$^JU5YQXSto3C]fJ!"6@jtJ<Q\)NQBUF4]@c-e`?FMi_+UC\'G,H7)X0,2]*I)7J+=Sfi5S'/*u!`-ke33%5h5kB"*/LEZu9*,1bJsnjSNJU:UDOh!4OVjGl#!Cp?8VAQh>\OL?/6HbVo-U(KeBG`9iHU;pJ;P\S;cK<iku(0UrD6*""OH:LXT(EN'.(#)nk5TV8tbui5j$l0sH'q9@WN-[YAZ=LD!Xs-fJFIp3>N0#h_.5t9hfi<J#:OYn@5^gK'l+i.:r.jKDSG>T7!:@likWl,>P16;a8-_aTi01kXQmA3l!OhZnVD!JC`p.Sk7F=Omk5qBGJ2egJ+2>YABd^496jZVYN'bqO?_`8eUJkY*]opWT=+PiWUA]=*_T`M'>)XlM1:-<.jKh_FjuC9)BOJ_k:ICL5J/S!;3\Um&WOhOV9WV*qZ<fH("hoA=b>BF9hD=-!5@Z;,j\HQC9>#JuFY@`IFa]_S[7'Gj^S@qVoj`._eXI0hM1And8OBb>Gp[=HriTU3Qd)+9fKqlH0'f$cQu;s:@#a>Un6?^sBrgu_9(2SN<:BdN*cA,pf4aV,Z^`mPX_S?s-AC4CEQ(6cHEjSepHf$iHpMGunX0l"A5+B+ZM)068AB>tSZE"_eh<keCVfB>B:'dp,kM-gEZiKZAH&eL73m^VGBu;2%B&AP%'"]bJN.g@U0tWafYe6qlc3V[3+m+o!/eJBZP&nKdA['qc'bHS71M<NR'@o[,<ru^&,5rH%'8(d=.C]Gl@\/CEBNKa5:<+c^el=i+Cd94&!r\mn>W96*e,7n+uOEh\ggX^Cu_stJdd:,mKSXgeCP_aLl1"!SjjL0~>
+Gat=,997gc&AJ$CkdUXON(CX/9=iq)"cB+4Hg:=?Q"ZRf=]?fkY5`o@>7u^skW$B[$]u4fRi0L1&%u2NhE3,+%[$4^2_3c9=4m,p+8.HsmDWLaE"IB^SbMRn7^'JV,guR6+6u[ZU<bnRl;uhlhYPRK+7%H9F]qF_^H.ts6BoKp(S8pb`]-5g-^j_AqKqm1,=`e%B4MsH?kar#ck!JpU+sHUC@hA=G-imGqW[FcVh_0OAqRSTR@;1ifQM(Def,dWU6/p2>@eSR9=)7nZKl492Wf>I<Nh'J!%@P70#C[hQ>CN/D7?,]1>(E!O"J[$id_"^Tek#Cb5c^pAS/1_nllXUP-q$N3S72HUa=O!'BZh*ca>1:0j7R$V`Ft-nc`02$u`I]an.!>O?],86/$[)MnmQ,!qC+tA`pdj^M*6$$WB=uQ%8DqiSj;WY`lA`W*>%,C#=c<Z;k1%FYqi!X-lpVc$d?7@>[oC=@aAGAW>d9+>?O>)d`6$SiJY3-a038Vn;?<bh"H8)#A_`_0EEjflI)I2hS@Fa5dtJFU(%&GdV3`S>(88WG4_lCoJp'F5"uR?LdII#WLU`P!A]s(%'uWQLIi:JB2`LiWYkXEl$?V\:#3G@iXf*=`?qH9tBR7N7G%lh>PQmL4:kBg&R,[p?9"o>3pL9]gY"+?!"#r!rNLRmT\$/(Dt@Jlh\UQB7:pYVC_Kd,Z?n5;R$rd+>^$VH*e\QI0Fa0Q,==QiQ_(t&e/on79uOVXUeX'jqoV7IF`;8A;e$gKY?:LP"jWs,1:;U._Vq#UV`=Lho2:4>HQ\Y$eH)[Yd@X2KNX1!T:?W?i#Wt.hX;0Ol.*?dWrG%$VQOR@eb(VQ;)3;.T.:Z1^m1b8Bdrn!lQGXc-jC`/07h"o[5\/$$1#<s7;io898^i3D]Z1$KIB2)%jjCN&j[n9ht+RppJtC]ErC0=!bLlH_a/gYN(c<ukG'k+]S#tOdV4=R=aCbV-nds8kWkp1"o&L"_oLBf01Eg#&W)PMP.F(UW_Q'8Z>XlgEiD-aXKdBC=I14`R0_dc'M%[&l:sj#]'juPN)^iU`PdfAWSlmq5c_;HH*08&YrYOY5/Ci#[u5UY5j4)bX-gfY6"Rc1X[(b9P(EHb2LSU])PSX=:G+XVmq`3sTpd)Cm]#9A#=c(Kdg4T]kre>a8"D:2h!4@0(K?qfJAXn)K;J\_)cE9ZWYMV8qdjgN0Z/iD(3AHX^:##@I[C^Hp5<h7#WOuD5S`<IohKKdmJLE_;!?Ss&ObRZT7Q6cJ?QjtAct44TrLYL`GQ87,E,-DZG5/]m]tP&"ANZZ*HNiBIR&`dg'$JojbE-^;4t?R1T4IN//>gFCM$C+Dr.tS"D;`00cIFc1k3-I2Ge+E&j@_^[o)K7j9o)[c;gEZU4nTu:?MgbG7GX7Yu>^7ATrRA%kSSlB/_j]EX#\9'BM9raB3rs^5P.g7S\pah+g]F57kg@mij+j&a=O^9:0eLde0552lOpBg2]&IEY[e95Va"7`[D>G)cRhR#!X`i36_Z$_9Wq_?`HC<"#59$1;U'^C#3QU![O0dquC`:Ym]=r;VtGl.na[qeP1R8W)ls)2C`%>$VItpiaq^ZY`A:s)O+4Qr`h2>5%+sV>?3V9X2Irc#Gc)Oq^;pu0>WOJ,b';Y9+mI80c<2-:!<(u+,n\>4J?.I5tNmA-pAi;*`"[=Fn&^f8-&s0i+4'(P6(\>/XZc4J-":EW/ji>KFnVT2?.</jJAP*"coL_3D*Uq:B?XB92rGM_Ic5a5;fM73Lu`XD]>._oiJE_%`8^$hK]f`?'qg7_m-3sM\C7AekGU\cO+31pDn<H*5D~>
 endstream
 endobj
 13 0 obj
@@ -72,7 +72,7 @@ endobj
 15 0 obj
 << /Type /Annot
 /Subtype /Link
-/Rect [ 108.0 467.732 197.328 455.732 ]
+/Rect [ 108.0 467.732 202.656 455.732 ]
 /C [ 0 0 0 ]
 /Border [ 0 0 0 ]
 /A << /URI (hod_admin_guide.html)
@@ -83,10 +83,10 @@ endobj
 16 0 obj
 << /Type /Annot
 /Subtype /Link
-/Rect [ 108.0 428.132 186.648 416.132 ]
+/Rect [ 108.0 428.132 231.324 416.132 ]
 /C [ 0 0 0 ]
 /Border [ 0 0 0 ]
-/A << /URI (hod_user_guide.html)
+/A << /URI (hod_config_guide.html)
 /S /URI >>
 /H /I
 >>
@@ -94,10 +94,10 @@ endobj
 17 0 obj
 << /Type /Annot
 /Subtype /Link
-/Rect [ 108.0 401.732 231.324 389.732 ]
+/Rect [ 108.0 401.732 186.648 389.732 ]
 /C [ 0 0 0 ]
 /Border [ 0 0 0 ]
-/A << /URI (hod_config_guide.html)
+/A << /URI (hod_user_guide.html)
 /S /URI >>
 /H /I
 >>
@@ -188,31 +188,31 @@ endobj
 xref
 0 26
 0000000000 65535 f 
-0000004473 00000 n 
-0000004538 00000 n 
-0000004630 00000 n 
+0000004476 00000 n 
+0000004541 00000 n 
+0000004633 00000 n 
 0000000015 00000 n 
 0000000071 00000 n 
 0000000564 00000 n 
 0000000684 00000 n 
 0000000716 00000 n 
-0000004753 00000 n 
+0000004756 00000 n 
 0000000851 00000 n 
-0000004816 00000 n 
+0000004819 00000 n 
 0000000988 00000 n 
-0000002904 00000 n 
-0000003027 00000 n 
-0000003068 00000 n 
-0000003240 00000 n 
-0000003411 00000 n 
-0000004882 00000 n 
-0000003584 00000 n 
-0000003747 00000 n 
-0000003917 00000 n 
-0000004030 00000 n 
-0000004140 00000 n 
-0000004248 00000 n 
-0000004364 00000 n 
+0000002907 00000 n 
+0000003030 00000 n 
+0000003071 00000 n 
+0000003243 00000 n 
+0000003416 00000 n 
+0000004885 00000 n 
+0000003587 00000 n 
+0000003750 00000 n 
+0000003920 00000 n 
+0000004033 00000 n 
+0000004143 00000 n 
+0000004251 00000 n 
+0000004367 00000 n 
 trailer
 <<
 /Size 26
@@ -220,5 +220,5 @@ trailer
 /Info 4 0 R
 >>
 startxref
-4933
+4936
 %%EOF

+ 96 - 88
docs/hod_admin_guide.html

@@ -210,7 +210,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Configuring+HOD">Configuring HOD</a>
 <ul class="minitoc">
 <li>
-<a href="#Minimal+Configuration+to+get+started">Minimal Configuration to get started</a>
+<a href="#Minimal+Configuration">Minimal Configuration</a>
 </li>
 <li>
 <a href="#Advanced+Configuration">Advanced Configuration</a>
@@ -224,7 +224,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#Supporting+Tools+and+Utilities">Supporting Tools and Utilities</a>
 <ul class="minitoc">
 <li>
-<a href="#logcondense.py+-+Tool+for+removing+log+files+uploaded+to+DFS">logcondense.py - Tool for removing log files uploaded to DFS</a>
+<a href="#logcondense.py+-+Manage+Log+Files">logcondense.py - Manage Log Files</a>
 <ul class="minitoc">
 <li>
 <a href="#Running+logcondense.py">Running logcondense.py</a>
@@ -235,7 +235,7 @@ document.write("Last Published: " + document.lastModified);
 </ul>
 </li>
 <li>
-<a href="#checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits">checklimits.sh - Tool to update torque comment field reflecting resource limits</a>
+<a href="#checklimits.sh+-+Monitor+Resource+Limits">checklimits.sh - Monitor Resource Limits</a>
 <ul class="minitoc">
 <li>
 <a href="#Running+checklimits.sh">Running checklimits.sh</a>
@@ -251,7 +251,8 @@ document.write("Last Published: " + document.lastModified);
 <h2 class="h3">Overview</h2>
 <div class="section">
 <p>The Hadoop On Demand (HOD) project is a system for provisioning and
-managing independent Hadoop MapReduce and HDFS instances on a shared cluster 
+managing independent Hadoop Map/Reduce and Hadoop Distributed File System (HDFS)
+instances on a shared cluster 
 of nodes. HOD is a tool that makes it easy for administrators and users to 
 quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers 
 and testers who need to share a physical cluster for testing their own Hadoop 
@@ -262,20 +263,20 @@ running Hadoop instances. At present it runs with the <a href="http://www.cluste
 resource manager</a>.
 </p>
 <p>
-The basic system architecture of HOD includes components from:</p>
+The basic system architecture of HOD includes these components:</p>
 <ul>
   
-<li>A Resource manager (possibly together with a scheduler),</li>
+<li>A Resource manager (possibly together with a scheduler)</li>
   
-<li>HOD components, and </li>
+<li>Various HOD components</li>
   
-<li>Hadoop Map/Reduce and HDFS daemons.</li>
+<li>Hadoop Map/Reduce and HDFS daemons</li>
 
 </ul>
 <p>
 HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances 
 through interaction with the above components on a given cluster of nodes. A cluster of
-nodes can be thought of as comprising of two sets of nodes:</p>
+nodes can be thought of as comprising two sets of nodes:</p>
 <ul>
   
 <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, and then
@@ -291,22 +292,22 @@ running jobs on them.
 </p>
 <ul>
   
-<li>The user uses the HOD client on the Submit node to allocate a required number of
-cluster nodes, and provision Hadoop on them.</li>
+<li>The user uses the HOD client on the Submit node to allocate a desired number of
+cluster nodes and to provision Hadoop on them.</li>
   
-<li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to submit a HOD
-process, called the RingMaster, as a Resource Manager job, requesting the user desired number 
-of nodes. This job is submitted to the central server of the Resource Manager (pbs_server, in Torque).</li>
+<li>The HOD client uses a resource manager interface (qsub, in Torque) to submit a HOD
+process, called the RingMaster, as a Resource Manager job, to request the user's desired number 
+of nodes. This job is submitted to the central server of the resource manager (pbs_server, in Torque).</li>
   
-<li>On the compute nodes, the resource manager slave daemons, (pbs_moms in Torque), accept
-and run jobs that they are given by the central server (pbs_server in Torque). The RingMaster 
+<li>On the compute nodes, the resource manager slave daemons (pbs_moms in Torque) accept
+and run jobs that they are assigned by the central server (pbs_server in Torque). The RingMaster 
 process is started on one of the compute nodes (mother superior, in Torque).</li>
   
-<li>The Ringmaster then uses another Resource Manager interface, (pbsdsh, in Torque), to run
+<li>The RingMaster then uses another resource manager interface (pbsdsh, in Torque) to run
 the second HOD component, HodRing, as distributed tasks on each of the compute
 nodes allocated.</li>
   
-<li>The Hodrings, after initializing, communicate with the Ringmaster to get Hadoop commands, 
+<li>The HodRings, after initializing, communicate with the RingMaster to get Hadoop commands, 
 and run them accordingly. Once the Hadoop commands are started, they register with the RingMaster,
 giving information about the daemons.</li>
   
@@ -317,18 +318,20 @@ some obtained from options given by user in its own configuration file.</li>
 JobTracker and HDFS daemons.</li>
 
 </ul>
-<p>The rest of the document deals with the steps needed to setup HOD on a physical cluster of nodes.</p>
+<p>The rest of this document describes how to setup HOD on a physical cluster of nodes.</p>
 </div>
 
 
 <a name="N10056"></a><a name="Pre-requisites"></a>
 <h2 class="h3">Pre-requisites</h2>
 <div class="section">
+<p>To use HOD, your system should include the following hardware and software
+components.</p>
 <p>Operating System: HOD is currently tested on RHEL4.<br>
-Nodes : HOD requires a minimum of 3 nodes configured through a resource manager.<br>
+Nodes : HOD requires a minimum of three nodes configured through a resource manager.<br>
 </p>
 <p> Software </p>
-<p>The following components are to be installed on *ALL* the nodes before using HOD:</p>
+<p>The following components must be installed on ALL nodes before using HOD:</p>
 <ul>
  
 <li>Torque: Resource manager</li>
@@ -337,7 +340,7 @@ Nodes : HOD requires a minimum of 3 nodes configured through a resource manager.
 <a href="http://www.python.org">Python</a> : HOD requires version 2.5.1 of Python.</li>
 
 </ul>
-<p>The following components can be optionally installed for getting better
+<p>The following components are optional and can be installed to obtain better
 functionality from HOD:</p>
 <ul>
  
@@ -361,7 +364,7 @@ nodes.
 </div>
 
 
-<a name="N1008A"></a><a name="Resource+Manager"></a>
+<a name="N1008D"></a><a name="Resource+Manager"></a>
 <h2 class="h3">Resource Manager</h2>
 <div class="section">
 <p>  Currently HOD works with the Torque resource manager, which it uses for its node
@@ -376,48 +379,49 @@ nodes.
   Users may wish to subscribe to TORQUE&rsquo;s mailing list or view the archive for questions,
   comments <a href="http://www.clusterresources.com/pages/resources/mailing-lists.php">here</a>.
 </p>
-<p>For using HOD with Torque:</p>
+<p>To use HOD with Torque:</p>
 <ul>
  
-<li>Install Torque components: pbs_server on one node(head node), pbs_mom on all
+<li>Install Torque components: pbs_server on one node (head node), pbs_mom on all
   compute nodes, and PBS client tools on all compute nodes and submit
-  nodes. Perform atleast a basic configuration so that the Torque system is up and
-  running i.e pbs_server knows which machines to talk to. Look <a href="http://www.clusterresources.com/wiki/doku.php?id=torque:1.2_basic_configuration">here</a>
+  nodes. Perform at least a basic configuration so that the Torque system is up and
+  running, that is, pbs_server knows which machines to talk to. Look <a href="http://www.clusterresources.com/wiki/doku.php?id=torque:1.2_basic_configuration">here</a>
   for basic configuration.
 
   For advanced configuration, see <a href="http://www.clusterresources.com/wiki/doku.php?id=torque:1.3_advanced_configuration">here</a>
 </li>
  
 <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is the
-  same as the HOD configuration parameter, resource-manager.queue. The Hod client uses this queue to
-  submit the Ringmaster process as a Torque job.</li>
+  same as the HOD configuration parameter, resource-manager.queue. The HOD client uses this queue to
+  submit the RingMaster process as a Torque job.</li>
  
-<li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
-  This can be done by using the 'qmgr' command. For example:
-  qmgr -c "set node node properties=cluster-name". The name of the cluster is the same as
+<li>Specify a cluster name as a property for all nodes in the cluster.
+  This can be done by using the qmgr command. For example:
+  <span class="codefrag">qmgr -c "set node node properties=cluster-name"</span>. The name of the cluster is the same as
   the HOD configuration parameter, hod.cluster. </li>
  
-<li>Ensure that jobs can be submitted to the nodes. This can be done by
-  using the 'qsub' command. For example:
-  echo "sleep 30" | qsub -l nodes=3</li>
+<li>Make sure that jobs can be submitted to the nodes. This can be done by
+  using the qsub command. For example:
+  <span class="codefrag">echo "sleep 30" | qsub -l nodes=3</span>
+</li>
 
 </ul>
 </div>
 
 
-<a name="N100C4"></a><a name="Installing+HOD"></a>
+<a name="N100CC"></a><a name="Installing+HOD"></a>
 <h2 class="h3">Installing HOD</h2>
 <div class="section">
-<p>Now that the resource manager set up is done, we proceed on to obtaining and
-installing HOD.</p>
+<p>Once the resource manager is set up, you can obtain and
+install HOD.</p>
 <ul>
  
-<li>If you are getting HOD from the Hadoop tarball,it is available under the 
+<li>If you are getting HOD from the Hadoop tarball, it is available under the 
   'contrib' section of Hadoop, under the root  directory 'hod'.</li>
  
 <li>If you are building from source, you can run ant tar from the Hadoop root
-  directory, to generate the Hadoop tarball, and then pick HOD from there,
-  as described in the point above.</li>
+  directory to generate the Hadoop tarball, and then get HOD from there,
+  as described above.</li>
  
 <li>Distribute the files under this directory to all the nodes in the
   cluster. Note that the location where the files are copied should be
@@ -430,18 +434,21 @@ installing HOD.</p>
 </div>
 
 
-<a name="N100DD"></a><a name="Configuring+HOD"></a>
+<a name="N100E5"></a><a name="Configuring+HOD"></a>
 <h2 class="h3">Configuring HOD</h2>
 <div class="section">
-<p>After HOD installation is done, it has to be configured before we start using
-it.</p>
-<a name="N100E6"></a><a name="Minimal+Configuration+to+get+started"></a>
-<h3 class="h4">Minimal Configuration to get started</h3>
+<p>You can configure HOD once it is installed. The minimal configuration needed
+to run HOD is described below. More advanced configuration options are discussed
+in the HOD Configuration Guide.</p>
+<a name="N100EE"></a><a name="Minimal+Configuration"></a>
+<h3 class="h4">Minimal Configuration</h3>
+<p>To get started using HOD, the following minimal configuration is
+  required:</p>
 <ul>
  
-<li>On the node from where you want to run hod, edit the file hodrc
-  which can be found in the &lt;install dir&gt;/conf directory. This file
-  contains the minimal set of values required for running hod.</li>
+<li>On the node from where you want to run HOD, edit the file hodrc
+  located in the &lt;install dir&gt;/conf directory. This file
+  contains the minimal set of values required to run hod.</li>
  
 <li>
 
@@ -461,7 +468,7 @@ it.</p>
 <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
     submit nodes.</li>
    
-<li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
+<li>${RM_QUEUE}: Queue configured for submitting jobs in the resource
     manager configuration.</li>
    
 <li>${RM_HOME}: Location of the resource manager installation on the
@@ -474,9 +481,9 @@ it.</p>
 
 <li>
 
-<p>The following environment variables *may* need to be set depending on
+<p>The following environment variables may need to be set depending on
   your environment. These variables must be defined where you run the
-  HOD client, and also be specified in the HOD configuration file as the
+  HOD client and must also be specified in the HOD configuration file as the
   value of the key resource_manager.env-vars. Multiple variables can be
   specified as a comma separated list of key=value pairs.</p>
 
@@ -484,7 +491,7 @@ it.</p>
 <ul>
    
 <li>HOD_PYTHON_HOME: If you install python to a non-default location
-    of the compute nodes, or submit nodes, then, this variable must be
+    of the compute nodes, or submit nodes, then this variable must be
     defined to point to the python executable in the non-standard
     location.</li>
     
@@ -493,47 +500,46 @@ it.</p>
 </li>
 
 </ul>
-<a name="N10117"></a><a name="Advanced+Configuration"></a>
+<a name="N10122"></a><a name="Advanced+Configuration"></a>
 <h3 class="h4">Advanced Configuration</h3>
-<p> You can review other configuration options in the file and modify them to suit
- your needs. Refer to the <a href="hod_config_guide.html">Configuration Guide</a> for information about the HOD
- configuration.
-    </p>
+<p> You can review and modify other configuration options to suit
+ your specific needs. Refer to the <a href="hod_config_guide.html">Configuration
+ Guide</a> for more information.</p>
 </div>
 
   
-<a name="N10126"></a><a name="Running+HOD"></a>
+<a name="N10131"></a><a name="Running+HOD"></a>
 <h2 class="h3">Running HOD</h2>
 <div class="section">
-<p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a> for information about how to run HOD,
-    what are the various features, options and for help in trouble-shooting.</p>
+<p>You can run HOD once it is configured. Refer to <a href="hod_user_guide.html">the HOD User Guide</a> for more information.</p>
 </div>
 
   
-<a name="N10134"></a><a name="Supporting+Tools+and+Utilities"></a>
+<a name="N1013F"></a><a name="Supporting+Tools+and+Utilities"></a>
 <h2 class="h3">Supporting Tools and Utilities</h2>
 <div class="section">
-<p>This section describes certain supporting tools and utilities that can be used in managing HOD deployments.</p>
-<a name="N1013D"></a><a name="logcondense.py+-+Tool+for+removing+log+files+uploaded+to+DFS"></a>
-<h3 class="h4">logcondense.py - Tool for removing log files uploaded to DFS</h3>
-<p>As mentioned in 
-         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">this section</a> of the
-         <a href="hod_user_guide.html">HOD User Guide</a>, HOD can be configured to upload
+<p>This section describes supporting tools and utilities that can be used to
+    manage HOD deployments.</p>
+<a name="N10148"></a><a name="logcondense.py+-+Manage+Log+Files"></a>
+<h3 class="h4">logcondense.py - Manage Log Files</h3>
+<p>As mentioned in the 
+         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">HOD User Guide</a>,
+         HOD can be configured to upload
          Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
-         to DFS could increase. logcondense.py is a tool that helps administrators to clean-up
-         the log files older than a certain number of days. </p>
-<a name="N1014E"></a><a name="Running+logcondense.py"></a>
+         to HDFS could increase. logcondense.py is a tool that helps
+         administrators to remove log files uploaded to HDFS. </p>
+<a name="N10155"></a><a name="Running+logcondense.py"></a>
 <h4>Running logcondense.py</h4>
 <p>logcondense.py is available under hod_install_location/support folder. You can either
-        run it using python, for e.g. <em>python logcondense.py</em>, or give execute permissions 
+        run it using python, for example, <em>python logcondense.py</em>, or give execute permissions 
         to the file, and directly run it as <em>logcondense.py</em>. logcondense.py needs to be 
         run by a user who has sufficient permissions to remove files from locations where log 
-        files are uploaded in the DFS, if permissions are enabled. For e.g. as mentioned in the
+        files are uploaded in the HDFS, if permissions are enabled. For example as mentioned in the
         <a href="hod_config_guide.html#3.7+hodring+options">configuration guide</a>, the logs could
         be configured to come under the user's home directory in HDFS. In that case, the user
         running logcondense.py should have super user privileges to remove the files from under
         all user home directories.</p>
-<a name="N10162"></a><a name="Command+Line+Options+for+logcondense.py"></a>
+<a name="N10169"></a><a name="Command+Line+Options+for+logcondense.py"></a>
 <h4>Command Line Options for logcondense.py</h4>
 <p>The following command line options are supported for logcondense.py.</p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
@@ -593,8 +599,9 @@ it.</p>
               <td colspan="1" rowspan="1">--dynamicdfs</td>
               <td colspan="1" rowspan="1">If true, this will indicate that the logcondense.py script should delete HDFS logs
               in addition to Map/Reduce logs. Otherwise, it only deletes Map/Reduce logs, which is also the
-              default if this option is not specified. This option is useful if dynamic DFS installations 
-              are being provisioned by HOD, and the static DFS installation is being used only to collect 
+              default if this option is not specified. This option is useful if
+              dynamic HDFS installations 
+              are being provisioned by HOD, and the static HDFS installation is being used only to collect 
               logs - a scenario that may be common in test clusters.</td>
               <td colspan="1" rowspan="1">false</td>
             
@@ -606,33 +613,34 @@ it.</p>
 <p>
 <em>python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user</em>
 </p>
-<a name="N10205"></a><a name="checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits"></a>
-<h3 class="h4">checklimits.sh - Tool to update torque comment field reflecting resource limits</h3>
-<p>checklimits is a HOD tool specific to Torque/Maui environment
+<a name="N1020C"></a><a name="checklimits.sh+-+Monitor+Resource+Limits"></a>
+<h3 class="h4">checklimits.sh - Monitor Resource Limits</h3>
+<p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
       (<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">Maui Cluster Scheduler</a> is an open source job
       scheduler for clusters and supercomputers, from clusterresources). The
       checklimits.sh script
-      updates torque comment field when newly submitted job(s) violate/cross
+      updates the torque comment field when newly submitted job(s) violate or
+      exceed
       over user limits set up in Maui scheduler. It uses qstat, does one pass
-      over torque job list to find out queued or unfinished jobs, runs Maui
+      over the torque job-list to determine queued or unfinished jobs, runs Maui
       tool checkjob on each job to see if user limits are violated and then
       runs torque's qalter utility to update job attribute 'comment'. Currently
       it updates the comment as <em>User-limits exceeded. Requested:([0-9]*)
       Used:([0-9]*) MaxLimit:([0-9]*)</em> for those jobs that violate limits.
       This comment field is then used by HOD to behave accordingly depending on
       the type of violation.</p>
-<a name="N10215"></a><a name="Running+checklimits.sh"></a>
+<a name="N1021C"></a><a name="Running+checklimits.sh"></a>
 <h4>Running checklimits.sh</h4>
-<p>checklimits.sh is available under hod_install_location/support
-        folder. This is a shell script and can be run directly as <em>sh
+<p>checklimits.sh is available under the hod_install_location/support
+        folder. This shell script can be run directly as <em>sh
         checklimits.sh </em>or as <em>./checklimits.sh</em> after enabling
         execute permissions. Torque and Maui binaries should be available
         on the machine where the tool is run and should be in the path
-        of the shell script process. In order for this tool to be able to update
-        comment field of jobs from different users, it has to be run with
-        torque administrative privileges. This tool has to be run repeatedly
+        of the shell script process. To update the
+        comment field of jobs from different users, this tool must be run with
+        torque administrative privileges. This tool must be run repeatedly
         after specific intervals of time to frequently update jobs violating
-        constraints, for e.g. via cron. Please note that the resource manager
+        constraints, for example via cron. Please note that the resource manager
         and scheduler commands used in this script can be expensive and so
         it is better not to run this inside a tight loop without sleeping.</p>
 </div>

File diff suppressed because it is too large
+ 6 - 6
docs/hod_admin_guide.pdf


+ 43 - 43
docs/hod_config_guide.html

@@ -201,7 +201,7 @@ document.write("Last Published: " + document.lastModified);
 <a href="#2.+Sections">2. Sections</a>
 </li>
 <li>
-<a href="#3.+Important+%2F+Commonly+Used+Configuration+Options">3. Important / Commonly Used Configuration Options</a>
+<a href="#3.+HOD+Configuration+Options">3. HOD Configuration Options</a>
 <ul class="minitoc">
 <li>
 <a href="#3.1+Common+configuration+options">3.1 Common configuration options</a>
@@ -232,29 +232,28 @@ document.write("Last Published: " + document.lastModified);
 <a name="N1000C"></a><a name="1.+Introduction"></a>
 <h2 class="h3">1. Introduction</h2>
 <div class="section">
-<p>Configuration options for HOD are organized as sections and options 
-      within them. They can be specified in two ways: a configuration file 
+<p>This document explains some of the most important and commonly used 
+      Hadoop On Demand (HOD) configuration options. Configuration options 
+      can be specified in two ways: a configuration file 
       in the INI format, and as command line options to the HOD shell, 
       specified in the format --section.option[=value]. If the same option is 
       specified in both places, the value specified on the command line 
       overrides the value in the configuration file.</p>
 <p>
-        To get a simple description of all configuration options, you can type
+        To get a simple description of all configuration options, type:
       </p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
 <tr>
 <td colspan="1" rowspan="1"><span class="codefrag">$ hod --verbose-help</span></td>
 </tr>
 </table>
-<p>This document explains some of the most important or commonly used
-      configuration options in some more detail.</p>
 </div>
     
     
-<a name="N10024"></a><a name="2.+Sections"></a>
+<a name="N10021"></a><a name="2.+Sections"></a>
 <h2 class="h3">2. Sections</h2>
 <div class="section">
-<p>The following are the various sections in the HOD configuration:</p>
+<p>HOD organizes configuration options into these sections:</p>
 <ul>
         
 <li>  hod:                  Options for the HOD client</li>
@@ -266,20 +265,21 @@ document.write("Last Published: " + document.lastModified);
         
 <li>  hodring:              Options for the HodRing processes</li>
         
-<li>  gridservice-mapred:   Options for the MapReduce daemons</li>
+<li>  gridservice-mapred:   Options for the Map/Reduce daemons</li>
         
 <li>  gridservice-hdfs:     Options for the HDFS daemons.</li>
       
 </ul>
-<p>The next section deals with some of the important options in the HOD 
-        configuration.</p>
 </div>
     
     
-<a name="N10046"></a><a name="3.+Important+%2F+Commonly+Used+Configuration+Options"></a>
-<h2 class="h3">3. Important / Commonly Used Configuration Options</h2>
+<a name="N10040"></a><a name="3.+HOD+Configuration+Options"></a>
+<h2 class="h3">3. HOD Configuration Options</h2>
 <div class="section">
-<a name="N1004C"></a><a name="3.1+Common+configuration+options"></a>
+<p>The following section describes configuration options common to most 
+      HOD sections followed by sections that describe configuration options 
+      specific to each HOD section.</p>
+<a name="N10049"></a><a name="3.1+Common+configuration+options"></a>
 <h3 class="h4">3.1 Common configuration options</h3>
 <p>Certain configuration options are defined in most of the sections of 
         the HOD configuration. Options defined in a section, are used by the
@@ -293,7 +293,7 @@ document.write("Last Published: " + document.lastModified);
                       directories under the directory specified here.</li>
           
           
-<li>debug: A numeric value from 1-4. 4 produces the most log information,
+<li>debug: Numeric value from 1-4. 4 produces the most log information,
                    and 1 the least.</li>
           
           
@@ -303,11 +303,11 @@ document.write("Last Published: " + document.lastModified);
           </li>
           
           
-<li>xrs-port-range: A range of ports, among which an available port shall
+<li>xrs-port-range: Range of ports, among which an available port shall
                             be picked for use to run an XML-RPC server.</li>
           
           
-<li>http-port-range: A range of ports, among which an available port shall
+<li>http-port-range: Range of ports, among which an available port shall
                              be picked for use to run an HTTP server.</li>
           
           
@@ -319,21 +319,21 @@ document.write("Last Published: " + document.lastModified);
                               
         
 </ul>
-<a name="N1006E"></a><a name="3.2+hod+options"></a>
+<a name="N1006B"></a><a name="3.2+hod+options"></a>
 <h3 class="h4">3.2 hod options</h3>
 <ul>
           
-<li>cluster: A descriptive name given to the cluster. For Torque, this is
+<li>cluster: Descriptive name given to the cluster. For Torque, this is
                      specified as a 'Node property' for every node in the cluster.
                      HOD uses this value to compute the number of available nodes.</li>
           
           
-<li>client-params: A comma-separated list of hadoop config parameters
+<li>client-params: Comma-separated list of hadoop config parameters
                            specified as key-value pairs. These will be used to
                            generate a hadoop-site.xml on the submit node that 
-                           should be used for running MapReduce jobs.</li>
+                           should be used for running Map/Reduce jobs.</li>
           
-<li>job-feasibility-attr: A regular expression string that specifies
+<li>job-feasibility-attr: Regular expression string that specifies
                            whether and how to check job feasibility - resource
                            manager or scheduler limits. The current
                            implementation corresponds to the torque job
@@ -342,20 +342,20 @@ document.write("Last Published: " + document.lastModified);
                            of limit violation is triggered and either
                            deallocates the cluster or stays in queued state
                            according as the request is beyond maximum limits or
-                           the cumulative usage has crossed maxumum limits. 
+                           the cumulative usage has crossed maximum limits. 
                            The torque comment attribute may be updated
-                           periodically by an external mechanism. For e.g.,
+                           periodically by an external mechanism. For example,
                            comment attribute can be updated by running <a href="hod_admin_guide.html#checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits">
                            checklimits.sh</a> script in hod/support directory,
                            and then setting job-feasibility-attr equal to the
-                           value TORQUE_USER_LIMITS_COMMENT_FIELD i.e
+                           value TORQUE_USER_LIMITS_COMMENT_FIELD,
                            "User-limits exceeded. Requested:([0-9]*)
-                           Used:([0-9]*) MaxLimit:([0-9]*)" will make HOD
+                           Used:([0-9]*) MaxLimit:([0-9]*)", will make HOD
                            behave accordingly.
                            </li>
          
 </ul>
-<a name="N10085"></a><a name="3.3+resource_manager+options"></a>
+<a name="N10082"></a><a name="3.3+resource_manager+options"></a>
 <h3 class="h4">3.3 resource_manager options</h3>
 <ul>
           
@@ -368,7 +368,7 @@ document.write("Last Published: " + document.lastModified);
                         found.</li> 
           
           
-<li>env-vars: This is a comma separated list of key-value pairs, 
+<li>env-vars: Comma-separated list of key-value pairs, 
                       expressed as key=value, which would be passed to the jobs 
                       launched on the compute nodes. 
                       For example, if the python installation is 
@@ -378,23 +378,23 @@ document.write("Last Published: " + document.lastModified);
                       can then use this variable.</li>
         
 </ul>
-<a name="N10098"></a><a name="3.4+ringmaster+options"></a>
+<a name="N10095"></a><a name="3.4+ringmaster+options"></a>
 <h3 class="h4">3.4 ringmaster options</h3>
 <ul>
           
-<li>work-dirs: These are a list of comma separated paths that will serve
+<li>work-dirs: Comma-separated list of paths that will serve
                        as the root for directories that HOD generates and passes
-                       to Hadoop for use to store DFS / MapReduce data. For e.g.
+                       to Hadoop for use to store DFS and Map/Reduce data. For e.g.
                        this is where DFS data blocks will be stored. Typically,
                        as many paths are specified as there are disks available
                        to ensure all disks are being utilized. The restrictions
                        and notes for the temp-dir variable apply here too.</li>
           
-<li>max-master-failures: It defines how many times a hadoop master
+<li>max-master-failures: Number of times a hadoop master
                        daemon can fail to launch, beyond which HOD will fail
                        the cluster allocation altogether. In HOD clusters,
                        sometimes there might be a single or few "bad" nodes due
-                       to issues like missing java, missing/incorrect version
+                       to issues like missing java, missing or incorrect version
                        of Hadoop etc. When this configuration variable is set
                        to a positive integer, the RingMaster returns an error
                        to the client only when the number of times a hadoop
@@ -408,11 +408,11 @@ document.write("Last Published: " + document.lastModified);
                        </li>
         
 </ul>
-<a name="N100A8"></a><a name="3.5+gridservice-hdfs+options"></a>
+<a name="N100A5"></a><a name="3.5+gridservice-hdfs+options"></a>
 <h3 class="h4">3.5 gridservice-hdfs options</h3>
 <ul>
           
-<li>external: If false, this indicates that a HDFS cluster must be 
+<li>external: If false, indicates that a HDFS cluster must be 
                       bought up by the HOD system, on the nodes which it 
                       allocates via the allocate command. Note that in that case,
                       when the cluster is de-allocated, it will bring down the 
@@ -440,7 +440,7 @@ document.write("Last Published: " + document.lastModified);
                   Hadoop on the cluster.</li>
           
           
-<li>server-params: A comma-separated list of hadoop config parameters
+<li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            NameNode and DataNodes.</li>
@@ -449,15 +449,15 @@ document.write("Last Published: " + document.lastModified);
 <li>final-server-params: Same as above, except they will be marked final.</li>
         
 </ul>
-<a name="N100C7"></a><a name="3.6+gridservice-mapred+options"></a>
+<a name="N100C4"></a><a name="3.6+gridservice-mapred+options"></a>
 <h3 class="h4">3.6 gridservice-mapred options</h3>
 <ul>
           
-<li>external: If false, this indicates that a MapReduce cluster must be
+<li>external: If false, indicates that a Map/Reduce cluster must be
                       bought up by the HOD system on the nodes which it allocates
                       via the allocate command.
                       If true, if will try and connect to an externally 
-                      configured MapReduce system.</li>
+                      configured Map/Reduce system.</li>
           
           
 <li>host: Hostname of the externally configured JobTracker, if any</li>
@@ -473,7 +473,7 @@ document.write("Last Published: " + document.lastModified);
                   located</li>
           
           
-<li>server-params: A comma-separated list of hadoop config parameters
+<li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            JobTracker and TaskTrackers</li>
@@ -482,7 +482,7 @@ document.write("Last Published: " + document.lastModified);
 <li>final-server-params: Same as above, except they will be marked final.</li>
         
 </ul>
-<a name="N100E6"></a><a name="3.7+hodring+options"></a>
+<a name="N100E3"></a><a name="3.7+hodring+options"></a>
 <h3 class="h4">3.7 hodring options</h3>
 <ul>
           
@@ -505,8 +505,8 @@ document.write("Last Published: " + document.lastModified);
                                    cluster node's local file path, use the format 'file://path'.
 
                                    When clusters are deallocated by HOD, the hadoop logs will
-                                   be deleted as part of HOD's cleanup process. In order to
-                                   persist these logs, you can use this configuration option.
+                                   be deleted as part of HOD's cleanup process. To ensure these
+                                   logs persist, you can use this configuration option.
 
                                    The format of the path is 
                                    value-of-this-option/userid/hod-logs/cluster-id

File diff suppressed because it is too large
+ 4 - 4
docs/hod_config_guide.pdf


+ 4 - 4
docs/hod_user_guide.html

@@ -290,7 +290,7 @@ document.write("Last Published: " + document.lastModified);
 <a name="Introduction" id="Introduction"></a>
 <p>Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters over a large physical cluster. It uses the Torque resource manager to do node allocation. On the allocated nodes, it can start Hadoop Map/Reduce and HDFS daemons. It automatically generates the appropriate configuration files (hadoop-site.xml) for the Hadoop daemons and client. HOD also has the capability to distribute Hadoop to the nodes in the virtual cluster that it allocates. In short, HOD makes it easy for administrators and users to quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers and testers who need to share a physical cluster for testing their own Hadoop versions.</p>
 <p>HOD supports Hadoop from version 0.15 onwards.</p>
-<p>The rest of the documentation comprises of a quick-start guide that helps you get quickly started with using HOD, a more detailed guide of all HOD features, command line options, known issues and trouble-shooting information.</p>
+<p>The rest of this document comprises of a quick-start guide that helps you get quickly started with using HOD, a more detailed guide of all HOD features, and a trouble-shooting section.</p>
 </div>
   
 <a name="N1001E"></a><a name="Getting+Started+Using+HOD"></a>
@@ -470,7 +470,7 @@ document.write("Last Published: " + document.lastModified);
 <strong> Operation <em>list</em></strong>
 </p>
 <a name="Operation_list" id="Operation_list"></a>
-<p>The list operation lists all the clusters allocated so far by a user. The cluster directory where the hadoop-site.xml is stored for the cluster, and it's status vis-a-vis connectivity with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
+<p>The list operation lists all the clusters allocated so far by a user. The cluster directory where the hadoop-site.xml is stored for the cluster, and its status vis-a-vis connectivity with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
 <table class="ForrestTable" cellspacing="1" cellpadding="4">
       
         
@@ -681,7 +681,7 @@ document.write("Last Published: " + document.lastModified);
 </tr>
     
 </table>
-<p>Under the root directory specified above in the path, HOD will create a create a path user_name/torque_jobid and store gzipped log files for each node that was part of the job.</p>
+<p>Under the root directory specified above in the path, HOD will create a path user_name/torque_jobid and store gzipped log files for each node that was part of the job.</p>
 <p>Note that to store the files to HDFS, you may need to configure the <span class="codefrag">hodring.pkgs</span> option with the Hadoop version that matches the HDFS mentioned. If not, HOD will try to use the Hadoop version that it is using to provision the Hadoop cluster itself.</p>
 <a name="N10359"></a><a name="Auto-deallocation+of+Idle+Clusters"></a>
 <h3 class="h4"> Auto-deallocation of Idle Clusters </h3>
@@ -716,7 +716,7 @@ document.write("Last Published: " + document.lastModified);
     
 </table>
 <p>
-<em>Note:</em> Due to restriction in the underlying Torque resource manager, names which do not start with a alphabet or contain a 'space' will cause the job to fail. The failure message points to the problem being in the specified job name.</p>
+<em>Note:</em> Due to restriction in the underlying Torque resource manager, names which do not start with an alphabet character or contain a 'space' will cause the job to fail. The failure message points to the problem being in the specified job name.</p>
 <a name="N103A6"></a><a name="Capturing+HOD+exit+codes+in+Torque"></a>
 <h3 class="h4"> Capturing HOD exit codes in Torque </h3>
 <a name="Capturing_HOD_exit_codes_in_Torq" id="Capturing_HOD_exit_codes_in_Torq"></a>

File diff suppressed because it is too large
+ 1 - 1
docs/hod_user_guide.pdf


+ 3 - 0
src/contrib/hod/CHANGES.txt

@@ -62,6 +62,9 @@ Release 0.18.0 - Unreleased
     script.exitcode file when a cluster directory is specified as a relative
     path. (Vinod Kumar Vavilapalli via yhemanth)
 
+    HADOOP-3668. Makes editorial changes to HOD documentation.
+    (Vinod Kumar Vavilapalli via yhemanth)
+
 Release 0.17.0 - 2008-05-18
 
   INCOMPATIBLE CHANGES

+ 3 - 3
src/docs/src/documentation/content/xdocs/hod.xml

@@ -38,9 +38,9 @@ Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters over
         <title>Documentation</title>
       <p>Please go through the following to know more about using HOD</p>
       <ul>
-        <li><a href="hod_admin_guide.html">Hod Admin Guide</a> : This guide will walk you through an overview of architecture of HOD, prerequisites, installing various components and dependent software, and configuring HOD to get it up and running.</li>
-        <li><a href="hod_user_guide.html">Hod User Guide</a> : This guide will let you know about how to get started on running hod, its various features, command line options and help on troubleshooting in detail.</li>
-        <li><a href="hod_config_guide.html">Hod Configuration Guide</a> : This guide discusses about onfiguring HOD, describing various configuration sections, parameters and their purpose in detail.</li>
+        <li><a href="hod_admin_guide.html">HOD Admin Guide</a> : This guide provides an overview of the HOD architecture, Torque resource manager, and various support tools and utilities, and shows you how to install, configure, and run HOD.</li>
+        <li><a href="hod_config_guide.html">Hod Configuration Guide</a> : This guide discusses HOD configuration sections and shows you how to work with the most important and commonly used configuration options.</li>
+        <li><a href="hod_user_guide.html">Hod User Guide</a> : This guide shows you how to get started using HOD, reviews various HOD features and command line options, and provides detailed troubleshooting help.</li>
       </ul>
     </section>
   </body>

+ 81 - 74
src/docs/src/documentation/content/xdocs/hod_admin_guide.xml

@@ -17,7 +17,8 @@
 <title>Overview</title>
 
 <p>The Hadoop On Demand (HOD) project is a system for provisioning and
-managing independent Hadoop MapReduce and HDFS instances on a shared cluster 
+managing independent Hadoop Map/Reduce and Hadoop Distributed File System (HDFS)
+instances on a shared cluster 
 of nodes. HOD is a tool that makes it easy for administrators and users to 
 quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers 
 and testers who need to share a physical cluster for testing their own Hadoop 
@@ -30,17 +31,17 @@ resource manager</a>.
 </p>
 
 <p>
-The basic system architecture of HOD includes components from:</p>
+The basic system architecture of HOD includes these components:</p>
 <ul>
-  <li>A Resource manager (possibly together with a scheduler),</li>
-  <li>HOD components, and </li>
-  <li>Hadoop Map/Reduce and HDFS daemons.</li>
+  <li>A Resource manager (possibly together with a scheduler)</li>
+  <li>Various HOD components</li>
+  <li>Hadoop Map/Reduce and HDFS daemons</li>
 </ul>
 
 <p>
 HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances 
 through interaction with the above components on a given cluster of nodes. A cluster of
-nodes can be thought of as comprising of two sets of nodes:</p>
+nodes can be thought of as comprising two sets of nodes:</p>
 <ul>
   <li>Submit nodes: Users use the HOD client on these nodes to allocate clusters, and then
 use the Hadoop client to submit Hadoop jobs. </li>
@@ -54,18 +55,18 @@ running jobs on them.
 </p>
 
 <ul>
-  <li>The user uses the HOD client on the Submit node to allocate a required number of
-cluster nodes, and provision Hadoop on them.</li>
-  <li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to submit a HOD
-process, called the RingMaster, as a Resource Manager job, requesting the user desired number 
-of nodes. This job is submitted to the central server of the Resource Manager (pbs_server, in Torque).</li>
-  <li>On the compute nodes, the resource manager slave daemons, (pbs_moms in Torque), accept
-and run jobs that they are given by the central server (pbs_server in Torque). The RingMaster 
+  <li>The user uses the HOD client on the Submit node to allocate a desired number of
+cluster nodes and to provision Hadoop on them.</li>
+  <li>The HOD client uses a resource manager interface (qsub, in Torque) to submit a HOD
+process, called the RingMaster, as a Resource Manager job, to request the user's desired number 
+of nodes. This job is submitted to the central server of the resource manager (pbs_server, in Torque).</li>
+  <li>On the compute nodes, the resource manager slave daemons (pbs_moms in Torque) accept
+and run jobs that they are assigned by the central server (pbs_server in Torque). The RingMaster 
 process is started on one of the compute nodes (mother superior, in Torque).</li>
-  <li>The Ringmaster then uses another Resource Manager interface, (pbsdsh, in Torque), to run
+  <li>The RingMaster then uses another resource manager interface (pbsdsh, in Torque) to run
 the second HOD component, HodRing, as distributed tasks on each of the compute
 nodes allocated.</li>
-  <li>The Hodrings, after initializing, communicate with the Ringmaster to get Hadoop commands, 
+  <li>The HodRings, after initializing, communicate with the RingMaster to get Hadoop commands, 
 and run them accordingly. Once the Hadoop commands are started, they register with the RingMaster,
 giving information about the daemons.</li>
   <li>All the configuration files needed for Hadoop instances are generated by HOD itself, 
@@ -74,24 +75,25 @@ some obtained from options given by user in its own configuration file.</li>
 JobTracker and HDFS daemons.</li>
 </ul>
 
-<p>The rest of the document deals with the steps needed to setup HOD on a physical cluster of nodes.</p>
+<p>The rest of this document describes how to setup HOD on a physical cluster of nodes.</p>
 
 </section>
 
 <section>
 <title>Pre-requisites</title>
-
+<p>To use HOD, your system should include the following hardware and software
+components.</p>
 <p>Operating System: HOD is currently tested on RHEL4.<br/>
-Nodes : HOD requires a minimum of 3 nodes configured through a resource manager.<br/></p>
+Nodes : HOD requires a minimum of three nodes configured through a resource manager.<br/></p>
 
 <p> Software </p>
-<p>The following components are to be installed on *ALL* the nodes before using HOD:</p>
+<p>The following components must be installed on ALL nodes before using HOD:</p>
 <ul>
  <li>Torque: Resource manager</li>
  <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of Python.</li>
 </ul>
 
-<p>The following components can be optionally installed for getting better
+<p>The following components are optional and can be installed to obtain better
 functionality from HOD:</p>
 <ul>
  <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
@@ -129,27 +131,27 @@ nodes.
   href="ext:hod/torque-mailing-list">here</a>.
 </p>
 
-<p>For using HOD with Torque:</p>
+<p>To use HOD with Torque:</p>
 <ul>
- <li>Install Torque components: pbs_server on one node(head node), pbs_mom on all
+ <li>Install Torque components: pbs_server on one node (head node), pbs_mom on all
   compute nodes, and PBS client tools on all compute nodes and submit
-  nodes. Perform atleast a basic configuration so that the Torque system is up and
-  running i.e pbs_server knows which machines to talk to. Look <a
+  nodes. Perform at least a basic configuration so that the Torque system is up and
+  running, that is, pbs_server knows which machines to talk to. Look <a
   href="ext:hod/torque-basic-config">here</a>
   for basic configuration.
 
   For advanced configuration, see <a
   href="ext:hod/torque-advanced-config">here</a></li>
  <li>Create a queue for submitting jobs on the pbs_server. The name of the queue is the
-  same as the HOD configuration parameter, resource-manager.queue. The Hod client uses this queue to
-  submit the Ringmaster process as a Torque job.</li>
- <li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
-  This can be done by using the 'qmgr' command. For example:
-  qmgr -c "set node node properties=cluster-name". The name of the cluster is the same as
+  same as the HOD configuration parameter, resource-manager.queue. The HOD client uses this queue to
+  submit the RingMaster process as a Torque job.</li>
+ <li>Specify a cluster name as a property for all nodes in the cluster.
+  This can be done by using the qmgr command. For example:
+  <code>qmgr -c "set node node properties=cluster-name"</code>. The name of the cluster is the same as
   the HOD configuration parameter, hod.cluster. </li>
- <li>Ensure that jobs can be submitted to the nodes. This can be done by
-  using the 'qsub' command. For example:
-  echo "sleep 30" | qsub -l nodes=3</li>
+ <li>Make sure that jobs can be submitted to the nodes. This can be done by
+  using the qsub command. For example:
+  <code>echo "sleep 30" | qsub -l nodes=3</code></li>
 </ul>
 
 </section>
@@ -157,14 +159,14 @@ nodes.
 <section>
 <title>Installing HOD</title>
 
-<p>Now that the resource manager set up is done, we proceed on to obtaining and
-installing HOD.</p>
+<p>Once the resource manager is set up, you can obtain and
+install HOD.</p>
 <ul>
- <li>If you are getting HOD from the Hadoop tarball,it is available under the 
+ <li>If you are getting HOD from the Hadoop tarball, it is available under the 
   'contrib' section of Hadoop, under the root  directory 'hod'.</li>
  <li>If you are building from source, you can run ant tar from the Hadoop root
-  directory, to generate the Hadoop tarball, and then pick HOD from there,
-  as described in the point above.</li>
+  directory to generate the Hadoop tarball, and then get HOD from there,
+  as described above.</li>
  <li>Distribute the files under this directory to all the nodes in the
   cluster. Note that the location where the files are copied should be
   the same on all the nodes.</li>
@@ -176,14 +178,17 @@ installing HOD.</p>
 <section>
 <title>Configuring HOD</title>
 
-<p>After HOD installation is done, it has to be configured before we start using
-it.</p>
+<p>You can configure HOD once it is installed. The minimal configuration needed
+to run HOD is described below. More advanced configuration options are discussed
+in the HOD Configuration Guide.</p>
 <section>
-  <title>Minimal Configuration to get started</title>
+  <title>Minimal Configuration</title>
+  <p>To get started using HOD, the following minimal configuration is
+  required:</p>
 <ul>
- <li>On the node from where you want to run hod, edit the file hodrc
-  which can be found in the &lt;install dir&gt;/conf directory. This file
-  contains the minimal set of values required for running hod.</li>
+ <li>On the node from where you want to run HOD, edit the file hodrc
+  located in the &lt;install dir&gt;/conf directory. This file
+  contains the minimal set of values required to run hod.</li>
  <li>
 <p>Specify values suitable to your environment for the following
   variables defined in the configuration file. Note that some of these
@@ -196,7 +201,7 @@ it.</p>
     'node property' as mentioned in resource manager configuration.</li>
    <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
     submit nodes.</li>
-   <li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
+   <li>${RM_QUEUE}: Queue configured for submitting jobs in the resource
     manager configuration.</li>
    <li>${RM_HOME}: Location of the resource manager installation on the
     compute and submit nodes.</li>
@@ -204,15 +209,15 @@ it.</p>
 </li>
 
 <li>
-<p>The following environment variables *may* need to be set depending on
+<p>The following environment variables may need to be set depending on
   your environment. These variables must be defined where you run the
-  HOD client, and also be specified in the HOD configuration file as the
+  HOD client and must also be specified in the HOD configuration file as the
   value of the key resource_manager.env-vars. Multiple variables can be
   specified as a comma separated list of key=value pairs.</p>
 
   <ul>
    <li>HOD_PYTHON_HOME: If you install python to a non-default location
-    of the compute nodes, or submit nodes, then, this variable must be
+    of the compute nodes, or submit nodes, then this variable must be
     defined to point to the python executable in the non-standard
     location.</li>
     </ul>
@@ -222,38 +227,38 @@ it.</p>
 
   <section>
     <title>Advanced Configuration</title>
-    <p> You can review other configuration options in the file and modify them to suit
- your needs. Refer to the <a href="hod_config_guide.html">Configuration Guide</a> for information about the HOD
- configuration.
-    </p>
+    <p> You can review and modify other configuration options to suit
+ your specific needs. Refer to the <a href="hod_config_guide.html">Configuration
+ Guide</a> for more information.</p>
   </section>
 </section>
 
   <section>
     <title>Running HOD</title>
-    <p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a> for information about how to run HOD,
-    what are the various features, options and for help in trouble-shooting.</p>
+    <p>You can run HOD once it is configured. Refer to <a
+    href="hod_user_guide.html">the HOD User Guide</a> for more information.</p>
   </section>
 
   <section>
     <title>Supporting Tools and Utilities</title>
-    <p>This section describes certain supporting tools and utilities that can be used in managing HOD deployments.</p>
+    <p>This section describes supporting tools and utilities that can be used to
+    manage HOD deployments.</p>
     
     <section>
-      <title>logcondense.py - Tool for removing log files uploaded to DFS</title>
-      <p>As mentioned in 
-         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">this section</a> of the
-         <a href="hod_user_guide.html">HOD User Guide</a>, HOD can be configured to upload
+      <title>logcondense.py - Manage Log Files</title>
+      <p>As mentioned in the 
+         <a href="hod_user_guide.html#Collecting+and+Viewing+Hadoop+Logs">HOD User Guide</a>,
+         HOD can be configured to upload
          Hadoop logs to a statically configured HDFS. Over time, the number of logs uploaded
-         to DFS could increase. logcondense.py is a tool that helps administrators to clean-up
-         the log files older than a certain number of days. </p>
+         to HDFS could increase. logcondense.py is a tool that helps
+         administrators to remove log files uploaded to HDFS. </p>
       <section>
         <title>Running logcondense.py</title>
         <p>logcondense.py is available under hod_install_location/support folder. You can either
-        run it using python, for e.g. <em>python logcondense.py</em>, or give execute permissions 
+        run it using python, for example, <em>python logcondense.py</em>, or give execute permissions 
         to the file, and directly run it as <em>logcondense.py</em>. logcondense.py needs to be 
         run by a user who has sufficient permissions to remove files from locations where log 
-        files are uploaded in the DFS, if permissions are enabled. For e.g. as mentioned in the
+        files are uploaded in the HDFS, if permissions are enabled. For example as mentioned in the
         <a href="hod_config_guide.html#3.7+hodring+options">configuration guide</a>, the logs could
         be configured to come under the user's home directory in HDFS. In that case, the user
         running logcondense.py should have super user privileges to remove the files from under
@@ -302,8 +307,9 @@ it.</p>
               <td>--dynamicdfs</td>
               <td>If true, this will indicate that the logcondense.py script should delete HDFS logs
               in addition to Map/Reduce logs. Otherwise, it only deletes Map/Reduce logs, which is also the
-              default if this option is not specified. This option is useful if dynamic DFS installations 
-              are being provisioned by HOD, and the static DFS installation is being used only to collect 
+              default if this option is not specified. This option is useful if
+              dynamic HDFS installations 
+              are being provisioned by HOD, and the static HDFS installation is being used only to collect 
               logs - a scenario that may be common in test clusters.</td>
               <td>false</td>
             </tr>
@@ -314,14 +320,15 @@ it.</p>
       </section>
     </section>
     <section>
-      <title>checklimits.sh - Tool to update torque comment field reflecting resource limits</title>
-      <p>checklimits is a HOD tool specific to Torque/Maui environment
+      <title>checklimits.sh - Monitor Resource Limits</title>
+      <p>checklimits.sh is a HOD tool specific to the Torque/Maui environment
       (<a href="ext:hod/maui">Maui Cluster Scheduler</a> is an open source job
       scheduler for clusters and supercomputers, from clusterresources). The
       checklimits.sh script
-      updates torque comment field when newly submitted job(s) violate/cross
+      updates the torque comment field when newly submitted job(s) violate or
+      exceed
       over user limits set up in Maui scheduler. It uses qstat, does one pass
-      over torque job list to find out queued or unfinished jobs, runs Maui
+      over the torque job-list to determine queued or unfinished jobs, runs Maui
       tool checkjob on each job to see if user limits are violated and then
       runs torque's qalter utility to update job attribute 'comment'. Currently
       it updates the comment as <em>User-limits exceeded. Requested:([0-9]*)
@@ -330,16 +337,16 @@ it.</p>
       the type of violation.</p>
       <section>
         <title>Running checklimits.sh</title>
-        <p>checklimits.sh is available under hod_install_location/support
-        folder. This is a shell script and can be run directly as <em>sh
+        <p>checklimits.sh is available under the hod_install_location/support
+        folder. This shell script can be run directly as <em>sh
         checklimits.sh </em>or as <em>./checklimits.sh</em> after enabling
         execute permissions. Torque and Maui binaries should be available
         on the machine where the tool is run and should be in the path
-        of the shell script process. In order for this tool to be able to update
-        comment field of jobs from different users, it has to be run with
-        torque administrative privileges. This tool has to be run repeatedly
+        of the shell script process. To update the
+        comment field of jobs from different users, this tool must be run with
+        torque administrative privileges. This tool must be run repeatedly
         after specific intervals of time to frequently update jobs violating
-        constraints, for e.g. via cron. Please note that the resource manager
+        constraints, for example via cron. Please note that the resource manager
         and scheduler commands used in this script can be expensive and so
         it is better not to run this inside a tight loop without sleeping.</p>
       </section>

+ 35 - 35
src/docs/src/documentation/content/xdocs/hod_config_guide.xml

@@ -16,26 +16,26 @@
     <section>
       <title>1. Introduction</title>
     
-      <p>Configuration options for HOD are organized as sections and options 
-      within them. They can be specified in two ways: a configuration file 
+      <p>This document explains some of the most important and commonly used 
+      Hadoop On Demand (HOD) configuration options. Configuration options 
+      can be specified in two ways: a configuration file 
       in the INI format, and as command line options to the HOD shell, 
       specified in the format --section.option[=value]. If the same option is 
       specified in both places, the value specified on the command line 
       overrides the value in the configuration file.</p>
       
       <p>
-        To get a simple description of all configuration options, you can type
+        To get a simple description of all configuration options, type:
       </p>
       <table><tr><td><code>$ hod --verbose-help</code></td></tr></table>
       
-      <p>This document explains some of the most important or commonly used
-      configuration options in some more detail.</p>
+
     </section>
     
     <section>
       <title>2. Sections</title>
     
-      <p>The following are the various sections in the HOD configuration:</p>
+      <p>HOD organizes configuration options into these sections:</p>
       
       <ul>
         <li>  hod:                  Options for the HOD client</li>
@@ -43,19 +43,19 @@
          to use, and other parameters for using that resource manager</li>
         <li>  ringmaster:           Options for the RingMaster process, </li>
         <li>  hodring:              Options for the HodRing processes</li>
-        <li>  gridservice-mapred:   Options for the MapReduce daemons</li>
+        <li>  gridservice-mapred:   Options for the Map/Reduce daemons</li>
         <li>  gridservice-hdfs:     Options for the HDFS daemons.</li>
       </ul>
     
-      
-      <p>The next section deals with some of the important options in the HOD 
-        configuration.</p>
     </section>
     
     <section>
-      <title>3. Important / Commonly Used Configuration Options</title>
-  
+      <title>3. HOD Configuration Options</title>
   
+      <p>The following section describes configuration options common to most 
+      HOD sections followed by sections that describe configuration options 
+      specific to each HOD section.</p>
+      
       <section> 
         <title>3.1 Common configuration options</title>
         
@@ -70,7 +70,7 @@
                       sure that the users who will run hod have rights to create 
                       directories under the directory specified here.</li>
           
-          <li>debug: A numeric value from 1-4. 4 produces the most log information,
+          <li>debug: Numeric value from 1-4. 4 produces the most log information,
                    and 1 the least.</li>
           
           <li>log-dir: Directory where log files are stored. By default, this is
@@ -78,10 +78,10 @@
                      temp-dir variable apply here too.
           </li>
           
-          <li>xrs-port-range: A range of ports, among which an available port shall
+          <li>xrs-port-range: Range of ports, among which an available port shall
                             be picked for use to run an XML-RPC server.</li>
           
-          <li>http-port-range: A range of ports, among which an available port shall
+          <li>http-port-range: Range of ports, among which an available port shall
                              be picked for use to run an HTTP server.</li>
           
           <li>java-home: Location of Java to be used by Hadoop.</li>
@@ -96,15 +96,15 @@
         <title>3.2 hod options</title>
         
         <ul>
-          <li>cluster: A descriptive name given to the cluster. For Torque, this is
+          <li>cluster: Descriptive name given to the cluster. For Torque, this is
                      specified as a 'Node property' for every node in the cluster.
                      HOD uses this value to compute the number of available nodes.</li>
           
-          <li>client-params: A comma-separated list of hadoop config parameters
+          <li>client-params: Comma-separated list of hadoop config parameters
                            specified as key-value pairs. These will be used to
                            generate a hadoop-site.xml on the submit node that 
-                           should be used for running MapReduce jobs.</li>
-          <li>job-feasibility-attr: A regular expression string that specifies
+                           should be used for running Map/Reduce jobs.</li>
+          <li>job-feasibility-attr: Regular expression string that specifies
                            whether and how to check job feasibility - resource
                            manager or scheduler limits. The current
                            implementation corresponds to the torque job
@@ -113,16 +113,16 @@
                            of limit violation is triggered and either
                            deallocates the cluster or stays in queued state
                            according as the request is beyond maximum limits or
-                           the cumulative usage has crossed maxumum limits. 
+                           the cumulative usage has crossed maximum limits. 
                            The torque comment attribute may be updated
-                           periodically by an external mechanism. For e.g.,
+                           periodically by an external mechanism. For example,
                            comment attribute can be updated by running <a href=
 "hod_admin_guide.html#checklimits.sh+-+Tool+to+update+torque+comment+field+reflecting+resource+limits">
                            checklimits.sh</a> script in hod/support directory,
                            and then setting job-feasibility-attr equal to the
-                           value TORQUE_USER_LIMITS_COMMENT_FIELD i.e
+                           value TORQUE_USER_LIMITS_COMMENT_FIELD,
                            "User-limits exceeded. Requested:([0-9]*)
-                           Used:([0-9]*) MaxLimit:([0-9]*)" will make HOD
+                           Used:([0-9]*) MaxLimit:([0-9]*)", will make HOD
                            behave accordingly.
                            </li>
          </ul>
@@ -139,7 +139,7 @@
                         which the executables of the resource manager can be 
                         found.</li> 
           
-          <li>env-vars: This is a comma separated list of key-value pairs, 
+          <li>env-vars: Comma-separated list of key-value pairs, 
                       expressed as key=value, which would be passed to the jobs 
                       launched on the compute nodes. 
                       For example, if the python installation is 
@@ -154,18 +154,18 @@
         <title>3.4 ringmaster options</title>
         
         <ul>
-          <li>work-dirs: These are a list of comma separated paths that will serve
+          <li>work-dirs: Comma-separated list of paths that will serve
                        as the root for directories that HOD generates and passes
-                       to Hadoop for use to store DFS / MapReduce data. For e.g.
+                       to Hadoop for use to store DFS and Map/Reduce data. For e.g.
                        this is where DFS data blocks will be stored. Typically,
                        as many paths are specified as there are disks available
                        to ensure all disks are being utilized. The restrictions
                        and notes for the temp-dir variable apply here too.</li>
-          <li>max-master-failures: It defines how many times a hadoop master
+          <li>max-master-failures: Number of times a hadoop master
                        daemon can fail to launch, beyond which HOD will fail
                        the cluster allocation altogether. In HOD clusters,
                        sometimes there might be a single or few "bad" nodes due
-                       to issues like missing java, missing/incorrect version
+                       to issues like missing java, missing or incorrect version
                        of Hadoop etc. When this configuration variable is set
                        to a positive integer, the RingMaster returns an error
                        to the client only when the number of times a hadoop
@@ -184,7 +184,7 @@
         <title>3.5 gridservice-hdfs options</title>
         
         <ul>
-          <li>external: If false, this indicates that a HDFS cluster must be 
+          <li>external: If false, indicates that a HDFS cluster must be 
                       bought up by the HOD system, on the nodes which it 
                       allocates via the allocate command. Note that in that case,
                       when the cluster is de-allocated, it will bring down the 
@@ -207,7 +207,7 @@
                   located. This can be used to use a pre-installed version of
                   Hadoop on the cluster.</li>
           
-          <li>server-params: A comma-separated list of hadoop config parameters
+          <li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            NameNode and DataNodes.</li>
@@ -220,11 +220,11 @@
         <title>3.6 gridservice-mapred options</title>
         
         <ul>
-          <li>external: If false, this indicates that a MapReduce cluster must be
+          <li>external: If false, indicates that a Map/Reduce cluster must be
                       bought up by the HOD system on the nodes which it allocates
                       via the allocate command.
                       If true, if will try and connect to an externally 
-                      configured MapReduce system.</li>
+                      configured Map/Reduce system.</li>
           
           <li>host: Hostname of the externally configured JobTracker, if any</li>
           
@@ -235,7 +235,7 @@
           <li>pkgs: Installation directory, under which bin/hadoop executable is 
                   located</li>
           
-          <li>server-params: A comma-separated list of hadoop config parameters
+          <li>server-params: Comma-separated list of hadoop config parameters
                            specified key-value pairs. These will be used to
                            generate a hadoop-site.xml that will be used by the
                            JobTracker and TaskTrackers</li>
@@ -266,8 +266,8 @@
                                    cluster node's local file path, use the format 'file://path'.
 
                                    When clusters are deallocated by HOD, the hadoop logs will
-                                   be deleted as part of HOD's cleanup process. In order to
-                                   persist these logs, you can use this configuration option.
+                                   be deleted as part of HOD's cleanup process. To ensure these
+                                   logs persist, you can use this configuration option.
 
                                    The format of the path is 
                                    value-of-this-option/userid/hod-logs/cluster-id

+ 4 - 4
src/docs/src/documentation/content/xdocs/hod_user_guide.xml

@@ -14,7 +14,7 @@
     <title> Introduction </title><anchor id="Introduction"></anchor>
   <p>Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters over a large physical cluster. It uses the Torque resource manager to do node allocation. On the allocated nodes, it can start Hadoop Map/Reduce and HDFS daemons. It automatically generates the appropriate configuration files (hadoop-site.xml) for the Hadoop daemons and client. HOD also has the capability to distribute Hadoop to the nodes in the virtual cluster that it allocates. In short, HOD makes it easy for administrators and users to quickly setup and use Hadoop. It is also a very useful tool for Hadoop developers and testers who need to share a physical cluster for testing their own Hadoop versions.</p>
   <p>HOD supports Hadoop from version 0.15 onwards.</p>
-  <p>The rest of the documentation comprises of a quick-start guide that helps you get quickly started with using HOD, a more detailed guide of all HOD features, command line options, known issues and trouble-shooting information.</p>
+  <p>The rest of this document comprises of a quick-start guide that helps you get quickly started with using HOD, a more detailed guide of all HOD features, and a trouble-shooting section.</p>
   </section>
   <section>
 		<title> Getting Started Using HOD </title><anchor id="Getting_Started_Using_HOD_0_4"></anchor>
@@ -110,7 +110,7 @@
   <section><title> Provisioning and Managing Hadoop Clusters </title><anchor id="Provisioning_and_Managing_Hadoop"></anchor>
   <p>The primary feature of HOD is to provision Hadoop Map/Reduce and HDFS clusters. This is described above in the Getting Started section. Also, as long as nodes are available, and organizational policies allow, a user can use HOD to allocate multiple Map/Reduce clusters simultaneously. The user would need to specify different paths for the <code>cluster_dir</code> parameter mentioned above for each cluster he/she allocates. HOD provides the <em>list</em> and the <em>info</em> operations to enable managing multiple clusters.</p>
   <p><strong> Operation <em>list</em></strong></p><anchor id="Operation_list"></anchor>
-  <p>The list operation lists all the clusters allocated so far by a user. The cluster directory where the hadoop-site.xml is stored for the cluster, and it's status vis-a-vis connectivity with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
+  <p>The list operation lists all the clusters allocated so far by a user. The cluster directory where the hadoop-site.xml is stored for the cluster, and its status vis-a-vis connectivity with the JobTracker and/or HDFS is shown. The list operation has the following syntax:</p>
     <table>
       
         <tr>
@@ -219,7 +219,7 @@
    <table><tr><td><code>log-destination-uri = hdfs://host123:45678/user/hod/logs</code> or</td></tr>
     <tr><td><code>log-destination-uri = file://path/to/store/log/files</code></td></tr>
     </table>
-  <p>Under the root directory specified above in the path, HOD will create a create a path user_name/torque_jobid and store gzipped log files for each node that was part of the job.</p>
+  <p>Under the root directory specified above in the path, HOD will create a path user_name/torque_jobid and store gzipped log files for each node that was part of the job.</p>
   <p>Note that to store the files to HDFS, you may need to configure the <code>hodring.pkgs</code> option with the Hadoop version that matches the HDFS mentioned. If not, HOD will try to use the Hadoop version that it is using to provision the Hadoop cluster itself.</p>
   </section>
   <section><title> Auto-deallocation of Idle Clusters </title><anchor id="Auto_deallocation_of_Idle_Cluste"></anchor>
@@ -242,7 +242,7 @@
           <td><code>$ hod allocate -d cluster_dir -n number_of_nodes -N name_of_job</code></td>
         </tr>
     </table>
-  <p><em>Note:</em> Due to restriction in the underlying Torque resource manager, names which do not start with a alphabet or contain a 'space' will cause the job to fail. The failure message points to the problem being in the specified job name.</p>
+  <p><em>Note:</em> Due to restriction in the underlying Torque resource manager, names which do not start with an alphabet character or contain a 'space' will cause the job to fail. The failure message points to the problem being in the specified job name.</p>
   </section>
   <section><title> Capturing HOD exit codes in Torque </title><anchor id="Capturing_HOD_exit_codes_in_Torq"></anchor>
   <p>HOD exit codes are captured in the Torque exit_status field. This will help users and system administrators to distinguish successful runs from unsuccessful runs of HOD. The exit codes are 0 if allocation succeeded and all hadoop jobs ran on the allocated cluster correctly. They are non-zero if allocation failed or some of the hadoop jobs failed on the allocated cluster. The exit codes that are possible are mentioned in the table below. <em>Note: Hadoop job status is captured only if the version of Hadoop used is 16 or above.</em></p>

Some files were not shown because too many files changed in this diff