Browse Source

SUBMARINE-83. Refine the documents of submarine targeting 0.2.0 release. Contributed by Zhankun Tang.

Zhankun Tang 6 năm trước cách đây
mục cha
commit
03aa70fe19

+ 3 - 4
hadoop-submarine/hadoop-submarine-core/README.md

@@ -37,11 +37,12 @@
     \__________________________________________________________/ (_)
 ```
 
-Submarine is a project which allows infra engineer / data scientist to run *unmodified* Tensorflow programs on YARN.
+Submarine is a project which allows infra engineer / data scientist to run
+*unmodified* Tensorflow or PyTorch programs on YARN or Kubernetes.
 
 Goals of Submarine:
 - It allows jobs easy access data/models in HDFS and other storages.
-- Can launch services to serve Tensorflow/MXNet models.
+- Can launch services to serve Tensorflow/PyTorch models.
 - Support run distributed Tensorflow jobs with simple configs.
 - Support run user-specified Docker images.
 - Support specify GPU and other resources.
@@ -51,5 +52,3 @@ Goals of Submarine:
 Please jump to [QuickStart](src/site/markdown/QuickStart.md) guide to quickly understand how to use this framework.
 
 Please jump to [Examples](src/site/markdown/Examples.md) to try other examples like running Distributed Tensorflow Training for CIFAR 10.
-
-If you're a developer, please find [Developer](src/site/markdown/DeveloperGuide.md) guide for more details.

+ 0 - 24
hadoop-submarine/hadoop-submarine-core/src/site/markdown/DeveloperGuide.md

@@ -1,24 +0,0 @@
-<!---
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License. See accompanying LICENSE file.
--->
-
-# Developer Guide
-
-By default, Submarine uses YARN service framework as runtime. If you want to add your own implementation, you can add a new `RuntimeFactory` implementation and configure following option to `submarine.xml` (which should be placed under same `$HADOOP_CONF_DIR`)
-
-```
-<property>
-  <name>submarine.runtime.class</name>
-  <value>... full qualified class name for your runtime factory ... </value>
-</property>
-```

+ 1 - 3
hadoop-submarine/hadoop-submarine-core/src/site/markdown/Examples.md

@@ -18,6 +18,4 @@ Here're some examples about Submarine usage.
 
 [Running Distributed CIFAR 10 Tensorflow Job](RunningDistributedCifar10TFJobs.html)
 
-[Running Standalone CIFAR 10 PyTorch Job](RunningSingleNodeCifar10PTJobs.html)
-
-[Running Zeppelin Notebook on YARN](RunningZeppelinOnYARN.html)
+[Running Standalone CIFAR 10 PyTorch Job](RunningSingleNodeCifar10PTJobs.html)

+ 2 - 3
hadoop-submarine/hadoop-submarine-core/src/site/markdown/Index.md

@@ -12,7 +12,8 @@
   limitations under the License. See accompanying LICENSE file.
 -->
 
-Submarine is a project which allows infra engineer / data scientist to run *unmodified* Tensorflow programs on YARN.
+Submarine is a project which allows infra engineer / data scientist to run
+*unmodified* Tensorflow or PyTorch programs on YARN or Kubernetes.
 
 Goals of Submarine:
 
@@ -43,6 +44,4 @@ Click below contents if you want to understand more.
 
 - [How to write Dockerfile for Submarine PyTorch jobs](WriteDockerfilePT.html)
 
-- [Developer guide](DeveloperGuide.html)
-
 - [Installation guides](HowToInstall.html)

+ 17 - 2
hadoop-submarine/hadoop-submarine-core/src/site/markdown/QuickStart.md

@@ -18,7 +18,7 @@
 
 Must:
 
-- Apache Hadoop 3.1.x, YARN service enabled.
+- Apache Hadoop version newer than 2.7.3
 
 Optional:
 
@@ -37,6 +37,20 @@ For more details, please refer to:
 
 - [How to write Dockerfile for Submarine PyTorch jobs](WriteDockerfilePT.html)
 
+## Submarine runtimes
+After submarine 0.2.0, it supports two runtimes which are YARN native service
+ runtime and Linkedin's TonY runtime. Each runtime can support both Tensorflow
+ and Pytorch framework. And the user don't need to worry about the usage
+ because the two runtime implements the same interface.
+
+To use the TonY runtime, please set below value in the submarine configuration.
+
+|Configuration Name | Description |
+|:---- |:---- |
+| `submarine.runtime.class` | org.apache.hadoop.yarn.submarine.runtimes.tony.TonyRuntimeFactory |
+
+For more details of TonY runtime, please check [TonY runtime guide](TonYRuntimeGuide.html)
+
 ## Run jobs
 
 ### Commandline options
@@ -164,7 +178,8 @@ See below screenshot:
 
 ![alt text](./images/tensorboard-service.png "Tensorboard service")
 
-If there is no hadoop client, we can also use the java command and the uber jar, hadoop-submarine-all-*.jar, to submit the job.
+After v0.2.0, if there is no hadoop client, we can also use the java command
+and the uber jar, hadoop-submarine-all-*.jar, to submit the job.
 
 ```
 java -cp /path-to/hadoop-conf:/path-to/hadoop-submarine-all-*.jar \

+ 0 - 37
hadoop-submarine/hadoop-submarine-core/src/site/markdown/RunningZeppelinOnYARN.md

@@ -1,37 +0,0 @@
-<!---
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License. See accompanying LICENSE file.
--->
-
-# Running Zeppelin Notebook On Submarine
-
-This is a simple example about how to run Zeppelin notebook by using Submarine.
-
-## Step 1: Build Docker Image
-
-Go to `src/main/docker/zeppelin-notebook-example`, build the Docker image. Or you can use the prebuilt one: `hadoopsubmarine/zeppelin-on-yarn-gpu:0.0.1`
-
-## Step 2: Launch the notebook on YARN
-
-Submit command to YARN:
-
-`yarn app -destroy zeppelin-notebook;
-yarn jar path-to/hadoop-yarn-applications-submarine-3.2.0-SNAPSHOT.jar \
-   job run --name zeppelin-notebook \
-   --docker_image hadoopsubmarine/zeppelin-on-yarn-gpu:0.0.1 \
-   --worker_resources memory=8G,vcores=2,gpu=1 \
-   --num_workers 1 \
-   -worker_launch_cmd "/usr/local/bin/run_container.sh"`
-
-Once the container got launched, you can go to `YARN services` UI page, access the `zeppelin-notebook` job, and go to the quicklink `notebook` by clicking `...`.
-
-The notebook is secured by admin/admin user name and password.

+ 5 - 5
hadoop-submarine/hadoop-submarine-tony-runtime/src/site/markdown/QuickStart.md → hadoop-submarine/hadoop-submarine-core/src/site/markdown/TonYRuntimeGuide.md

@@ -247,16 +247,16 @@ CLASSPATH=$(hadoop classpath --glob): \
 /home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar \
 
 java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
- --framework tensorflow \
  --num_workers 2 \
  --worker_resources memory=3G,vcores=2 \
  --num_ps 2 \
  --ps_resources memory=3G,vcores=2 \
  --worker_launch_cmd "venv.zip/venv/bin/python mnist_distributed.py" \
  --ps_launch_cmd "venv.zip/venv/bin/python mnist_distributed.py" \
- --insecure
+ --insecure \
  --conf tony.containers.resources=PATH_TO_VENV_YOU_CREATED/venv.zip#archive,PATH_TO_MNIST_EXAMPLE/mnist_distributed.py, \
-PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar
+PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar \
+--conf tony.application.framework=pytorch
 
 ```
 You should then be able to see links and status of the jobs from command line:
@@ -284,7 +284,6 @@ CLASSPATH=$(hadoop classpath --glob): \
 /home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar \
 
 java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
- --framework tensorflow \
  --docker_image hadoopsubmarine/tf-1.8.0-cpu:0.0.3 \
  --input_path hdfs://pi-aw:9000/dataset/cifar-10-data \
  --worker_resources memory=3G,vcores=2 \
@@ -297,5 +296,6 @@ java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
  --env HADOOP_COMMON_HOME=/hadoop-3.1.0 \
  --env HADOOP_HDFS_HOME=/hadoop-3.1.0 \
  --env HADOOP_CONF_DIR=/hadoop-3.1.0/etc/hadoop \
- --conf tony.containers.resources=--conf tony.containers.resources=/home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar
+ --conf tony.containers.resources=PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar \
+ --conf tony.application.framework=pytorch
 ```

+ 0 - 29
hadoop-submarine/hadoop-submarine-tony-runtime/src/site/resources/css/site.css

@@ -1,29 +0,0 @@
-/*
-* Licensed to the Apache Software Foundation (ASF) under one or more
-* contributor license agreements.  See the NOTICE file distributed with
-* this work for additional information regarding copyright ownership.
-* The ASF licenses this file to You under the Apache License, Version 2.0
-* (the "License"); you may not use this file except in compliance with
-* the License.  You may obtain a copy of the License at
-*
-*     http://www.apache.org/licenses/LICENSE-2.0
-*
-* Unless required by applicable law or agreed to in writing, software
-* distributed under the License is distributed on an "AS IS" BASIS,
-* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-* See the License for the specific language governing permissions and
-* limitations under the License.
-*/
-#banner {
-  height: 93px;
-  background: none;
-}
-
-#bannerLeft img {
-  margin-left: 30px;
-  margin-top: 10px;
-}
-
-#bannerRight img {
-  margin: 17px;
-}

+ 0 - 28
hadoop-submarine/hadoop-submarine-tony-runtime/src/site/site.xml

@@ -1,28 +0,0 @@
-<!--
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
--->
-<project name="Apache Hadoop ${project.version}">
-
-  <skin>
-    <groupId>org.apache.maven.skins</groupId>
-    <artifactId>maven-stylus-skin</artifactId>
-    <version>${maven-stylus-skin.version}</version>
-  </skin>
-
-  <body>
-    <links>
-      <item name="Apache Hadoop" href="http://hadoop.apache.org/"/>
-    </links>
-  </body>
-
-</project>