12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394 |
- ~~ Licensed under the Apache License, Version 2.0 (the "License");
- ~~ you may not use this file except in compliance with the License.
- ~~ You may obtain a copy of the License at
- ~~
- ~~ http://www.apache.org/licenses/LICENSE-2.0
- ~~
- ~~ Unless required by applicable law or agreed to in writing, software
- ~~ distributed under the License is distributed on an "AS IS" BASIS,
- ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- ~~ See the License for the specific language governing permissions and
- ~~ limitations under the License. See accompanying LICENSE file.
- ---
- C API libhdfs
- ---
- ---
- ${maven.build.timestamp}
- C API libhdfs
- %{toc|section=1|fromDepth=0}
- * Overview
- libhdfs is a JNI based C API for Hadoop's Distributed File System
- (HDFS). It provides C APIs to a subset of the HDFS APIs to manipulate
- HDFS files and the filesystem. libhdfs is part of the Hadoop
- distribution and comes pre-compiled in
- <<<${HADOOP_PREFIX}/libhdfs/libhdfs.so>>> .
- * The APIs
- The libhdfs APIs are a subset of: {{{hadoop fs APIs}}}.
- The header file for libhdfs describes each API in detail and is
- available in <<<${HADOOP_PREFIX}/src/c++/libhdfs/hdfs.h>>>
- * A Sample Program
- ----
- \#include "hdfs.h"
- int main(int argc, char **argv) {
- hdfsFS fs = hdfsConnect("default", 0);
- const char* writePath = "/tmp/testfile.txt";
- hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0);
- if(!writeFile) {
- fprintf(stderr, "Failed to open %s for writing!\n", writePath);
- exit(-1);
- }
- char* buffer = "Hello, World!";
- tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1);
- if (hdfsFlush(fs, writeFile)) {
- fprintf(stderr, "Failed to 'flush' %s\n", writePath);
- exit(-1);
- }
- hdfsCloseFile(fs, writeFile);
- }
- ----
- * How To Link With The Library
- See the Makefile for <<<hdfs_test.c>>> in the libhdfs source directory
- (<<<${HADOOP_PREFIX}/src/c++/libhdfs/Makefile>>>) or something like:
- <<<gcc above_sample.c -I${HADOOP_PREFIX}/src/c++/libhdfs -L${HADOOP_PREFIX}/libhdfs -lhdfs -o above_sample>>>
- * Common Problems
- The most common problem is the <<<CLASSPATH>>> is not set properly when
- calling a program that uses libhdfs. Make sure you set it to all the
- Hadoop jars needed to run Hadoop itself. Currently, there is no way to
- programmatically generate the classpath, but a good bet is to include
- all the jar files in <<<${HADOOP_PREFIX}>>> and <<<${HADOOP_PREFIX}/lib>>> as well
- as the right configuration directory containing <<<hdfs-site.xml>>>
- * Thread Safe
- libdhfs is thread safe.
- * Concurrency and Hadoop FS "handles"
- The Hadoop FS implementation includes a FS handle cache which
- caches based on the URI of the namenode along with the user
- connecting. So, all calls to <<<hdfsConnect>>> will return the same
- handle but calls to <<<hdfsConnectAsUser>>> with different users will
- return different handles. But, since HDFS client handles are
- completely thread safe, this has no bearing on concurrency.
- * Concurrency and libhdfs/JNI
- The libhdfs calls to JNI should always be creating thread local
- storage, so (in theory), libhdfs should be as thread safe as the
- underlying calls to the Hadoop FS.
|