CONTRIBUTING.md 4.2 KB

Libhdfs++ Coding Standards

  • Libhdfs++ Coding Standards
    • Introduction
    • Automated Formatting
    • Explicit Scoping
    • Comments
    • Portability

Introduction

The foundation of the libhdfs++ project's coding standards is Google's C++ style guide. It can be found here:

https://google.github.io/styleguide/cppguide.html

There are several small restrictions adopted from Sun's Java standards and Hadoop convention on top of Google's that must also be followed as well as portability requirements.

Automated Formatting

Prior to submitting a patch for code review use LLVM's formatting tool, clang-format, on the .h, .c, and .cc files included in the patch. Use the -style=google switch when doing so.

Example pre-submission usage:

$ clang-format -i -style=google temp_file.cc
  • note: On some linux distributions clang-format already exists in repositories but don't show up without an appended version number. On Ubuntu you'll find it with:

    "apt-get install clang-format-3.6"
    

    Explicit Block Scopes

    Always add brackets conditional and loop bodies, even if the body could fit on a single line.

    BAD:

    if (foo)
    Bar();
    
    if (foo)
    Bar();
    else
    Baz();
    
    for (int i=0; i<10; i++)
    Bar(i);
    

GOOD:

if (foo) {
  Bar();
}

if (foo) {
  Bar();
} else {
  Baz();
}

for (int i=0; i<10; i++) {
  Bar(i);
}

Comments

Use the /* comment */ style to maintain consistency with the rest of the Hadoop code base.

BAD:

//this is a bad single line comment
/*
  this is a bad block comment
*/

GOOD:

/* this is a single line comment */

/**
 * This is a block comment.  Note that nothing is on the first
 * line of the block.
 **/

Portability

Please make sure you write code that is portable.

  • All code most be able to build using GCC and LLVM.
    • In the future we hope to support other compilers as well.
  • Don't make assumptions about endianness or architecture.
    • Don't do clever things with pointers or intrinsics.
  • Don't write code that could force a non-aligned word access.
    • This causes performance issues on most architectures and isn't supported at all on some.
    • Generally the compiler will prevent this unless you are doing clever things with pointers e.g. abusing placement new or reinterpreting a pointer into a pointer to a wider type.
  • If a type needs to be a a specific width make sure to specify it.
    • int32_t my_32_bit_wide_int
  • Avoid using compiler dependent pragmas or attributes.
    • If there is a justified and unavoidable reason for using these you must document why. See examples below.

BAD:

struct Foo {
  int32_t x_;
  char y_;
  int32_t z_;
  char z_;
} __attribute__((packed));
/**
 * "I didn't profile and identify that this is causing
 * significant memory overhead but I want to pack it to
 * save 6 bytes"
 **/

NECESSARY: Still not good but required for short-circuit reads.

struct FileDescriptorMessage {
  struct cmsghdr msg_;
  int file_descriptors_[2];
} __attribute__((packed));
/**
 * This is actually needed for short circuit reads.
 * "struct cmsghdr" is well defined on UNIX systems.
 * This mechanism relies on the fact that any passed
 * ancillary data is _directly_ following the cmghdr.
 * The kernel interprets any padding as real data.
 **/