FX-ALPHA C and C++ Coding Conventions

19 Jan 95

David H. Leserman


Table of Contents


1. Introduction

We can judge the quality of source code along several axes: satisfaction of functional and performance requirements, correctness (that is, the absence of defects or bugs), and ease of maintenance. When code is easier to understand, it is easier to maintain.

Coding conventions help us write code that is understandable. When code is understandable, it can be reviewed, maintained, and extended more easily. Coding conventions are useful when many programmers are working on a project, when a programmer other than the original author will take over a body of code for maintenance, and for communicating effectively during code reviews. Coding guidelines can also help reduce certain kinds of coding errors.

Because this document is short, rationale and many nuances are not covered. Refer to the following references for additional guidance.

Where the references differ, this document prevails.

Remember, the overriding consideration is communication. As software authors we have a dual audience: the compiler and other people.

2. Standards

Adhere to relevant standards.
  1. For C language programs, use ANSI C. (This implies that function prototypes are required.) Use the -Aa option in both the HP C and C++ compilers for ANSI C.
  2. For C++, the bible is the ARM, that is, the Annotated Reference Manual by Ellis and Stroustrup. We will attempt to maintain consistency with the progress of the C++ WG21 evolving draft standard.
  3. Use standard libraries. For example, learn the functionality of the C Standard Library and use those functions, not your own. Include (#include) the header files that come with the libraries you use. Don't retype the function prototypes.
  4. Comply with POSIX standards where reasonable.
  5. Avoid vendor-specific extensions.

3. Names

Choice of names is the most important single style element for understandability. Names should be meaningful. Choose a verb for a function name that reflects the action taken by the function. (An exception is listed as a rule below.) Choose a noun for a variable name. Choose a file name to reflect the content of the file.

McConnell has a good description of how to choose variable names. He states:

"The most important consideration in naming a variable is that the name fully and accurately describe the entity the variable represents. An effective technique for coming up with a good name is to state in words what the variable represents. Often that statement itself is the best variable name..." (p. 186)
"When you find yourself `figuring out' a section of code, consider renaming the variables. It's OK to figure out murder mysteries, but you shouldn't need to figure out code. You should be able to read it." (p. 193)

Choose names to reflect elements of the problem domain rather than the computational solution. For instance, SAOreport is a better name than inputRecord for a variable that is an input record containing an SAO report.

The length of a name should reflect its scope. A local variable in a small function has very limited scope. Here, max might be a suitable name although maxTemp is probably better. A global variable has wide scope. Globally, we might need a name like maxTempForMostRecentSAO.

As a practical matter, limit the names of object files to 14 characters, the names of C and C++ identifiers to 31 characters, and do not begin a name with an underscore (except as noted in point i below) or a double underscore.

See McConnell (Chap. 9, esp. Sec. 9.2) for further discussion of the following items.

  1. When using opposites, be precise. For example use first/last, not first/end.
  2. Use meaningful loop index names (as shown in the example below) for long (>15 lines) or nested loops. Loop index names such as i are acceptable for short loops. for (windowIndex = 0; windowIndex <= windowCount; windowIndex++) { /* ... */ }
  3. "Think of a better name than flag for status variables. CharacterType = CONTROL_CHARACTER is more meaningful than StatusFlag = 0x80."
  4. Think twice about short variable names, for example, temp. Is it temperature? Is it temporary? Temporary what? If it's not clear from context, use a better name.
  5. "Give boolean variables names that imply True or False." For instance, success is initialized to False and set to True when the related operation has succeeded.

    Now for some boring but helpful rules:

  6. Use UPPERCASE_IDENTIFIERS for #defined constant and macro identifiers, for constants, for the values of an enumerated type (enumerators), and for global variables (that is, variables that are visible in more than one linkage unit).
  7. Use UpperMixedCase (where the first character is upper case) for user defined types (including enumerations, classes, structs, and unions) and for names of static variables with file scope.
  8. Use lowerMixedCase (where the first character is lower case) for local variable names, function names, formal parameter names, and members of C language structs and unions.
  9. Use _lowerMixedCase (where the first character is an underscore and the next is lower case) for member variables of C++ classes. This helps the reader to distinguish a member variable in a member function.
  10. When using typedef, reflect the style appropriate to the underlying type. For example, use lowerMixedCase for a typedef-name for a function or a built-in type and UpperMixedCase for a typedef-name for a class, struct, or union. If the purpose of the typedef is to transparently switch between underlying types, choose any appropriate naming style.
  11. Exception for C++ access function names: Function names are generally verbs. However, when a member function returns the value of a member variable, the name of the variable sans the leading underscore is the preferred name for the function. For example, if class Parcel has a member named _temperature, then Parcel::temperature() is preferred over Parcel::getTemperature().
  12. Use only generally accepted abbreviations.
  13. Sometimes it is appropriate to use an uppercase abbreviation in a mixed case name. If so, you may use an underscore as a separator between words. SAO_Report is preferred although SAOreport and saoReport are both acceptable.
  14. All identifiers that have global scope in a function library should have a lowercase, two character prefix separated from the identifier by an underscore. For example, gr_newSegment() could be the name of a function in the graphics library. Because the name space is encapsulated, names in a class library need not have a prefix.
  15. Use UpperMixedCase for the name of a file that declares or defines a C++ class. Otherwise, use lowerMixedCase for file names. Suffixes for source code file names are .c for C language definition files, .h for C language include files, .C for C++ definition files, and .H for C++ include files.
  16. Write numbers in decimal, unless there is a good reason to use another base. Clearly document any numbers based upon underlying hardware or software services.
  17. Names of formal parameters should appear in both the function prototype and the function definition. If a function has no formal parameters, use the void keyword instead of an empty parameter list (to compensate for the inconsistency between ANSI C and C++).

Note that many of the above rules are based on, but not identical to, the Identifier Format Conventions in the FSL Conventions.

4. Safe Coding

Safe coding conventions help to catch errors.
  1. Clearly document any assumptions made by functions in the comment block that precedes the function body.
  2. Be sure that every variable is initialized before use.
  3. Avoid type-casts whenever possible. When casting is unavoidable, an explicit cast is preferred over an implicit (compiler provided) one.
  4. Select restrictive argument types. This primarily includes the keywords unsigned and const, and using enumerations in preference to simple integers.
  5. Be sure that all functions have an explicit return type, even if that type is void.
  6. "A ... function must never return a reference or a pointer to a local variable." (Ellemtel Rule 34.) This rule applies to all C functions and to public C++ functions.
  7. "Do not write code which is dependent on the lifetime of a temporary object." (Ellemtel Prt. Rec. 16.)
  8. "Do not allocate memory and expect that someone else will deallocate it later." (Ellemtel Rec. 58.)
  9. "Always assign a new value to a pointer that points to deallocated memory." (Ellemtel Rec. 59.)
  10. "Always use inclusive lower limits and exclusive upper limits." (Ellemtel Rec. 52.) Also see Koenig Sec. 3.6 and Formatting point f in Section 8.
  11. If a C function has no parameters, use the keyword void to ensure consistency with C++. For example, int fnThatTakesNoParams(void);
  12. Use an explicit status parameter for a function that returns a status. Do not overload data-passing parameters or the function return value. That is, do not use a special, reserved data value to indicate a return status.
  13. Other error handling conventions are yet to be determined.

5. Structure

  1. Declare identifiers in a reasonably small scope. See Ellemtel Rec. 57.
  2. Strive for low coupling between cohesive modules. "Coupling measures the amount of interaction and dependency between components... Lower coupling, that is, fewer and simpler interactions, characterizes software that is easier to understand and maintain." (Cargill, p. 82.) "Each class [or module] should serve a single, coherent purpose." (Cargill, p. 174.)
  3. Write cohesive functions. A function should do only one thing and be fairly short. Functions over 100 lines should be relatively rare, but there are cases where this is unavoidable (such as a switch statement with numerous branches, each containing only a line or two).
  4. "Avoid functions with many arguments." (Ellemtel Rec. 41.) Three or fewer arguments is good; five or more is questionable.
  5. Use symbolic names for literal constants that appear in procedural code. For example, use PI instead of 3.14159, MAX_SAO_COUNT instead of 300, and ACK instead of `\6'. Exceptions include zero, very common strings (such as "rb" in an fopen call), and NULL in C++ (see point s in Section 7). Symbolic identifiers document the code and often make it trivial to make changes. Use appropriately scoped const variables or enumerations. Use of #define is discouraged.
  6. Performance optimization takes time and often results in violation of encapsulation boundaries. "Optimize code only if you know that you have a performance problem. Think twice before you begin." (Ellemtel Rec. 1.) "Don't guess. Use an execution profiler to isolate performance problems." (Cargill, Chap. 7.)

6. Portability

  1. "Never specify relative UNIX names in #include directives." (Ellemtel Rule 10.) Use the #include "filename" for user-prepared include files and #include <filename for include files from libraries. (Ellemtel Recs. 10 and 11.)
  2. "Avoid the direct use of [built-in] data types in declarations." (Ellemtel Prt. Rec. 1.)
  3. "Do not assume that an int and a long have the same size." (Ellemtel Prt. Rec. 2.)
  4. "Do not assume that an int is 32 bits long (it may be only 16)." (Ellemtel Prt. Rec. 3.)
  5. "Do not assume that a char is signed or unsigned." (Ellemtel Prt. Rec. 4.)
  6. "Place machine-dependent code in a special file so that it may be easily located when porting code from one machine to another." (Ellemtel Rec. 5.) This may also apply to software-package-dependent code.

7. C++ Specific Considerations

Most of the C conventions carry over to C++. Additional conventions include:
  1. "A class should define a set of objects... Concentrate on common abstractions in a base class... A public derived class should be a specialization of its base." (Cargill, Chap 1.)
  2. "Define a class interface consistently - avoid surprises." (Cargill, Chap. 2.)
  3. Define constructors such that any newly constructed object is in a well defined state. See Cargill, Chap. 2.
  4. "Avoid inheritance for parts-of relations." (Ellemtel Rec. 37.) "Recognize inheritance for implementation; use a private base class or (preferably) a member object [as an alternative]." (Cargill, Chap 3.)
  5. Pass an argument (of a heavyweight user-defined type) via a pointer if the function will modify the argument value; otherwise, use a constant reference (const &). This reduces parameter passing overhead and helps to distinguish parameters used for input from those used for output. See Ellemtel Recs. 42 and 43.
  6. For efficiency, "minimize the number of temporary objects that are created as return values from functions or as arguments to functions." (Ellemtel Rec. 46.)
  7. Declare copy constructor and assignment operator member functions for any class that contains any data which cannot simply be bitcopied. See Meyers, Item 11.
  8. "Use operator overloading sparingly and in a uniform manner." (Ellemtel Recs. 35, 36 and 44.) Paraphrasing Cargill (Chap. 5), the meaning of overloaded operators should be natural and consistent; the set of overloaded operators should be complete.
  9. Declare operator= to take const T& as the right-hand-side parameter and to return T&. See Meyers, Item 15. This differs from the Ellemtel solution.
  10. "When defining operator=, remember x = x." (Cargill, Chaps. 2 and 5.) Also see Ellemtel Rule 28.
  11. "A public member function must never return a non-const reference or pointer to member data." (Ellemtel Rule 29.)
  12. Use inline functions rather than #define to obtain more efficient code. See Ellemtel Rule 35. However, carefully consider the impact of making a function inline. The performance consequences can be surprising.
  13. "Usually, the destructor in a public base class should be virtual." (Cargill, Chap. 4.)
  14. If a member function has been declared virtual in a base class, include the virtual keyword in any declaration of that function in a derived class.
  15. Declare member variables as private or protected. Use access member functions to access to non-public values. See point k in Section 3.
  16. "If possible, always use initialization instead of assignment." (Ellemtel Rule 41.) But, do not write code that depends upon the order of initialization. See Meyers, Item 13.
  17. "Do not assume that static objects are initialized in any special order." (Ellemtel Prt. Rec. 15.)
  18. Within a class declaration, declare public elements defined first, protected elements next, and private elements last. Within each visibility category, declare any constructors first and any destructor next.
  19. Do not use the symbol NULL. Use 0 instead. See Ellemtel Rule 42.
  20. "Do not use malloc, realloc or free." Use new and delete instead. (Ellemtel Rule 50.)
  21. "Always provide empty brackets ("[]") for delete when deallocating arrays." (Ellemtel Rule 51.)
  22. Avoid multiple inheritance. See Cargill, Chap. 9 and Meyers, Item 43.
  23. Global functions should be fully scoped, that is, prepend the :: operator.
  24. In a .H file, avoid including a file that defines a class when a forward declaration would be sufficient (that is, when the class is accessed only via pointers and references). See Ellemtel Rule 9.
  25. "No member functions are to be defined within the class definition." (Ellemtel Rule 21.)
  26. Stream IO is preferred to C standard IO. At any rate, avoid mixing the two.
  27. Global new() will be defined in the foundation library and will handle out-of-memory errors. Include the appropriate file and don't bother to check the return value from new().

8. Format

Imposing constraints on the format of syntactic elements makes the code easier to read because it has a consistent appearance.
  1. Keep source line length to 79 characters or less.
  2. Write each statement on a separate line.
  3. Use 4 spaces to indent code. Avoid tabs - they are interpreted differently by different editors and printers.
  4. The preferred indentation for a continuation line is 2 additional spaces.
  5. "Use parentheses to clarify the order of evaluation for operators in expressions." (Ellemtel Rec. 56.)
  6. The following preferred brace placement and indentation conventions are a slight relaxation of the FSL Conventions. for (i = 0; i < MaxCount; i++) { /* body of loop */ } typedef struct { /* body of struct */ } name; class example { public: example(); ~example(); int value() const; void setValue(int val); private: int _value; };
  7. If a loop does all of its work in the control statement, acknowledge the null statement by placing the closing semicolon on a separate line. Alternately, a set of empty braces may be used. while ((c = getchar()) != NULL) ;
  8. Indent switch statements as follows: switch (ch) { case `\n': case `\r': /* code */ break; case `\0': /* code */ break; default: /* code */ break; } Multiple case statements may share a common clause provided all of the clause is shared. Include a break statement in each clause. Always include a default clause.
  9. Goto statements are discouraged.

9. Comments

Comments should add value. Be clear and concise. Don't bother to repeat the obvious.
  1. Write comments in grammatically correct English.
  2. Provide block comments for all functions and for blocks of code that merit explanation. Make block comments stand out by separating them from code with an extra comment line or a blank line. For example: /* * This is a K&R block comment. Many variants exist, * but there should be white space separating comment * text from code. */
  3. Embedded comments, for example to clarify control flow, do not require emphasis as do block comments.
  4. In C++, use // to begin a comment. // This is a C++ embedded comment. It is not separated // from code by white space.
  5. Comment individual lines of code as necessary. For example, in a Finite State Machine class: void reset(); // move to start state void advance(char); // advance one state transition
  6. Avoid comments to indicate what closing braces close unless the comment contributes significantly to the understandability of the code.
  7. Header comments are discussed in a separate document entitled FX-ALPHA C and C++ Header Templates.

10. Addendum 1 (12/27/94)

These items augment the original 4/28/94 version of this document.

    Variable Naming

  1. _<units> may be appended to the end of variable names that represent meteoro-logical values. Examples are: range_km, maxTemp_C, and height_ftMSL.
  2. _UpperMixedCase may be used for class static member variables. See also item 3.f.

    Function Parameters

  3. Give formal parameters meaningful names. Use the same formal parameter names in the function prototype and the definition. See also item 3.q.
  4. If a function returns only one value, use the function return value.
  5. If a function returns a success-or-failure status, success will be represented by the bool value true and failure by the bool value false.
  6. If a function will return a value via one or more function parameters, those parameters should appear after all input parameters for which there are not default values.
  7. In C++, use pointers for those function parameters that return a value and use pass-by-value or const references for parameters that do not return a value.
  8. See also Ellemtel Chap. 9, Functions.

    Directory Naming

  9. The main criteria for choosing a directory name is that, in the context of its location in the directory hierarchy, it conveys to the reader a clear understanding of its contents. This is more difficult with the artificially flat directory structure imposed by RECOURSE.
  10. Directory names are lowerMixedCase, that is, they begin with a lower-case character and then used mixed case as appropriate.

    File Naming (See item 3.o.)

  11. Use the .H suffix for an include file that can be used only by C++ includers. Use the .h suffix for an include file that can by used by C or C++ includers. If you need the C++ compiler to compile a file, then that file is considered to be a C++ implementation. Otherwise it is a C implementation. Use the .C suffix for a C++ implementation. Use the .c suffix for a C implementation.
  12. There is generally only one class per .H/.C pair. The name of a .H/.C pair is exactly the class name which must begin with an uppercase alphabetic character.
  13. Use lowerMixedCase for the names of C language files. (As for variable names, acronyms may be in uppercase.)
  14. While function names are generally verbs, C language file names are generally nouns. For example, use the file name SAO_Decoder for a file in which the primary function is named decodeSAO().

    Libraries

  15. Provide at most one library file (archive) per source-code directory.
  16. Library files are named as follows: libxxx.a where xxx is identical to the directory name associated with the library and always begins with a lowercase alphabetic character. (This convention is currently violated in Backbone-1.)
  17. When the string "Lib" appears in a directory name, it may be omitted from the library name. For example, the library for the directory uiLib may be named libuiLib.a or uilib.a.

    Environment Variable Naming

  18. Environment variables that expand to path names should expand to an absolute rather than a relative path name. For example, /data/fx-2/netcdf is a good name; ../data a bad one.
  19. Assume that environment variables expanding to path names omit the trailing slash. For example, assume ~fxa/data; not ~fxa/data/.

11. Addendum 2 (1/19/95)

These items augment the revised 12/27/94 version of this document.

    Global Variables

  1. Do not use global variables (that is, those that are visible in more than one linkage unit) without good reason; their use is strongly discouraged. Global variables are difficult to track, make it difficult to change or correct code without changing all clients of the code, make debugging more difficult, and are tough for compilers to optimize. They are generally considered bad style, especially for large multi-developer projects like ours. See also item 5a.
  2. If you have a good reason to use a global variable, adhere to the naming conventions that are described in the introduction to Section 3. The requirement to make global variable names UPPERCASE has been struck from item 3.f (as of the date of this addendum). Instead, use a prefix as described in item 3.n for global functions.

    File-Scoped Variables

  3. Use file-scoped variables (that is, variables that are visible in a single linkage unit) with care. They are useful when a variable is to be shared by many functions within a module (linkage unit). They are dangerous because, as for global variables, it is easy to assign a value but difficult to determine where that assignment took place when debugging.
  4. As for global variables, use a meaningful prefix for file-scoped variables to indicate the scope. For example, use prefix m for module.
 
HTML version created by Joe Wakefield, 18 Jun 97.