How to make a const-correct codebase in 4300 easy steps

One of the codebases that I work on is theoretically C++, but if you peer under the hood, it looks more like 1990-vintage C. It’s 500 KLOC of almost purely procedural code, with lots of structs and few true objects. More dusty and brittle than I’d like.

Image credit: Tim Gillin (Flickr)

I am not a C++ bigot; I first began to do serious, professional coding in C, not long after this codebase got its start. I understand the C-isms pretty well. And although I think Linus got carried away in his rant about the ugliness of C++, I can appreciate the ways that lean C sometimes makes its descendant look ugly and inefficient. (Though C++11 and 14 are making this less true…)

This means that I don’t consider the C-like style of this particular codebase a fatal flaw, in and of itself.

However, since my early coding adventures I’ve been converted to the advantages of OOP for complex projects and large teams, and I’ve also accumulated a lot of battlescars around multithreading. Our codebase needs OOP to solve some encapsulation antipatterns, and it needs RAII and C++11-style mutexing in the worst way. Its old, single-threaded mindset makes far too many things far too slow and error-prone.

A few months ago, we decided to make an investment to solve these problems.

To do it right, I had the team add a story to our scrum backlog about making the codebase const-correct. And therein lies a tale worth recounting…

Continue reading

Good fences make good neighbors

In Robert Frost’s poem, “Mending Wall”, two farmers meet each spring to rebuild the rock wall between their properties. One farmer is the narrator. He notes that the unseen forces of winter and weather always cause some decay (“something there is that doesn’t love a wall”), and he wonders why the wall is necessary. There’s apple orchard on one side, and pine forest on the other–it’s not as if something will be kept in or out. The other farmer answers with the repeated aphorism “good fences make good neighbors.”

photo credit: DragonWoman (Flickr)

This poem could be a treatise for the principle of encapsulation in software. In software as in life:

  • Something there is that doesn’t love a wall.
  • Good fences make good neighbors.

What doesn’t love a wall?

Subroutines, formal interfaces, data hiding, class hierarchies, the pimpl idiom, and similar mechanisms all create barriers in software between consumers and providers of functionality. These techniques are well known, but we still have codebases littered with protected data members, unnecessary class declarations in headers, goto, and other suboptimal choices.

Why? Continue reading

How Sutter’s Wrong About const in C++ 11

Herb Sutter recently gave a talk about how the const keyword and the mutable keyword have subtle but profoundly different semantics in C++ 11. In a nutshell, he says that C++ 11 corrects the wishy-washy definition of const in C++ 98; const used to mean “logically constant,” but now it means thread-safe. And mutable now means thread-safe as well. His summary slide says:

const == mutable == thread safe (bitwise const or internally synchronized)

Editor’s note: Since this post was written, Herb has updated his slide. See Herb’s note in the comment stream below.

Now, I think Herb’s talk is quite informative, and I don’t dispute the core of what he was trying to convey. It’s a good insight, well worth the community’s attention. I learned something important; I recommend that you watch the talk. Using const well is an essential skill. But I think in his enthusiasm about the way the language has evolved to make semantics clearer, Herb does us a disservice by oversimplifying.

When Herb uses the C++ == operator to boil his point down to a pithy summary, he’s implying true equivalence; what’s on one side of the operator is, for all intents and purposes, identical to or indistinguishable from what’s on the other side. And while const and mutable and thread-safe are highly related concepts, they are not equivalent enough to each other for ==.

To understand why, answer the following question: Why would good code use const and/or mutable even if it’s single-threaded?

Ah. I imagine you nodding your head sagely. You see where I’m going, don’t you?

These two keywords don’t just define semantics for cross-thread access; they define the semantics a variable or object supports when accessed by various scopes (e.g., subroutines or code blocks) on the same thread. If you pass a const Widget & to a function, that function can’t call Widget::modifyState() even if it’s the only thread in the universe. If you declare a m_lazy_init member variable to be mutable, you are telling the compiler to let you change it where it would normally be disallowed, including on the same thread.

So: const means unchangeable in whatever scope sees const (including many threads), which is why it also implies thread-safe (if all threads see const); mutable means changing safely in one or many threads, which is why it also implies thread-safe (if all threads see const). In C++ 98, these semantics were a bit loose. You could use them carelessly, cast away parts of their guarantees, and generally operate as a law unto yourself. In C++ 11 the semantics of const and mutable are explicit and exacting; the standard library demands thread-safe copy construction. As a result, their role in thread safety is clarified, and we all write better code. Mutexes and atomics and certain kinds of queues are inherently safe to change from any thread; they deserve and require the mutable keyword.

Instead of Herb’s final equation, I’d propose a Venn diagram:

The const and mutable keywords are not equivalent in C++ 11, but they do share guarantees about thread safety.

The const and mutable keywords are not equivalent in C++ 11, but they do share guarantees about thread safety.

Put Your Const Foot Forward

Here are two C++ style habits that I recommend. Neither is earth-shattering, but both have a benefit that I find useful. Both relate to the order in which constness shows up in your syntax.

1. When you write a conditional, consider putting the constant (unchangeable) value first:

if (0 == i)

… instead of:

if (i == 0)

The reason is simple. If you forget/mis-type and accidentally write a single = instead of two, making the expression into an assignment, you’ll get a compile error, instead of subtle and difficult-to-find misbehavior. (Thanks to my friend Doug for reminding me about this one not long ago.)

Ah, the joys of pointers… Image credit: xkcd.

2. With any data types that involve pointers, prefer putting the const keyword after the item that it modifies:

char const * VERSION = "2.5";

… instead of:

const char * VERSION = "2.5";

This rule is simple to follow, and it makes semantics about constness crystal clear. It lets you read data types backwards (from right to left) to get their semantics in plain English, which helps uncover careless errors. In either of the declarations of VERSION given above, the coder probably intends to create a constant, but that’s not what his code says. The semantics of the two are identical, as far as a compiler is concerned, but the first variant makes the mistake obvious. Reading right-to-left, the data type of VERSION is “pointer to const char” — so VERSION could be incremented or reassigned.

Use the right-to-left rule in reverse to solve the problem. If we want a “const pointer to const char”, then we want:

char const * const VERSION = "2.5";

That is a true string literal constant in C++. (Thanks to my friend Julie for teaching me this one.)

This might seem like nit-picky stuff, but if you ever get into const_iterator classes and STL containers, this particular habit helps you write or use templates with much greater comfort. It also helps if you have pointers to pointers and the like. (For consistency, I prefer to follow the same convention for reference variables as well. However, this is not especially important, since references are inherently immutable and therefore never bind to const.)

Action Item

Share a tip of your own, or tell me why you prefer different conventions.