3 reasons to prefer references over pointers (C++)

I still remember what it was like, as a C programmer, to be introduced to the newfangled concept of references in C++. I thought: “This is dumb. It’s just another way to use pointers. More syntactic sugar for no good reason…”

For a long time, I thought of pointers vs. references as a stylistic choice, and I’ve run into lots of old C pros who feel the same. (The debate on this comment stream is typical.) If you’re one of them, let me see if I can explain why I now think I was wrong, and maybe convince you to use references where it makes sense. I won’t try to enumerate every reason–just hit a few highlights.

image credit: xkcd.com

1. References have clearer semantics

NULL is a perfectly valid value for a pointer, but you have to do some headstands to create a reference to NULL. Because of these headstands, and because testing a reference for NULL-ness is a bit arcane, you can assume that references are not expected to be NULL. Consider this function prototype, typical of so much “C++” code written by folks with a C mindset:

void setClient(IClient * client);

With only that line, you don’t know very much. Will client’s state change during the call? Is it valid to pass NULL? In the body of the call, will client ever point to anything other than the value it had at the top of the function?

Veteran C programmers recognize that the semantics are unclear, so they come up with doc conventions to plug the gap, and they check assumptions at the top of the function:

/**
 * @param client IN, cannot be NULL
 */
void setClient(IClient * client) {
    if (client != NULL) { //...do something

This is fine, except why depend on a comment and a programmer’s inclination to read and obey, when you can enforce your intentions at compile time, and write less code, too? Continue reading

How Sutter’s Wrong About const in C++ 11

Herb Sutter recently gave a talk about how the const keyword and the mutable keyword have subtle but profoundly different semantics in C++ 11. In a nutshell, he says that C++ 11 corrects the wishy-washy definition of const in C++ 98; const used to mean “logically constant,” but now it means thread-safe. And mutable now means thread-safe as well. His summary slide says:

const == mutable == thread safe (bitwise const or internally synchronized)

Editor’s note: Since this post was written, Herb has updated his slide. See Herb’s note in the comment stream below.

Now, I think Herb’s talk is quite informative, and I don’t dispute the core of what he was trying to convey. It’s a good insight, well worth the community’s attention. I learned something important; I recommend that you watch the talk. Using const well is an essential skill. But I think in his enthusiasm about the way the language has evolved to make semantics clearer, Herb does us a disservice by oversimplifying.

When Herb uses the C++ == operator to boil his point down to a pithy summary, he’s implying true equivalence; what’s on one side of the operator is, for all intents and purposes, identical to or indistinguishable from what’s on the other side. And while const and mutable and thread-safe are highly related concepts, they are not equivalent enough to each other for ==.

To understand why, answer the following question: Why would good code use const and/or mutable even if it’s single-threaded?

Ah. I imagine you nodding your head sagely. You see where I’m going, don’t you?

These two keywords don’t just define semantics for cross-thread access; they define the semantics a variable or object supports when accessed by various scopes (e.g., subroutines or code blocks) on the same thread. If you pass a const Widget & to a function, that function can’t call Widget::modifyState() even if it’s the only thread in the universe. If you declare a m_lazy_init member variable to be mutable, you are telling the compiler to let you change it where it would normally be disallowed, including on the same thread.

So: const means unchangeable in whatever scope sees const (including many threads), which is why it also implies thread-safe (if all threads see const); mutable means changing safely in one or many threads, which is why it also implies thread-safe (if all threads see const). In C++ 98, these semantics were a bit loose. You could use them carelessly, cast away parts of their guarantees, and generally operate as a law unto yourself. In C++ 11 the semantics of const and mutable are explicit and exacting; the standard library demands thread-safe copy construction. As a result, their role in thread safety is clarified, and we all write better code. Mutexes and atomics and certain kinds of queues are inherently safe to change from any thread; they deserve and require the mutable keyword.

Instead of Herb’s final equation, I’d propose a Venn diagram:

The const and mutable keywords are not equivalent in C++ 11, but they do share guarantees about thread safety.

The const and mutable keywords are not equivalent in C++ 11, but they do share guarantees about thread safety.

Put Your Const Foot Forward

Here are two C++ style habits that I recommend. Neither is earth-shattering, but both have a benefit that I find useful. Both relate to the order in which constness shows up in your syntax.

1. When you write a conditional, consider putting the constant (unchangeable) value first:

if (0 == i)

… instead of:

if (i == 0)

The reason is simple. If you forget/mis-type and accidentally write a single = instead of two, making the expression into an assignment, you’ll get a compile error, instead of subtle and difficult-to-find misbehavior. (Thanks to my friend Doug for reminding me about this one not long ago.)

Ah, the joys of pointers… Image credit: xkcd.

2. With any data types that involve pointers, prefer putting the const keyword after the item that it modifies:

char const * VERSION = "2.5";

… instead of:

const char * VERSION = "2.5";

This rule is simple to follow, and it makes semantics about constness crystal clear. It lets you read data types backwards (from right to left) to get their semantics in plain English, which helps uncover careless errors. In either of the declarations of VERSION given above, the coder probably intends to create a constant, but that’s not what his code says. The semantics of the two are identical, as far as a compiler is concerned, but the first variant makes the mistake obvious. Reading right-to-left, the data type of VERSION is “pointer to const char” — so VERSION could be incremented or reassigned.

Use the right-to-left rule in reverse to solve the problem. If we want a “const pointer to const char”, then we want:

char const * const VERSION = "2.5";

That is a true string literal constant in C++. (Thanks to my friend Julie for teaching me this one.)

This might seem like nit-picky stuff, but if you ever get into const_iterator classes and STL containers, this particular habit helps you write or use templates with much greater comfort. It also helps if you have pointers to pointers and the like. (For consistency, I prefer to follow the same convention for reference variables as well. However, this is not especially important, since references are inherently immutable and therefore never bind to const.)

Action Item

Share a tip of your own, or tell me why you prefer different conventions.

What Should Be In The Next C++ Compiler?

Herb Sutter (champion of VC++ compiler evolution at Microsoft) posted an interesting poll today. He wants to know what the C++ developer community thinks about priorities for the next release.

Go vote.

Editorial comment: I’m not surprised that conformance is winning over performance or fancy features. MS’s compiler has always been easy to use in a vacuum, but a pain when interoperability and cross-platform matter. When Herb went to MS, things improved drastically, but my perception is that gcc leapfrogged VC++ a year or two back, as C++11 began to gel. (VS 2012 probably gets back to parity again; I haven’t poked into it deeply, yet.)