A Quibble With Martin’s “Optimize Later” Notion

In Refactoring, Martin Fowler (a brilliant engineer whom I greatly admire) articulates an idea that I have heard from smart engineers for a long time: first make it work, then make it fast. He puts it this way:

“Until I profile I cannot tell how much time is needed for the loop to calculate or whether the loop is called often enough for it to affect the overall performance of the system. Don’t worry about this while refactoring. When you optimize you will have to worry about it, but you will then be in a much better position to do something about it, and you will have more options to optimize effectively.”

I mostly agree. Certainly, premature optimization can cause lots of problems (pollute an otherwise clean design, overvalue corner cases, dilute conceptual integrity), and profiler-driven optimization (science, not black magic!) is the way to get the best results. Donald Knuth famously observed that “premature optimization is the root of all evil” — a bit over the top, maybe, yet true often enough to give me fits.

But implicit in Fowler’s advice are the following problematic notions:

  • Optimization is a discrete activity from ordinary coding. Especially, it is discrete from refactoring.
  • Between the time that you code an original or refactored version, and the time you optimize, the existence of unoptimized code will have negligible effect on how the rest of the code’s ecosystem evolves.

The flaws in the first notion should be obvious; optimization often requires concommitant refactoring. I won’t beat that dead horse here. The second idea, however, deserves further comment.

Sometimes the only time to optimize is before decisions get made :-). Image credit: xkcd

Before you get around to optimizing, what happens if programmers go looking for an API that does X, find your works-correctly-but-suboptimally function, and wrinkle their nose. “Code smell!” they cry. And they write their own function that does a binary rather than linear search, etc. They don’t have time to investigate whether the original version was coded that way for a reason (and thus should simply be refactored); they just need something that works AND that is fast, and your function doesn’t cut it.

I have seen this happen over and over and over again. Late optimization, while smart in many cases, must be managed (communicated, commented, evangelized, trained, reinforced, audited, planned for) carefully or else it will provoke a lot of NIH and contempt from (ironically) less savvy programmers. Result: more guck that needs refactoring.

Action Item

Find a notoriously sub-optimized function in your code. Study how its existence in non-optimal form has influenced how other code has evolved.

Good Code Is Named Right

(Another post in my “What is ‘Good Code’?” series…)

A rose by any other name may smell as sweet, but in software, the names you choose have consequences.

Rosa berberifolia. Photo credit: I Believe I Can Fry (Flickr).

Names can confuse or cohere. In The Mythical Man-Month, Fred Brooks emphasizes the need for code to have “conceptual integrity.” He means that code that should embody a unifying and consistent vision, with minimal distraction or dissonance. Names of classes, functions, applications, interfaces, resources in RESTful URLs — all are a reflection of the code’s cohesiveness or its chaos.

I once worked with an engineer who liked to pull variable names out of the random hopper at the top of his brain: “apple”, “banana”, “ick”… Although his code provoked an occasional snort of amusement, it didn’t do much to guide later readers into a productive mindset.

One way I can distinguish a mediocre engineer from a great one is by the quality of their language–particularly, the names she or he chooses. Mediocre engineers are sloppy and inconsistent in their names, because they undervalue the way their code communicates to human beings. Mediocre engineers think that comments are for humans, and code is for computers. Code, like java or C++ or ruby, doesn’t communicate to computers at all, folks; it has to be turned into op-codes and 1s and 0s before a computer can use it! Code is human language. Comments are like parenthetical asides in normal human speech — needed occasionally, but annoying if they restate the obvious and distract from flow.

Good engineers understand this. It bothers them if something is called a “Controller” in the code, but it fails to implement IController. It bothers them if .ReadLine() doesn’t always read a line of text from a file; when they run across such a function, they are prone to rename it ReadUpToAFullLine() so the function’s semantics are obvious. If they implement a method that calculates a standard deviation, they are likely to name it something like calcStandardDeviation() instead of stdv() or calc(). (This is not about naming conventions, BTW. I don’t have a problem with short forms or whatever casing convention you prefer; I’m just emphasizing clarity.) Code from great engineers says what they mean, and means what they say. Notice how Martin Fowler (a great engineer) takes this for granted as he discusses an appropriate name for a class in Refactoring:

Does the price class represent an algorithm for calculating the price (in which case I prefer to call it Pricer or PricingStrategy), or does it represent a state of the movie (Star Trek X is a new release). At this stage the choice of pattern (and name) reflects how you want to think about the structure. At the moment I’m thinking about this as a state of a movie. If I later decide a strategy communicates my intention better, I will refactor to do this by changing the names.

Somewhere (maybe Scott Meyers?) I remember reading an expert’s lament about people naming classes FooManager, BarManager, etc. His point was that “Manager” says little or nothing about the class’s responsibilities. I agree (although I must admit I’ve written a few XManager classes in my time :-).

Truly great engineers take the language insight of good engineers one step further. Not only do they want clear and consistent names–they want their code to resonate to a unifying metaphor.

In the early days of ecommerce (I was writing CC processing stuff in about 1996), nobody talked about “shopping carts.” You just wrote code that accepted credit cards, and you kept track of what the user wanted to buy until they were ready to pay. You accumulated customer state in your session, or maybe your db, in whatever way you could cobble together. Messy. Once the shopping cart metaphor was introduced, it was easy to see how you could let a customer change quantities at the last minute, handle partial payments with different cards, apply discounts and coupons, and so forth.

The power of metaphor in code is so pervasive that it may be invisible unless you’re looking for it. Good metaphor leaks from coders to their managers and marketers and support staff and tech writers–and because it explains so much, so clearly and concisely, the audience gloms onto it immediately. From there it leaks out to customers and the blogosphere, and we start taking it for granted. Which says more to you: “a software application that lets you pretend to be running a full OS with simulated hardware” or “virtual machine”? How about “self-replicating program that subverts the normal purpose of software” or “virus”?

Action Item

Find a place in code where comments are compensating for a class, function, or variable with a less-than-ideal name, and fix it.

Extra Credit

Find a place in code where you have a weak or inconsistent metaphor. List implications of that metaphor problem. Brainstorm improvements; if one of the improvements seems particularly helpful, implement it.

Good Code Is Optimized

(Another post in my “What is ‘Good Code’?” series…)

Yes, optimized.

But for what?

A lot of programmers seem to think that raw speed of execution is the only possible answer. If pushed, they may admit it’s also possible to optimize for minimal memory usage or minimal size of executable. Compilers have switches for that.

Get out of the box. Photo credit: lel4nd (Flickr).

Emerson said, “A foolish consistency is the hobgoblin of little minds.” In modern terms, he was deploring the lazy instinct to accept established wisdom instead of thinking outside the box. And I think optimization is one of those topics where we need a larger vision.

What about optimizing for:

  • Speed of coding (sometimes programmer time is the most constrained resource…)?
  • Ease of use (often, low learning curve and productive users outweighs all other factors…)?
  • Speed of testing (sometimes provably correct is the most important success criterion…)?
  • Full utilization (the major promise of physical-to-virtual-to-cloud migration)?
  • Ease of understanding and maintenance?
  • Integration with external systems?

Selecting the criteria against which you optimize is more than a technical question. It’s a strategic one, driven by business and organizational goals. Programmers who relentlessly pursue speed of execution to the exclusion of other considerations are not doing their teams or their companies a favor.

Action Item

Optimize one function for ease of understanding and maintenance. Make a short list of how your choices were different than they might have been if you optimized for speed of execution.

Good Code Is Balanced

In my first post about what constitutes “good code,” I claimed we were dealing with a complex question. This is why I distrust short answers.

So many competing concerns must be balanced to achieve goodness:

  • Testability
  • Maintainability
  • Short-term revenue pressures
  • Long-term strategic value
  • Performance (many aspects)
  • Scalability (up, down, across)
  • Ease of use
  • Supportability
  • Conceptual integrity
  • Alignment with the skills, temperament, interests, and tools of the team that owns it
  • Cost vs. benefit (for some problems, quick and dirty is definitely “right”)
  • Simplicity (separation of concerns)

More items undoubtedly belong on the list. Quite a balancing act!

Someone’s got this “balance” thing down! Photo credit: joãokẽdal (Flickr).

Action Item

Pick a module, application, or subsystem that you know well, and grade its code according to how much its coders emphasize a few different dimensions (e.g., performance, testability, scalability, ease of use). Do you like the balance? Are any attributes being neglected?

What Is “Good Code”?

This is one of those questions that I often ask when I am interviewing a computer programmer job applicant, trying to get a sense for an engineer’s maturity with the craft. (And for the record, I don’t think the question has a “right” answer. Certainly there is no ideal one-sentence response.)

Sometimes I get answers like this:

  • “Whatever gets the job done.”
  • “Whatever sells.”
  • “Whatever solves the customer problem.”

Answering tough interview questions. Photo credit: bpsusf (Flickr).

Such answers tell me that an engineer is practical, business-aware, and customer-focused–all useful traits. Pragmatism is usually learned in the economic school of hard knocks, and it’s a critical perspective that should never be forgotten. But I don’t get a warm fuzzy from pure pragmatism; it lacks vision or love of craft.

Another kind of answer focuses on cleverness:

  • “I wrote a burst sort once that could beat stdlib qsort. It’s counterintuitive, I know, but the way burst sort works cache…”
  • “You wouldn’t believe how much you can say with a 3-line statement in python…”

An engineer with this type of perspective also has praiseworthy qualities–an appreciation for elegance, a desire to achieve. But I find these answers unsatisfying as well. For one thing, the statements are lonely; notice how little they imply about who will build upon or use the finished product. In this vein, I love Martin Fowler’s warning in Refactoring:

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”

I’ll also include a “miscellaneous” response category, which encompasses stuff like:

  • “Good code has to be scalable, make efficient use of resources, optimize for good performance, …”
  • “Good code is maintainable and testable.”
  • “KISS — keep it simple, stupid.”

All true (the first statement less than the other two, IMO), but all less than fully satisfying.

So what answer would impress me?

Albert Einstein supposedly said, “Make things as simple as possible, but not simpler.” Well, I think good code is quite a complicated subject, and the first thing that would impress me is an acknowledgement that I’ve posed a very difficult question indeed.

Since I’ve already mentioned one of my favorite quotes about simplicity, I think I’ll mention the other here, as well. Oliver Wendell Holmes: “I would not give a fig for the simplicity this side of complexity, but I would give my life for the simplicity on the other side of complexity.”

Holmes was talking, I believe, about wrestling with a difficult, multifaceted problem, and distilling it down to its essence only after all its dimensions are fully understood.

I’m going to post a few observations on what I think constitutes good code in coming days. These will be glimpses of zen that I’ve occasionally stumbled upon on the “other side of complexity” as I’ve wrestled with the craft through my career.

I’ll be curious to know what you think, as well.

(Read more posts in my “What is ‘Good Code’?” series…)