If X works for 1 ___ [minute | user | computer | customer | …], then 100X ought to work for 100, right? And 1000X for 1000?
Sorry, Charlie. No dice.
One of my favorite books, Universal Principles of Design, includes a fascinating discussion of our tendency to succumb to scaling fallacies. The book makes its case using the strength of ants and winged flight as examples.
Have you ever heard that an ant can lift many times its own weight–and that if that if one were the size of a human, it could hoist a car over its head with ease? The first part of that assertion is true, but the conclusion folks draw is completely bogus. Exoskeletons cease to be a viable structure on which to anchor muscle and tissue at sizes much smaller than your average grown-up; the strength-to-weight ratio just isn’t good enough. Chitin is only about as tough as fingernails.
I’d long understood the flaws in the big-ant-lifting-cars idea, but the flight example from the book was virgin territory for me.
Humans are familiar with birds and insects that fly. We know they have wings that beat the air. We naively assume that at much larger and much smaller scales, the same principles apply. But it turns out that at the micro scale, wings don’t move enough air molecules to be helpful when they flap, and at the giant scale (say, the size of an elephant), flapping wings become impractical due to structural challenges.
What does this have to do with software?
For one thing, what works in small codebases often doesn’t work in large ones. The need for disciplined practices such as continuous integration, TDD, encapsulation, loose coupling, and so forth is just not profound if you’re writing a 50-line bash script for your own consumption. This is one reason why I think Steve Yegge’s claim that size–not poor design–is code’s worst enemy is actually quite profound.
Setting aside the way scale affects processes and teams, think about what it does to product.
If you’ve ever tried to apply a design that works well at one scale to a problem domain that’s a couple orders of magnitude different, you know that often, scaling is far more difficult that a simple linear calculation.
Grep is a great tool for finding text in files. It can crunch through all the files in a directory in a second or so, and all the files on my hard drive in a handful of minutes. But using grep to search all the documents in a company’s archives is impractical.
A traditional enterprise search product is also a great tool for finding text in files. It’s too cumbersome to set up to do the quick-and-dirty, small-scale work of grep, but it comes into its own when you need to find all emails sent by a company in the past decade that might be relevant to a lawsuit.
Enter big data…
Google’s indexing of the internet is essentially a scaled-up, incredibly sophisticated, optimized version of traditional enterprise search. Last I heard, over 4 million (!) servers were behind http://www.google.com, servicing the queries that all of us feed it. It’s impressive–miraculous, even–how effective the Google service (and Bing, and other competitors) has managed to be. I don’t have major complaints about the user experience.
But it’s the wrong architecture for internet scale. We’re paying way too much for power and hardware to keep these sites running; we need something radically different, which is why technologies like the one I helped productize at Perfect Search are the wave of the future. Perfect Search can sustain query speeds that are hundreds or thousands of times faster than a traditional index; sooner or later, the world will figure out that that matters.
Use better scaling assumptions
The next time you have to plan for a scale that’s well outside original tolerances (whether that scale has to do with numbers of machines/threads/events, or with size of deployment, or whatever), try using the 80:20 rule instead of a linear formula as your guide: the 20% of the cases at the high end of what you’re targeting will take 80% of the effort to address correctly. Recurse: of the 20% at the high end, 80% of those cases might be handled by a modest adjustment of the design, but 20% need a radical improvement. And recurse again.
I don’t think there’s anything magical about 80:20; maybe a simple exponential function is easier and more accurate. Or maybe you have a stair-step progression. Or maybe you simply can’t progress to the higher scale at all, because money or physics or some other factor imposes a hard limit that no amount of hand-waving will overcome.
Bottom line: part of your job as a designer/architect is to understand these issues, wrestle with them, and provide a balanced roadmap that anticipates the best all-around compromises.
Analyze the amount of RAM, CPU, or disk that your program uses at several different scales. Is the curve linear, exponential, logarithmic, a stair-step? What does this tell you about business goals like selling to bigger customers?
- Three Big Economic Fallacies Behind Growth Hacking (spinnakr.com)
- Path of a Critical Thinker (Meow) (dead-logic.blogspot.com)
- What Fred Wilson and the VCs don’t get about advertising (pandodaily.com)