the_normal_curve

I don’t think posting pieces of chapters is working for any of us, so I’m changing the plan.

We have 16 chapters to go in the book so I’ll be posting in their entirety two chapters per week for the next eight weeks. The chapters will be posted late on Sunday and Wednesday nights so you will have several days to read and comment.

Down at the Specks Howard School of Blogging Technique they teach that this is blogging suicide because these chapters are up to 7000 words long! Blog readers are supposed to have short attention spans so I’ll supposedly lose readers by doing it this way. But I think Specks is wrong and smart readers want more to read, not less — if the material is good. You decide.

See you in the comments, then, and again late Wednesday night.

ACCIDENTAL EMPIRES — CHAPTER TWO

THE TYRANNY OF THE NORMAL DISTRIBUTION

This chapter is about smart people. My own, highly personal definition of what it means to be smart has changed over the years. When I was in the second grade, smart meant being able to read a word like Mississippi and then correctly announce how many syllables it had (four, right?). During my college days, smart people were the ones who wrote the most complex and amazing computer programs. Today, at college plus twenty years or so, my definition of smart means being able to deal honestly with people yet somehow avoid the twin perils of either pissing them off or of committing myself to a lifetime of indentured servitude by trying too hard to be nice. In all three cases, being smart means accomplishing something beyond my current level of ability, which is probably the way most other folks define it. Even you.

But what if nothing is beyond your ability? What if you’ve got so much brain power that little things like getting through school and doing brain surgery (or getting through school while doing brain surgery) are no big sweat? Against what, then, do you measure yourself?

Back in the 1960s at MIT, there was a guy named Harvey Allen, a child of privilege for whom everything was just that easy, or at least that’s the way it looked to his fraternity brothers. Every Sunday morning, Harvey would wander down to the frat house dining room and do the New York Times crossword puzzle before breakfast—the whole puzzle, even to the point of knowing off the top of his head that Nunivak is the seven-letter name for an island in the Bering Sea off the southwestern coast of Alaska.

One of Harvey Allen’s frat brothers was Bob Metcalfe, who noticed this trick of doing crossword puzzles in the time it took the bacon to fry and was in awe. Metcalfe, no slouch himself, eventually received a Ph.D., invented the most popular way of linking computers together, started his own company, became a multimillionaire, put his money and name on two MIT professorships, moved into a 10,000-square-foot Bernard Maybeck mansion in California, and still can’t finish the New York Times crossword, which continues to be his definition of pure intelligence.

Not surprisingly, Harvey Allen hasn’t done nearly as much with his professional life as Bob Metcalfe has because Harvey Allen had less to prove. After all, he’d already done the crossword puzzle.

Now we’re sitting with Matt Ocko, a clever young programmer who is working on the problem of seamless communication between programs running on all different types of computers, which is something along the lines of getting vegetables to talk with each other even when they don’t want to. It’s a big job, but Matt says he’s just the man to do it.

Back in North Carolina, Matt started DaVinci Systems to produce electronic mail software. Then he spent a year working as a programmer at Microsoft. Returning to DaVinci, he wrote an electronic mail program now used by more than 500,000 people, giving Matt a net worth of $1.5 million. Eventually he joined a new company, UserLand Software, to work on the problem of teaching vegetables to talk. And somewhere in there, Matt Ocko went to Yale. He is 22 years old.

Sitting in a restaurant, Matt drops every industry name he can think of and claims at least tangential involvement with every major computer advance since before he was born. Synapses snapping, neurons straining near the breaking point—for some reason he’s putting a terrific effort into making me believe what I always knew to be true: Matt Ocko is a smart kid. Like Bill Gates, he’s got something to prove. I ask him if he ever does the New York Times crossword.

Personal computer hardware and software companies, at least the ones that are doing new and interesting work, are all built around technical people of extraordinary ability. They are a mixture of Harvey Aliens and Bob Metcalfes—people who find creativity so effortless that invention becomes like breathing or who have something to prove to the world. There are more Bob Metcalfes in this business than Harvey Aliens but still not enough of either type.

Both types are exceptional. They are the people who are left unchallenged by the simple routine of making a living and surviving in the world and are capable, instead, of first imagining and then making a living from whole new worlds they’ve created in the computer. When balancing your checking account isn’t, by itself, enough, why not create an alternate universe where checks don’t exist, nobody really dies, and monsters can be killed by jumping on their heads? That’s what computer game designers do. They define what it means to be a sky and a wall and a man, and to have color, and what should happen when man and monster collide, while the rest of us just try to figure out whether interest rates have changed enough to justify refinancing our mortgages.

Who are these ultrasmart people? We call them engineers, programmers, hackers, and techies, but mainly we call them nerds.

Here’s your father’s image of the computer nerd: male, a sloppy dresser, often overweight, hairy, and with poor interpersonal communication skills. Once again, Dad’s wrong. Those who work with nerds but who aren’t themselves programmers or engineers imagine that nerds are withdrawn—that is, until they have some information the nerd needs or find themselves losing an argument with him. Then they learn just how expressive a nerd can be. Nerds are expressive and precise in the extreme but only when they feel like it. They look the way they do as a deliberate statement about personal priorities, not because they’re lazy. Their mode of communication is so precise that they can seem almost unable to communicate. Call a nerd Mike when he calls himself Michael and he likely won’t answer, since you couldn’t possibly be referring to him.

Out on the grass beside the Department of Computer Science at Stanford University, a group of computer types has been meeting every lunchtime for years and years just to juggle together. Groups of two, four, and six techies stand barefoot in the grass, surrounded by Rodin sculptures, madly flipping Indian clubs through the air, apparently aiming at each other’s heads. As a spectator, the big thrill is to stand in the middle of one of these unstable geometric forms, with the clubs zipping past your head, experiencing what it must be like to be the nucleus of an especially busy atom. Standing with your head in their hands is a good time, too, to remember that these folks are not the way they look. They are precise, careful, and . . .

POW1I

“Oh, SHIT!!!!!!”

“Sorry, man. You okay?”

One day in the mid-1980s, Time, Newsweek, and the Wall Street Journal simultaneously discovered the computer culture, which they branded instantly and forever as a homogenized group they called nerds, who were supposed to be uniformly dressed in T-shirts and reeking of Snickers bars and Jolt cola.

Or just reeking. Nat Goldhaber, who founded a software company called TOPS, used to man his company’s booth at computer trade shows. Whenever a particularly foul-smelling man would come in the booth, Goldhaber would say, “You’re a programmer, aren’t you?” “Why, yes,” he’d reply, beaming at being recognized as a stinking god among men.

The truth is that there are big differences in techie types. The hardware people are radically different from the software people, and on the software side alone, there are at least three subspecies of programmers, two of which we are interested in here.

Forget about the first subspecies, the lumpenprogrammers, who typically spend their careers maintaining mainframe computer code at insurance companies. Lumpenprogrammers don’t even like to program but have discovered that by the simple technique of leaving out the comments—clues, labels, and directions written in English—they are supposed to sprinkle in among their lines of computer code, their programs are rendered undecipherable by others, guaranteeing them a lifetime of dull employment.

The two programmer subspecies that are worthy of note are the hippies and the nerds. Nearly all great programmers are one type or the other. Hippy programmers have long hair and deliberately, even pridefully, ignore the seasons in their choice of clothing. They wear shorts and sandals in the winter and T-shirts all the time. Nerds are neat little anal-retentive men with penchants for short-sleeved shirts and pocket protectors. Nerds carry calculators; hippies borrow calculators. Nerds use decongestant nasal sprays; hippies snort cocaine. Nerds typically know forty-six different ways to make love but don’t know any women.

Hippies know women.

In the actual doing of that voodoo that they do so well, there’s a major difference, too, in the way that hippies and nerds write computer programs. Hippies tend to do the right things poorly; nerds tend to do the wrong things well. Hippie programmers are very good at getting a sense of the correct shape of a problem and how to solve it, but when it comes to the actual code writing, they can get sloppy and make major errors through pure boredom. For hippie programmers, the problem is solved when they’ve figured out how to solve it rather than later, when the work is finished and the problem no longer exists. Hippies live in a world of ideas. In contrast, the nerds are so tightly focused on the niggly details of making a program feature work efficiently that they can completely fail to notice major flaws in the overall concept of the project.

Conventional wisdom says that asking hippies and nerds to work together might lead to doing the wrong things poorly, but that’s not so. With the hippies dreaming and the nerds coding, a good combination of the two can help keep a software development project both on course and on schedule. The real problem is finding such superprogrammers in the first place. Often they hide.

 

**********

 

Back in the 1950s, a Harvard psychologist named George A. Miller wrote “The Magical Number Seven, Plus or Minus Two,” a landmark journal article. Miller studied short-term memory, especially the quick memorization of random sequences of numbers. He wanted to know, going into the study, how many numbers people could be reliably expected to remember a few minutes after having been told those numbers only once.

The answer—the magical number—was about seven. Grab some people off the street, tell them to remember the numbers 2-4-3-5-1-8-3 in that order, and most of them could, at least for a while. There was variation in ability among Miller’s subjects, with some people able to remember eight or nine numbers and an equal number of people able to remember only five or six numbers, so he figured that seven (plus or minus two) numbers accurately represented the ability range of nearly the entire population.

Miller’s concept went beyond numbers, though, to other organizations of data. For example, most of us can remember about seven recently learned pieces of similarly classified data, like names, numbers, or clues in a parlor game.

You’re exposed to Miller’s work every time you dial a telephone, because it was a factor in AT&T’s decision to standardize on seven-digit local telephone numbers. Using longer numbers would have eliminated the need for area codes, but then no one would ever be able to remember a number without first writing it down.

Even area codes follow another bit of Miller’s work. He found that people could remember more short-term information if they first subdivided the information into pieces—what Miller called “chunks.” If I tell you that my telephone number is (707) 525-9519 (it is; call any time), you probably remember the area code as a separate chunk of information, a single data point that doesn’t significantly affect your ability to remember the seven-digit number that follows. The area code is stored in memory as a single three-digit number—415—related to your knowledge of geography and the telephone system that rather than the random sequence of one-digit numbers—4-1-5—that relate to nothing in particular.

We store and recall memories based on their content, which explains why jokes are remembered by their punch lines, eliminating the possibility of mistaking “Why did the chicken cross the road?” with “How do you get to Carnegie Hall?” It’s also why remembering your way home doesn’t interfere with remembering your way to the bathroom: the sets of information are maintained as different chunks in memory.

Some very good chess players use a form of chunking to keep track of the progress of a game by taking it to a higher level of abstraction in their minds. Instead of remembering the changing positions of each piece on the board, they see the game in terms of flowing trends, rather like the intuitive grammar rules that most of us apply without having to know their underlying definitions. But the very best chess players don’t play this way at all: they effortlessly remember the positions of all the pieces.

As in most other statistical studies. Miller used a random sample of a few hundred subjects intended to represent the total population of the world. It was cheaper than canvassing the whole planet, and not significantly less accurate. The study relied on Miller’s assurance that the population of the sample studied and that of the world it represented were both “normal”—a statistical term that allows us to generalize accurately from a small, random sample to a much larger population from which that sample has been drawn.

Avoiding a lengthy explanation of bell-shaped curves and standard deviations, please trust George Miller and me when we tell you that this means 99.7 percent of all people can remember seven (plus or minus two) numbers. Of course, that leaves 0.3 percent, or 3 out of every 1,000 people, who can remember either fewer than five numbers or more than nine. As true believers in the normal distribution, we know it’s symmetrical, which means that just about as many people can remember more than nine numbers as can remember fewer than five.

In fact, there are learning-impaired people who can’t remember even one number, so it should be no surprise that 0.15 percent, or 3 out of every 2,000 people, can remember fewer than five numbers, given Miller’s test. Believe me, those three people are not likely to be working as computer programmers. It is the 0.15 percent on the other side of the bell curve that we’re interested in—the 3 out of every 2,000 people who can remember more than nine numbers. There are approximately 375,000 such people living in the United States, and most of them would make terrific computer programmers, if only we could find them.

So here’s my plan for leading the United States back to dominance of the technical world. We’ll run a short-term memory contest. I like the idea of doing it like those correspondence art schools that advertise on matchbook covers and run ads in women’s magazines and Popular Mechanics—you know, the ones that want you to “draw Skippy.”

“Win Big Bucks Just by Remembering 12 Numbers!” our matchbooks would say.

Wait, I have a better ideal We could have the contest live on national TV, and the viewers would call in on a 900 number that would cost them a couple of bucks each to play. We’d find thousands of potential top programmers who all this time were masquerading as truck drivers and cotton gin operators and beauticians in Cheyenne, Wyoming—people you’d never in a million years know were born to write software. The program would be self-supporting, too, since we know that less than 1 percent of the players would be winners. And the best part of all about this plan is that it’s my idea. I’ll be richl

Behind my dreams of glory lies the fact that nearly all of the best computer programmers and hardware designers are people who would fall off the right side of George Miller’s bell curve of short-term memory ability. This doesn’t mean that being able to remember more than nine numbers at a time is a prerequisite for writing a computer program, just that being able to remember more than nine numbers at a time is probably a prerequisite for writing a really good computer program.

Writing software or designing computer hardware requires keeping track of the complex flow of data through a program or a machine, so being able to keep more data in memory at a time can be very useful. In this case, the memory we are talking about is the programmer’s, not the computer’s.

The best programmers find it easy to remember complex things. Charles Simonyi, one of the world’s truly great programmers, once lamented the effect age was having on his ability to remember. “I have to really concentrate, and I might even get a headache just trying to imagine something clearly and distinctly with twenty or thirty components,” Simonyi said. “When I was young, I could easily imagine a castle with twenty rooms with each room having ten different objects in it. I can’t do that anymore.”

Stop for a moment and look back at that last paragraph. George Miller showed us that only 3 in 2,000 people can remember more than nine simultaneous pieces of short-term data, yet Simonyi looked wistfully back at a time when he could remember 200 pieces of data, and still claimed to be able to think simultaneously of 30 distinct data points. Even in his doddering middle age (Simonyi is still in his forties), that puts the Hungarian so far over on the right side of Miller’s memory distribution that he is barely on the same planet with the rest of us. And there are better programmers than Charles Simonyi.

Here is a fact that will shock people who are unaware of the way computers and software are designed: at the extreme edges of the normal distribution, there are programmers who are 100 times more productive than the average programmer simply on the basis of the number of lines of computer code they can write in a given period of time. Going a bit further, since some programmers are so accomplished that their programming feats are beyond the ability of most of their peers, we might say that they are infinitely more productive for really creative, leading-edge projects.

The trick to developing a new computer or program, then, is not to hire a lot of smart people but to hire a few very smart people. This rule lies at the heart of most successful ventures in the personal computer industry.

Programs are written in a code that’s referred to as a computer language, and that’s just what it is—a language, complete with subjects and verbs and all the other parts of speech we used to be able to name back in junior high school. Programmers learn to speak the language, and good programmers learn to speak it fluently. The very best programmers go beyond fluency to the level of art, where, like Shakespeare, they create works that have value beyond that even recognized or intended by the writer. Who will say that Shakespeare isn’t worth a dozen lesser writers, or a hundred, or a thousand? And who can train a Shakespeare? Nobody; they have to be born.

But in the computer world, there can be such a thing as having too much gray matter. Most of us, for example, would decide that Bob Metcalfe was more successful in his career than Harvey Allen, but that’s because Metcalfe had things to prove to himself and the world, while Harvey Allen, already supreme, did not.

Metcalfe chose being smart as his method of gaining revenge against those kids who didn’t pick him for their athletic teams back in school on Long Island, and he used being smart as a weapon against the girls who broke his heart or even in retaliation for the easy grace of Harvey Allen. Revenge is a common motivation for nerds who have something to prove.

The Harvey Aliens of the world can apply their big brains to self-delusion, too, with great success. Donald Knuth is a Stanford computer science professor generally acknowledged as having the biggest brain of all—so big that it is capable on occasion of seeing things that aren’t really there. Knuth, a nice guy whose first-ever publication was “The Potrszebie System of Weights and Measures” (“one-millionth of a potrszebie is a farshimmelt potrszebie”), in the June 1957 issue of Mad magazine, is better known for his multivolume work The Art of Computer Programming, the seminal scholarly work in his field.

The first volume of Knuth’s series (dedicated to the IBM 650 computer, “in remembrance of many pleasant evenings”) was printed in the late 1960s using old-fashioned but beautiful hot-type printing technology, complete with Linotype machines and the sharp smell of molten lead. Volume 2, which appeared a few years later, used photo-offset printing to save money for the publisher (the publisher of this book, in fact). Knuth didn’t like the change from hot type to cold, from Lino to photo, and so he took a few months off from his other work, rolled up his sleeves, and set to work computerizing the business of setting type and designing type fonts. Nine years later, he was done.

Knuth’s idea was that through the use of computers, photo offset, and especially the printing of numbers and mathematical formulas, could be made as beautiful as hot type. This was like Perseus giving fire to humans, and as ambitious, though well within the capability of Knuth’s largest of all brains.

He invented a text formatting language called TeX, which could drive a laser printer to place type images on the page as well as or better than the old linotype, and he invented another language, Metafont, for designing whole families of fonts. Draw a letter “A,” and Metafont could generate a matching set of the other twenty-five letters of the alphabet.

When he was finished, Don Knuth saw that what he had done was good, and said as much in volume 3 of The Art of Computer Programming, which was typeset using the new technology.

It was a major advance, and in the introduction he proudly claimed that the printing once again looked just as good as the hot type of volume 1.

Except it didn’t.

Reading his introduction to volume 3,1 had the feeling that Knuth was wearing the emperor’s new clothes. Squinting closely at the type in volume 3,1 saw the letters had that telltale look of a low-resolution laser printer—not the beautiful, smooth curves of real type or even of a photo typesetter. There were “jaggies”— little bumps that make all the difference between good type and bad. Yet here was Knuth, writing the same letters that I was reading, and claiming that they were beautiful.

“Donnie,” I wanted to say. “What are you talking about? Can’t you see the jaggies?”

But he couldn’t. Donald Knuth’s gray matter, far more powerful than mine, was making him look beyond the actual letters and words to the mathematical concepts that underlay them. Had a good enough laser printer been available, the printing would have been beautiful, so that’s what Knuth saw and I didn’t. This effect of mind over what matters is both a strength and a weakness for those, like Knuth, who would break radical new ground with computers.

Unfortunately for printers, most of the rest of the world sees like me. The tyranny of the normal distribution is that we run the world as though it was populated entirely by Bob Cringelys, completely ignoring the Don Knuths among us. Americans tend to look at research like George Miller’s and use it to custom-design cultural institutions that work at our most common level of mediocrity—in this case, the number seven. We cry about Japanese or Korean students, having higher average math scores in high school than do American students. “Oh, no!” the editorials scream. “Johnny will never learn FORTRAN!” In fact, average high school math scores have little bearing on the state of basic research or of product research and development in Japan, Korea, or the United States. What really matters is what we do with the edges of the distribution rather than the middle. Whether Johnny learns FORTRAN is relevant only to Johnny, not to America. Whether Johnny learns to read matters to America.

This mistaken trend of attributing average levels of competence or commitment to the whole population extends far beyond human memory and computer technology to areas like medicine. Medical doctors, for example, say that spot weight reduction is not possible. “You can reduce body fat overall through dieting and exercise, but you can’t take fat just off your butt,” they lecture. Bodybuilders, who don’t know what the doctors know, have been doing spot weight reduction for years. What the doctors don’t say out loud when they make their pronouncements on spot reduction is that their definition of exercise is 20 minutes, three times a week. The bodybuilder’s definition of exercise is more like 5 to 7 hours, five times a week —up to thirty-five times as much.

Doctors might protest that average people are unlikely to spend 35 hours per week exercising, but that is exactly the point: Most of us wouldn’t work 36 straight hours on a computer program either, but there are programmers and engineers who thrive on working that way.

Average populations will always achieve only average results, but what we are talking about are exceptional populations seeking extraordinary results. In order to make spectacular progress, to achieve profound results in nearly any field, what is required is a combination of unusual ability and profound dedication—very unaverage qualities for a population that typically spends 35 hours per week watching television and less than 1 hour exercising.

Brilliant programmers and champion bodybuilders already have these levels of ability and motivation in their chosen fields. And given that we live in a society that can’t seem to come up with coherent education or exercise policies, it’s good that the hackers and iron-pumpers are self-motivated. Hackers will seek out and find computing problems that challenge them. Bodybuilders will find gyms or found them. We don’t have to change national policy to encourage bodybuilders or super-programmers.

All we have to do is stay out of their way.