1960s, a pair of brothers employed by IBM used the company's


computers to analyze baseball strategies and players. But the
desire to use statistics to make baseball efficient — to measure and
value precisely the events that occur on a baseball field, to give
the numbers new powers of language — only became potent when
It became practical.

When Bill lames published his i977 Baseball Abstract, two
changes were about to occur that would make his questions not
only more answerable but also more valuable. First came radical
advances in computer technolog>': this dramatically reduced the
cost of compiling and nnalvzing vast amounts of baseball data.
Then came the boom in baseball players' salaries: this dramati-
callv raised the benefits of having such knowledge. "If we're going
to pay these guys $1 SO, 000 a vear to do this, " lames concluded in
his essay on fielding, "we should at least know how good they
are — which means knowing how much they allowed in the field
just as much as it means knowing how much thev created at bat."
It this sounded compelling when baseball players were paid
SI. SO, 000 a vear, it sounded one hundred times nidre so when they
were paid SIS million a year.

lames's first proper essay was the preview to an astonishing lit-
erary career. There was but one question he left unasked, and it
vibrated between his lines: if gross miscalculations of a person's
value ccHild occur on a baseball field, betore a live audience of
thirty thousand, and a television audience of millions more, what
did that say about the measurement ot performance in other lines
of worki" If professional baseball players could be over- or under-
valued, who couldn't? Bad as they may have been, the statistics
used to evaluate baseball players were probably far more accurate
than anything used to measure the value of people who didn't play
baseball for a living.

Still, had he left off writing in 1977, James would have been dis-
missed as just another crank who didn't know when to shut up
about box scores. He didn't leave off in 1977. It didn't occur to him


to be disappointed by a sale of seventy-five copies; he was encour-
aged! No author has ever been so energized by so Httle. As James's
wife, Susan McCarthy, later put it, "instead of one page of a stolen
base study lying on top of a couple of pages of pitcher data in the
dungeon of a Stokely Van Camp's cardboard box for years and
years, ideas and questions about issues he had been chewing on for
a long time took up residence in a climate that allowed for growth
and maturation."*

In 1978, James came out with a second book, and this time,
before entering his discussion, he checked his modesty at the
door. The book was titled 1978 Baseball Abstract: The 2nd
Annual Edition of Baseball's Most Informative and Imaginative
Review. "I would like to produce here the most complete,
detailed, and comprehensive picture of the game of baseball avail-
able anywhere," he wrote, "and I would like to avoid repeating
anything that has ever been written before."

Word had spread this time: 250 people bought a copy. To an
author who viewed a sale of 75 copies as encouragement, the sale
of 250 was a bonanza. James's pen was now an unstoppable force.
Every winter for the next nine years he wrote with greater confi-
dence; every spring his growing audience found relatively less
space devoted to numbers and more to James's words. The words
might run on for many pages but they were typically presented as
digressions from the numbers. Wishing to convey the history of
his obsession with baseball, for instance, James buried it in a dis-
cussion of the year-end stats of the Kansas City Royals. Unable to
supress his distaste for the rich men who bought baseball teams
and spent huge sums of money on players, he left off writing about
the Atlanta Braves and picked up the subject of their new owner,

* An invisible subplot of baseball fanaticism is its effect on the spouses of the
fanatics. "Bill hid his interest in baseball when we first started dating," said his
wife. "If I had known the extent of it, I'm not sure we'd have gotten very far."


"Ted Turner," he wrote, "seems never to have heen tempted hy
moderation, hy dignity or restraint. He is a man who plays hard at
gentleman's games and whines when he loses that the victor was
not a gentleman. No matter how hard he flees, he will always be
pursued by an Awful Commonness, and that is what makes him a
winner." (Yankees fans would soon learn that lames was capable
of greater contempt: "Turner is the man Steinbrenner dreams oi

The Baseball Abstracts were one long, elaborate aside, and the
aside raised all sorts of strange new questions: If Mike Schmidt hit
against the Cubs all the time, what would he hit' Did fleet young
black players, as it seemed to lames, actually lose their speed later
in their careers than fleet young white players!" Who were the best
dead hitters' Even the most obscure questions about baseball, and
its history, had practical implications. To calculate what Mike
Schmidt would hit if he bit only against the Chicago Cubs, you
needed to understand how hitting in Wrigley Field differed from
hitting in other parks. To compare white and black speedsters, you
needed to find a way to measure speed on the base paths and in the
field; and once you'd done that, you might begin to ask questions
about the importance ot toot speed. To determine the best dead
hitters, you needed to build tools to evaluate them, and those
tools worked just as well on the living.

That last problem preoccupied lames. From his second season
on, he more or less set baseball detense to one side and concen-
trated on baseball offense. He explained to the readers of the sec-
ond Abstract that his book contained roughly torty thousand
baseball statistics. A few of them had been easy for him to obtain,
but "the bulk of them were compiled one by one, picked out of the
box scores and laboriously sorted into groups of about 30 or so,
groups with titles like 'Double Plays turned in games started by
Nino Espinosa,' and 'Triples hit by Larry Parrish in July.'" He
freely admitted that collecting baseball statistics was, on the face
of it, a bizarre way to spend one's time — unless one was obsessed


by the baseball offense. "I am a mechanic with numbers/' he
wrote to readers of the third Abstract,

tinkering with the records of baseball games to see how the
machinery of the baseball offense works. I do not start with the
numbers any more than a mechanic starts with a monkey
wrench. I start with the game, with the things that I see there
and the things that people say there. And I ask: Is it true? Can
you validate it? Can you measure it? How does it fit with the
rest of the machinery? And for those answers I go to the record
books. . . . What is remarkable to me is that I have so little com-
pany. Baseball keeps copious records, and people talk about
them and argue about them and think about them a great deal.
Why doesn't anybody use them? Why doesn't anybody say, in
the face of this contention or that one, "Prove it"?

For what now seem like obvious reasons the baseball offense
was more interesting to James than the other two potentially big
fields of research, fielding and pitching. Hitting statistics were
abundant and had, for James, the powers of language. They were,
in his Teutonic coinage, "imagenumbers." Literary material.
When you read them, they called to mind pictures. "Let us start
with the number 191 in the hit column," he wrote,

and with the assertion that it is not possible for a flake (I would
hope that no one reading this book doesn't know what a flake is)
to get 191 hits in a season. It is possible for a bastard to do this.
It is possible for a warthog to do this. It is possible for many peo-
ple whom you would not want to marry your sister to do this.
But to get 191 hits in a season demands (or seems to demand,
which is as good for the drama) a consistency, a day-in, day-out
devotion, a self-discipline, a willingness to play with pain and
(to some degree) a predisposition to the team game which is
wholly inconsistent with flakiness. It is entirely possible, on the


other hand, tor a flake to hit 48 homers. Hitting 48 homers is
something done by large, slow men three-quarters thespian. . . .

James was an aesthete. But he was also a pragmatist: he had
happened upon something broken and wanted to fix it. But he
could only fix what he had the tools to fix. The power of statisti-
cal analysis depends on sample size: the larger the pile of data the
analyst has to work with, the more confidentlv he can draw spe-
cific conclusions about it. A right-handed hitter who has gone two
for ten against left-handed pitching cannot as reliably be predicted
to hit .200 against lefties as a hitter who has gone 200 for 1,000.
The offensive statistics available to James in 1978 were suffi-
ciently comprehensive to reach specific, meaningful conclusions.
Offense he could fix. He couldn't fix fielding because, as he had
explained in his first Abstract, there wasn't the data available to
make a meaningful appraisal of fielding. Pitching didn't need to be
fixed. Or, at any rate, James didn't think it did.

In 1979, in the third, now annual. Baseball Abstract, James
wrote, "a hitter should be measured by his success in that which
he is trying to do, and that which he is trving to do is create runs.
It is startling, when you think about it, how much confusion there
is about this. I find it remarkable that, in listing offenses, the
league will list first — meaning best — not the team which scored
the most runs, but the team with the highest batting average. It
should be obvious that the purpose of an offense is not to compile
a high batting average." Because it was not obvious, at least to
the people who ran baseball, James smelled a huge opportunity.
How did runs score- "We can't directly see how many runs each
player creates," he wrote, "but we can see how many runs each
team creates."

He set out to build a model to predict how many runs a team
would score, given its number of walks, hits, stolen bases, etc.
He'd dig out the numbers for, say, the 197S Red Sox. (Walks by


individual players were still hard to find in 1975, thanks to Henry
Chadwick, but team totals were available.) He could also find out
how many runs the 1975 Red Sox scored. What he needed to deter-
mine was the relative importance to the team's scoring of the var-
ious things Red Sox players did at the plate and on the base
paths — that is, assign weights to outs, walks, steals, singles, dou-
bles, etc. There was nothing elegant or principled in the way he
went about solving the problem. He simply tried out various equa-
tions on the right side of the equals sign until he found one that
gave him the team run totals on the left side. The first version of
what James called his "Runs Created" formula looked like this:

Runs Created = (Hits + Walks) X Total Bases/(At Bats + Walks)

Crude as it was, the equation could fairly be described as a sci-
entific hypothesis: a model that would predict the number of runs
a team would score given its walks, steals, singles, doubles, etc.
You could plug actual numbers from past seasons into the right
side and see if they gave you the runs the team scored that season.
James was, in a sense, trying to predict the past. If the actual num-
ber of runs scored by the 1975 Boston Red Sox differed dramati-
cally from the predicted number, his model was clearly false. If
they were identical, James was probably onto something. As it
turned out, James was onto something. His model came far closer,
year in and year out, to describing the run totals of every big
league baseball team than anything the teams themselves had
come up with.

That, in turn, implied that professional baseball people had a
false view of their offenses. It implied, specifically, that they didn't
place enough value on walks and extra base hits, which featured
prominently in the "Runs Created" model, and placed too much
value on batting average and stolen bases, which James didn't even
bother to include. It implied that sacrifices of any sort were aptly


named, as they made no contribution whatsoever. That is: t)uts
were more precious than baseball people believed, or seemed to
believe. Not all baseball people, of course. The Jamesean analysis
was consistent with an approach to the game championed most
vocally by the tormer manager ot the Baltimore Orioles, Earl
Weaver. Weaver designed his otfenses to maximize the chances of
a three-run homer. He didn't bunt, and he had a special taste for
guys who got on base and guys who hit home runs. Big ball, as
opposed to small ball.

But once again, the details of lames's equation didn't matter all
that much. He was creating opportunities for scientists as much
as doing science himself. Other, more technically adroit people
would soon generate closer approximations of reality. What mat-
tered was (al it was a rational, testable hypothesis; and (b) lames
made it so clear and interesting that it prt)Voked a lot ot intelligent
people to join the conversation. "The fact that the formulas work
with the accuracy that they do is a way of saying there are essen-
tially stable relationships between batting average, home runs,
walks, other offensive elements— and runs, " wrote lames.

This kind of talk was catnip to people whose lives were devoted
to discovering stable relationships in a seemingly unstable world:
physicists, biologists, economists. There was a young statistician
at the RAND Corporation, a future chair of the Harvard statistics
department, named Carl Morris. "I'd been thinking about
advanced ideas m baseball analysis," said Morris, "and was
impressed that someone else was, too, who wrote about it in a
very interesting way." Morris counted the days until the next
Baseball Abstract appeared. lames pointed the way to big ques-
tions that Morris could address more rigorously than even lames

There was also a bright young government economist with the
Office of Management and Budget named Eddie Epstein. He stum-
bled across the Abstract and decided he was in the wrong line of
work. "1 read the Abstract," he said, "and the light bulb went off:


I can do this! The way Bill laid out very clearly what could be
gleaned from these mountains of baseball data. In the past an
awful lot was thought to be unknowable." Epstein began to pester
Edward Bennett Williams, the owner of the Baltimore Orioles, for
a job.

Then there were the few hobbyists who had been active before
James began writing his Abstracts. Dick Cramer was a research
scientist for the pharmaceutical company then called SmithKline
French (now GlaxoSmithKline), and so had access to a computer.
By day he used the SmithKline computers to discover new drugs
and by night he used them to test his own theories about baseball.
For instance, Cramer had a hypothesis about clutch hitting: it
didn't exist. No matter what the announcers said, and what the
coaches believed, major league baseball players did not perform
particularly well — or particularly badly — in critical situations. On
the one hand, it made a funny kind of sense: no one who behaved
differently under pressure would ever make it to the big leagues.
On the other hand, it contradicted the sacred, received wisdom in
baseball. The sheer counterintuitiveness of his notion delighted
Cramer. "It violates everyone's personal experience of pressure,
and how they cope with it," he said. And yet it was true, or impos-
sible to disprove. Cramer had tested it and found no evidence that
players hit differently in one situation than any other — with a pair
of exceptions. Some left-handed hitters fared worse against lefties
than righties, and some right-handed hitters fared worse against
righties than lefties.

Cramer's work has subsequently withstood intense, repeated
critical scrutiny, but until Bill James came along no one paid it
any attention. "Until Bill came along," Cramer says, "it was just
three or four of us writing letters to each other. Even my own fam-
ily would say, 'This is a crazy way to spend your time.'"

Cramer, like James, understood that the search for baseball
knowledge was constrained by the raw statistics, and began to
think seriously about starting a company to collect better data


about Major League Baseball games than did Major League Base-
ball. One ot the men to whom Cramer wrote letters on the subject
was Pete Palmer. Palmer worked as an engineer at Raytheon, on
the software that supported the radar station m the Aleutian
Islands that monitored Russian test missiles. At least that's what
he did for money; for love he sat down with his charts and slide
rule and analyzed baseball strategies. Both Palmer and Cramer had
separately created their own models ot the baseball offense that
differed trivially trom lames's. (Together, thev later dreamed up
the Stat now widely used to capture the primary importance to
offense of slugging and on-base percentages: OPS, an acronym for
on base plus slugging.) Palmer really was a gifted statistical mind,
and he had done a lot of work, just for the hell of it, that demon-
strated the toolishness of manv conventional baseball strategies.
Hunts, stolen bases, hit and runs — they all were mostly self-
defeating and all had a common theme: fear ot public humiliation.

"Managers tend to pick a strategy that is least likely to fail
rather than pick a strategy that is most efficient," said Palmer.
"The pain of looking bad is worse than the gain of making the best
move." Palmer had written a book back in the 1960s revealing all
this. The manuscript was still gathering dust on his desk when
Bill lames came along and created a market for it. In 1984, in the
wake of Bill lames's success, he was able to publish it: The Hid-
den Game of Baseball. "Bill proved there were buyers for this kind
oi thing," Palmer says. "I'm not sure the book would have seen
the light of day otherwise."

James's literary powers combined with his willingness to
answer his mail to create a movement. Research scientists at big
companies, university professors of physics and economics and
life sciences, professional statisticians, Wall Street analysts, bored
lavv^ers, math wizards unable to hold down regular jobs — all these
people were soon mailing James their ideas, criticisms, models,
and questions. His readership must have been one of the strangest
group of people ever assembled under one idea. Before he found a


publisher, James had four readers he considered "celebrities."
They were:

Norman Mailer

Baseball writer Dan Okrent

William Goldman, the screenwriter [Butch Cassidy and the

Sundance Kid)
The guy who played "Squiggy" on the TV sitcom Laverne eO


James's readers were hard to classify because he was hard to
classify. The sheer quantity of brain power that hurled itself vol-
untarily and quixotically into the search for new baseball knowl-
edge was either exhilarating or depressing, depending on how you
felt about baseball. The same intellectual resources might have
cured the common cold, or put a man on Pluto; instead, it was
used to divine the logic hidden inside a baseball game, and create
whole new ways of second-guessing the manager.

Four years into his experiment James was still self-publishing
his Baseball Abstract but he was overwhelmed by reader mail.
What began as an internal monologue became, first, a discussion
among dozens of resourceful people, and then, finally, a series of
arguments in which fools were not tolerated. (Most witheringly
not by James: "Is baseball really 75% pitching? James J. Skipper
attempted to answer this question in the 1980 Baseball Research
Journal, by the ingenious method of asking everybody in sight
what percentage of baseball was pitching, totaling up their answers,
and dividing by the number of people in sight. . . .") By 1981, in
response to a pile of letters asking him what he thought about a
new baseball offense model created by the sports journalist Thomas
Boswell, James was able and willing to write that "the world needs
another offensive rating system like Custer needed more Indians
(or, for that matter, like the Indians needed another Custer). . . .
What we really need is for the amateurs to clear the floor." There


was now such a thing as intcllcctuallv rigorous haseball analysts.
James had given the field of study its name: sabermetncs.*

The swelling crowd of disciples and correspondents made
James's movement more potent in a couple of ways. One was that
it now had a form of peer review: by the early 1980s all the statis-
tical work was being vetted bv people, unlike James, who had a
deep interest in, and understanding of, statistical theory. Baseball
studies, previouslv an eccentric hobby, became formalized along
the lines of an academic discipline. In one wav it was an even
more effective instrument of progress: all these exquisitely trained,
brilliantly successful scientists and mathematicians were working
for love, not monev. And for a certain kind oi hyperkinetic ana-
lytical, usually male mind there was no greater pleasure than
searching for new truths about baseball. "Baseball is a soap opera
that lends itself to probabilistic thinking," is how Dick Cramer
described the pleasure.

The other advantage was that the growing army of baseball ana-
lysts was willing and able to generate new baseball data. James
was forever moaning about the paucity ot the information kept by
maior league baseball teams. Earlv in one of his Baseball
Abstracts he had explained to his readers that "the answers that I
arrive at — and thus the methods 1 have chosen — are never wholly
satisfactory, almost never wholly disappointing. The most consis-
tent problems that 1 have arise from the limitations on mv infor-
mation sources. All I have is the box scores." The reason he
couldn't get more than the box scores is that the company that
kept the score sheets for Major League Baseball, the Elias Sports
Bureau, was perfectly unhelpful when James asked it for access to
them. "The problem with the Elias Bureau," he wrote, "is that the
Elias Bureau never turns loose of a statistic unless they get a dol-
lar for It. Their overarching concern in life is to get every dollar

* The name derives from SABR, the acronym of the Society for American Base-
ball Research. In 2002, the society had about seven thousand members.


they can from you and give you as little as possible in return for
it — like a lot of other businesses, I suppose, only with a more
naked display of greed than is really usual."

James was shocked by the indifference of baseball insiders to
the fans who took more than an ordinary interest. Major League
Baseball had no sense of the fans as customers, and so hadn't the
first clue of what the customer wanted. The customer wanted
stats and Major League Baseball did its best not to give them to
him. The people inside Major League Baseball were, if anything,
hostile to the people outside Major League Baseball who wished to
study the game. That struck James, who by now had perfected the
art of sounding like a sane man in an insane world, as mad. "The
entire basis of professional sports is the public's interest in what
is going on," he wrote. "To deny the public access to information
that it cares about is the logical equivalent of locking the stadiums
and playing the games in private so that no one will find out what
is happening."

In 1984, James wrote to what was now a rapidly expanding
crowd of baseball nuts and proposed a radical idea: Take the accu-
mulation of baseball statistics out of the hands of baseball insid-
ers. Build an organization of hundreds of volunteer scorekeepers
who would collect the stuff you needed to know to reduce base-
ball to a science. "What I propose here, so far as I know for the first
time in a century, is to start over. ... I am proposing to re-build
the box score, not around the old one, but around the tool from
which the box score is assembled: the score sheet." He then
explained that much of the data collected by professional baseball
teams — say, how right-handed hitters fared against left-handed
pitching — wasn't available to the public. Worse, baseball teams
didn't have the sense to know what to collect, and so an awful lot

