The High Boskage House Baseball-Analysis Web Site
baseball team and player performance examined realistically and accurately

  email me search site site directory  

About Modern Baseball Analysis


Baseball Analysis: An Overview

What Kind of Baseball Information Have We Here?

First: Hello--thanks for dropping by.

This site is not a baseball "statistics" site as such--the internet (and, for that matter, your daily newspaper) can supply you with tons of raw baseball statistics. This site is about analyzing baseball statistics to extract deep and powerful measures showing what batters, pitchers, and teams are really doing toward scoring and giving up runs and--the bottom line--winning baseball games. We know how important those special measures are because this was our business for quite a long time.

There are several extensive introductory articles in the pages of this site, and those are where you will find the detailed explanations of the principles and practices that underlie the baseball measures we present on our main display pages. Farther down this page we will give you links to those explanatory essays, but first we will just summarize the highlights.

It has been recognized for over half a century now that the winning of baseball games, and the underlying constituents of winning--the scoring and yielding of runs--are processes capable of numerical analysis using the statistics of the game as raw materials. To put it simply, we can take just a few familiar team stats--batting average, slugging average, and on-base percentage are one such set--and (assuming these are from some reasonable number of games) predict with striking accuracy a team's runs-scored total for those games. And, of course, what we can do for batters scoring runs we can turn around and do for pitchers yielding runs. Moreover, given team runs-scored and runs-yielded figures, we can predict with similarly striking accuracy how many of its baseball games a team will have won.

And we can go a step further--a very, very important step. Using the same stats we would use to analyze a team, we can analyze a single man, batter or pitcher. The results we get may best be thought of as the runs that would be scored or given up by, as appropriate, a lineup or a pitching staff made up of exact clones of that man. With that kind of information, we can put together the performance stats for the individual men on a baseball team and derive projected team-total values--and from them, calculate games-won values. In other words, given the identities of the men on a baseball team, we can reliably forecast how many games that team will win with substantial accuracy. (But that does not mean that, for example, we can win lots of money at a baseball sports book, because no one can know in advance how much playing time a given manager will give a specific player, nor who will get injured for how long, nor what trades a team may make when.) But we can say quite exactly just how much a given batter or pitcher is (or is not) doing to help his team win baseball games--and those measures are not some arbitrary relative "rankings": they are absolute numbers that can be combined to give familiar and important real-world values (like runs scored) for an entire baseball team.

Now as anyone remotely familiar with baseball knows, one of the crucial problems in comparing men's abilities and performances is that the players throughout major-league baseball are not competing on the proverbial "level playing field." A given batter playing in, for example, Coors Field would compile very different statistics than he would playing in, again for example, AT&T Park (or whatever they may be calling it this year), even though his actual abilities would be the same.

The 30 quite different baseball parks affect statistics 30 different ways. Nowadays you can find many places that purport to show such park effects on each baseball statistic, often to several decimal places. We here used to calculate, and use, such correction factors to generate "park-neutral" results, but we have--at least for now--stopped doing so. The reasons we stopped are given at some length in our page on park effects, but in essence it has become virtually impossible, in our opinion, to get meaningful results, chiefly owing to two things. For one, parks now have results-affecting structural changes made with striking frequency (you might be surprised at how little it takes to have some effect), such that "historical" data--even as recent as last year's--is too often meaningless, and even if last year's data is usable, it is nowadays unlikely that several seasons' worth of commensurable data exist, meaning we'd be working with an undesireably small data sample. And for another, the vagaries of scheduling have made it hard (even putting aside changes in the parks) to get a set of normalizeable data representing some sort of standard against which to compare a given park. The old technique was home versus away data, but "away" now represents drastically different combinations of parks (something interleague play has further corrupted) from team to team. It may yet be possible to derive some usable measures, and it's something we are working on, but till we get a result we're satisfied with, we prefer to present results that are frankly unadjusted rather than results "adjusted" by a method that may or may not be particularly sound.

Park-specific biases are not the only normalization applicable to baseball stats. Another factor is the undoubtable and major "juicing" that the baseball itself underwent sometime in 1993: results from 1977 (when the brand of ball was changed) through 1992 are simply not commensurable with results from 1994 on (with 1993 as a bizarre in-between thing of its own). In years past, we used to apply a correction for that change: originally, we knocked back current data to the older levels; later, we kicked up older data to the current levels' now, since stats from 1993 and earlier don't show in many men's resumés and don't have much impact on career totals when they do, we have just plain stopped making historical corrections. (Earlier hopes notwithstanding, the SillyBall--as we call it--is obviously a permanent change.) But the phenomenon is worth remembering for those interested in historical evaluations.

Going back for a moment to the issue of predicting runs scored or runs yielded from ordinary stats: we want to emphasize the special significance of such measures for pitchers. The usual measure of pitching performance, the ERA, is actually in many ways a better measure of pitchers than any one conventional offensive stat is for batters; but the fact remains that the ERA nevertheless has several severe defects. One is that it can be influenced, often strongly so, by factors not well within any pitcher's control. Such influences--in simplest terms, "luck"--will more or less average out over the long run, but, for pitchers much more so than for batters, the "long run," even for a starter, can be more than one full season. And for relievers, who often come into a game with one or two men already out and don't stay in long, the ERA is nearly meaningless. The measure that we calculate, on the other hand, shows the actual quality level at which any given pitcher (or staff) is really performing. In the long run, it and the ERA will eventually come into pretty close agreement, but, as someone once remarked, "in the long run, we are all dead"; in the short to medium run, our figure is a much more important and accurate measure of pitching quality than anything else now available.


How Is All This Baseball Information Presented?

The way the information pages on this site are laid out is simple enough. You should have no trouble at all picking it up from the Site Directory at the bottom of every page of this site.

Besides the complete player and pitcher listings, we also include a page with lists for all "regular" pitchers and batters--all those meeting a minimum-playing-time criterion (in plate appearances or batters faced, as appropriate, with starters and relievers considered separately), which minimum changes daily through the season. Using those lists, one can see at once how the more significant players and pitchers compare all across major-league baseball.

Perhaps chiefest in importance of all the results pages on this site, we have a Team-Performance page on which the performance of all 30 major-league baseball teams as units are shown in terms of both actual and projected runs scored, runs allowed, and games won. That page may be the chiefest here because it shows how well each team is currently playing in terms of how many games they appear aimed--by the quality of their play, not their actual record so far--toward winning on the season, a very simple, easily understood datum. (There is a separate explanation page here about that table, to help make its many columns readily comprehensible.)


How Is This Site Operated?

We try to keep the baseball stats on these pages up to date daily, but please understand that this site is a sideline at a pretty busy place of business. That also means that we may not be able to respond individually to all e-mail . . . but be assured that we will read it all! As a rule, updating will happen around 6:00 a.m. Pacific Time (9:00 a.m. Eastern Time), but that is not guaranteed. Our software creates and posts the updates automatically, but those procedures can fail if, as is often the case, there is a defect in the raw data we begin with, and we may not discover and be able to repair such failures till later in the day, if at all. It is also by no means uncommon for our supposedly professional sources of raw baseball data to fail to post their information timely. Check the "through games of" notice on each of our data pages to be sure of what you're seeing. (If the software detects bad raw data, it leaves the previous day's pages in place.)

The data and measures were formerly only for the current major-league season. Starting with 2009, we are also making available full career results--both season-by-season and cumulative--for each man appearing on a major-league roster during the current season; those pages include the current season's data in the cumulative career numbers, so those change every day, just as all the other results pages do. (But we do not, at this time have or plan on minor-league stats, though methods exist to convert them with reasonable reliability into major-league equivalencies; perhaps some other year . . . .) Those career pages are not separately listed anywhere: you access them by clicking on a man's name wherever you find it on one of the regular listings--by team, by position or role, whatever.

Even after years of operation, this entire site is always under development. Just as with the old beat writers' observation, "You come out to the ballyard every day for twenty years, and every day you see something you never saw before", so with this site; the only constant is change. And if you miss something you would like to see, or see something you would as soon miss, please . . . let us know. There is a simple click-on "email me" link atop and/or at the bottom of every page of this site (we're changing page format, a page at a time--the newer ones have the link in both palces, the older ones only at the bottom) to make it easy for you. Take some time, explore the site, then send us your thoughts. And, again, thank you both for visiting here and for any feedback.

One other thing: all pages on this web site have been third-party verified as being 100% fully and strictly compliant with the latest official written standards (currently version 1.0 "Transitional") for extended hypertext language ("XHTML"). That means that any properly designed web browser should display these pages exactly as they were meant to be seen. If you experience any systemic trouble reading these pages, your web browser is likely to be, as with so many, including--notably including--the most famous names (can you spell "Micro$oft"?), ill-designed.



Baseball Analysis: Some Details

You can find what we imagine is all you'd want to know--and very possibly more than you want to know--in the "White Papers" about baseball analysis listed in the Site Directory at the bottom of this and every page of this site. But we'll abstract that list here for your convenience:


General Background:
About Particular Pages Here:


Baseball Analysis: Some History

Since that joyous day in 1845 when baseball was played for the very first time--on the grounds called by the cosmically appropriate name The Elysian Fields--followers of the art, both amateur and professional, have been seeking measures of player and team performance. There is a fine summary of that search in the second chapter of Pete Palmer's book The Hidden Game of Baseball, so we won't recount it at length here.

Let it suffice to say that a myriad of such baseball measures have been proposed, and the count grows daily; and each proposed tool has claimed for it by its adherents various reasons why it is important. Till relatively modern times, all such tools--the good and the bad alike--were relative measures: they compared, in one or another way or ways, one man with another or one team with another. They were excellent conversation starters but little else. It was not till the middle of the twentieth century that a serious and sustained effort was made to derive some sort of absolute baseball measure--one that would allow calculation of actual, real-world values of importance (such as games won, the ultimate measure of importance).

In the August 2, 1954, issue of Life magazine, Branch Rickey, possibly the finest mind ever to grace baseball management, set forth a formula (developed with the aid of mathematicians from M.I.T.) that would, in a crude way, predict how many games a team would win based on various commonly available team statistics. In 1964, a Johns Hopkins professor named Earnshaw Cook put out a book titled Percentage Baseball, wherein he derived--by using somewhat arcane mathematical methods (stochastic analysis)--criteria for both teams and individual players that were reasonably successful absolute measures.

More recently, the highly articulate and provocative writing style of Bill James propelled these forms of analysis into the consciousness of the general baseball public, and even the ranks of baseball management. Nowadays, largely because of the success of James' books, everybody and his cousin is producing both books and proprietary measures with which to fill the pages of those books. While a few of these are good and several are awful (no names--you know who you are, or you should), the point of greatest interest to the public is that almost all of these books and measures, no matter how differently expounded, reach pretty much the same results. (Here is one example, and here's another.) On the front page of our Baseball Bookshop, we list a few--by no means all--of the more valuable works now available on these topics.

That consensus of results will, of course, tell the thoughtful one vital thing: this is now a science. Theoretical physicists still argue, with some heat, over small and abstruse side issues about Einstein's principles; but those debates are of little consequence to anybody besides the participants, whereas nuclear power plants--and weapons--are an everyday fact of life. So in baseball: the practitioners of analysis--which some like to call SABRemetrics (often mistakenly rendered as sabermetrics) after SABR, an excellent nonprofit research organization which has done outstanding work in restoring missing baseball records and developing standards for measuring performance, although analysis of the sort we discuss here was not their primary focus--still quibble and squabble over the minor arcana, but on the whole no reasonable person (which omits many professionally associated with the sport) doubts any more that analysis is the only basis for a true understanding of the inner workings of the subtle and wonderful game of baseball.






You loaded this page on Thursday, 2 July 2009, at 10:32 pm EDT;
it was last modified on Tuesday, 17 March 2009, at 5:35 am EDT.

Site Mechanics:

Search this site:


Custom Search
(the usual Google search rules apply)


Site Directory:

 This site's Front Page
 Late News about the site



(team and player performance evaluations, updated daily)
The Performance Stats:
    Team Measures:
    Player Measures:

(meanings and explanations of the things on this site)
Baseball-Analysis Background:
    For You Rookies:
 what this site is all about--what it is telling you about baseball, and how, and why
    Some Baseball Analysis Theory:
 a semi-technical backgrounding on modern baseball analysis
    Baseball Stat Definitions:
 the standard and the unique statistics we present
    The "Quality of Pitching" Measures:
 why they are the best way to evaluate pitching
    The SillyBall:
 why baseball before and after 1993 is really two different games
    Fielding and Defense in Baseball
 how important defense is or isn't in baseball, and how to correctly evaluate it
    Baseball Data Normalization:
 why raw stats need "correction", and how and why we can and cannot apply it
    "Steroids" and Other "Performance-Enhancing Drugs":
 why just about everything you think you know about them is wrong
(now a full-fledged site of its own)



(miscellaneous but not unimportant)
Some Miscellaneous Information:
    The Team-Performance Table
 there is a lot in that Table, and this explains what it all is
    The HBH Baseball-Analysis Formula Tested
 what we get when we apply it to half a century of team stats
    The Pitfalls of Park Factors
 an explicit, detailed demonstration of how and why they are so dubious
    About High Boskage House
 who we are and why we might know what we're talking about
    Links To A Select Few Other Useful Baseball Sites
 including those that link to this one



(new, used--find any book, anywhere in the world)
The High Boskage House Baseball Shop:
    What Makes This "Baseball Shop" Special:
    Baseball Books Available Today:


Site Info:

owl logo This site is one of The Owlcroft Company family of web sites. Please click on the link (or the owl) to see a menu of our other diverse user-friendly, helpful sites.       Pair Networks logo Like all our sites, this one is hosted at the highly regarded Pair Networks, whom we strongly recommend. We invite you to click on the Pair link (or their logo) for more information on getting your site or sites hosted on a first-class service.

Comments? Criticisms? Questions?

Please, e-mail me by clicking here.

(Or, if you cannot email from your browser, send mail to webmaster@highboskage.com)

All content copyright © 1999 - 2009 The Owlcroft Company.

This web page is strictly compliant with the W3C (World Wide Web Consortium)
Extensible HyperText Markup Language (XHTML) Protocol v1.0 (Transitional).
Click on the logo below to test us!

So if your browser experiences any difficulties with this page(or, really, even if it doesn't seem to),
just click on the logo below to find out all about (and even get)--


Get the Firefox browser!
(It's free!)



---=== end of page ===---