Analysis of Baseball Performance From Baseball Stats: An OverviewWhat Kind of Baseball Information Have We Here?First: Hello--thanks for dropping by.This site is not a baseball statistics site as such--the internet (and, for that matter, your daily newspaper) can supply you with tons of raw baseball statistics. This site is about analyzing baseball statistics to extract deep and powerful measures showing what batters, pitchers, and teams are really doing toward scoring and giving up runs and--the bottom line--winning baseball games. We know how important those special measures are because this was our business for quite a long time. There are several extensive introductory articles in the pages of this site, and those are where you will find the detailed explanations of the principles and practices that underlie the baseball measures we present on our main display pages. Farther down this page we will give you links to those explanatory essays, but first we will just summarize the highlights. It has been recognized for over half a century now that the winning of baseball games, and the underlying constituents of winning--the scoring and yielding of runs--are processes capable of numerical analysis using the statistics of the game as raw materials. To put it simply, we can take just a few familiar team stats--batting average, slugging average, and on-base percentage are one such set--and (assuming these are from some reasonable number of games) predict with striking accuracy a team's runs-scored total for those games. And, of course, what we can do for batters scoring runs we can turn around and do for pitchers yielding runs. Moreover, given team runs-scored and runs-yielded figures, we can predict with similarly striking accuracy how many of its baseball games a team will have won. And we can go a step further--a very, very important step. Using the same stats we would use to analyze a team, we can analyze a single man, batter or pitcher. The results we get may best be thought of as the runs that would be scored or given up by, as appropriate, a lineup or a pitching staff made up of exact clones of that man. With that kind of information, we can put together the performance stats for the individual men on a baseball team and derive projected team-total values--and from them, calculate games-won values. In other words, given the identities of the men on a baseball team, we can reliably forecast how many games that team will win with substantial accuracy. (But that does not mean that, for example, we can win lots of money at a baseball sports book, because no one can know in advance how much playing time a given manager will give a specific player, nor who will get injured for how long, nor what trades a team may make when.) But we can say quite exactly just how much a given batter or pitcher is (or is not) doing to help his team win baseball games--and those measures are not some arbitrary relative "rankings", they are absolute numbers that can be combined to give familiar and important real-world values (like runs scored) for an entire baseball team. Now as anyone remotely familiar with baseball knows, one of the crucial problems in comparing men's abilities and performances is that the players throughout major-league baseball are not competing on the proverbial "level playing field." A given batter playing in, for example, Coors Field would compile very different statistics than he would playing in, again for example, SBC Park, even though his actual abilities would be the same. The 30 quite different baseball parks affect statistics 30 different ways. One of the very important things that we here at High Boskage House do is to "normalize" out those park differences (using adjustment factors calculated in tedious but straightforward ways from published home/away "split" data for teams over the past several years). The results we want and get are, in effect, the stats a player would have posted playing all his games in one imaginary ballpark that is an exact average in its effects of all the actual ballparks in the league in question. (It is difficult to combine the two leagues into one imaginary ballpark owing to an insufficiency of park-to-park data, even with that abomination, inter-league play.) Removing park-specific biases is not the only normalization we do on stats. Another thing that anyone remotely familiar with baseball knows is that there has been a major surge in offense levels in the last decade or so. The causes of that surge have been debated--if we can call the incredible follies uttered on the subject "debate"--with very much more heat than light being generated in consequence. It is now, with several years' perspective to draw on, no longer deniable that what happened was (just as we maintained right from when it all started) that there was a material change made in the baseball itself prior to the 1994 (or possibly 1993) season, and the offensive (in more than one sense) surge is nearly or, most likely, entirely due to that change. That's fine in itself (and we examine the matter at much greater depth elsewhere), but the question remains: what do we do with today's numbers? That is not so much an analysis question as a philosophical question. Most folk interested in and familiar with baseball formed their semi-conscious concepts of "normal" baseball stats sometime during the period (which ranges from 1977, the last time the baseball was changed--only then it was officially acknowledged--through at least 1992) when numbers in the game were quite stable from year to year, save for the one minor freak show in 1987. Quite clearly, with the 1993 (or '94) season baseball entered a whole new era. Numbers are again pretty stable from year to year, but those numbers are rather outlandish-looking to anyone not pretty new to the game. What we here at High Boskage House are doing for now is also normalizing all stats for the era: adjusting them so that they correspond to what would have been achieved playing in an average season anywhere between 1977 and 1992. If, as is now certain, the SillyBall (as we call it) is here to say, some year not too far off--perhaps even sometime this season--we will have to begin rendering data in new-era (uncorrected) form; but for now we continue to make it all look like what most of us would call "normal" numbers. Going back for a moment to the issue of predicting runs scored or runs yielded from ordinary stats: we want to emphasize the special significance of such measures for pitchers. The usual measure of pitching performance, the ERA, is actually in many ways a better measure of pitchers than any one conventional offensive stat is for batters; but the fact remains that the ERA nevertheless has several severe defects. One is that it can be influenced, often strongly so, by factors not well within any pitcher's control. Such influences--in simplest terms, "luck"--will more or less average out over the long run, but, for pitchers much more so than for batters, the "long run," even for a starter, can be more than one full season. And for relievers, who often come into a game with one or two men already out and don't stay in long, the ERA is nearly meaningless. The measure that we calculate, on the other hand, shows the actual quality level at which any given pitcher (or staff) is really performing. In the long run, it and the ERA will eventually come into pretty close agreement, but, as someone once remarked, "in the long run, we are all dead"; in the short to medium run, our figure is a much more important and accurate measure of pitching quality than anything else now available. How Is All This Baseball Information Presented?The way the information pages on this site are laid out is simple enough. You should have no trouble at all picking it up from the Site Directory at the bottom of every page of this site.We include lists of all "regular" pitchers and batters--all those meeting a minimum-playing-time criterion (in plate appearances or batters faced, as appropriate), which minimum changes daily through the season (unfortunately, that effectively limits the pitching list to starters). Using those lists, one can see at once how the various regular players and pitchers compare all across the major-league baseball (and recall that these numbers are comparable, owing to the park normalizations we use). Perhaps chiefest in importance, we have a Team-Performance page on which the performance of all 30 major-league baseball teams as units are shown in terms of both actual and projected runs scored, runs allowed, and games won. That page may be the single most important on the site because it shows how well each team is currently playing in terms of how many games they appear aimed--by the quality of their play, not their actual record so far--toward winning on the season, a very simple, easily understood datum. How Is This Site Operated?We try to keep the baseball stats on these pages up to date daily, but please understand that this site is a sideline at a pretty busy place of business. That also means that we may not be able to respond individually to all e-mail . . . but be assured that we will read it all! As a rule, updating will happen around 6:00 a.m. Pacific Time (9:00 a.m. Eastern Time), but that is not guaranteed. Our software creates and posts the updates automatically, but those procedures can fail if, as is often the case, there is a defect in the raw data we begin with, and we may not discover and be able to repair such failures until later in the day. It is also by no means uncommon for our supposedly professional sources of raw baseball data to fail to post their information timely. Check the "through games of" notice on each of our data pages to be sure of what you're seeing. (If the software detects bad raw data, it leaves the previous day's pages in place.)The data and measures are only for the current major-league season. While we do keep career data for all active major-leaguers, and higher-level minor-leaguers, those data are proprietary to our consulting operations and will not be available on this site. Also, keep in mind that the data normalization we have referred to will always have a few limitations for current-season data that do not apply to our final, post-season evaluations; those limitations derive from certain implicit assumptions that are not quite true mid-season--for instance, that all the major-league ballparks are essentially the same as they were last year, that each team has played equally at home and on the road, that its games have been played against good and bad teams in the prevailing proportions, that the baseball has not been further tampered with, and that results obtain equally throughout the season (in reality, cool spring weather usually has an impact). Nevertheless, even the accumulation of error from all of these not-quite-valid assumptions is not likely to be so large as to invalidate or even materially compromise the utility of the baseball measures being calculated. Even after years of operation, this entire site is always under development. Just as with the old beat writers' observation, "You come out to the ballyard every day for twenty years, and every day you see something you never saw before," so with this site; the only constant is change. And if you miss something you would like to see, or see something you would as soon miss, please . . . let us know. There is a simple click-on-me response link near the bottom of every page of this site to make it easy for you. Take some time, explore the site, then send us your thoughts. And, again, thank you both for visiting here and for any feedback. One other thing: all pages on this web site have been third-party verified as being 100% fully and strictly compliant with the latest official written standards (currently version 1.0 "Transitional") for extended hypertext language ("XHTML"). That means that any properly designed web browser should display these pages exactly as they were meant to be seen. If you experience any systemic trouble reading these pages, your web browser is likely to be, as with so many, including--notably including--the most famous names (can you spell "Micro$oft"?), ill-designed. Analysis of Baseball Performance From Baseball Stats: Some Details
Analysis of Baseball Performance From Baseball Stats: Some HistoryLet it suffice to say that a myriad of such baseball measures have been proposed, and the count grows daily; and each proposed tool has claimed for it by its adherents various reasons why it is important. Until relatively modern times, all such tools--the good and the bad alike--were relative measures: they compared, in one or another way or ways, one man with another or one team with another. They were excellent conversation starters but little else. It was not until mid-century that a serious and sustained effort was made to derive some sort of absolute baseball measure--one that would allow calculation of actual, real-world values of importance (such as games won, the ultimate measure of importance). In the August 2, 1954, issue of Life magazine, Branch Rickey, possibly the finest mind ever to grace baseball management, set forth a formula (developed with the aid of mathematicians from M.I.T.) that would, in a crude way, predict how many games a team would win based on various commonly available team statistics. In 1964, a Johns Hopkins professor named Earnshaw Cook put out a book titled Percentage Baseball, wherein he derived--by using somewhat arcane mathematical methods (stochastic analysis)--criteria for both teams and individual players that were reasonably successful absolute measures. More recently, the highly articulate and provocative writing style of Bill James propelled these forms of analysis into the consciousness of the general baseball public, and even the ranks of baseball management. Nowadays, largely because of the success of James' books, everybody and his cousin is producing both books and proprietary measures with which to fill the pages of those books. While a few of these are good and several are awful (no names--you know who you are, or you should), the point of greatest interest to the public is that almost all of these books and measures, no matter how differently expounded, reach pretty much the same results. (Here is one example, and here's another.) That fact will, of course, tell the thoughtful one vital thing: this is now a science. Theoretical physicists still argue, with some heat, over small and abstruse side issues about Einstein's principles; but those debates are of little consequence to anybody besides the participants, whereas nuclear power plants--and weapons--are an everyday fact of life. So in baseball: the practitioners of analysis--which some like to call SABRemetrics (often mistakenly rendered as sabermetrics) after SABR, an excellent nonprofit research organization which has done outstanding work in restoring missing baseball records and developing standards for measuring performance, although analysis of the sort we discuss here was not their primary focus--still quibble and squabble over the minor arcana, but on the whole no reasonable person doubts any more that analysis is the only basis for a true understanding of the inner workings of the subtle and wonderful game of baseball. |
||||||||||||||||||||||||
Measures calculated by High Boskage House Baseball Operations, using proprietary techniques.
All data soon will be (but is not yet) normalized for park effects and seasonal variations.
(What do you know about OmniKnow?)
|
|
This site is one of The Owlcroft Company family of web sites. Please click on the link (or the owl) to see a menu of our other diverse user-friendly, helpful sites. |
|
|
Site Front Page Late Baseball-Site News and Thoughts |
||
|
Daily Baseball Data: |
||
|---|---|---|
|
Teams: |
||
| Overall Team Performance Stats (win projections and more from actual quality of play to date) | ||
| Player Performance Stats, by Team | ||
|
Batters: |
||
| Batters by Last Name: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | ||
| Batters by Performance (a single all-batters list) | ||
|
Batters by Positions Played:
alphabetically: C | 1B | 2B | SS | 3B | LF | CF | RF | DH | SP | RP by batting performance: C | 1B | 2B | SS | 3B | LF | CF | RF | DH | SP | RP |
||
|
Pitchers: |
||
| Pitchers by Last Name: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | ||
| Pitchers by Performance (a single all-pitchers list) | ||
|
Pitchers by Role:
alphabetically: Starters | Relievers by pitching performance: Starters | Relievers |
||
|
Other Statistical Data: |
||
| "Regular" Players, Starting Pitchers, and Relief Pitchers, by Performance | ||
|
Team
Defense (and its projected consequences)
|
||
|
Baseball "White Papers"--meanings and explanations of the things on this site |
||
|
General Background: |
||
| For You Rookies: what this site is all about--what it is telling you about baseball, and how, and why | ||
| Some Baseball Analysis Theory: a semi-technical backgrounding on modern baseball analysis | ||
| Baseball Stat Definitions: the standard and the unique statistics we present here, defined | ||
| Baseball Data Normalization: how we correct for what, and why we need to | ||
| The "Quality of Pitching" Measures: why they are the best way to evaluate pitching performance | ||
|
"Steroids":
why just about everything you think you know about them is wrong Now a site of its own! steroids-and-baseball.com (the link above gets you there) |
||
| "The SillyBall": why baseball before and after 1993 is really two different games | ||
|
About Particular Pages Here: |
||
| The Team-Performance Table: there is a lot in that Table, and this explains what it all is | ||
|
The Team-Defense
Table: how important defense is or isn't in baseball, and how to
correctly evaluate it
|
||
|
Miscellaneous--but not unimportant |
||
| About High Boskage House: who we are and why we might know what we're talking about regarding baseball | ||
|
Links To A Select Few
Other Useful Baseball Sites (including those that link to this one)
|
||
|
The High Boskage House Baseball Shop (which offers more than baseball books--in fact, more than just books) |
||
|
What Makes This "Baseball Shop" Special: |
||
| Finding Books About Baseball Topics: we've already done it for you, and our list is updated daily | ||
| Search For Any New Book at Amazon (which is, after all, the cheapest place to buy books new) | ||
| Search For Any Used Book at Abebooks (which is the easiest place on the internet to find any used book) | ||
|
Search For Anything at
All at Amazon: nowadays, they're a lot more than just books
|
||
| Baseball Books Available Today: | ||
| A Master Baseball-Books List (plain text your browser can easily "search") | ||
|
Baseball Books By Title:
(because so many baseball book titles begin with the word "baseball", those are broken out separately in the title lists below) A | B | "Baseball" | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | other |
||
Not every browser renders proper HTML correctly (Internet Explorer famously does not);
so, if your browser experiences any difficulties with this page (or, really, even if it
doesn't),
(It's free!)