|
The High Boskage House Baseball-Analysis Web Site baseball team and player performance examined realistically and accurately |
||
| email me | search site | site directory |
Probability and BaseballTo begin by discussing "probability", an abstruse field of mathematics, on a baseball site may seem to be putting the cart before the horse--but in reality, probability theory is the horse that pulls the analysis cart. Probability theory can get pretty hairy pretty quickly; fortunately, for our purposes we need only grasp a few basic ideas that in any event correlate well with our everyday experience of baseball and the world--as they should, since at bottom everything in the universe is probabilistic. (As a sidelight, it is both interesting and amusing that probability theory, which began in the seventeenth century, was originally contrived to answer the questions some aristocratic gentlemen had about the true odds in certain dice games they favored. So probability theory's relationship with games is historically sound.) The main reason we need a little insight into probability is to understand just what it means to say we have a "formula" for team runs scored or some such. Some formulas are not probabilistic: if we know the velocity of the baseball and the distance from the pitcher's hand to the plate and the resistance of the air, we can calculate exactly the time the baseball will take to get to the batter--not "more or less" but exactly. But obviously we cannot calculate the runs a team will score from its plate appearances, hits, walks, and so on with that same kind of absolute, cause-and-effect, dead-on accuracy. What we can calculate with near-perfect accuracy is the probable number of runs the team will have scored. But if we say this or that baseball formula is correct as to the probable result, what do we have? If the team actually scored 800 runs and the formula said 775, what--if anything--do we know? To gain understanding, we need no more complex tool than a coin, which we will, in imagination, undertake to toss. We all understand implicitly that a fair, balanced coin will come up heads half the time (and tails the other half), which is why so many decisions that we want to make in an unbiased way are made by tossing a coin; so much is elementary. Or is it? Just what do we really mean by "half the time"? That needs to be looked at more closely. We certainly do not expect that a tossed coin will come up heads-tails-heads-tails et cetera forever. Nor do we require or expect that if we toss it 10 times we always and surely get 5 heads and 5 tails. In short, what we mean by "half heads" is that the more times we toss the coin, the more closely we expect the percentage of heads to approach 50 percent. In 10 tosses, 7 heads is of no account; in 100 tosses, 70 heads would surprise us mightily; and in 10,000 tosses, 7,000 heads would shock us to the core (or, more correctly, would absolutely, positively convince us that the coin is in fact not a fair, balanced coin). As a matter of record, there is a precise, if complex, mathematical relationship between the number of times we "sample" something and our confidence that our samples represent "true" or very-long-term expected results. That is what is referred to in, for example, political polling results when the pollster speaks of "a confidence level of plus-or-minus X percent." That is, the more coin tosses we make, the closer we expect the heads total to approach 50%; the relation is a formula that tells us in definite numbers how far off from 50% heads we typically expect to be for a given number of tosses. An interesting sidelight is that added data become progressively less helpful in improving confidence. If a rookie bats .300 in his first full year, we are hopeful but not fully convinced; but if he bats .300 or so again the next year, we feel like the team has found a real .300 hitter. If he then bats .300 for his career--well, we already expected that, didn't we? The increase in at-bats from the 1200 or so of his first two seasons to the perhaps 10 times as many of his full career gave us less new information than the mere doubling of the number from his first year to his second. Mathematically stated, confidence goes up as the square root of the data multiple: that is, it takes four times as much data as we have to double our confidence in what the data are telling us. That's one big reason why pollsters can determine pretty well, for example, how popular a given TV show is by surveying only a few hundred households. If we do the math--which we won't here--using typical baseball-team numbers, we find that over a full 162-game season, for runs scored we can expect an average variation from target of a little under 3 percent (about 2.9%) from chance alone. Over a more restricted period of time--say the first month of a season--the expected average scatter rises significantly, to around 6%, owing to the reduced data sample. Probability theory tells us more than just the expected average error. It also tells us how we should expect any actual set of results to be distributed around that average. Think of it this way: if we firmly clamp a rifle in a vise so that it is aimed directly and precisely at the bullseye of a target some distance away and then fire that rifle a number of times, what do we expect the pattern of the holes in the target to look like? If we did the test in an ideal indoor windless test room we might just get a large series of dead bullseyes; but if we do it outdoors, where there is a wind blowing in a moderate but randomly variable (in both direction and velocity) way, what we expect is a scattering--but one centered on the bullseye. Moreover, we expect to see most of the holes fairly near the bullseye, a few a little ways out, and perhaps an occasional one quite a ways out (such things are thus appropriately known as "outliers"). If we measured the distance of each hole from the exact bullseye center and then made a little graph plotting number of holes (data points) against distance from the bullseye (expected norm), the graph--given enough shots (data) to show its shape clearly, would look something like a cross-section of the Liberty Bell--which is why such distributions are called "bell curves"--you've probably heard the term. (Technically they're "Gaussian distributions," named after the mathematician Karl Gauss.) The point of this digression is that if you take the results of any competent analysis of baseball statistics--let's say HBH's "TOP" formula--and repeatedly compare its predictions against real-world results, you expect to see a bell curve whose exact size and shape depend on definitely known numbers. If that is the case, you have good cause to say that the formula is correct. There are minor differences in accuracy between various different formulae from various different sources, but those differences are very small compared to the degree to which all of them, however derived by whom, generally agree with one another and with the expected scatter patterns probability mathematics demands of an accurate formula. So that you can see that we put our money where our mouth is, we include this demonstration tabulation of the HBH TOP formula tested on a full half-century of baseball stats. (You can also take a look at short-term results on the Team-Performance page on this site, but we don't link it at this specific spot because you should read more before going there.) The Logic of Baseball AnalysisWinning GamesIndividual baseball games are, obviously, won or lost based on a very clear and simple rule: the team that scores more runs than it gives up by the end of the game wins. Less obvious is that there is a definite and clear relationship between the runs a team scores and gives up over a series of games and the percentage of games it wins in that series (and, again, that of course is a probabilistic relation). Given that fact, if we knew how many runs a team could be expected to score and give up over a season, we could predict with reasonable accuracy how many games they would win in that season. There are numerous versions of this formula; they often look very different, but when one does an engineering analysis with typical baseball numbers, they essentially resolve into the same thing. Naturally, they thus also each give almost exactly the same results for a given set of games and runs figures. Cook, in Percentage Baseball, used simply-- 1/2 x R/OR --(where R and OR are Runs and Opponents' Runs) to get the expected win percentage. Bill James has used his so-called "Pythagorean" formula, not easily reproduced on a web page. We at High Boskage House have yet another. None of them is really right, because "right" here would be a very messy probabilistic equation based on typical scatter of runs scored around its average value for a team (that is not a simple bell curve, because it is constrained at one side, the lower limit--you can't score fewer than zero runs--but there is no upper limit, especially with the SillyBall). But they all work quite well enough. (An important aspect of baseball statistics is that in the real world they don't actually have a very large range of values: for example, once past a few at-bats, no one hits .007 or .731; no team's seasonal run total is 37 or 23,469; and so on. Because of that narrowness, an equation relating values can be rather drastically wrong overall, but still manage to give tolerably correct answers within that narrow range. To a human, the world looks flat, because the very narrow segment of it we see is curved so slightly we can't notice it. In mathematical terms, a linear approximation can work on almost any function if it only has to deal with a very narrow segment.) To make this less mystical, consider a team that plays a fairly large number of games against another team or set of teams--in fact, a typical baseball season. If, at the end of that time, the team has scored exactly as many runs as it has given up, it is no great leap of logic to say that on balance they have been neither better nor worse a team than their average opponent. That being so, we would expect that the most reasonable outcome is that they have won no more than they have lost: that they are .500 in those games. All games-won formulae thus must meet the test that at equal numbers of runs scored and runs yielded, they predict a .500 win percentage. (Consider, for example, Cook's formula as given above.) Moreover, we certainly feel that if the team has outscored their opponents by a little, they should have won a little more than half their games; and if they outscored them a lot, then they should have won at well over a .500 clip. The various formulae quantize those expectations, giving specific, reasonably reliable win percentages for specific R and OR run sets. Scoring RunsIf we can--and we indeed can--project probable games won from runs scored and runs yielded over any arbitrary set of games (with, of course, increasing accuracy as the number of games in the series rises), we would next like to be able to project runs scored and yielded for a team based on who is playing and pitching for it. If we could do that as well--and here too we can--we could then project with some accuracy a team's ultimate win percentage just from the identities of its player personnel. The essence of scoring runs in baseball is remarkably straightforward: put runners on base and then drive them in. The background is the ticking clock of baseball--outs. Of all the many and diverse numbers in baseball analysis, none is nearly so important as this one: three, the three outs that define an inning. Another thing that probability mathematics tells us is that the chances of two things both happening is the chance for one multiplied by the independent chance for the other. The chance of a man getting on base is very simply expressed by a now-familiar (if late-arriving) stat: the on-base percentage. To get the chances of a man at the plate becoming a run scored, we need to take his on-base percentage and multiply it by some factor representing the chances that a teammate will knock him in. (We do need, naturally, to make some adjustment to the raw on-base percentage to allow for the facts that the man may get on by an error, and also that he may be put out on the basepaths even after having reached safely). As an aside, we need to remember at all times--which many discussions and analyses we have seen do not--that the batter at the plate is also a base runner. That is, there is always at least one runner on for every batter: himself. He is the base runner on "zeroth base." What he does as a batter independently affects him as a base runner (analysts sometimes forget that, but the Rules Of Baseball don't, referring to the "batter-runner"). It is as if there are two men at the plate: a runner, just like a runner at any other base, and a batter who does what he does and then fades into thin air as the base runners (or runner--himself) do whatever is appropriate for what he as a batter did. The mechanics of what such an "RBI factor" might comprise, and of how it is derived, are somewhat complicated. Evidently base hits are going to be very important, and extra-base hits especially so; but walks have some value, and even minor factors like wild pitches and balks are not utterly negligible. In fact, the details of both the philosophy and practice of calculating an RBI factor of some sort are largely (but by no means wholly) what distinguish one school of analysis from another. High Boskage House has its own methods, which we will not detail here for a variety of reasons, most notably brevity. (The whole shebang appears on the graph that appears on the HBH TOP formula tested page.) In many workers' formulations, the occurrence rate of Total Bases (the sum of all hits weighted by bases per hit--that is, for example, triples are 3 and singles are 1) is the only determinant in their RBI factor, whatever they call it (if they call it by a name at all). That can actually give a pretty fair result, and it has the virtue of simplicity. The first runs-scored formula Bill James widely published was indeed that simple: (Hits + Walks) x (Total Bases) / (At-Bats + Walks) Since Hits + Walks is, roughly anyway, the available base-runner total, and At-Bats + Walks is--also roughly--the plate-appearances total, manifestly James' "RBI Factor" in this formulation was indeed just the Total Bases rate (TB/PA, more or less). Note that James did not use the on-base percentage, or any rough equivalent of it: he used what amounts to the actual number of base runners. That's OK for a quick, simple formulation which will serve to demonstrate how well analytic methods work, but it limits the utility of the formulation to evaluating what has happened; you cannot use it to predict what likely will happen because to know how many men will reach base, you need to state your formula in terms of an on-base rate. And that brings us to another important point. The on-base percentage and an RBI factor, when multiplied, give the chance that a given batter will become a run scored; but the actual number of runs scored also depends on how many men come to the plate so as to have that chance. That number, actual total team plate appearances, varies significantly from team to team and year to year; but it does not do so without cause. Remember outs as the ticking clock of an inning: the less likely a team is to make an out at the plate, the more men they will get to the plate over the long haul. That can be stated quite precisely in a mathematical formulation, but its essence for the purposes of understanding is this: a team's on-base percentage has a form of compound-interest effect on run scoring. First, it directly increases the chance that any one batter will ultimately become a run scored; and second, it increases the number of men who will get to have that chance. It is for these reasons that the single most important baseball statistic viewed in isolation is, far and away, the on-base percentage; actual run scoring tracks on-base percentage more closely than any other single statistic (as we now understand that it should). And one more time: if you have any doubts that the HBH run-scoring equation works, and works very well, look over the actual results again. Rating PlayersNow consider this: what we can calculate for a team from its statistics, we can also calculate for any one batter from his personal statistics. If we then set the number of available outs to what it is for a full team for a full season, we get a number that sums up that man's ability to contribute to his team's scoring of runs in one number; we can think of it as the runs that would be scored in a season by a team made up entirely of exact clones of that man. High Boskage House calculates just such a measure, which we call the Total Offensive Productivity, or just TOP. It is shown for all batters listed anywhere in these pages; in the by-team batting lists, the batters are arranged in order of descending TOP. (There is a now-popular metric called RC27, "Runs Created per 27 outs", but you cannot convert a TOP to the equivalent of an RC27 value just by dividing by the 162 games in a season, the reason being that there are just slightly fewer than 27 outs in the average real-world baseball game, so that the TOP--which is oriented to real-world results--divided by 162 will be slightly lower than a corresponding RC27 value, even if both "agreed" in their evaluation.) Moreover, what one can calculate for a batter, one can correspondingly calculate for a pitcher, using the numbers that he gives up to batters. You will find on this site just such calculations, which yield a novel and very, very important measure that we call the "Quality of Pitching" stat (there is also a closely related stat that we call the TPP, for Total Pitching Productivity, because the term pairs nicely with the TOP). There is a separate page on this site that discusses those measures further, but you would be best off to finish this page before jumping there. Two other and somewhat related points need mention. (Actually, they need extensive discussion, and we hope in future, as we gradually expand these notes, to give them that discussion.) One is the predictability of individual men's performances. As we said earlier, there is a sort of law of diminishing returns for the meaning of increasing data; by the time we have roughly the equivalent of two seasons' full-time play for a batter, we have enough data to have defined his norms of performance pretty well. Pitchers, for complex reasons, take more time to evaluate, although using the TPP instead of the ERA gives results in time periods comparable to those needed for batters. Moreover, by the time a batter reaches double-A ball, he has become pretty much what he will be; if we have two seasons' worth of data above class A ball, we have the man defined. It is precisely that predictability that makes it possible to "engineer" a baseball team in a manner quite comparable to the process of engineering an automobile engine. By knowing the data for the components and the equations for how those components interact, we can design an engine or a team to meet a specified set of performance criteria. It is crucially important to understand what we are saying here: we are not saying that we can predict accurately how every man will do in a given season from how he has done in the past; that is, as common sense suggests, impossible. But, just as we certainly cannot predict how a pair of dice will come up in any given throw or small number of throws--which is why people gamble--we can equally certainly predict with great accuracy how much money a craps table will likely take in on one shift because we know the tendencies of the dice well. So with a ball club: if we know the tendencies of the batters and pitchers--what they have done in the past--we can predict with good accuracy how the cumulative results of 25 men over a full season will come out. We know some will be surprisingly low and some surprisingly high, but--most of the time--the net will be on target. (A full season for a ball club is enough for acceptable precision, but there is always room for the occasional burst of especially good or bad luck seen by most fans--as well as professionals who should know better--as either "clutch performance" or "choking." (How come no sane craps player ever refers to "clutch" dice or "choking" dice when they win or lose?) The second point, related to what we just said, is that minor-league statistics--long thought by most baseball professionals and fans to be nearly meaningless--can be translated so that we see what the man would have achieved playing at that same level of ability in a major-league ballpark against major-league competition. (That realization, and the mechanics to implement it, are one of Bill James' most valuable contributions--probably his most valuable--to the art.) Finally, we repeat that all statistics, to be useful, must be comparable. There is a separate page on this site that discusses "normalization" processes for stats and why they have become a statistical nightmare (we no longer apply them, preferring honestly raw results to results "adjusted" by a dubiously derived factor). |
Site Mechanics:
|
|||||
|
Site Directory:
This site's Front Page Late News about the site |
|||
(team and player performance evaluations, updated daily) |
The Performance Stats: | ||
Team Measures:
|
|||
Player Measures:
|
|||
(meanings and explanations of the things on this site) |
Baseball-Analysis Background: | ||
|
For You Rookies:
what this site is all about--what it is telling you about baseball, and how, and why |
|||
|
Some Baseball Analysis
Theory: a semi-technical backgrounding on modern baseball analysis |
|||
|
Baseball Stat Definitions: the standard and the unique statistics we present |
|||
|
The "Quality of Pitching" Measures: why they are the best way to evaluate pitching |
|||
|
The SillyBall: why baseball before and after 1993 is really two different games |
|||
|
Fielding and Defense in Baseball how important defense is or isn't in baseball, and how to correctly evaluate it |
|||
|
Baseball Data Normalization: why raw stats need "correction", and how and why we can and cannot apply it |
|||
|
"Steroids" and Other "Performance-Enhancing
Drugs": why just about everything you think you know about them is wrong (now a full-fledged site of its own) |
|||
(miscellaneous but not unimportant) |
Some Miscellaneous Information: | ||
|
The Team-Performance Table there is a lot in that Table, and this explains what it all is |
|||
|
The HBH Baseball-Analysis Formula Tested what we get when we apply it to half a century of team stats |
|||
|
The Pitfalls of Park Factors an explicit, detailed demonstration of how and why they are so dubious |
|||
|
About High Boskage House who we are and why we might know what we're talking about |
|||
|
Links About Eric Walker links to baseball-related pages concerning the webmaster here |
|||
|
Links To A Select Few Other Useful Baseball Sites including those that link to this one |
|||
(new, used--find any book, anywhere in the world) |
The High Boskage House Baseball Shop: | ||
What Makes This "Baseball Shop" Special:
|
|||
| Baseball Books Available Today: | |||
Site Info:
Comments? Criticisms? Questions? Please, e-mail me by clicking here. (Or, if you cannot email from your browser, send mail to webmaster@highboskage.com)
This web page is strictly compliant with the W3C (World Wide Web Consortium)
So if your browser experiences any difficulties with this page(or, really, even if it
doesn't seem to), |
||||||||