A Graphic Demonstration of the TOP-Formula Accuracy

(One of several "White Papers" on baseball analysis to be found on this site--see the Site Directory below.)

(The data here only run through and including the 2001 season because that's when we made this page and its graph,
and it's a nuisance to re-make the graph; but almost half a century of beautiful matches ought to suffice.)

Summary of Results

Full results appear farther down this page, but first this summary:

The graph below shows the results of applying the full High Boskage House formula for runs scored (derived from team Plate Appearances, Hits, Total Bases, Walks, Steals, Caught Stealing, and Sacrifice Bunts) to almost a half-century, 48 full years, of actual major-league data: all teams in all years from 1954 through 2001 inclusive (there were scoring-rule changes in earlier years that make certain raw data suspect).

Shown below are actual team runs scored, predicted team runs scored (rounded to the nearest whole number), and the difference between prediction and actuality as a percentage. As expected, the average size of runs projection errors is well under 3% (about 2½%). For those who know statistical theory, the Standard Deviation is 22.31 runs (without any regression "curve-fitting" cooking of the results).

The overall results are these:

For 1138 Team-Seasons Evaluated:
"Expected" Average Error Size: 19.92 runs/team/season 2.84%
Actual Average Error Size: 17.58 runs/team/season 2.51%
Cumulative Error: -0.02 run/team/season circa 0%

The "Error Sizes" disregard whether the error is high or low--they measure its size. The "Cumulative Error" allows plus and minus errors to cancel; as we should expect, it is virtually zero, far less than a run a team a season.

(The statistical formulae used to calculate those data are shown at the bottom of this page.)

Full Results

And now the year-by-year, team-by-team numbers. (Note: "Error" rates from years of less than 162 games--such as 1981--will, of course, be a bit larger than typical.) Since "a picture is worth a thousand words", here is a graph of the results: the red line is exact accuracy, and, as you can see, the results are a truly beautiful approximation to that red line.

Projected-vs.-Actual Runs Graph

Notice especially that accuracy remains excellent at the extremes: down at 329 annual runs (the lowest) and also up at 993 annual runs (the second highest), there are points bang on the line. That indicates a truly general predictive method of accuracy, not just one that is OK where the data bunch up.

For those who might want to see those data in tabular form, we have provided it on a separate page (separate owing to its length--it may take some time to fully download). If you want to check it, though here's the tabulated data.

[You can return to the Baseball-Analysis Theory discussion page or look over the calculation methods below.]


The Calculations

The methodology of the TOP (projected-runs) calculation is explained elsewhere on this site.

As to the statistical measures:

The expected error is calculated using the standard statistical probability formulae. The expected average error is 79.79% of one Standard Deviation. The Standard Deviation for any one team-season is, in turn, the square root of npq, where n is the number of data samples, p is the probability of a success, and q is the probability of a failure (by definition, then, q=1-p). For this tabulation, a "success" is a run scored and a "data sample" is a batter at the plate. Thus, the probability of a success is the calculated expected runs divided by the team's total of batter plate appearances (we use the calculated, not the actual, runs because we want to know the S.D. for the projection, although in practice it would matter little which was used). So, the expected average error per team-season is just:

err = 0.7979 x SquareRoot(PA x (TOP/PA) x (1 - TOP/PA))

We then average the individual expected errors for an overall average expected error figure (the individual expected-error figures per team-season will be rather similar because neither PAs nor TOPs vary all that much from one to another).





[Return to the Baseball-Analysis Theory discussion page.]



You loaded this page on Thursday, 3 July 2008, at 20:53 EDT.;
it was last modified on Thursday, 3 April 2008, at 21:50 EDT.

Measures calculated by High Boskage House Baseball Operations, using proprietary techniques.

All data soon will be (but is not yet) normalized for park effects and seasonal variations.

(What do you know about OmniKnow?)

owl logo This site is one of The Owlcroft Company family of web sites. Please click on the link (or the owl) to see a menu of our other diverse user-friendly, helpful sites.       Pair Networks logo Like all our sites, this one is hosted at the highly regarded Pair Networks, whom we strongly recommend. We invite you to click on the Pair link (or their logo) for more information on getting your site or sites hosted on a first-class service.

Site Directory:

Search this site, or the web:
Google
  Web highboskage.com   


Site Front Page

Late Baseball-Site News and Thoughts


Daily Baseball Data:
 
Teams:
    Overall Team Performance Stats   (win projections and more from actual quality of play to date)
    Player Performance Stats, by Team
 
Batters:
    Batters by Last Name:    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    Batters by Performance (a single all-batters list)
    Batters by Positions Played:

     alphabetically:               C | 1B | 2B | SS | 3B | LF | CF | RF | DH | SP | RP

     by batting performance:    C | 1B | 2B | SS | 3B | LF | CF | RF | DH | SP | RP
 
Pitchers:
    Pitchers by Last Name:    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    Pitchers by Performance (a single all-pitchers list)
    Pitchers by Role:

     alphabetically:               Starters | Relievers

     by pitching performance:    Starters | Relievers
 
Other Statistical Data:
    "Regular" Players, Starting Pitchers, and Relief Pitchers, by Performance
    Team Defense (and its projected consequences)


Baseball "White Papers"--meanings and explanations of the things on this site
 
General Background:
    For You Rookies: what this site is all about--what it is telling you about baseball, and how, and why
    Some Baseball Analysis Theory: a semi-technical backgrounding on modern baseball analysis
    Baseball Stat Definitions: the standard and the unique statistics we present here, defined
    Baseball Data Normalization: how we correct for what, and why we need to
    The "Quality of Pitching" Measures: why they are the best way to evaluate pitching performance
    "Steroids": why just about everything you think you know about them is wrong
Now a site of its own! steroids-and-baseball.com (the link above gets you there)
    "The SillyBall": why baseball before and after 1993 is really two different games
 
About Particular Pages Here:
    The Team-Performance Table: there is a lot in that Table, and this explains what it all is
    The Team-Defense Table: how important defense is or isn't in baseball, and how to correctly evaluate it


Miscellaneous--but not unimportant
  About High Boskage House: who we are and why we might know what we're talking about regarding baseball
  Links To A Select Few Other Useful Baseball Sites (including those that link to this one)


The High Boskage House Baseball Shop (which offers more than baseball books--in fact, more than just books)
 
What Makes This "Baseball Shop" Special:
    Finding Books About Baseball Topics: we've already done it for you, and our list is updated daily
    Search For Any New Book at Amazon (which is, after all, the cheapest place to buy books new)
    Search For Any Used Book at Abebooks (which is the easiest place on the internet to find any used book)
    Search For Anything at All at Amazon: nowadays, they're a lot more than just books

  Baseball Books Available Today:
    A Master Baseball-Books List   (plain text your browser can easily "search")
    Baseball Books By Title:
(because so many baseball book titles begin with the word "baseball", those are broken out separately in the title lists below)

    A | B | "Baseball" | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | other



Comments? Criticisms? Questions?

Please, e-mail us by clicking here.
(Or, if you cannot email from your browser, send mail to webmaster@highboskage.com)

All content copyright 2000 - 2005, The Owlcroft Company



This web page is strictly compliant with the W3C (World Wide Web Consortium)
Extensible HyperText Markup Language (XHTML) Protocol v1.0 (Transitional).
Click on the logo below to test us!


Not every browser renders proper HTML correctly (Internet Explorer famously does not);
so, if your browser experiences any difficulties with this page (or, really, even if it doesn't),


(It's free!)


---=== end of page ===---