About this report
As I mentioned previously, 1994 has a weird and blurry place in my memory. I was young, obsessed with baseball, software and other teenaged things. When I think back to the season before the strike, it all seemed bigger - like offensive production hit an absolutely massive peak with some massive records (like Roger Maris' 61 home run season) seeming poised to fall and Tony Gwynn making a serious run for .400.
Weirdly, I do not remember any pitching performances whatsoever and in my old memories, 1994 seems like the year of the bat. And so while digging into Retrosheet data, I started to come up with an idea. And so I wrote some code, analyzed the data and started trying to answer a simple question.
Did MLB's offensive environment shift in 1994 in a way that exceeded the normal seasonal and park variation visible in nearby years?
Well, not so simple. This report is a first-pass answer. It uses Retrosheet team-game batting totals and game context from 1989 through 1998. Every season is cut off after August 11, matching the final day of the 1994 regular season. That gives 1994 a fair comparison window instead of comparing a 113-game player season or a strike-shortened league total to a full 162-game season. It creates some weird data and I don't think the methodology was right, but we'll file this one away as an exploratory pass. And, it shows some league wide trends that have been reported before, so the methodology is at least close.
What the data says
1994 was an unusual offensive season within the ten-season window that I analyzed.
Through August 11, the 1994 season ranked second out of ten seasons in runs per team-game, home runs per team-game, batting average, slugging, OPS, ISO, and home runs per plate appearance. It ranked third in BABIP. Compared with the same point in 1993:
- runs per team-game increased by 6.6%;
- home runs per team-game increased by 16.3%;
- ISO increased by .0158;
- home runs per plate appearance increased by 14.6%.
That is strong evidence that the 1994 offensive environment was not a memory trick constructed from Gwynn, Bagwell, Thomas, Williams, a very good Yankees outfield and those little distortions caused by 32 years of time.
But it is also not any kind of indication that something special happened leading up to the strike. Instead, it's just part of a wider and more interesting trend that culminates with a lot of the big years in the 'steroid era'.
The offensive rise already began in 1993. That year had the largest year-over-year increase in runs and home-run rate in this window. Then 1994 stayed high, and 1996 exceeded 1994 across most of the same measures. The cleanest summary is not "something suddenly changed in 1994." It is:
MLB entered a high-offense stretch beginning in 1993. 1994 was a major part of it, but not an isolated outlier.
And so while there were some amazing individual seasons cut off in 1994, it was just part of a broader trend that continued with things like Brady Anderson's 50 home runs in 1996 and the Sammy Sosa/Mark McGwire race to chase down Roger Maris' record.
The Coors question
The 1994 Colorado Rockies played their first season at Mile High Stadium. I wrote about it previously using a different way to analyze the contributions from a stadium. That is a very large potential confounder, especially for a report that asks questions about league-wide offense.
So the report runs a test where I leave one park out. For each 1994 park, it removes every game played there and recalculates the remaining MLB rate.
Mile High clearly raised scoring. Its 57 games produced 5.90 runs per team-game, well above the league rate. But removing those games changes the MLB home run rate from 1.0331 to 1.0324 home runs per team-game. That is a decline of about 0.07%.
In other words, Mile High mattered to the 1994 scoring context, but it does not explain the league-wide home run signal. And so while the Coors effect is real whether you analyze individual performance or overall performance, it's not the only thing going on here.
What this report cannot tell us
This is descriptive baseline analysis, not a causal model.
It cannot determine whether:
- performance enhancing drugs were more or less widespread in 1994;
- baseball construction changed (the balls got juiced);
- any change was intentional;
- ownership, players, umpires, or anyone else made a decision;
- the rise came from player talent, training, pitching quality, expansion, park changes, weather, rules, scoring conventions, or a combination of several factors;
- 1994 was unique relative to a wider historical sample.
The weather fields in the Retrosheet game context are useful for future work, but their coverage is inconsistent in early seasons. And so this report describes the fields but does not try to build any kind of weather-adjusted model to find a fit.
1995 also deserves caution. Its rate statistics are useful, but the season opened late because of the strike and so the games played are notably lower. It shows rates of production, but over a season that looked very different.
The connection to the strike attendance report
The MLB Attendance and the 1994 Strike report looks at the same historical moment from a completely different direction.
That report shows that raw 1994 attendance is a bad measure because the season stopped. Attendance per opening actually remained strong in 1994, while the more obvious attendance wound appears in 1995 when baseball returned.
This report adds a small piece of on-field context. Fans were treated to a very exciting game in 1994. The game was producing unusually high offense before the strike stopped it. That is interesting historical context, but it is not proof that offense drove attendance or that the strike damaged the sport more because it interrupted an exciting season. The two reports should sit beside each other because they look at the same era through different questions.
In the future, I think I should write another report that looks at attendance versus offensive performance.
Methodology
The source data comes from Retrosheet:
- simplified
teamstats.csvfor value-only team-game batting totals; - main
gameinfo.csvfor game date, park, home/away context, and available environmental fields.
The analysis includes regular-season games from 1989 through 1998, through August 11 in every season.
For the first version, the main MLB rates are:
- runs per team-game;
- home runs per team-game;
- batting average;
- on-base percentage;
- slugging percentage;
- OPS;
- ISO;
- home runs per plate appearance;
- BABIP.
The pipeline aggregates the raw totals first and then calculates each rate from the aggregated denominator. It does not average game-level OPS, batting average, or home-run rates. Rolling values are calendar-day windows calculated from rolling totals.
For the American League and National League output, the scope follows the home team's league. That made both batting lines in an interleague game belong to the host environment after interleague play began in 1997. MLB rows contain every included game.
My data quality check didn't find anything missing or any malformed rows because of team names, etc. In other words, me do good but would be good if no had check work.
Accessibility notes
This report is designed around text and tables first.
The line chart is there to make the season pattern easier to scan, but the season table contains the same numbers. Both tables use captions, column headers, and regular buttons for sorting. The chart metric selector is a normal form control, and the result counts announce updates through a live region.
This page was scanned for accessibility with Siteimp and tested with NVDA.
I am still learning accessibility, so I make mistakes. You would maybe think that after 16 years I could have stopped learning by now, but thus far it's been a constant learning process. But... any day now!!?? :) If something does not work with your screen reader, keyboard, zoom settings, or browser setup, please contact me so I can fix it.
What did we learn?
The 1994 season really was a high-offense season. That is visible in the league-wide data, not just in the famous names that survived in my teenaged brain.
It was high enough to deserve investigation. It was not clean enough to make any deeper claims about a juiced baseball or even any steroid use, though the numbers sure line up with the steroid era.
The strongest conclusion is that 1994 is part of an offensive surge from 1993 through 1996. The same surge lead to the famed Sammy Sosa/Mark McGwire race to beat Roger Maris' old famous home run record. The next useful version of this report would widen the historical window, add an expansion-era sensitivity pass that removes Rockies and Marlins games, and decide whether the available weather coverage can support a properly limited model.
For now, I am happy with the small and and am especially happy that I have at least three more reports to write! Baseball rules.