Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Explaining Jeremy Lin's Early, Surprising Success

Cal vs. Ranked Opponents in the Tedford Era

My fanpost on Cal vs. the Pac-10 in the Tedford Era led me to one major conclusion: It's hard to figure out anything from those numbers because the sample sizes are way too small, and too easily skewed by individual games like (for example) last year's Washington and Washington State games, or by factors that don't show up in the numbers like special teams in the UCLA games (thanks norcalnick).

I thought that there was room for further exploration of what variables correlated the most closely to victory, and there were four potential areas that came to mind:

1. Use a sample consisting of all of our games in the Tedford Era. The advantage here would be that the sample would be bigger. The disadvantage is that the sample would include a lot of games against extremely strong teams (like USC) and extremely weak teams (again, like Washington State), and so would have a higher deviation (I don't know how to express this in statistical terms - whatever we come up with would be an average that may not provide much predictive value for the more important games that we want to look at).

2. Use a sample consisting of all games against ranked opponents in the Tedford Era. This would be a smaller sample, but theoretically at least one with relatively little deviation, and one that would have some predictive value as far as our tougher games go.

3. Look at specific intervals within the range of data for each variable. While not statistically significant, this could provide examples of how Cal's performance may not match its statistics.

4. Look at specific scenarios. Even smaller samples, but interesting.

I decided to look at correlation coefficients using the first sample to get a "big picture" kind of view, and then use the second sample to explore in more detail the implications of our past record against USC and other ranked teams.

[Before I go any further: big shoutout to Royrules once again for sharing his data with me, without which this post would not be possible.

Also, note that I have no real knowledge of statistics, this is just me messing around with numbers and trying to figure out what they mean. If anyone has suggestions on how to better examine this data, that would be awesome.]

Star-divide

1. Correlation Coefficients (all Tedford-era games) (N=89)


I used Excel to calculate the correlation between each of the following variables and the outcome of each game (defined as points margin - the most objective measure I could think of):

(Note: Close to 1 means more positive correlation, close to -1 means more inverse correlation, 0 means less correlation)

Rush Off 0.62
Pass Off -0.10
Rush Def -0.41
Pass Def -0.15
Total Off 0.43
Total Def -0.22
Net TO 0.55

 

A couple of thoughts here:

  • The ground game, both ways, clearly correlates more closely to points margin than the passing game
  • Offense correlates more closely than defense. Could this be because our defense is fairly consistent, and victory or defeat depends more on how the offense does?
  • Passing offense has a negative correlation to points margin. Is this because we abandon the run and start airing it out when we're behind, like the 08 Maryland game where Riley ended up with 423 passing yards?

Of course, these numbers don't tell us much out of context You can argue that a 7 point win over USC is more impressive than a 21 point win over Pradesh A&M, which is why I decided to look at the second set of data: ranked opponents only.

2. Correlation Coefficients (Tedford-era games against ranked opponents) (N=25)

 

Rush Off 0.39
Pass Off 0.06
Rush Def -0.26
Pass Def -0.05
Total Off 0.37
Total Def -0.28
Net TO 0.70
3D% 0.36
Opp 3D% -0.39

 

Observations:

  • The passing game's correlation to points margin dwindles to almost nothing
  • Turnovers are HUGE - almost twice the correlation coefficient of any other variable

Now you may be thinking: maybe these correlations are just flukes, given the small sample size. I was curious about that as well, so I decided to break down our record against ranked opponents based on individual variables into intervals. That will allow us to see what happened in all cases where variable = x (positive TO margin, more than 200 rushing yards, whatever).

3. Cal's Record for Specific Intervals of Data (Overall record: 11-14)

These are mostly what you would expect, but a few are pretty interesting.

 

Turnover Margin:

When TO Margin > 0: 7-0
When TO Margin = 0: 3-6
When TO Margin < 0: 1-8

Observations: Res ipsa loquitur.

Third Down Percentage:

When 3D% <= 33%: 2-8
When 3D% > 33%: 6-4
When 3D% > 40%: 5-2

When Opp 3D% <= 33%: 4-5
When Opp 3D% > 33%: 4-7
When Opp 3D% > 40%: 2-5

When 3D% > Opp 3D%: 4-3
When 3D% <= Opp 3D%: 3-8

Observations: Nothing unexpected here.

Points:

When Cal scores <  21: 0-9
When Cal scores 21-30: 2-2
When Cal scores 31-40: 3-3
When Cal scores 41-50: 5-0
When Cal scores 50+: 1-0

When opponent scores < 21: 3-1
When opponent scores 21-30: 6-5
When opponent scores 31-40: 2-3
When opponent scores 41-50: 0-4
When opponent scores 50+: n/a

Observations: Nothing unexpected here.

Rushing Yards

When Cal has < 101: 0-8
When Cal has 101-150: 6-0
When Cal has 151-200: 1-3
When Cal has 201-250: 4-2
When Cal has 250+: 0-1

Observations: Looks like our losses come when we can't get the ground game going at all, which makes sense. When we get 100+ yards, we're almost .667. And if that 1-3 looks weird, note that those 3 losses were all against USC.

Passing Yards

When Cal has < 151: 0-4
When Cal has 151-200: 2-2
When Cal has 201-250: 5-3
When Cal has 251-300: 3-3
When Cal has 300+: 1-2

Observations: I interpret these numbers to mean that as long as we get some kind of passing game going, outcome has relatively less correlation to passing yardage than to rushing yardage.

Opponent Rushing Yards

When opponent has < 101: 5-2
When opponent has 101-150: 2-4
When opponent has 151-200: 2-5
When opponent has 201-250: 2-2
When opponent has 250+: 0-1

Observations: Seems like we have to shut down the run to win, which again makes sense. We are .333 when the opponent has more than 100 rushing yards.

Opponent Passing Yards

When opponent has < 151: 1-2
When opponent has 151-200: 2-1
When opponent has 201-250: 1-7
When opponent has 251-300: 4-2
When opponent has 300+: 3-2

Observations: Doesn't seem to have much correlation. Could the 7-4 when opponents are 250+ be because opponents throw more when behind? Could the 1-7 when opponents are 201-250 be because opponents don't need to throw as much when they have a strong running game going? Lots of possibilities here.

Total Offensive Yards

When Cal has < 301: 0-4
When Cal has 301-350: 1-3
When Cal has 351-400: 5-2
When Cal has 401-450: 2-2
When Cal has 451-500: 3-2
When Cal has 500+: 0-1

Observations: Not much of a pattern here past 350.

Opponent Total Offensive Yards

When opponent has < 301: 1-1
When opponent has 301-350: 1-1
When opponent has 351-400: 5-5
When opponent has 401-450: 2-2
When opponent has 451-500: 2-1
When opponent has 500+: 0-4

Observations: No pattern here either under 500.

4. Cal's Record in Specific Scenarios

Averages

Stat Avg Home Away Win Loss USC Oregon Not USC
Pts 29.4 28.80 26.24 39.45 21.43 16.43 30.50 34.39
Opp Pts 27.9 22.30 27.94 21.55 32.93 25.14 22.75 29.00
Total 57.3 51.10 54.18 61.00 54.36 41.57 53.25 63.39
Net 1.44 6.50 -1.71 17.91 -11.50 -8.71 7.75 5.39
Rush Off 152 157.80 130.35 172.45 135.50 126.14 173.75 161.72
Pass Off 231 211.20 215.76 241.09 223.43 208.43 194.50 240.06
Rush Def 146 137.90 133.18 120.82 165.29 147.86 152.50 144.89
Pass Def 264 218.40 259.18 269.82 258.71 225.86 227.50 278.28
Total Off 383 369.00 346.12 413.55 358.93 334.57 368.25 401.78
Total Def 409 356.30 392.35 390.64 424.00 373.71 380.00 423.17
Net TO 0.16 0.30 0.06 1.91 -1.21 -1.57 0.50 0.83
3D% 0.35 0.33 0.30 0.42 0.30 0.39 0.36 0.32
Opp 3D% 0.37 0.31 0.35 0.33 0.40 0.38 0.32 0.36

 

Observations:

  • Win-loss doesn't really tell us anything. Obviously statistics in games that Cal lost are going to be a lot worse.
  • Cal is pretty strong on the road. Cal loses a net of about 50 yards and a TD by playing away, but that's less than I expected.
  • Even when Cal loses, we're not getting blown out yardage-wise. It's scoring and turnovers that are the problem.
  • Cal is surprisingly strong against USC (especially keeping in mind the caliber of some of those USC teams we played). Offense clearly suffers, as does turnover margin, but it seems like our defense holds up well against them.
  • It's interesting to see how the numbers change when you remove the USC games from the mix.

Overall Conclusions

  • The conventional wisdom about games being won or lost on the ground is true. Cal's rushing offense and defense correlates more closely to the final points margin than does passing offense or defense. It seems like that a certain baseline amount of passing yardage is necessary (makes sense, an offense has to be somewhat balanced) but above that baseline rushing yardage is more closely correlated to success than passing yardage. Of course, this could be explained either as Cal being more succesful when we are able to run the ball or as Cal running the ball more when we are already winning.
  • Yardage numbers and 3rd down % don't always correlated to the final score. The most obvious explanations that come to mind for this: either team being unable to score in the redzone, special teams plays, and turnovers.
  • Cal can win on the road. Cal can beat USC, too. The fact that our average statistics against them over the last 7 years are so close proves that. It's just a question of the other variables (again, things like special teams and turnovers) falling into place.
  • Let's repeat that, because it seems like it might be the single biggest factor at work: turnovers. Turnovers turnovers turnovers. Here's another factoid; in our seven games against USC, Cal's final turnover margins were 0, -1, -3, -5, -2, 0, and 0.

Obviously none of these conclusions are particularly groundbreaking (most of them are pretty damn obvious), but it's always nice to see things backed up by numbers and charts and stuff. </nerd> Again, if anyone has ideas for how we can more closely examine any of these things, that would be awesome.

The opinions expressed in a FanPost are, in every way, reflective of the opinions of every California Golden Blogs Marshawnthusiast. Moreover, they are reflective of every employee of SBNation, including Tyler "Blez" Bleszinski.

Comment 12 comments  |  7 recs  | 

Do you like this story?

Comments

Display:

Rec’d for awesomness, flagged for misspelling the word “era”

ALL HAIL SUPREME LEADER AVINASH!

www.CaliforniaGoldenBlogs.com

by TwistNHook on Sep 10, 2009 1:13 PM PDT reply actions  

Wow…I’m a moron. This is what happens when you write fanposts at 1am.

dboneisloose

by HolmoePhobe on Sep 10, 2009 1:40 PM PDT up reply actions  

Great post, lots of good data in there

by Kai on Sep 10, 2009 1:39 PM PDT reply actions  

Good insights, couple of thoughts

These stats could support the idea that the run game is far more critical to any success Cal has than passing, which could take some of the onus off the poor quarterback play the past few seasons.

*USC and Maryland we couldn’t run the ball at all last season
*UCLA, Furd, ASU in 07 our performances were similarly underwhelming
*All of our losses in 06 had a lot to do with inability to run the football.

However I find it hard to believe that QB play plays no part in how well our passing offense produces points. I’d have to say that passing yardage might not be the stat you want to be looking at (since yardage is not the crucial part of the passing game the way running is), but maybe passing yards per attempt per game, or completion percentage.

Contact if you want to chat: bearsnecessities@gmail.com

by Avinash Kunnath on Sep 10, 2009 2:15 PM PDT reply actions  

That’s a good point.

It would be really interesting to look at rushing yards per carry, passing yards per attempt, run-pass ratio, etc. but unfortunately Royrules’ data does not include those categories, and I don’t have time right now to input them for 80-something games. Maybe as a longer-term project.

dboneisloose

by HolmoePhobe on Sep 10, 2009 3:35 PM PDT up reply actions  

To add on to that

Situational passing may matter as well.

The relative value of completions in the first half might be greater than in the second half due to its effect on the defense with respect to the running game.

The hypothesis would run something like this: If the passing game flounders in the first half, the defense is able to key in on the running game. The running game being less effective, the team may fall behind and the offense may have to abandon the run game in order to keep itself in the game. The effect being predictability in play calling in both halves. Ineffective passing in the first half allows the defense to concentrate n the run. Ineffective run game plus point differential plus time constraints in the second half permits the defense to employ more effective pass defenses.

by Nashville on Sep 11, 2009 8:33 AM PDT up reply actions  

another item to look at might be scoring by quarter

and off/deff production by quarter to get an idea of how we win and when it matters to play our best best best.

Go Bears Go

by Rocksanddirt on Sep 11, 2009 11:49 AM PDT reply actions  

Interesting post

Did you calculate these correlation coefficients by constructing a new model for each variable, or are these coefficients from multiple linear regression models? If the former, it would be interesting to try different combinations of these variables to see if there is any interaction… I’d suspect that rushing/passing yards may interact, for some of the reasons you gave above. Given the small sample size though, don’t think it would be wise to include too many variables more in any given model.

by Mister Pie on Sep 12, 2009 8:56 PM PDT reply actions  

No – I just used Excel’s correlate function to compare two columns of data. I would love to put together a model using multiple variables, but I don’t know nearly enough about statistics.

dboneisloose

by HolmoePhobe on Sep 13, 2009 10:20 PM PDT up reply actions  

0-8 in <100 yrd rushing!

This sort of shocked me to know we have had 8 games where we rushed below 100. Thats a third of the games in this sample. I’m beginning to understand our slight underachievement these past couple years.

by YleeXOtee on Sep 15, 2009 12:16 PM PDT reply actions  

Comments For This Post Are Closed


User Tools

The California Sports Website that's .....different from all the rest.

GoldenBlogs' FAQ and Community Guidelines

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Avatar700_small
DBD 2.1.12: Memorial Stadium=Happiness

Recent FanPosts

20955_937378249336_2535124_52060718_7603102_n_small
New/Old Traditions for the New/Old Memorial...
Ajoceywcalhatpic_small
DBD 2/10/12: The Day the Interwebz Broke
47081_1264898881265_1793562355_517598_1551191_s_small
Harper/Jorge Giant Cut-Out Heads
Ab_small
DBD 2.9.12 The CGB Cocktail Party
Avinash4_small
DBD 2.8.12 An Important Question
Snoopy1_small
DBD 2.7.2012 Puppies!
Boosmall_small
DBD 2/6/2012: Highlights from Yesterday's Game
Ab_small
DBD 2.3.12 Thank you, DBD!
Cstcst3644_small
DBD 2.2.12 I Am A DBD Originalist

+ New FanPost All FanPosts >

More great SB Nation Blogs

Pac-12 On SBN

Pacific Takes (Pac-12)

Pacifictakes-165x74_medium

NORTH

AddictedToQuack: (Oregon)

UW Dawg Pound: (Washington)

CougCenter: (Washington State)

BuildingTheDam: (Oregon State)

Rule Of Tree: (Stanford)

CaliforniaGoldenBlogs: (Cal)

 

SOUTH

BruinsNation: (UCLA)

ConquestChronicles: (USC)

HouseOfSparky: (ASU)

Arizona Desert Swarm: (Arizona)

TheRalphieReport: (Colorado)

Block U: (Utah)


Marshawnthusiasts!

Bear_small ragnarok

Script_cal_small HydroTech

Cal_football_2005_09_16_roll_07_012_small CBKWit

Cstcst3644_small TwistNHook

1262541127_small yellow fever

Avinash4_small Avinash Kunnath

Jahvidtician

Bear__small norcalnick

Monty_in_cal_gear_small Ohio Bear

Giorgiorope_small Berkelium97

Ajoceywcalhatpic_small Kodiak

Mbc_small ManBearCal

Members Of The Follettariat

Sofele20squarecal_stanford2011_small solarise

Rugby_split_small RugbyVet

The Hit Squad

1129748640_small LeonPowe

Atom_small atomsareenough

Basketball_desktop_small CALumbus Bear

Humpty_dance_1_small Cugel