Prepping for the BC: A Statistical Analysis (updated!)

Re: Prepping for the Breeders' Cup: A Statistical Analysis

Postby Tessablue » Mon Oct 30, 2017 9:49 pm

Man I hadn't considered how unseasonably warm it's been on the east coast this year.... definitely something else to think about.

Kennedy wrote:I wonder if it might be easy to translate these findings into Impact Values? So you take a 10/1 shot and say that they are expected to win xx% of the time based strictly on the fact they they are 10/1. Then you could potentially isolate 10/1 entrants who last prepped at Belmont vs the overall 10/1 average. Or the same with Santa Anita. It may be interesting to draw out if a 10/1 from California is a "better" or more likely winner than a 10/1 from another locale (when the BC is hosted in Cal). Or is it even the other way around? Do the locals take more money and a 10/1 from another regional base is actually a better play?

This may be particularly interesting to note with Europeans. How many times have we seen well backed Europeans lose to other Europeans in the BC who are actually quite good but for whatever reason aren't as well received at the windows. A lot of the European winners are not the odds on choices.
That would be an ideal way to look at it- this system sort of has to work around the fact that I don't know much about parimutuel math or odds distributions (aside from the classic 33/66% favorite performance rates). Just looking at the raw data, you can pull out horses who go off between 8-1 and 12-1... Belmont preppers in this range have a mean finish of 5th and differential of -16.5. Keeneland horses (very few, so no conclusions here) finish 3rd with a +0.24, Santa Anita horses finish 5th with a -16.2 (these all account for field size). So I don't know how those compare to the previous outcomes of ~10-1 horses, but it looks like the bettors aren't great at evaluating these horses and the prep location effects are more from false favorites and missed longshots.

I'm fascinated by the European question as well, can't wait to get to that part. My instinct here is that Americans are pretty bad at evaluating overseas horses- I certainly am, and I usually just bet them based on their offshore odds. I wonder if I could find offshore odds for past BCs and compare them to the post-time odds here? That might be one way to evaluate it.

Somnambulist wrote:The betting public is interesting (Cuvee being listed as an example of an under performing comes to mind) and there is seriously way more attention to the lead up of these races than there is most others, save the TC. People watch works and formulate opinions based off weeks (which I do NOT agree with) of trainer comments, works, dozens of articles, and worse now... social media. It garners more way more casual attention from the national as a whole.

Who the betting public decides to stand behind is an exercise in collective decision making I'd like to see expounded on more. For the 20 or so people globally who care.

TB, honestly I'd like to see your formula applies to claiming and allowance races because I view the TC races and the BC to be an anomaly in terms of how people come to their decisions.
Ooh I would love to see that too! Unfortunately Equibase charts are protected from export, so I have to input all the horses manually- but it is definitely a project for the offseason and I'd love to see how bettor capabilities vary from track to track. I agree with you that bettors approach these races very differently, and there's also that aspect of casual bettors and non-bettors influencing the win pool. Win odds are probably one of the worst parameters for this, because they are swayed by the pubic in a way that exotic pools are not.

Worth mentioning that Baffert today said he doesn't think there's any advantage for SA horses going to Del Mar, because "it's totally different down there." Of course, Baffert says a lot of things.
Re: Prepping for the Breeders' Cup: A Statistical Analysis

Postby BaroqueAgain1 » Mon Oct 30, 2017 9:59 pm

I don't think he's BS'tting here, though.
Re: Prepping for the Breeders' Cup: A Statistical Analysis

Postby Tessablue » Tue Nov 07, 2017 9:13 pm

Another BC in the books... unfortunately I never got a chance to even begin to look at historical turf races, but I did want to do a retrospective look of Del Mar to see how the trends previously discussed held up. Did it look much like a Santa Anita or Hollywood Breeders' Cup?

First off, a lot has been made about the number of flops and disappointments this year. But if we actually look at the performances of dirt runners (note that all of these numbers are for dirt only), the event was on aggregate a surprisingly formful one. Remember that a negative figure means horses ran worse than expected, while a positive one means they ran surprisingly well relative to their odds. If we rank the California BC's by their median true differential, we get this:

2012 -19.10
2013 -16.20
2003 -14.43
2014 -9.89
2017 -9.60
1997 -9.18
1993 -8.27
2016 -7.59

So 2017 was... actually pretty average, surprisingly. Now because this is a very rough measurement it is entirely possible that the overperformance of longshots sort of washed out the disappointing performances by favorites, but as we'll soon see, 2017 wasn't particularly unusual in other regards as well.

Next, let's look at prep location:
This isn't all that dissimilar to what I've seen with other years. Probably the biggest surprise here is the atrocious performance by the Keeneland contingency. Keeneland horses weren't much respected at the windows, with median odds of 20-1, but they vastly underperformed as a whole (only two horses received a score better than -18: Bar of Gold and Givemeaminit). Otherwise, the prep performances can be summer up roughly like this:

Saratoga = Santa Anita
Belmont = Del Mar

So while there was no obvious bias towards local preppers, it must be noted that all Del Mar-prepping horses were coming off of layoffs. Of course, so were the Saratoga horses. Saratoga preppers were quite rare in the past but are beginning to accumulate pretty quickly, so this is certainly a future angle to watch.

Now one of the most interesting point of analysis I previously found involved the discrepancy between Belmont prep winners and Belmont prep losers. Did that trend hold true for this BC as well?
...sort of! Belmont losers did do better, and in fact both juvenile race winners fit that profile, but none of the winners did particularly well this year. I'd say the jury is still out in this regard, but if I had to take something away from it, I'd say that as bettors we do tend to value prep wins more than we apparently should.

One last thing. Much has been made of the apparent dead rail on Friday and Saturday, by which I mean it's something I've complained about a decent amount. I wanted to see if that assumption was backed up by more than just the eye test, so I divided the dirt competitors into groups and did this:
Keep in mind that post position doesn't always reflect where a horse ends up during a race, and posts 1-4 had slightly higher odds than the others (median ~20-1 vs 14-1 and 15-1). This also could be completely normal; I haven't yet input this data from previous years. It's also a pretty arbitrary way to divide up these horses. And with all that said, the result is... a shrug of the shoulders. Posts 1-4 did perform worse relative to outside posts if you look at the median, but it's not statistically significant or all that close to significance. I do plan on re-examining this angle using Trakus data at some point in the future.

So that sums up what I have for the 2017 BC- but I'm very happy to take requests if anyone is interested in examining other trends. Happily, I now have a year to perfect the system and we have plenty of previous Churchill BCs to examine. I'm also hoping to adapt this system for the Triple Crown, so expect to see it again in six months or so. In the meantime, I'd like to sincerely thank you all for taking the time to read and comment on this work!
