This post is a follow up to this one. If you haven’t seen it yet, go read that one first.
Winning Times
The most common complaint from the comments on the Olympic Trials model I posted yesterday was that the times were too slow. Many commenters suggested that much faster times than what was listed would be required to win events at Trials. The model agrees with them. I mentioned this in passing at the bottom of the original post, but I thought I’d go into more detail.
The original post listed predicted times, but the model doesn’t actually predict an exact time for a swimmer. Instead it returns a distribution of times. The predicted times were the median of that distribution. This means that for any predicted time, the model thinks there is a 50% chance the swimmer will be faster and a 50% chance the swimmer will be slower with times becoming less and less likely the further they get from the median. This leads to the fact that the predicted time required to win an event is always better than or equal to the predicted time of the top ranked swimmer. I’ll repeat my previous explanation of this effect here:
In the women’s 400 IM, the model has Maya DiRado 1st in 4:33.84. It thinks she has a 50% chance of going faster than that time and a 50% chance of going slower. She only has a 34% chance to get 1st place. That means at least 16% of the time when she’s faster than the predicted time, it still won’t be good enough to win. This effect pushes the predicted winning time faster than the predicted time of the top ranked swimmer.
Here are the model’s predicted 1st and 2nd place times:
Men:
1st
2nd
50 Free
21.33
21.55
100 Free
47.72
48.08
200 Free
1:44.78
1:45.6
400 Free
3:43.24
3:44.79
1500 Free
14:40.32
14:48.42
100 Back
52.08
52.56
200 Back
1:54.22
1:55.14
100 Breast
58.97
59.37
200 Breast
2:07.31
2:08.32
100 Fly
50.48
50.94
200 Fly
1:53.42
1:54.56
200 IM
1:55.22
1:56.43
400 IM
4:08.45
4:10.36
Women:
1st
2nd
50 Free
24.29
24.48
100 Free
52.81
53.19
200 Free
1:54.01
1:55.2
400 Free
3:58.47
4:01.98
800 Free
8:06.83
8:16.52
100 Back
58.65
59.09
200 Back
2:06.6
2:07.71
100 Breast
1:05.25
1:05.8
200 Breast
2:22.42
2:23.46
100 Fly
56.65
57.24
200 Fly
2:06.01
2:07.02
200 IM
2:08.66
2:09.71
400 IM
4:31.2
4:33.61
If anything, I think these times may be a bit fast (but that’s just my subjective opinion). The worst offender is the women’s 800 free winning time dropping from Katie Ledecky‘s predicted 8:07.05 to 8:06.83. This means that the model thinks there is a world where Ledecky goes under 8:07.05 and loses. This seems extremely unlikely (to be fair, the model thinks it’s unlikely too, but probably not unlikely enough). The model over predicts this outcome because, unlike a typical event, Ledecky’s competitors are at a level to reasonably make an Olympic final, where as Ledecky is operating in her own universe. If a swimmer is that far in front of the field, it can just as easily indicate a weak field, as an exceptionally strong front runner. The model doesn’t account for this different circumstance. Instead it looks at Breeja Larson‘s 3.2% drop in the 100 breast at the 2012 trials and thinks it’s possible, if unlikely, for anyone (if Becca Mann dropped 3.2% it’s an 8:05).
PR’s
Another commenter complaint was about the lack of predictions of PR’s. Again the chance to beat a seed time is included in the distribution of possible times for each swimmer. In 2012, 19% of women and 32% of men at trials beat their seed times. Those numbers were higher for top seeds, but they weren’t higher than 50%. The model expects 41% of top 24 men and 35% of top 24 women to beat their seed times (and 44%/38% for predicted top 8 swimmers).
Here’s the full list of chances to beat seed:
Men
50 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Adrian, Nathan
21.37
21.4
45%
2
3
Ervin, Anthony
21.55
21.67
31%
3
2
Dressel, Caeleb
21.53
21.69
26%
4
4
Schneider, Josh
21.8
21.87
39%
5
5
Jones, Cullen
21.83
21.88
42%
6
6
Chadwick, Michael
22.03
22.1
39%
7
9
Powers, Paul
22.18
22.15
55%
8
11
Copeland, William
22.25
22.21
56%
100 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Adrian, Nathan
48
47.91
57%
2
2
Phelps, Michael
48.45
48.53
44%
3
5
Schneider, Josh
48.76
48.77
49%
4
4
Dressel, Caeleb
48.74
48.85
42%
5
7
Chadwick, Michael
48.87
48.88
49%
6
8
Lochte, Ryan
48.9
48.88
51%
7
3
Ervin, Anthony
48.71
48.89
37%
8
11
Conger, Jack
49.02
48.94
56%
200 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
2
Dwyer, Conor
1:45.41
1:45.39
51%
2
1
Lochte, Ryan
1:45.36
1:45.8
36%
3
3
Rooney, Maxime
1:47.1
1:47.17
48%
4
4
Grothe, Zane
1:47.11
1:47.54
36%
5
9
Weiss, Michael
1:47.63
1:47.6
51%
6
7
Haas, Townley
1:47.55
1:47.61
48%
7
10
Klueh, Michael
1:47.73
1:47.65
53%
8
12
Smith, Clark
1:47.97
1:47.67
60%
400 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Jaeger, Connor
3:44.81
3:45.65
37%
2
3
Dwyer, Conor
3:46.09
3:45.77
55%
3
2
Grothe, Zane
3:45.98
3:46.46
43%
4
4
McBroom, Michael
3:46.69
3:46.5
53%
5
5
Smith, Clark
3:47.1
3:46.91
53%
6
6
Haas, Townley
3:48.69
3:49.45
38%
7
8
Sweetser, True
3:49.33
3:49.46
48%
8
7
Shoults, Grant
3:48.91
3:49.97
34%
1500 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Jaeger, Connor
14:41.2
14:44.4
37%
2
2
Wilimovsky, Jordan
14:53.12
14:50.53
60%
3
3
McBroom, Michael
14:56.17
14:54.96
55%
4
5
Smith, Clark
15:05.97
15:02.12
65%
5
4
Ryan, Sean
15:03.82
15:06.29
40%
6
6
Gemmell, Andrew
15:07.82
15:08.24
48%
7
7
Sweetser, True
15:10.73
15:12.3
44%
8
13
Abruzzo, Andrew
15:15.99
15:12.65
63%
100 Back:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Plummer, David
52.51
52.46
53%
2
3
Murphy, Ryan
52.57
52.57
50%
3
2
Grevers, Matt
52.54
52.75
36%
4
4
Pebley, Jacob
53.57
53.52
53%
5
5
Godsoe, Eugene
53.96
54.11
40%
6
6
Conger, Jack
54.09
54.3
37%
7
7
Kaliszak, Luke
54.23
54.41
38%
8
10
Mulcare, Patrick
54.5
54.56
46%
200 Back:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
2
Murphy, Ryan
1:54.94
1:54.99
48%
2
1
Clary, Tyler
1:54.73
1:55.34
32%
3
3
Pebley, Jacob
1:56.29
1:56.02
58%
4
4
Lochte, Ryan
1:56.47
1:56.8
40%
5
5
Lehane, Sean
1:57.11
1:56.94
55%
6
6
Grevers, Matt
1:57.24
1:57.28
49%
7
7
Mulcare, Patrick
1:57.34
1:57.67
40%
8
9
Owen, Robert
1:57.96
1:57.74
57%
100 Breast:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Miller, Cody
59.51
59.56
47%
2
2
Fink, Nic
59.52
59.67
41%
3
3
Wilson, Andrew
59.65
59.73
45%
4
4
Cordes, Kevin
59.7
59.95
36%
5
5
Tierney, Sam
1:00.15
1:00.2
47%
6
10
Prenot, Josh
1:00.66
1:00.4
65%
7
6
McHugh, Brendan
1:00.31
1:00.54
37%
8
11
Andrew, Michael
1:00.68
1:00.62
53%
200 Breast:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
2
Prenot, Josh
2:08.58
2:08.4
55%
2
1
Cordes, Kevin
2:07.86
2:08.46
34%
3
3
Fink, Nic
2:08.89
2:09.11
44%
4
4
Miller, Cody
2:09.08
2:09.8
31%
5
6
Licon, Will
2:10.02
2:10.36
41%
6
5
Wilson, Andrew
2:09.84
2:10.88
24%
7
7
Johnson, BJ
2:10.77
2:11.05
43%
8
9
Whitley, Reece
2:11.3
2:11.94
33%
100 Fly:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Phelps, Michael
50.45
50.69
34%
2
2
Shields, Tom
51.03
51.12
44%
3
3
Conger, Jack
51.33
51.47
40%
4
5
Lochte, Ryan
51.55
51.74
37%
5
4
Phillips, Tim
51.49
51.75
33%
6
6
Josa, Matthew
51.68
51.89
36%
7
8
Smith, Giles
51.92
51.93
49%
8
9
Nolan, David
52.15
52.25
43%
200 Fly:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Phelps, Michael
1:52.94
1:53.71
27%
2
2
Conger, Jack
1:54.54
1:55.34
27%
3
3
Shields, Tom
1:55.09
1:55.77
30%
4
4
Clary, Tyler
1:55.42
1:55.88
36%
5
6
Kalisz, Chase
1:56.5
1:56.51
50%
6
5
Seliskar, Andrew
1:55.92
1:56.58
31%
7
8
Clark, Pace
1:56.84
1:56.85
50%
8
7
Whitaker, Kyle
1:56.67
1:57.45
28%
200 IM:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Phelps, Michael
1:54.75
1:55.59
26%
2
2
Lochte, Ryan
1:55.81
1:56.61
27%
3
3
Dwyer, Conor
1:57.41
1:57.85
37%
4
4
Prenot, Josh
1:58.38
1:58.46
48%
5
6
Kalisz, Chase
1:58.73
1:58.74
50%
6
5
Licon, Will
1:58.43
1:58.8
39%
7
11
Nolan, David
1:59.4
1:59.32
52%
8
10
Bentz, Gunnar
1:59.19
1:59.36
45%
400 IM:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
2
Kalisz, Chase
4:09.62
4:10.09
43%
2
1
Clary, Tyler
4:09.03
4:11.52
19%
3
4
Lochte, Ryan
4:12.66
4:12.62
51%
4
5
Prenot, Josh
4:13.15
4:13.02
52%
5
3
Litherland, Jay
4:12.43
4:13.06
41%
6
8
Grieshop, Sean
4:15.67
4:15.44
53%
7
6
Bentz, Gunnar
4:14.16
4:15.56
31%
8
7
Weiss, Michael
4:14.85
4:16.35
30%
Women:
50 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Kennedy, Madison
24.45
24.6
32%
2
2
Manuel, Simone
24.47
24.61
34%
3
6
Weitzeil, Abbey
24.72
24.71
51%
4
5
Vollmer, Dana
24.69
24.82
35%
5
3
Martin, Ivy
24.62
24.86
23%
6
4
Coughlin, Natalie
24.66
24.88
25%
7
10
Worrell, Kelsi
24.98
25.01
46%
8
8
Weir, Amanda
24.85
25.02
31%
100 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Manuel, Simone
53.25
53.41
41%
2
3
Vollmer, Dana
53.59
53.68
45%
3
5
Weitzeil, Abbey
53.77
53.8
48%
4
4
Ledecky, Katie
53.75
53.92
41%
5
2
Franklin, Missy
53.43
54.03
20%
6
8
Neal, Lia
54.01
54.03
49%
7
6
Coughlin, Natalie
53.85
54.24
30%
8
7
Geer, Margo
53.95
54.27
33%
200 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Ledecky, Katie
1:54.43
1:54.48
49%
2
2
Franklin, Missy
1:55.49
1:56.3
30%
3
3
Schmitt, Allison
1:56.23
1:56.48
44%
4
4
Smith, Leah
1:56.64
1:56.7
48%
5
5
Margalis, Melanie
1:57.33
1:57.41
48%
6
8
DiRado, Maya
1:57.7
1:58.04
41%
7
9
Manuel, Simone
1:57.9
1:58.11
45%
8
10
Runge, Cierra
1:57.97
1:58.2
44%
400 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Ledecky, Katie
3:58.37
3:58.88
44%
2
2
Smith, Leah
4:03.33
4:03.46
48%
3
3
Runge, Cierra
4:04.55
4:05.85
35%
4
6
Vrooman, Lindsay
4:07.16
4:07.28
49%
5
5
Mann, Becca
4:07.09
4:07.32
47%
6
4
Schmitt, Allison
4:06.88
4:07.55
42%
7
9
Flickinger, Hali
4:07.93
4:08.23
46%
8
7
Beisel, Elizabeth
4:07.46
4:08.34
40%
800 Free:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Ledecky, Katie
8:06.68
8:07.05
48%
2
2
Mann, Becca
8:21.77
8:21.72
50%
3
4
Smith, Leah
8:24.74
8:23.74
56%
4
3
Runge, Cierra
8:24.69
8:25.85
43%
5
5
Peacock, Stephanie
8:25.89
8:26.02
49%
6
6
Vrooman, Lindsay
8:26.67
8:27.34
46%
7
7
Schmidt, Sierra
8:27.54
8:28.52
44%
8
13
Ryan, Gillian
8:31.97
8:30.12
61%
100 Back:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
3
Smoliga, Olivia
59.41
59.41
50%
2
1
Coughlin, Natalie
59.05
59.57
26%
3
2
Franklin, Missy
59.38
59.69
35%
4
6
Stevens, Hannah
59.67
59.77
45%
5
7
Deloof, Ali
1:00.1
1:00.01
54%
6
5
Baker, Kathleen
59.63
1:00.04
30%
7
4
Adams, Claire
59.58
1:00.06
27%
8
8
Bootsma, Rachel
1:00.25
1:00.38
44%
200 Back:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Franklin, Missy
2:06.34
2:07.47
25%
2
2
DiRado, Maya
2:08.19
2:08.39
45%
3
3
Beisel, Elizabeth
2:08.33
2:09.61
23%
4
6
Baker, Kathleen
2:09.36
2:09.7
42%
5
5
Pelton, Elizabeth
2:09.36
2:10.29
30%
6
13
Flickinger, Hali
2:10.6
2:10.51
52%
7
9
Voss, Erin
2:10.12
2:10.68
37%
8
19
Smiddy, Clara
2:11.15
2:10.76
59%
100 Breast:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Meili, Katie
1:05.64
1:05.98
35%
2
2
King, Lilly
1:05.73
1:06
38%
3
4
Haase, Sarah
1:06.31
1:06.54
40%
4
3
Hannis, Molly
1:06.16
1:06.56
33%
5
5
Hardy, Jessica
1:06.51
1:06.95
31%
6
6
Lawrence, Micah
1:06.51
1:07.03
28%
7
7
Larson, Breeja
1:06.73
1:07.05
36%
8
9
Margalis, Melanie
1:07.26
1:07.39
44%
200 Breast:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Lawrence, Micah
2:22.04
2:24.33
12%
2
5
King, Lilly
2:24.47
2:24.93
41%
3
2
Sogar, Laura
2:23.54
2:25.1
21%
4
3
Meili, Katie
2:23.69
2:25.21
22%
5
4
Larson, Breeja
2:24.16
2:25.27
28%
6
6
Margalis, Melanie
2:24.68
2:25.48
34%
7
7
Lazor, Annie
2:24.96
2:25.55
38%
8
8
Hannis, Molly
2:25.26
2:26.06
34%
100 Fly:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Vollmer, Dana
56.94
57.01
46%
2
2
Worrell, Kelsi
57.24
57.31
46%
3
3
Stewart, Kendyl
57.82
58
41%
4
5
Donahue, Claire
58.03
58.31
36%
5
6
Lee, Felicia
58.14
58.52
31%
6
4
McLaughlin, Katie
57.87
58.53
20%
7
8
Merrell, Eva
58.58
58.79
39%
8
10
Moffitt, Hellen
58.86
58.85
51%
200 Fly:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
Adams, Cammile
2:06.33
2:07.04
34%
2
4
Flickinger, Hali
2:07.59
2:08.18
37%
3
5
Bayer, Cassidy
2:08.03
2:08.62
37%
4
2
McLaughlin, Katie
2:06.95
2:08.82
14%
5
6
Worrell, Kelsi
2:08.61
2:09.03
40%
6
3
DiRado, Maya
2:07.42
2:09.04
17%
7
7
Mills, Kate
2:08.89
2:09.51
36%
8
10
Saiz, Hannah
2:09.83
2:09.96
47%
200 IM:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
DiRado, Maya
2:08.99
2:09.91
30%
2
2
Margalis, Melanie
2:10.2
2:10.8
37%
3
3
Leverenz, Caitlin
2:10.35
2:10.92
37%
4
4
Eastin, Ella
2:10.54
2:10.97
40%
5
5
Cox, Madisyn
2:10.75
2:11.45
35%
6
7
Baker, Kathleen
2:12.09
2:12.61
38%
7
9
Henry, Sarah
2:12.25
2:12.83
37%
8
6
Small, Meghan
2:11.26
2:12.9
18%
400 IM:
Predicted Place
Psych Place
Name
Seed Time
Predicted Time
Beat Seed
1
1
DiRado, Maya
4:31.71
4:33.84
28%
2
2
Beisel, Elizabeth
4:31.99
4:34.04
29%
3
3
Leverenz, Caitlin
4:35.46
4:37.12
33%
4
5
Ledecky, Katie
4:37.93
4:37.71
52%
5
4
Mann, Becca
4:37.04
4:37.73
43%
6
7
Adams, Cammile
4:38.97
4:39.26
47%
7
6
Henry, Sarah
4:38.88
4:40
38%
8
8
Eastin, Ella
4:40.7
4:40.31
54%
These odds aren’t perfect. It’s probably possible to improve some by adjusting the odds up or down manually for well known swimmers based on training style,recent rest levels, and a hundred other known variables. The problem is then there are two different models, one for famous swimmers and one for less well known swimmers, and the model now has a huge bias.
This is awesome work. Is there a way we can contact the author for more detailed questions? I recently started grad school in statistics, and modeling is definitely one of my interests.
Thanks for doing this, Andrew!
Apple
8 years ago
Love reading these – thanks for putting them together. I really hope the model is right about the fast times, and I hope they are duplicated or improved upon in Rio. It’ll be fun to watch!
But Nate Silver got it so totally wrong on your favorite politician Donald Trump. I wonder what you actually wanted to say about Andrew.
G.I.N.A.
8 years ago
Andrew . I calculated the number of places of the total (68 including relays ) & estimated a a full 50% of places would be taken by (proven ) Australian performers from trials.
Thats about the same as Kazan where USA just tipped Australia out due to uncontested mixed relays. I think SWSWers would be bored to read the numbers but I could give them . Basically 10 men 16 women in individual & 7-8 in relays .
So basically you have to do a whole lot better than those to be numero uno. if not , then we can start to consider we are nearing the limits .If there are big jumps then there is more to go.… Read more »
Attila the Hunt
8 years ago
I think your 1st and 2nd place prediction model is the best I’ve seen and will be very close to the actual OT swims next week. Great work!
Eric
8 years ago
I hope that they don’t go too fast for trials or they may not go as fast at the actual Olympics…I like these predictions but time will tell their accuracy
Nice work – one really easy improvement to make to the model would be to include the swimmer’s age. There should be significant differences in who is likely to improve by age, with the patterns probably slightly different for each gender.
That’s what I thought too. I tried adding age in, but there was no meaningful improvement in the results.
Savannah
8 years ago
A 1:47.6 for Townley? After his 1:30 in short course and 1:47 last year? I can’t imagine he won’t dropp any time in long course. I’d bet on him being top 4.
I love these articles. But then, I also hate them, because comments like this highlight how few people are doing any reading of anything rather than just scrolling to their favorite swimmer in the charts and whining.
When a comment is deleted, its un-deleted replies suddenly become orphans and they attach themselves to another completely unrelated comment. Hilarity ensues.
Reader, I agree with you. I love all these stats and data, and it is quite irritating (must be for the writer, too) that this happens quite often. It also happened in the first installment of this article.
This is awesome work. Is there a way we can contact the author for more detailed questions? I recently started grad school in statistics, and modeling is definitely one of my interests.
Thanks for doing this, Andrew!
Love reading these – thanks for putting them together. I really hope the model is right about the fast times, and I hope they are duplicated or improved upon in Rio. It’ll be fun to watch!
Andrew Mering. The Nate Silver of swimming.
But Nate Silver got it so totally wrong on your favorite politician Donald Trump. I wonder what you actually wanted to say about Andrew.
Andrew . I calculated the number of places of the total (68 including relays ) & estimated a a full 50% of places would be taken by (proven ) Australian performers from trials.
Thats about the same as Kazan where USA just tipped Australia out due to uncontested mixed relays. I think SWSWers would be bored to read the numbers but I could give them . Basically 10 men 16 women in individual & 7-8 in relays .
So basically you have to do a whole lot better than those to be numero uno. if not , then we can start to consider we are nearing the limits .If there are big jumps then there is more to go.… Read more »
I think your 1st and 2nd place prediction model is the best I’ve seen and will be very close to the actual OT swims next week. Great work!
I hope that they don’t go too fast for trials or they may not go as fast at the actual Olympics…I like these predictions but time will tell their accuracy
https://swimswam.com/2016-u-s-olympic-trials-previews-murphys-time-mens-200-back/#comment-426870
Nice work – one really easy improvement to make to the model would be to include the swimmer’s age. There should be significant differences in who is likely to improve by age, with the patterns probably slightly different for each gender.
That’s what I thought too. I tried adding age in, but there was no meaningful improvement in the results.
A 1:47.6 for Townley? After his 1:30 in short course and 1:47 last year? I can’t imagine he won’t dropp any time in long course. I’d bet on him being top 4.
I love these articles. But then, I also hate them, because comments like this highlight how few people are doing any reading of anything rather than just scrolling to their favorite swimmer in the charts and whining.
If one is a “P****k ” does one have to be born that way or can you just choose to be one ? Do you need surgery & can you use any change room?
When a comment is deleted, its un-deleted replies suddenly become orphans and they attach themselves to another completely unrelated comment. Hilarity ensues.
Reader, I agree with you. I love all these stats and data, and it is quite irritating (must be for the writer, too) that this happens quite often. It also happened in the first installment of this article.
These aren’t his real predictions, they are like a poll I believe. He made a 1:46.5 for Haas I think.