Suspicious Splitting In 800 Free Raises Questions At U.S. Open

by Barry Revzin 57

December 02nd, 2023 National, News, U.S. Open

2023 U.S. OPEN SWIMMING CHAMPIONSHIPS

In the first heat of the women’s 800 free at the U.S. Open, while most eyes were on Katie Ledecky busy doing Katie Ledecky things, Tennessee senior Aly Breslin was swimming in Lane 8. She didn’t end up hitting a best time, but it was the way she swam her race that caught our eye. Here are her splits, split into two columns for reasons that will become obvious shortly.

Split Comparison

50m 100m
1 31.21 31.94
2 33.16 32.69
3 33.87 32.71
4 33.35 32.76
5 33.49 32.66
6 33.46 33.04
7 33.84 33.42
8 33.97 32.58

If we omit the first and last 100, her six odd 50s were consistent, averaging 33.53. Her six even 50s were also fairly consistent, but those averaged 32.88, more than six-tenths of a second faster (and even that is lifted heavily by the last even 50 of 33.42, her slowest by far).

By itself, this might simply be odd splitting. But it does at least beg the question: how unique was Breslin’s splitting across all the men’s and women’s 800s?

Let’s take a look!

The way I’ve been analyzing the data (since the 2016 Olympics) is to take every distance swim (just the 800 and 1500), skip the first and last 100s, and look at every pair of consecutive 50m splits (which, for an 800m, would be 11 pairs).

For each pair, I look at the change in average speed for that pair. So for Breslin’s splits, we start with time differences of [-0.47, +1.18, -1.16, +0.64, …] and convert that to speed differences (in cm/s) of [+2.17, -5.33, +5.24, -2.93, …]. The speed difference helps normalize between swimmers of different speeds, and between men and women. I then multiply every other value by -1, ending with a final result, for Breslin’s swim, of [+2.17, +5.33, +5.24, +2.93, …].

If we take these 11 points for each swim, and bucket them by lane, we can get a series of boxplots for each lane. The a priori expectations, for the kind of distance swimmers that would compete at meets like the U.S. Open (or bigger) would be that:

  • on average, distance swimmers swim the same speed in both directions (once you remove the opening and closing 100m, and not just for Bobby Finke)
  • on average, distance swimmers swim the same way speeds independent of what lane they are in

When I say the same speed in both directions, this doesn’t necessarily mean that everyone even splits or positive splits or negative splits. There might be some drift in one direction or another throughout the race, but the kind of jagged splitting that Breslin did in this race should be atypical. Moreover, just because one person in Lane 8 has atypical splitting, surely that should be independent of other people in Lane 8 having the same splitting. Or, for instance, people in Lane 1 splitting the race in the exact opposite way.

Right?

Of course, the reason I frame these rhetorical questions in such a leading way is that this is what the data shows when you look at all the 800m swims at the U.S. Open. In total, there were 51 swims producing 561 data points, spread out over the eight lanes. The resulting box plots are:

The things to note here are that each lane is decidedly not centered at zero, and indeed you can draw a slowly increasing line as you go down the pool. That regression line has a slope of 0.588cm/s per lane (which is quite a bit), and the lane actually explains 38.6% of the variance of the splits – and this regression is extremely statistically significant, with a pvalue of 3.5e-61.

For contrast, some disciplines use below 0.05 as a significant barrier — this is 59 more zeros in front of that. To give a sense of what 0.588cm/s means, a 30s lap is a speed of 166.67cm/s. Going 0.588cm/s faster would drop you to 29.89. But this is also per lane. So the model here suggests that if you were to swim a 30-second lap from Lane 4, you would actually swim a 30.32-second lap from Lane 1 and a 29.58-second lap from Lane 8. That’s quite some drift. It’s a large effect.

Now, the first question might be: Well, what about some other meets? Is this just some wacky analysis that flags everything?

Well, I’ve looked at those too. Here is what this year’s Worlds looked like:

Here, we have the luxury of a concluded meet, so we have both the 800 and the 1500, for a total of 2,868 data points. You can see that there’s still some drift, but it’s a lot flatter, and nearly all the boxes do at least cross zero. There is still a statistically significant effect here (pvalue of 5.8e-83), but the slope is just 0.19cm/s and the lane explains just 12% of the variance.

And here is the U.S. Worlds Trials from earlier this year, which is more along the lines of what I always want this chart to look like:

Here, every box intersects zero, many of which are essentially right at zero. There is still technically a statistically significant effect (pvalue of 3.6e-7), just sloping in the other direction. But here the effect is so small (slope of 0.05cm/s) that it explains nothing (0.79%). The difference from lane 1 to lane 8 by this model, for a swimmer expected to swim a 30-second lap, would be just 0.07 seconds (compared to 0.82 in Greensboro).

The point of the comparison is that not all meets show the kind of drift we saw in the 800s in Greensboro, so there might be something here that merits a closer look. The 50s are done, and there aren’t enough data points to draw any kind of conclusion, but the 1500 is on Saturday. I guess we’ll have to wait and see if anything changes.

Also Read: Was There A Problem With The Rio Pool?

In This Story

57
Leave a Reply

Subscribe
Notify of

57 Comments
newest
oldest most voted
Inline Feedbacks
View all comments
Brett
11 months ago

I’d like to see an analysis after the 1500. Whitlock didn’t have any problems in lane 8 during finals and he probably only weighs 120 pounds.

Admin
Reply to  Brett
11 months ago

Barry ran it, and there wasn’t an obvious effect – other than lane 1 and 8, which actually showed the opposite effect (though to a smaller degree).

So that means one of a couple of things:

1) Unlikely statistical coincidence (one way or the other).
2) They turned the pumps off after the analysis (which we’ve seen happen before).

Scott Griffith
11 months ago

Coaches noticed the same current situation in the middle of the pool at the knockoff US Open at Liberty Natatorium. Unfortunately the 50s ran upstream even though many coaches requested to swim downstream.

Josh Graham
11 months ago

I’ve been to pools all around the country, and lots have currents in the outside lanes. Was always pretty annoying. I noticed this about a decade at the GAC.

It’s a joke…
11 months ago

Clearly the swimmer is going downhill in one direction and uphill in the other.

Nick R.
11 months ago

This is fascinating. Now with this data, you should look at pool design. Try to come up categories related to the way waves/current are mitigated by lane ropes and jets, side walls, gutters etc and see if there is a correlation between design type and drift effect. Also, I would wager stroke would be a factor. Breaststroke likely produces less waves because swimmers spend more time going over and under the surface.

Scott Knopf
11 months ago

Chip Wheelie Shoyat was also in lane 8 of the first heat of the men’s 800. He also had some wacky splits, similar to what you are talking about.

Doug Cornish
11 months ago

Compare split differentials in the 100’s. The split differentials of swimmers going out with the current and coming home against it will be about 1.5 to 2 seconds greater than those going out against and coming home with the current. Same thing happens at Bucknell.

Octavio Gupta
11 months ago

Tl;dr

It’s a current.