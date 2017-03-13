The women’s D1 national championship meet starts on Wednesday. We’ve scored the psych sheet and examined how performance changes from the psych sheet at the big meet. It’s now time to combine to two into a forecast of the meet.

I ran a Monte Carlo simulation of the NCAA women’s meet minus diving (most top teams have 1 or 0 divers except Minnesota, and UCLA who have 3). The exact procedure was:

Modify each swim on the psych sheet swim by a random percentage based on their team’s performance history. The percentage was drawn from a normal distribution with mean and standard deviation of the team’s previous time changes at nationals (for example, Georgia mean: .1%, sd: .91%). If a team had fewer than 20 swims in the previous 7 years at nationals, the entire field’s mean of .45% and standard deviation of 1% were used. Re rank the times based off the time changes Score the meet and check the order of the teams Repeat 50,000 times

The top teams unsurprisingly remain unquestioned under this procedure. Stanford won over 99.9% of the time. California was 2nd over 99.9% of the time. There were shakeups in the the top 10. The model gives NC State, 4th on the psych sheet, almost no chance of a top 4 finish based on swimming points. Instead, 84% of the time the model has them finishing somewhere between 7th and 12th. Georgia, 6th on the psych sheet, ends up in the top 4 in 68% of simulations, and the top 5 in 91%. Texas, 5th on the psych sheet, finishes 5th or higher based on swimming points in only 16% of the model runs. Virginia, 8th on the psych sheet, is 7th or better in 72% of simulations. A table with more of the results is below.

This methodology isn’t perfect. There’s no diving. It doesn’t include a chance of relay DQ’s. It also makes an assumption that past performance at nationals is predictive of future performance. This assumption appears reasonably valid based on the year by year time changes. Teams performances are correlated one year to the next. However, this simulation doesn’t include contingencies for teams drastically changing their past behavior. For example, NC State, #5 on the psych sheet, historically added an average of .72% at nationals. If suddenly they behave like Stanford and add only .01%, the model’s prediction of a <2% chance of a top 5 finish will probably be up ended (I’m not saying anything about NC State. Just a random example). For most teams, I think the assumption of behavior consistent with their history at this meet is valid, but there will probably be a couple teams who change their approach and have a historically novel result and break out from their expected finish here.

That’s the fun part: seeing which teams break expectations and do things we didn’t (or shouldn’t) predict. This post is about setting a baseline of expectations. The story of the meet will be which expectations are defied or fulfilled.

Simulated Places

The spaces with 0% are perhaps better marked as <1%. It happened, but it rounded to 0%. The blank spaces never happened. The full table continues past 20th place and 24 teams, but I cut it off for readability/relevancy reasons.