Showing posts with label Performance Analysis. Show all posts
Showing posts with label Performance Analysis. Show all posts

Tuesday, March 25, 2014

How to [pretend to] be a better coach using bad statistics

How to [pretend to] be a better coach using bad statistics

How to [pretend to] be a better coach using bad statistics

Here is a simple scenario from practice: Coach A uses YOYOIRL1 test and Coach B uses 30-15IFT (for more info see recent paper my Martin Buchheit, which also stimulated me to write this blog) to gauge improvements in endurance

Coach A: We have improved distance covered in YOYOIRL1 test from 1750m to 2250m in four weeks. That is 500m improvement or ~28%

Coach B: We have improved velocity reached in 30-15IFT from 19km/h to 21km/h in four weeks . That is 2km/h improvement or ~10%

If you present those to someone who is not statisticaly educated he/she might conclude the following:

  • Coach A did a better job, since the improvement is 28% compared to 10% of Coach B
  • YOYOIRL1 test is more sensitive to changes than 30-15IFT

As a coaches, we needs to report to a manager(s), so which one would you prefer reporting? 28% or 10%? Be honest here!

Unfortunately, we cannot conclude who did a better job (Coach A or Coach B), nor which test is more sensitive (YOYOIRL1 or 30-15IFT) from percent change data. A lot of managers and coaches don't get this. At least I haven't until recently.

What we need is Effect Size statistics, or Cohen's D. But for that we need to know variability in the groups, expressed as SD (standard deviation). Let's simulate the data and use usual SDs for YOYOIRL1 and 30-15IFT

require(ggplot2, quietly = TRUE)
require(reshape2, quietly = TRUE)
require(plyr, quietly = TRUE)
require(randomNames, quietly = TRUE)
require(xtable, quietly = TRUE)
require(ggthemes, quietly = TRUE)
require(gridExtra, quietly = TRUE)

set.seed(1)
numberOfPlayers <- 150
playerNames <- randomNames(numberOfPlayers)

# Create YOYOIRL1 Pre- and Post- data using 300m as SD
YOYOIRL1.Pre <- rnorm(mean = 1750, sd = 300, n = numberOfPlayers)
YOYOIRL1.Post <- rnorm(mean = 2250, sd = 300, n = numberOfPlayers)

# We need to round YOYOIRL1 score to nearest 40m, since those are the
# increments of the scores
YOYOIRL1.Pre <- round_any(YOYOIRL1.Pre, 40)
YOYOIRL1.Post <- round_any(YOYOIRL1.Post, 40)

# Create 30-15IFT Pre- and Post- data using 1km/h as SD
v3015IFT.Pre <- rnorm(mean = 19, sd = 1, n = numberOfPlayers)
v3015IFT.Post <- rnorm(mean = 21, sd = 1, n = numberOfPlayers)

# We need to round 30-15IFT to nearest 0.5km/h, since those are the
# increments of the scores
v3015IFT.Pre <- round_any(v3015IFT.Pre, 0.5)
v3015IFT.Post <- round_any(v3015IFT.Post, 0.5)

# Put those test into data.frame
testDataWide <- data.frame(Athlete = playerNames, YOYOIRL1.Pre, YOYOIRL1.Post, 
    v3015IFT.Pre, v3015IFT.Post)

# And print first 15 athletes
print(xtable(head(testDataWide, 15), border = T), type = "html")
Athlete YOYOIRL1.Pre YOYOIRL1.Post v3015IFT.Pre v3015IFT.Post
1 Shrestha, Ezell 2000.00 2520.00 18.50 19.50
2 Cha, Gequan 1440.00 2080.00 20.50 20.00
3 Brown, Hindav 2360.00 2040.00 19.50 19.50
4 Venegas Delarosa, Destinee 1640.00 1800.00 19.50 21.50
5 Simon, Barrington 2240.00 1800.00 19.00 19.50
6 Williams, Hyeju 2200.00 2280.00 18.00 19.00
7 Gutierrez, Sabrina 1760.00 2400.00 17.50 23.00
8 Wilder, Johannah 1920.00 1640.00 19.00 22.00
9 Martin Dean, Jillian 1440.00 1960.00 21.00 22.00
10 Thomas, Neil 1840.00 2400.00 19.00 20.50
11 Nosker, Andrew 2080.00 2120.00 19.00 21.00
12 Blackford, Matthew 1760.00 2880.00 19.50 21.50
13 Mata, Rachel 1600.00 2640.00 18.50 19.50
14 Cheng, Ryan 1560.00 2440.00 17.50 21.00
15 True, Ashley 1720.00 2240.00 21.50 21.00

To plot the data and to do simple descriptive stats we need to reshape the data from wide format to long format using reshape2 package by Hadley Wickham

# Reshape the data
testData <- melt(testDataWide, id.vars = "Athlete", variable.name = "Test", 
    value.name = "Score")

# And print first 30 rows
print(xtable(head(testData, 30), border = T), type = "html")
Athlete Test Score
1 Shrestha, Ezell YOYOIRL1.Pre 2000.00
2 Cha, Gequan YOYOIRL1.Pre 1440.00
3 Brown, Hindav YOYOIRL1.Pre 2360.00
4 Venegas Delarosa, Destinee YOYOIRL1.Pre 1640.00
5 Simon, Barrington YOYOIRL1.Pre 2240.00
6 Williams, Hyeju YOYOIRL1.Pre 2200.00
7 Gutierrez, Sabrina YOYOIRL1.Pre 1760.00
8 Wilder, Johannah YOYOIRL1.Pre 1920.00
9 Martin Dean, Jillian YOYOIRL1.Pre 1440.00
10 Thomas, Neil YOYOIRL1.Pre 1840.00
11 Nosker, Andrew YOYOIRL1.Pre 2080.00
12 Blackford, Matthew YOYOIRL1.Pre 1760.00
13 Mata, Rachel YOYOIRL1.Pre 1600.00
14 Cheng, Ryan YOYOIRL1.Pre 1560.00
15 True, Ashley YOYOIRL1.Pre 1720.00
16 Inouye, Connor YOYOIRL1.Pre 2080.00
17 Tatum, Janice YOYOIRL1.Pre 1600.00
18 Latour, Pearl YOYOIRL1.Pre 1720.00
19 Tripathi, Juan YOYOIRL1.Pre 1360.00
20 Moore, Michelle YOYOIRL1.Pre 1880.00
21 O'Sullivan, Johanna YOYOIRL1.Pre 2160.00
22 Sharp, Gregory YOYOIRL1.Pre 2200.00
23 Blum, Jennifer YOYOIRL1.Pre 2000.00
24 Doering, Darius YOYOIRL1.Pre 1200.00
25 Sohn, Kendle YOYOIRL1.Pre 1880.00
26 Horton, Grant YOYOIRL1.Pre 1880.00
27 Waynewood, Nicholas YOYOIRL1.Pre 1640.00
28 Pallen, Raymundo YOYOIRL1.Pre 1800.00
29 Montoya, Simon YOYOIRL1.Pre 1480.00
30 Clark, Bryce YOYOIRL1.Pre 1960.00

From the tables above it is easy to see the difference between wide and long data formats.

Let's calculate simple stats using plyr package from Hadley Wickham (yes, he is a sort of celebrity in R community) and plot them using violin plots, which is great since they show the distribution of the scores

# Subset YOYOIRL1 tets
ggYOYO <- ggplot(subset(testData, Test == "YOYOIRL1.Pre" | Test == "YOYOIRL1.Post"), 
    aes(x = Test, y = Score))

ggYOYO <- ggYOYO + geom_violin(fill = "red", alpha = 0.5) + theme_few() + stat_summary(fun.y = mean, 
    geom = "point", fill = "white", shape = 23, size = 5)


# Subset 30-15IFT tets
ggIFT <- ggplot(subset(testData, Test == "v3015IFT.Pre" | Test == "v3015IFT.Post"), 
    aes(x = Test, y = Score))

ggIFT <- ggIFT + geom_violin(fill = "steelblue", alpha = 0.5) + theme_few() + 
    stat_summary(fun.y = mean, geom = "point", fill = "white", shape = 23, size = 5)


# Plot the graphs
grid.arrange(ggYOYO, ggIFT, ncol = 2)

plot of chunk unnamed-chunk-3


# Calculate the summary table
testDataSummary <- ddply(testData, "Test", summarize, N = length(Score), Mean = mean(Score), 
    SD = sd(Score))
# Print the summary table
print(xtable(testDataSummary, border = T), type = "html")
Test N Mean SD
1 YOYOIRL1.Pre 150 1749.60 309.40
2 YOYOIRL1.Post 150 2246.13 318.41
3 v3015IFT.Pre 150 18.86 1.12
4 v3015IFT.Post 150 20.98 1.09

From the table above we can calculate the percent change.

YOYOIRL1.Change <- (testDataSummary$Mean[2] - testDataSummary$Mean[1])/testDataSummary$Mean[1] * 
    100
v3015IFT.Change <- (testDataSummary$Mean[4] - testDataSummary$Mean[3])/testDataSummary$Mean[3] * 
    100

print(xtable(data.frame(YOYOIRL1.Change, v3015IFT.Change), border = T), type = "html")
YOYOIRL1.Change v3015IFT.Change
1 28.38 11.22

But as mentioned in the beginning of the post, percent change is not the best way to express change and sensitivity of the tests (although it is great to impress the managers or your superiors, or claim that your test is more sensitive).

What we need to do is to calculate effect size (ES). ES takes into account the difference between the means and SD (in this case of the Pre- test, but it can also use pooled SD).

YOYOIRL1.ES <- (testDataSummary$Mean[2] - testDataSummary$Mean[1])/testDataSummary$SD[1]
v3015IFT.ES <- (testDataSummary$Mean[4] - testDataSummary$Mean[3])/testDataSummary$SD[3]

print(xtable(data.frame(YOYOIRL1.ES, v3015IFT.ES), border = T), type = "html")
YOYOIRL1.ES v3015IFT.ES
1 1.60 1.88

From the data above we can conclude that they are pretty similar and that 30-15IFT might be a bit more sensitive (or the Coach B did a better job).

Anyway, to summarize this blog post - start reporting ES alongside with percent change. If someone claims high improvements in testing scores to show how great coach he is, or how great his program is, ask to see ES or the distribution of the change scores or Pre- and Post- tests. Besides we need to also ask for SWC and TE, but more on that later.

Monday, March 17, 2014

Random Thoughts

Random Thoughts


I was in UK for last couple of days, and today I’ve visited Leicester City FC on the Performance and Injury conference (#LCFC_PIC). It was great to be in the great LCFC gym with same/similar-minded professionals and listen to their troubles and solutions.

I am not planning to provide an overview of the seminar besides mentioning data by Jan Ekstrand (@JanEkstrand) on injuries and providing some of my random thoughts and rationale.

Player rotation as a way of reducing injuries


There was a mention that bigger clubs (with bigger athletes/player pool) could potentially use player rotation strategies to reduce injury risk in key players and potentially save them for more important game during congested game periods. This sounds like a good plan, but not without troubles.

Here is one simple hypothetical example. First four clubs play playoffs. There is one game left to play-offs. Our team is first and decide to ‘rotate’ certain key players , to rest them for play-off. The game we are about to play is against team ranked 5th that has one point difference from 4th team, and hence a chance to qualify to play off. At the same time 4th team is playing against the team from the bottom of the table that wants to avoid relegation, or a least play-out.

If our team employs player rotation strategies and play with less than the best team, that doesn’t represent ‘level playing field’ for the team ranked 4th. Not sure how ‘moral’ is this to other teams, especially the ones that are left with a lot to fight for.  Carl Valle mentioned similar scenario happening in NBA in an awesome article Money ball Madness.

The point to be taken home is that these strategies should be discussed and some policies should be made at the league level. Hope that the above example shows why. Player rotation is simply more complex.


Coaches the cause of injury? 


Jan Ekstrand showed VERY impressive data showing injury tendencies in club with same manager/head coach – or in other way, certain injury history tend to follow the coaches whenever they go. This data is not yet available, but the UEFA has it and I think it is very interesting.

What it is interesting is that there is correlation between player availability and prizes won by these coaches regardless of the club.

Everybody is trying to bring to the attention injuries problems to the managers and educated them (or showing them the economical and performance cost of injured player and a single day missed). Maybe the solution is to add an injury history to their CV along with performance improvements and competitions won? Maybe the CEO and board should track this as well – this data might be very revealing. At the same time, maybe when this is implemented the managers/head coaches will not have the last saying in return to play protocols (“We need this guy on this game – it is the risk of re-injury we need to take. It is in the nature of our sport”). This simple tracking metric might change this culture over night. Let’s hope that this data sees the day soon, but I doubt it.

Injuries and club performance


I wrote about this interesting correlation study before (click HERE), but it is worth repeating. Apparently there is a CORRELATION (not causation) link between injuries and team performance, where teams with less injuries (time/game loss and/or occurrence)  showed higher league ranking at the end of the season.  Chicken or the egg problem IMO, even it is a common sense.

One could say that to improve team ranking one of the important goals might be to reduce injuries. This is just common sense, but it CANNOT be concluded from a study like this. The other way around could be said as well: to reduce injuries start winning games.

It is well known that overall stress reduces the coping and adaptation of athletes. Hence a losing streak is a hell of a stress and can impair adaptability and recovery of the players to the usual loads. Opposite might be true as well. If the team wins, everybody is a bit more optimistic, the hormones in the body might be better, body is coping with stress better and hence there is less injuries.

I might dig into some simple data once I get the chance – we have collected wellness questionnaire (not in a great frequency that would allow confident inferences) and I might look at the differences in scoring after a game won or game loss. We all know that wellness status might predict overtraining, illness and potentially injury (research to back up this bold claim Mladen?), but if a overall climate affect wellness, that also means it affect coping with training loads as sleep, nutrition, or coach/manager as showed by Jan Ekstrand.

Again, things are not that simple and cheesy as It sound, correlation doesn’t apply causation.


Ferrari in the traffic jam – or how to make use of physical match performance data to make erroneous conclusions


This is, I think, great analogy to explain why game physical performance data might be misleading. Data such as total distance run, high-intensity running distance or percent decrement in a last 20min of a game might not mean that the player is not able or willing to run, or even worse a proof of him be tired.

Suppose you are a proud owner of Ferrari. One day you go with your Ferrari to work. The usual distance is 10km both ways. On the way back,  at around 16:30 o’clock you got stuck in the traffic jam, so it took you 30min to come from work to home. In mathematical sense that’ s on average 20km/h. On the better days is takes 15min top and that’s around 40 km/h.  So, that day it took you double the time to cover the same distance ~ there is something definitely wrong with the Ferrari (since it cannot be ‘tired’), right?

This is pretty much the same logic we employ with match physical data. We cannot make any claims without knowing the potential and expression of that potential or in other words the contexts (tactical situation at hand). To really check if something is wrong with our Ferarri we would need to take it to the raceway where there is no (or at least minimized) constraints, so it can express his maximum potential. If things are different here (everything else being equal), then and only then this might reveal us something. Comparing the average speed it took from home to work in 12:00, 16:30 and 1:00 might just tell us about the constraints of the traffic and not much, or at all, about the car potential. It might take same time to Toyota, Fiat, Ferrari and Formula 1 to cover same distance during the rush hour.

Anyway, in physical preparation worlds, the “raceway” represent certain tests, and sometimes not always ‘sport specific’  (read more HERE). We  need to assess the potential in at least constrained and reliable environment.


If you have any comments please leave them down below.