bridging theory and practice
UT GOV Methods Workshop, 2024-09-26
Low-powered studies are more likely to be unduly optimistic and fail to replicate2 3 4
“Power is for you, not for Reviewer Two” –C. Rainey5
When getting a null result, you want to be able to distinguish between a true null vs low power
We need justifiable6 and cost-effective sample sizes to get research done
Example
One-sided z-test:
\(\color{red}{h_0: \mu = 2}\); \(\color{blue}{h_1: \mu > 2}\)
Consider the specific value of \(\color{blue}{\mu_1 = 4}\) in the alternative hyp.
* Sampling distribution: \(\bar X \sim N(\mu, \sigma / \sqrt n)\)
Stronger hypothesized effects
Lower population variance
More observations
Our ingredients: \(\mu_1=4\), \(\alpha=.05\), \(\sigma=5\), \(n=30\).
[1] 0.7074796
Mean power calculation for normal distribution with known variance
d = 0.4
n = 30
sig.level = 0.05
power = 0.7074796
alternative = greater
# run one million simulations
k <- 1e06
v_rejections <- sapply(1:k, \(i){
# sample values and take mean (under h1)
sampled_mean <- rnorm(n = 30, mean = 4, sd = 5) |> mean()
# check whether this sampled mean is above the rejection threshold
as.integer(sampled_mean > thr)
})
# get proportion rejected
mean(v_rejections)
[1] 0.707352
Research design diagnosis based on 1000 simulations. Diagnosis completed in 6 secs. Diagnosand estimates with bootstrapped standard errors in parentheses (100 replicates).
Design N Sims Mean Estimand Mean Estimate Bias SD Estimate RMSE Power
declaration 1000 NA 3.97 NA 0.90 NA 0.71
Coverage
NA
Takeaway: mechanically, calculating power is straightforward with the necessary ingredients
Can also get required sample size for a given level of power (0.8?11)
The elephant in the room: how to set (3) and (4)?
For example, if we have a SE from an existing study, we could do \(\widehat {SE}_{\text{planned}} = \widehat{SE}_{existing} \times \displaystyle \sqrt \frac{n_{\text{existing}}}{n_{planned}}\).
If our SE comes from a pilot study, we might want to be more conservative: \(\widehat{SE}_{\text{planned}} = \widehat{SE}_{pilot} \times \displaystyle \sqrt \frac{n_{\text{pilot}}}{n_{planned}} \times \displaystyle \Biggl(\displaystyle \sqrt \frac{1}{n_{pilot}} + 1\Biggr)\).
Using an estimate of the variance of the outcome usually requires stronger assumptions
When designing a study, you want to run a power analysis
Use analytical approaches for simple designs and simulation approaches for more complex designs
Power depends on sample size, significance level, population variance*, and effect size
Use (but suspect) rules of thumb. Embrace the trade-offs!
✉️ <andres.cruz@utexas.edu>
Arel-Bundock, Vincent, Ryan C. Briggs, Hristos Doucouliagos, Marco M. Aviña, and T.D. Stanley. 2023. “Quantitative Political Science Research Is Greatly Underpowered.” OSF Preprints. doi:10.31219/osf.io/7vy2f.
Gelman, Andrew, and John Carlin. “Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors.” Perspectives on psychological science 9, no. 6 (2014): 641-651. https://doi.org/10.1177/1745691614551642
Button, Katherine S., John PA Ioannidis, Claire Mokrysz, Brian A. Nosek, Jonathan Flint, Emma SJ Robinson, and Marcus R. Munafò. “Power failure: Why small sample size undermines the reliability of neuroscience.” Nature Reviews Neuroscience 14, no. 5 (2013): 365-376. https://doi.org/10.1038/nrn3475
Tressoldi, Patrizio E. “Replication unreliability in psychology: elusive phenomena or “elusive” statistical power?.” Frontiers in Psychology 3 (2012): 218. https://doi.org/10.3389/fpsyg.2012.00218
Rainey, Carlisley. “Power, Part I: Power Is for You, Not for Reviewer Two.” Blog post. 2023. https://www.carlislerainey.com/blog/2023-05-22-power-1-for-you-not-reviewer-2/.
Lakens, Daniël. “Sample size justification.” Collabra: Psychology 8, no. 1 (2022): 33267. https://doi.org/10.1525/collabra.33267
Blair, Graeme, Alexander Coppock, and Macartan Humphreys. Research design in the social sciences: Declaration, diagnosis, and redesign. Princeton University Press, 2023. https://book.declaredesign.org/
Schuessler, Julian, and Markus Freitag. 2020. “Power Analysis for Conjoint Experiments.” SocArXiv. https://doi.org/10.31235/osf.io/9yuhp.
Stefanelli, Alberto, and Martin Lukac. 2020. “Subjects, Trials, and Levels: Statistical Power in Conjoint Experiments.” SocArXiv. https://doi.org/10.31235/osf.io/spkcy.
Blair, Graeme, Alexander Coppock, and Macartan Humphreys. Research design in the social sciences: Declaration, diagnosis, and redesign. Princeton University Press, 2023. https://book.declaredesign.org/library/experimental-descriptive.html#sec-ch17s3. Sec. 17.3: Conjoint experiments.
Lakens, Daniël. “Sample size justification.” Collabra: Psychology 8, no. 1 (2022): 33267. https://doi.org/10.1525/collabra.33267
Rainey, Carlisle. (2024). “Statistical Power from Pilot Data: Simulations to Illustrate.” Blog post. https://www.carlislerainey.com/blog/2024-06-03-pilot-power/
DeclareDesign Team. (2019). “Should a pilot study change your study design decisions?” Blog post. https://declaredesign.org/blog/posts/pilot-studies.html
Rainey, Carlisle. 2024. “Power Rules: Practical Statistical Power Calculations.” https://github.com/carlislerainey/power-rules/blob/main/power-rules.pdf
Rainey, Carlisle. 2024. “Power Rules: Practical Statistical Power Calculations.” https://github.com/carlislerainey/power-rules/blob/main/power-rules.pdf
Lakens, Daniël. “Sample size justification.” Collabra: Psychology 8, no. 1 (2022): 33267. https://doi.org/10.1525/collabra.33267
Stefanelli, Alberto, and Martin Lukac. 2020. “Subjects, Trials, and Levels: Statistical Power in Conjoint Experiments.” SocArXiv. https://doi.org/10.31235/osf.io/spkcy.
Leon, Andrew C., Lori L. Davis, and Helena C. Kraemer. “The role and interpretation of pilot studies in clinical research.” Journal of psychiatric research 45, no. 5 (2011): 626-629. https://doi.org/10.1016/j.jpsychires.2010.10.008
DeclareDesign Team. (2019). “Should a pilot study change your study design decisions?” Blog post. https://declaredesign.org/blog/posts/pilot-studies.html
Rainey, Carlisle. (2024). “Statistical Power from Pilot Data: Simulations to Illustrate.” Blog post. https://www.carlislerainey.com/blog/2024-06-03-pilot-power/
Schuessler, Julian, and Markus Freitag. 2020. “Power Analysis for Conjoint Experiments.” SocArXiv. https://doi.org/10.31235/osf.io/9yuhp. Created with their online tool: https://markusfreitag.shinyapps.io/cjpowr/.