I built a Python pipeline to stop copy-pasting p-values, automate assumption checks, and chat with raw data ; and then I open-sourced it.
10 min read
1 day ago
--
Access this article for free here
You’re on your 4th coffee, staring at a terminal that just finished running a normality test on your reaction time data. The p-value is 0.049. You already know what comes next , you Alt-Tab to a Word document that’s been open for three days, find the results table you’ve been filling in by hand, and type W = 0.96, p = .049
Then you do it for the next variable. And the one after that. At some point you’ll write the methods section too.
There is a specific kind of dread that comes from knowing that the hard part of your research isn’t the science. It’s the plumbing.
Andrej Karpathy is automating the literature review. I wanted to automate the execution.I got tired of copy-pasting p-values into Word docs like an animal. So I built something.
statforge run --data study.csv --outcome reaction_time \
--groups condition --style apa7That one command runs the full statistical pipeline , it detects which test is appropriate, checks your assumptions, runs the analysis, computes effect sizes, formats everything to…
