Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Bayesian analysis overcomes this particular problem, but of course doesn't have such an easy and commonly accepted "THIS IS SIGNIFICANT AND BY GOLLY I AM RIGHT!" signal. It's conceptually harder than NHST, doesn't lead to easy conclusions (unless your effect is really strong), and many researchers have never even heard about it. And they are the reviewers.

I got into a bit of a kerfuffle about a statistical test once. A critical review insisted I do an ANOVA, even though it was inadequate, and I had used a weaker test. Such is the status quo.

Work has been done to make Bayesian analysis as straight-forward as NHST, but if it gets accepted with a threshold (comparable to the current p<0.05), it won't help.

Some articles now publish both analyses, which is nice. And there are tools to help you, e.g. https://jasp-stats.org/



> doesn't have such an easy and commonly accepted "THIS IS SIGNIFICANT AND BY GOLLY I AM RIGHT!" signal.

Part of the ongoing stats revolution is abandoning the binary significance test, because it's often misaligned with the fundamental business question at hand. If, instead, you have a probability distribution over hypotheses, you can make optimal business decisions based on standard utility theory, like this https://philosophy.hku.hk/think/stat/naode1.php.

> It's conceptually harder than NHST

From a different angle: Experimenters fundamentally want to know p(hypothesis | data): Does adding 20g sugar to 1kg cement prevent 98% of the units from crushing under 1000kg load? Below are two answers. Which would people rather tell their boss?

1) Likely yes, given our experimental data. The totality of evidence shows that there's a 70% probability that failure rates are below the limit, but we're not yet ready to commit changes to our manufacturing. We can run more experiments to nail this down and make a concrete decision (heh).

2) Sorry boss, I have no clue. The p-value was 0.15, so we can't say anything. Unfortunately old-school stats can't easily add new data to an existing experiment, so we have to run new experiments, and then combine everything with a fancy new meta-analysis. Or we could call the first experiment a sneak peek but then we have to do multiplicity corrections. But even after we do all that, I can't really tell you the probability that sugar strengthens cement. NHST doesn't work that way.

From the stakeholder's perspective, #1 is conceptually lots easier.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: