Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, it always worries me that so many people see a small p-value like 10^-6 and say, "Wow, that one's definitely true."

But p=10^-6 doesn't mean, as commonly believed, that there's only a one-in-a-million chance that the proposed hypothesis is really false, nor does it even mean what many more-statistically-savvy people think it means, that if the proposed hypothesis were false, there would only be a one-in-a-million chance of observing test data as extreme as what was observed. No, what it really means is that – and here's the part most people miss – assuming that the researchers' model of the underlying data-generating process is correct, then, if the proposed hypothesis were false, there would be only a one-in-a-million chance of observing test data as extreme as what was observed.

Yes, as the p-value becomes smaller, it does indeed become easier to believe that the hypothesis of interest is true, assuming that the humans didn't screw up the model. But, in any complex work, I'm going to have a hard time believing, sans replication, that there's not a reasonable chance of humans screwing up.

To me, then, p=10^-6 is the new p=10^-2.

EDIT: Replaced Unicode superscripts (10⁻⁶) with circumflex notation (10^-6) because the superscripts weren't showing up on my Nexus 7.



Yes, the calculation of a p-value is always done by assuming a specific model, though usually a less controversial one than what is proposed. But I wouldn't go so far as to demand smaller values. I would prefer we stop thresholding so much altogether, and instead operate with the understanding that what has or has not been found is a suggestion with evidence for it, rather than fact, for all but the best-understood processes.

It is a difficult task for many people, who have been taught facts for decades, to accept that objective knowledge is hard to come by. But everyone understands the value and properties of a crude model.


Sorry, I wasn't clear. I meant that when I see a p-value of 10^-6 in papers, I expect that there's at least a 1% chance that the humans screwed up the models somewhere, so I don't see it as more persuasive than 10^-2. That is, p-values lose credibility once they start getting smaller than a few percent.

So I agree with you. If it were up to me, we'd all report evidence intensity in decibels, anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: