Yes, i understand. Another statistical test. Those have become far more common since the advent of computers and statistical software. I think my very minor point was made poorly because I'm not sure why you are pointing this out.

Well, I probably missed your original point. My point is that null hypothesis testing, which is still the workhorse methodology for hypothesis testing, Bayesian approaches notwithstanding, isn't Popperian - it is an attempt to falsify the null, not falsify the alternative hypothesis.

You could argue that it's still conservative, or "risky" in Popper's sense, but not because you are trying to falsify your hypothesis - only because you are committed to "retaining the nulL" if you fail to falsify that null.

And yet Popper was writing after Fisher et al - whose approach continued right through the establishment of the Popperian principle. My point is that mostly we don't do falsification tests of our theory-derived hypotheses (apart from validation tests, as Dave rather amusingly got diametrically wrong) - we do falsification tests of our null. And in practice, most nulls (in two tailed-tests, generally regarded as "conservative") are almost always wrong. So we have the bizarre situation where we attempt to falsify a null we know is almost certainly false!

And we don't even do it by calculating the probability that it's false - we do it by calculating the probability of what we observe given a hypothesis we know to be false!

In practice it sorta works, but it's a hell of a kludge. We need something better.

In Popper's day, there were very few chi square tests on 20k data point samples. The fact that the math was possible and the implications understood already is quite immaterial. Without computers, they were really quite limited in application. And, frankly, now that we have computers, they are given weight philosophically beyond their actual limits.

Null hypothesis testing is philosphically incoherent. But it works because most people regard the p value in Fisherian terms (as a proxy for how robust the result is) rather than Neuman-Pearson's (as a criterion by which to accept or reject H1), a trend that's become even more common now that computer's spit out "exact" p values, rather than you looking up your test statistic in a table that just gives thresholds. But it's a really dumb proxy, and leads to the perception that somehow the p value is the probability that the null is true. Which it isn't.

That last is an opinion which is harder to support. What I mean is that the correlation/causation boundary is quite permeable and often, establishing the direction of causality is used as a shorthand for causality itself.

Well, that's a different issue. I'm not sure I agree that "the correlation/causation boundary is quite permeable". The key is whether you randomly allocated your predictor variable. If you did, I think it's reasonable to conclude that your predictor was the cause of your effect. But only proximally of course. If it was a drug, say, it doesn't tell you the full causal pathway. It only goes back as far as the drug allocation method, and it's "permeable" in that blinding is rarely perfect. So if that's what you mean, I agree.

Pharmaceuticals are notoriously declared to work "because" x. When the reality is that no one anywhere has fuck all idea what the drug is doing to affect the disease. All that is known is it affects a pathway or system that is also known to be correlated with a causal direction to the disease.

Indeed. OK, in that case I'm with you.