Some people (like you) understand that reliability and validity aren't just "p < 0.05", but that's far from universal understanding. I've seen intelligent people accept and reject hypotheses with woefully inadequate evidence, and I've also seen wild hypotheses built on the backs of strong but meaningless correlations.
Dangerous beasts can be useful, but they must be treated with due care.
The books by Howard Wainer, Edward Tufte, and Bill Cleveland are good starting points; it also depends a lot on your particular interests. Andrew Gelman's blog [1] is very good if you're at all interested in non-experimental data and/or poli sci applications.
> Any stories of inadequate examples or meaningless correlations?
A customer's marketing group was tying visitor data to geodemographic data. They put together a database with tons of variables, went searching, and found a multiple regression with a Pearson coefficient of 0.8+, a low p, decided to rewrite personas, and started devising new tactics based on the discovery.
Fortunately, they briefed the CEO and the CEO said that the dimensions in question (I honestly don't remember what they were) didn't make intuitive sense, and demanded more details before supporting such a major shift in tactics. More research was done, and this time somebody remembered that this was a product where the customers aren't the users, so they need to be treated separately. And it turned out the original analysis (done without fancy analytics) was very close to correct.
If the CEO hadn't been engaged during that meeting, they would've thrown away good tactics on a simple mistake. The regression was "reliable" by most statistical measures, but it was noise.
A similar example holds for validity, where I saw a team make wonderfully accurate promotion response models, but they only measured to the first "conversion" instead of measuring LTV. And after several months of the new campaign, it turned out that the new customers had much higher churn, so they weren't nearly as valuable as the original customers.
> Care to elaborate how how to be more sure of reliability and validity?
I'm not a statistician or an actuary. I'm a guy who took four stat classes during undergrad. I know just enough to know that I don't know that much.
Disclaimer aside: my biggest rules of thumb are to make sure that you're measuring the thing you want to measure (not a substitute), to make sure the statistical methods you're using are appropriate for the data you're collecting, and to make sure you understand the segmentation of your market.
So those are some pretty bad decisions coming from statistical analysis; I wonder if you think that those people (the marketing group in particular) would make good decisions generally? It seems like some people are hell-bent on making bad decisions regardless of the tools available to them.
But, yeah, you hand some people a spreadsheet with numbers in it and their critical thinking ability just evaporates.
As an aside, that's not what I meant by "reliable" earlier (and, to be really specific, I agree that low p-values do not ensure reliability even w/out the other problems introduced by that particular model search).
Some people (like you) understand that reliability and validity aren't just "p < 0.05", but that's far from universal understanding. I've seen intelligent people accept and reject hypotheses with woefully inadequate evidence, and I've also seen wild hypotheses built on the backs of strong but meaningless correlations.
Dangerous beasts can be useful, but they must be treated with due care.