Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> if you can find a correlation, it's at least worth a try to treat that as causation. It's much better than picking a random answer.

No, it isn't. Or if it is, it's barely better, not much better. See my lengthier reply above.

The reason why this is false is because in biology, there is an enormous correlation structure in which, to a good approximation, everything is correlated with everything. If you take a random gene, its expression will be significantly correlated or anticorrelated with well over half of all other genes. Probably over 80% if I remember correctly. Depends on the number of samples, if you get into 10K+ samples it approaches 100%.

In an area like aging, skin wrinkling is correlated with sarcopenia is correlated with atherosclerosis is correlated with number of senescent cells is correlated with all kinds of gene and metabolite expression levels etc ... you get the idea.

In genetics a SNP will be correlated with hundreds of thousands of other SNPs.

I guess you can go so far as to say correlations can generate hypotheses. People certainly do this all the time. But "a correlation is tentative evidence of direct causation" is just wrong. Technically it is evidence, it's just that in my experience in biology it is such weak evidence as to be useless without other evidence.



"The reason why this is false is because in biology, there is an enormous correlation structure in which, to a good approximation, everything is correlated with everything."

Yes, but not all at the same ratios; it is not the case that literally everything is correlated to everything else to 99.9%. When people say "everything is correlated with everything" they mean that 60%s and 70% and 80%s show up in a large number of places, it does not literally mean that for every two possible processes the correlation is 99.9%. It's not practical to set up a large correlation net with that strong a correlation everywhere unless it really is all the same thing. (It might be mathematically possible, but it's not something you're going to encounter naturally.)

You're still better off choosing something that is very strongly correlated, because you have still shaved off huge swathes of the possibility space to start with which is much less likely to be directly involved. Yes, you still have a decent chance of being wrong, but you also have to remember that in the process of being wrong, you will gather more data. (Well... assuming that you actually listen to the data and don't hide behind some scientific dogma, but that's another discussion.) You're better off choosing something and probing than sitting there, agog at the net of correlations, and being paralyzed by the possibilities. Get in there and start shaving them off, and start with your best guesses, even if they're only your best guesses by a little bit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: