Exactly. This is the primary difference between observational and experimental studies (controlled experiments). Experimental studies control for the hypothesized mechanism as part of the experimental design, but observation studies do not or often cannot. Good data from controlled experiments is difficult, costly, and time-consuming to generate in general, and that often does not mesh with the notion of "big data". I think we are running into this problem more and more as we discover that our data sets really are superficial --- collections of a lot of data that is easy to collect rather than a representative sample of everything (especially in a controlled manner). Good data isn't cheap.