There's been a lot of talk here about the need for academic researchers to share their data. I'm interested what people's opinions are about this when it's corporations doing the researching.
The pressure to publish is so high, promotions et cetera all rely on it. Putting together large datasets can be can be costly and time consuming and the hope is often to get a few publications out of it. I think citations need to acknowledge datasets more, as do promotions otherwise people will have large incentives to "silo" vertically and not share. A big issue that has no excuse is public access to journals though. It's amazing that the public pays for the research and then pays for it again and again in every library's subscription to the vampire publishing houses who do nothing other than distribute a few hardcopies that nobody reads and then gatekeep the pdfs (the editing and review is all done by academics, so it's just copy and typesetting that's commanding this huge price). Maths, physics and comp sci are some of the better disciplines (masters of LaTeX) for making things available online, but everywhere else it's horrible.
I'd personally like to see more support for people gathering more diverse data sets. If private companies feel it is in their interest to share their data with others and contribute to scientific knowledge then this is wonderful. Obviously if your result is to be believed, you will have to share data so that someone else can attempt to replicate your analysis or compare your results with other methods. It is nice to have benchmark datasets, but there is a danger that research on "object recognition" can become research on the "caltech 101" dataset.
> There's been a lot of talk here about the need for academic researchers to share their data. I'm interested what people's opinions are about this when it's corporations doing the researching.
Is their research affecting public policy?
I only care about your research into the affect that skull shape has on criminal behavior when said research is used to argue for laws.
They should also mention the very nice recent flow of ideas from theoretical physics into the field of probabilistic inference:
Many algorithms that make the analysis of these data sets possible either originate from or are widely used in [particle] physics. Examples include message-passing, mean-field methods, variational inference, monte-carlo methods, and many others.
While earning my undergraduate degree in CS/Math there was not a single course in the CS department that had anything to do with databases. I had to teach myself. The MIS department had one course that taught some rudimentary stuff using Access. This was 1996-2000. Has it changed since? By that I mean is it still common to go to university for CS and not be offered any courses that teach the basics of RDBMS (or NoSQL)?
Dijkstra is often quoted (though no clue if he actually said it) with a similar expression "Computer Science is no more about computers than astronomy is about telescopes."
Yes, as the others said, the study of different ways to compute.
For example, from the article, "The Eyebrowse project is motivated, in part, by the fact that as researchers, we have no idea how people actually use the web." That's not about computation, that's human-computer interaction, which I think of more as applied psychology melded with graphic design. While there are opportunities to advance the field of analyzing large data sets, that's still more in the realm of advanced statistics than computation.