Articles like this just encourage pointy-haired bosses who know just enough to dangerous and annoying.
"We're shifting over to a MapReduce-based architecture, just like Google! Google's website is only an interface to a larger architecture. It all ties in to cloud computing and SOA and..."
I really doubt he does understand it. That's a terrible explanation of MapReduce, and then makes it sound like it's secret Google tech that will take over the world-- parallel computing isn't new, and even MapReduce itself has an open source clone now. It's really surprising to find an article like that on HBR.
It's interesting how much noise there is around MapReduce. I'm working as CTO of a start-up with a very large dataset (terabytes) where we need to do quite complex queries across the dataset and very quickly.
Naturally we are taking a distributed approach to the problem since hardware is cheap and relatively easy to coordinate via software.
We decided to go our own way using a mixture of some databases and raw file system access with our own application coordinating both the cluster of machines and performing the queries themselves after performing extensive benchmarking of everything.
We looked closely at Hadoop and came to the conclusion that MapReduce was just the wrong architecture for our problem. Its performance was horrible.
The moral of this story is that MapReduce is interesting, but that doesn't prevent you from having to actually test algorithms and make an informed choice. Just because Google's using it doesn't mean it's a panacea.
I didn't get through the whole article, but "Google is building a new secret weapon that has more to do with the brain than search."? Google published their MapReduce paper in 2004... http://labs.google.com/papers/mapreduce.html
Latest version of Qt also implements MapReduce patterns (only for multi-core, not across the network). There's also good old MPI, which isn't terribly sexy, but gets the job done. I found this post rather interesting when I ran across it:
Map Reduce is hardly secret and it's not like other people such as Yahoo!, Facebook and others aren't using it. Google might have a head start but that's about it.
I wonder how long that head start will last with a collaborative Open Source project backed by some other big players (that would be Hadoop).
Then a few hours later my comment has been edited -- they removed a few sentences. (I don't remember the text I wrote exactly, but I said that: 1) the author of this article should have showed his text to at least some CS student, 2) I didn't expect that Harvard Business Publishing could publish articles by such incompetent authors.)
Here we go. They could have deleted my comment, but instead they decided to edit my speech. Why just they don't rewrite everything I said?
I am interested in learning about map reduce but I have yet to find a very simple tutorial that explains what it does and demonstrate at a very simple level how it works
"We're shifting over to a MapReduce-based architecture, just like Google! Google's website is only an interface to a larger architecture. It all ties in to cloud computing and SOA and..."