Google offers researchers 1 billion computing core-hours

ChuckMcM · on April 8, 2011

A rancorous debate broke out inside Google once about the suitability of Google's infrastructure for solving 'real' problems. The people who felt it was inadequate pointed out that everyone who built "super" computers did so with lots of shared state and epic low latency bandwidth. But Google's computers were designed with 'shared nothing' in mind, the architecture [1] so clearly supports web search but is useless for scientific computing.

I and a few others were of the opinion that the scientific community didn't build the equivalent of LinPack or scientific simulation packages on 'Google like' architectures because they didn't have access to such architectures rather they had "Beowulf" clusters [2] which had been built to be more like Supercomputers. It wasn't that such problems couldn't be worked on shared nothing architectures, it was just that nobody was making any real progress along those lines.

This problem reads like a typical Google response to such an argument "Ok, if we made some hardware available for this sort of thing, would academics be willing to apply some serious thinking to it?" No doubt buried in the details somewhere there will be some language about Google owning or at least getting free perpetual access too any code or techniques that emerge out of this experiment.

You have to admit, if they could pull it off it would put a huge crimp in any supercomputer type system.

But perhaps more interestingly, after I left Google one of the things that caught my eye was some of the articles that have been written about what quantum computing might look like. And I realized that with enough cores sitting around you could imagine how stuff learned programming that for 'real' problems might inform how you would program a quantum computer.

[1] http://labs.google.com/papers/googlecluster-ieee.pdf [2] http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book/beowulf_bo...

VladRussian · on April 8, 2011

>I and a few others were of the opinion that the scientific community didn't build the equivalent of LinPack or scientific simulation packages on 'Google like' architectures because they didn't have access to such architectures rather they had "Beowulf" clusters [2] which had been built to be more like Supercomputers. It wasn't that such problems couldn't be worked on shared nothing architectures, it was just that nobody was making any real progress along those lines.

Whats is the difference between 'Google like' and Beowulf or other similar Linux clusters for scientific calculations? A Beowulf is just a software which clusters together a bunch of cheap computers connected by whatever Ethernet is currently available for normal money (of course you can throw more money if you have it). On the other side, people have no problem running physics or bioinformatics calculations onto hundreds of Amazon nodes.

>it would put a huge crimp in any supercomputer type system.

thats what Beowulf and the likes already did 10 years ago. It is one of the reasons why "supercomputers" (SMP nodes connected by extremely fast backplanes in big cabinets or as you said "lots of shared state and epic low latency bandwidth") had the low rate of survival into 21st century. Of course there is still Top500 supercomputers - big rooms with a lot of racks and, frequently, very expensive/fast networks. Yet, if your distributed program significantly depends on the speed of the interconnect, ie. it have significant message passing component, it usually wouldn't scale effectively, ie. it may scale, yet with quickly diminishing return, Amdahl law style, even with extremely low latency / expensive interconnect.

exit · on April 8, 2011

> And I realized that with enough cores sitting around you could imagine how stuff learned programming that for 'real' problems might inform how you would program a quantum computer.

(i realized that) (stuff learned programming with that many cores) might (inform how you would program a quantum computer)

hugh3 · on April 8, 2011

if they could pull it off it would put a huge crimp in any supercomputer type system

Well, I wouldn't go that far. There are many different types of problems to be solved. Some of 'em need a thousand-node super-uber-bandwidth supercomputer if you're going to get them done in a reasonable amount of time, while others can be done just as well on a million noninteracting nodes. Certainly there are some folks with problems of the second type out there, and I'm sure they'll make great use of google's resources here. But there are other folks who really need to invert a 100,000 by 100,000 matrix, and there ain't no magic way to embarrassingly-parallelize that.

nyellin · on April 7, 2011

Read the ending; it sounds like a cloud computing offer:

In the future, we think this service could also be useful for businesses in various industries, like biotech, financial services, manufacturing and energy. If your business can benefit from hundreds of millions of core-hours to solve complex technical challenges and you want to discuss potential applications, please contact us.

hugh3 · on April 7, 2011

Interesting! I have a few ideas on some awesome things I could do with a billion CPU hours, though I'm not one hundred percent sure whether I can get a proposal together by the end of May.

The eligibility is a little unclear though... at one point it says "up to 10 distinguished researchers and postdoctoral scholars worldwide", while elsewhere it says "Awardees will participate through Google’s Visiting Faculty Program; faculty members need to have full-time status at an academic institution". So are postdocs invited to apply or not?

edit: Oh, the page could also use some information about memory per node.

TheEzEzz · on April 7, 2011

Postdocs are full-time faculty members. They just aren't permanent.

gammarator · on April 8, 2011

No, postdocs are not faculty. They may be classified as staff, as fellows, or even as students [1-2], but they aren't members of faculty senates.

[1] http://sciencecareers.sciencemag.org/career_magazine/previou... [2] http://sciencecareers.sciencemag.org/career_magazine/previou...

TheEzEzz · on April 8, 2011

At my current university all postdocs in my department carry the official title of Visiting Assistant Professor. They are listed under the tab 'Visiting Faculty' on the departmental website. At least in name they are faculty, and they are professors.

bbgm · on April 8, 2011

That's unique. Having many friends who were, or are still, postdocs, they were almost always staff, but not faculty in any sense (in the US at least)

bdb · on April 7, 2011

Really interesting that your submission has to be in the form of something that can run on Native Client. I wonder if they're using this to stress-test implementations of NC.

tlrobinson · on April 8, 2011

Or they're just using NativeClient as the sandboxing mechanism.

rwg · on April 8, 2011

It's almost certainly this, and I would love to see them release lightweight grid software that uses NaCl on compute hosts. The footprints of existing grid software stacks, both in terms of volume of software and sysadmin time to setup/maintain, are ridiculous.

phaedrus · on April 9, 2011

I and another computer science student spent an entire semester internship just trying to get Globus properly installed and configured in the CS lab for one of our professors. It (the cluster computing software infrastructure of Globus) was a ridiculous sprawling mess of poorly integrated components. I've gone through both Gentoo and Linux-From-Scratch and can honestly say it is easier to assemle the parts for an entire operating system than to get Globus working (at least that was the case in 2008). I could see a definite need for a lightweight solution with a simple install package, but I wasn't in the position to write one.

thefreshteapot · on April 8, 2011

I was looking at the facebook github page for hiphop php and I only just made the connection that the compiled php projects could potentially be ran via "nativeclient".

Equally, its interesting to read that you can register a nativeclient as a plugin for Chrome to be used in their speech api. Not for speech per say, it demonstrates that the connection and the code exists to make it easier to hook into other aspects.

"Extensions are free to use any available web technology to provide speech, including streaming audio from a server, HTML5 audio, Native Client"

https://code.google.com/chrome/extensions/dev/experimental.t...

cabacon · on April 8, 2011

You can also go for the INCITE or ALCC programs from the DOE: http://www.doeleadershipcomputing.org/

They award O(1 billion) hours to O(100) projects, and you don't need DOE funding to apply (or even be a US-based project). The catch is that your code has to be very scalable (to roughly 40k cores), and your science problems should be in line with the DOE mission.

entangld · on April 8, 2011

Has Google been putting out really cool stuff consistently over the last few years or is it just because I'm constantly on HN that I see this stuff?

Anyway, I'm really starting to like Google again.

juiceandjuice · on April 7, 2011

That's effectively 114200 cores devoted continuously for a year. Not bad. An experiment I work with uses ~600 cores over 85 hosts, although the average load is probably only around 20%, but there's times where it's pegged for a few months.

Retric · on April 7, 2011

That's effectively 114200 cores devoted continuously for a year.

Or 256 x Quadro 6000's @ 99.6% up time. Granted, Google can hand out the real CPU time not just a bank of GPU's. Still, I don't think it will be that long before someone pulls that "switch".

DarkShikari · on April 8, 2011

Quadros don't have 448 cores each. nVidia's "core" numbers count each portion of a vector processor (a "thread warp") as a core; by this definition, a single Core i7 core has 48 "cores".

juiceandjuice · on April 8, 2011

Given the programming constraints associated with GPGPUs, I'm betting it's real CPUs.

pama · on April 8, 2011

This is a truly amazing offering. I hope the applications in the area of biochemistry research will find creative ways to work around the two important limitations of google: its GFS filesystem and its latency.

gojomo · on April 8, 2011

The question I would like to research is: 1 billion core-hours on Google's infrastructure yields exactly how many bitcoins?

wladimir · on April 8, 2011

Well, if the Google infrastructure is so massive that it will dominate the bitcoin generation: 50 coins every 10 minutes.

Not worth the electricity I think. Let's talk real science instead of self-enrichtment for a change.

gojomo · on April 9, 2011

I'd heard the bitcoin system auto-adjusted for more 'mining'... but is its production rate really constant, no matter how large the arrival of new computing power under a single authority?

(My 'proposal' was a thought experiment; I'd definitely agree that given that much computing-power there are more beneficial and/or profitable things to do... among them running a market-dominating search engine.)