We recently trained GPT-3 (SMALL) at work on our GPU cluster for fun, took 4 day...

ImprobableTruth · on Aug 17, 2020

I'm curious: Did you get comparable results to openai? I know a few people tried to train GPT-2 themselves (before it was openly released) and their results were quite inferior.

stainforth · on Aug 17, 2020

You're saying your project costed millions of dollars, or the big boys' projects did?

shmageggy · on Aug 17, 2020

If "4 days across a couple dozen machines" cost millions, something is very wrong.

T-A · on Aug 18, 2020

Not if it was a couple dozen of these machines:

https://www.hardwarezone.com.sg/tech-news-nvidia-dgx-a100-su...

GaryNumanVevo · on Aug 18, 2020

Running your own DC is quite expensive with GPU hardware. One DGX-2 is $400k and draws something like 24kW.

rajnathani · on Aug 22, 2020

> draws something like 24 kW.

That number is off. The DGX-2 consumes 10 kW at peak [0] and the DGX-2H consumes 12 kW at peak [1].

[0] https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...

[1] https://www.nvidia.com/content/dam/en-zz/es_em/Solutions/Dat...