This is great and looks very easy to use! I'd expect it to have a huge impact gi...

jeffra · on Feb 10, 2020

> Is it getting a lot of internal use already (beyond the example we just heard about)?

We have hundreds of internal users of DeepSpeed using it to train production ready models, many of which have been already shipped.

> Is it possible to do inference using a CPU and a lot of RAM using a model trained on multiple GPUs via DeepSpeed?

It is definitely possible to do inference on CPU using a model trained on multiple GPUs via DeepSpeed. For models trained without model parallelism, this is straight forward. The tricky part is if the model was trained using model parallelism, which would require merging checkpoints corresponding to different pieces of the model into a single one.

> Does it work with TPUs right out of the box? It looks like maybe not - if not, any plans to support them?

The ZeRO technology is compatible with TPU or any accelerator in a cluster setting, but we have not tested it with the TPUs. It likely would require some small refactoring to get DeepSpeed to work with TPUs. We do not have any internal plans to support them yet, but of course completely open to contribution from the community.

> Can you use DeepSpeed to train using a lot of CPUs + ram rather than GPUs?

It is possible to use DeepSpeed to train using a lot of CPUs. The major limitation of the approach is that CPUs can be an order of magnitude slower than GPUs in terms of computational performance.

tixocloud · on Feb 11, 2020

Are you able to share the use cases for production ready models?