I actually can't wrap my head around this number, even though I have been working on and off with deep learning for a few years. The biggest models we've ever deployed on production still have less than 1B parameters, and the latency is already pretty hard to manage during rush hours. I have no idea how they deploy (multiple?) 1.8T models that serve tens of millions of users a day.