Now that's fascinating. I'd thought of machine learning as a form of optimizatio...

Now that's fascinating.

I'd thought of machine learning as a form of optimization. Things like support vector machines really were hill climbing for some kind of local optimum point. But at a billion dimensions, you're doing something else entirely. I once went through Andrew Ng's old machine learning course on video, and he was definitely doing optimization.

The last time I actually had to do numerical optimization using gradients, I was trying to solve nonlinear differential equations for a physics engine in about a 20-dimensional space of joint angles that was very "stiff". That is, some dimensions might be many orders of magnitude steeper than another. It's like walking on a narrow zigzagging mountain ridge without falling off.

So deep learning is not at all like either of those. Hm.