Since they're using collaborative filtering, the matrix they're solving for is very sparse (ie it's an undetermined system). So they're fitting a nonlinear regression model by minimizing the regularized squared error. Since the vectors they're trying to model (x and y) are both unknown, the optimization problem is not convex, or in other words, can't be solved for exactly.
That's not actually true. The matrix they're talking about inverting here is actually just a square matrix of 8x8 or 128x128 (in their graphed example) to solve the alternating least squares problem, in the paper you linked.
What you do is for each user, sum up the vectors of items they 'used', and the outer product of each item vector with itself, and solve a straightforward linear algebra equation. Then you flip it around and sum up the new user vectors and their outer products for each item. And alternate.
And so yeah, you don't need to invert the matrix, you use a Cholesky decomposition or something similar as it's guaranteed to be positive definite.
Enhanced versions of Gaussian elimination are used for quite large matrices (with many millions of non-zero coefficients), but for even larger matrices, approximate algorithms can be used. These converge to the desired solution, as opposed to computing it directly, as with Gaussian elimination.
Huh? Isn't Gaussian elimination more straightforward?