They do mention that in the 2nd paragraph and repeatedly throughout the article as CF. I found the article to be a nice primer on the topic as it listed common approaches to the problem.
Lots of research here, but the most interesting part is their distributed graph approach. Scaling up CF is a challenging problem, and the current open source implementations have problems at really large scales.
I hope they open source their Giraph implementation, or at least part of it.
FWIW, I gave a talk about the Alternating Least Squares algorithm mentioned here (and linked in several comments) and how we implemented it at Spotify:
Since they're using collaborative filtering, the matrix they're solving for is very sparse (ie it's an undetermined system). So they're fitting a nonlinear regression model by minimizing the regularized squared error. Since the vectors they're trying to model (x and y) are both unknown, the optimization problem is not convex, or in other words, can't be solved for exactly.
That's not actually true. The matrix they're talking about inverting here is actually just a square matrix of 8x8 or 128x128 (in their graphed example) to solve the alternating least squares problem, in the paper you linked.
What you do is for each user, sum up the vectors of items they 'used', and the outer product of each item vector with itself, and solve a straightforward linear algebra equation. Then you flip it around and sum up the new user vectors and their outer products for each item. And alternate.
And so yeah, you don't need to invert the matrix, you use a Cholesky decomposition or something similar as it's guaranteed to be positive definite.
Enhanced versions of Gaussian elimination are used for quite large matrices (with many millions of non-zero coefficients), but for even larger matrices, approximate algorithms can be used. These converge to the desired solution, as opposed to computing it directly, as with Gaussian elimination.
I hoped for a minute that they shared their complete implementation; anyone aware of a recommendation system that can scale to millions of items, be updated as soon as new items come in (no full graph recalculation) and take multiple inputs (ratings, saved in library, etc)?
Have a look at Google's MinHash algorithm. While it's a probabilistic solution, You can run it as a mapreduce and will at no point need to have the full data set in memory/on a single machine. So it does scale pretty well.
EDIT: I see you changed your comment to include "no full graph recalcuation". Incremental recos are possible to do with minhash but I think you can't solve decay of old data easily.
- http://www.jmlr.org/papers/volume10/takacs09a/takacs09a.pdf
EDIT: Actually looks like Eq (15) from
- http://public.research.att.com/~volinsky/netflix/BellKorICDM...
Anyway there are lots of papers around on the topic.