It's a good idea, but it doesn't seem very common so far. This is what my NLP co...

It's a good idea, but it doesn't seem very common so far.

This is what my NLP company (Luminoso) does -- we train a domain-general model of word meanings on a lot of data, then do the last step on the probably-small amount of specific data you actually have.

Even customers who are knowledgeable about machine learning usually haven't heard of the idea before. They've been assuming that the only way to do NLP is to get millions of labeled examples. Or to get a thousand labeled examples and put them into the kind of off-the-shelf algorithm that needs millions of labeled examples, which of course goes poorly.