Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's a good idea, but it doesn't seem very common so far.

This is what my NLP company (Luminoso) does -- we train a domain-general model of word meanings on a lot of data, then do the last step on the probably-small amount of specific data you actually have.

Even customers who are knowledgeable about machine learning usually haven't heard of the idea before. They've been assuming that the only way to do NLP is to get millions of labeled examples. Or to get a thousand labeled examples and put them into the kind of off-the-shelf algorithm that needs millions of labeled examples, which of course goes poorly.



My ML background is primarily undergraduate work, and even then it was extremely common.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: