here is what I don't understand about deep NLP (please keep in mind that I just ...

jventura · on Feb 7, 2017

I've worked in the NLP research area (mostly with statistical metrics), and I can safely say that 80% (precision) is the empirical threshold that most metrics are able to reach quite easily. The threshold between 80% and 90% starts to get difficult and above that you have to do some tweaking to adapt to the specifics of your problem.

With that said, your values do seem to be in line with what I consider to be easily reachable, so it kind-of depends on how much work you need to do with the neural networks to extract those keywords. I'm not very knowledgeable on how NN are applied to this field, but I'm assuming that a drawback of that approach is that it may resemble a black-box in the sense that it may be hard to tweak the internals.

I prefer statistical metrics because they seem more simple to derive. For instance, you can think of things like "a relevant keyword is usually related with (or closer to) other relevant keywords" and you can test that hypothesis only by counting distances between words. This is what I've done in 2012 with quite good values, you can check the paper here: http://www.sciencedirect.com/science/article/pii/S1877050912...

demonshalo · on Feb 7, 2017

> I prefer statistical metrics because they seem more simple to derive.

That's exactly how I view it as well. My goal for this project is to reach a 90% "perfect" score. And in that case, ML seems to not even be needed. Perhaps the gap between 90 and 95-100% is where ML can help add value. But that in itself is what #3 is about in my original post.

Thank you for confirming my suspicions regarding the threshold though!

jamra · on Feb 7, 2017

I believe that NN provide the logical relations for you so perhaps they can be used to train and then inspected to understand the data better.

jventura · on Feb 8, 2017

As I said, don't know much about deep NN, but what I can remember of neural networks from my college years, each "node" on the network only have weights associated with the inputs, which makes it a blackbox. In other words, it is not easy to "grab" a node from the network, check the weights associated with the inputs and understand how it relates to a language..

rcpt · on Feb 7, 2017

You don't have to now. But what if you were to change the problem a bit, you'd need to reinvent those "elementary text cues", right? With deep learning (or, more generally, representation learning) you can simply change the training data and reuse the rest of your algorithm. Jure Leskovec has a paper, node2vec, which describes this well:

> A typical solution in- volves hand-engineering domain-specific features based on expert knowledge. Even if one discounts the tedious effort required for feature engineering, such features are usually designed for specific tasks and do not generalize across different prediction tasks. An alternative approach is to learn feature representations by solving an optimization problem [4]. The challenge in feature learn- ing is defining an objective function, which involves a trade-off in balancing computational efficiency and predictive accura

mcguire · on Feb 8, 2017

What is the difference between feature engineering and defining an objective function?

rcpt · on Feb 8, 2017

It's the same kinda game in some way. Objective functions are often easily reused though.

PeterisP · on Feb 8, 2017

Can you elaborate why would you consider "more efficiency than what is stated above then wouldn't it be safe to assume that is UNREASONABLY efficient" ?

Certainly in many cases a better accuracy can be both reasonably needed and reasonably possible (i.e. if humans can do it, then it's obviously possible).

One measure that is used, and is a bit similar (though with a major difference) is "inter-annotator agreement", i.e., you ask the same question to multiple people and note how often they agree. That would be considered a reasonable ceiling, is a measure of how objective/subjective the question is, a measure of how often there really is a single "correct answer"; for some problems that metric is near 100% and can be reasonably beaten by a good system, because the mismatches are caused by human mistakes instead of true disagreements; for others (e.g. some forms of emotion/sentiment/sarcasm analysis) 80% is unreasonably good, since the text doesn't have enough information to decide for sure.

Also, an answer to (3) is that to get a state of art result (as opposed to a simple baseline) with non-DNN methods you need a quite complex system and lots of custom feature engineering. If you have (or get) one, that's not an issue, but if developing a system from scratch, a good DNN system needs less labor than a good "classic" system. For example, a major point in neural machine translation is that it not only gets better results, but that it can get them with a much simpler NLP pipeline. When a "classic" system needs to integrate 10-30 additional separate modules (ML or with manually crafted rules) for handling various types of special cases or feature analysis, much of that (though not all) can be learned by a deep neural network directly in end to end training; so if you go directly to DNN then you avoid the (huge!) work of implementing them manually.

ma2rten · on Feb 7, 2017

When you are doing Machine Learning you should always have a simple baseline and only use more complicated algorithms when they improve over your baseline.

demonshalo · on Feb 7, 2017

That's my #3 question in a nutshell. And this seems to be a rather good strategy imo. However, the way everyone is talking about ML makes it seem as a diamond bullet to solve all that is holy and sacred!

traek · on Feb 7, 2017

> 2. if deep nlp can provide us with more efficiency than what is stated above then wouldn't it be safe to assume that is UNREASONABLY efficient?

Why? Neural nets can already detect skin cancer as well as human dermatologists [1]. Why would you assume that your algorithm is the peak of efficiency and anything that performs better is "unreasonable"?

[1] https://news.ycombinator.com/item?id=13484372

ska · on Feb 7, 2017

   Neural nets can already detect skin cancer as well as human dermatologists

For what it is worth, that statement is way too strong for what the linked article and paper show.

demonshalo · on Feb 7, 2017

I didn't say that I assume mine to be the peak. My intention was to point out that any efficiency that could be reached beyond what is statistically possible (i.e. if you only rely on statistical metrics and parsing), could be considered unreasonably efficient. At least, there is an argument for that to be made.

My method should in no way, shape or form be considered as a "peak".

ousta · on Feb 8, 2017

regarding 3. that is what we do. basically we re an NLP company that evolved into something different and we leverage the NLP parsing to do ML on enriched data