Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apologies, could you add yet more clarity? According to the blog post your results are:

Parser Accuracy Speed (w/s) Language LOC

Stanford 89.6% 19 Java > 50,000[1]

parser.py 89.8% 2,020 Python ~500

Redshift 93.6% 2,580 Cython ~4,000

Are these the labeled parsing results you are referring to? How many sents/sec? Using same PTB data sets as Zhang and Nivre '11?



Those are Unlabelled Accuracy score results, but Redshift is quietly computing the dependency labels, while parser.py does not. Running Redshift in unlabelled mode gives very fast parse times, but about 1% less accuracy. The labels are really good, both as features and as a way to divide the problem space.

The data sets are the _Stanford_ labels, where the main results in Zhang and Nivre refer to MALT labels. Z&N do provide a single Stanford accuracy in their results, of 93.5% UAS.

Sentences per second should be just over 100. I use k=8 with some extra features referring to words further down the stack, where Z&N use k=64. Right at the bottom of the post, you can find the commit SHA and the commands for the experiment.


Hi, how does this parser compare to clearnlp? it's supposed to be also super fast and very accurate to?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: