CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
Adding PTB Corrector as an option reduces the total validation errors in the PTB conversion to dependencies by about 250. Weirdly this is by removing 280 syntax errors and adding 40 morpho errors for aux verbs. Presumably those should be fixable. Of course, there is always more that can be done - there are now 2622 errors left when using the converter. https://github.com/UniversalDependencies/docs/issues/717
J
John Bauer committed
5e57eaba40897ee93b69ed3f11bda511f6b427d8
Parent: 1f11e3f