Commit Graph

  • cb9bacecdc Not sure where it's getting the incorrect path from... try to print that out John Bauer 2024-06-04 01:26:32 -07:00
  • ef31e6bb24 Add a test of the icepahc operations Stanza will use to prepare the Icelandic treebank John Bauer 2024-05-17 20:49:07 -07:00
  • bb4d17f5b9 Add a bit of error checking to RelabelNode to make an error a bit less inscrutable John Bauer 2024-05-17 20:34:58 -07:00
  • 375f24338c Update to 4.5.8 - no actual changes, just path changes John Bauer 2024-05-17 14:35:44 -07:00
  • 53a8f1b6aa Rehome the conll-2012 scorer. Not sure if the 2011 scorer can be recovered easily yet John Bauer 2024-05-06 08:57:18 -07:00
  • f736369561 Update links to 4.5.7 now that it is on Maven John Bauer 2024-04-29 23:55:26 -07:00
  • 54aaf2c41f Update a unittest output - DATE no longer being detected for some reason John Bauer 2024-04-28 00:43:07 -07:00
  • 246007929c Update to v4.5.7 - improved dependency converter, F1 scores for individual conparse trees v4.5.7 John Bauer 2024-04-18 19:20:27 -07:00
  • a76a854ce2 Find fronted 'said' ccomps, with a few negative detections to avoid other likely tree structures which are similar but aren't actually fronted ccomp - see comments John Bauer 2024-04-18 16:02:06 -07:00
  • c5ba427883 remove json-simple from pom Takeo Sawada 2024-04-13 16:34:26 +08:00
  • 5e57eaba40 Adding PTB Corrector as an option reduces the total validation errors in the PTB conversion to dependencies by about 250. Weirdly this is by removing 280 syntax errors and adding 40 morpho errors for aux verbs. Presumably those should be fixable. Of course, there is always more that can be done - there are now 2622 errors left when using the converter. https://github.com/UniversalDependencies/docs/issues/717 John Bauer 2024-03-20 08:47:51 -07:00
  • 1f11e3fb0a Add the ability to EnglishPTBTreebankCorrector to use it as a TreeTransformer with a single transformation, rather than on an entire treebank John Bauer 2024-03-20 01:21:28 -07:00
  • aef5e36f46 Not sure how this didn't blow up in the past 10 years... perhaps tsurgeon is more restrictive about not having the labels in the query now John Bauer 2024-03-20 00:41:49 -07:00
  • 25136768ee Don't CC underneath an MWE, since those are deliberately expected to be FIXED. Improves 'all but' and 'whether or not' https://github.com/UniversalDependencies/docs/issues/717 John Bauer 2024-03-14 23:53:31 -07:00
  • 72918d9537 Add a no-self-loop rule to the very general VP < VP < other verb AUX rule in UniversalEnglishGrammaticalRelations. This doesn't change any trees in the PTB Train section, but does help the tag updater not hit false positives for the AUX rules https://github.com/UniversalDependencies/docs/issues/717 John Bauer 2024-03-14 08:22:45 -07:00
  • f3288a8e1d Technically the UniversalPOSMapper needs to use the UniversalSemanticHeadFinder, in case any of its POS mapping rules use # John Bauer 2024-03-14 08:21:39 -07:00
  • 7eeca858ae Make the tsurgeon debug mode noisier John Bauer 2024-03-14 08:11:08 -07:00
  • 15bcfb3420 Gottem John Bauer 2024-03-14 07:38:00 -07:00
  • 396741e4bd Run a TreeTransformer that gets rid of the functional tags other than TMP on the NPs before using the POS conversions. This also greatly reduces the number of validation errors, especially thanks to the AUX rules now matching for NP-stuff whereas before it would not match those John Bauer 2024-03-14 02:41:39 -07:00
  • ce33462fa0 Add a DEBUG flag to the UniversalPOSMapper which outputs the automatically generated tregex/tsurgeons John Bauer 2024-03-14 02:14:59 -07:00
  • 30f2f8e7d9 Update the UniversalPOSMapper to use AUX for a large chunk of the dependencies by reusing the patterns from UniversalEnglishGrammaticalRelations to find those words. Currently it is finding more than it should, but the error rate is significantly lower than it is without this change John Bauer 2024-03-14 02:03:10 -07:00
  • 7f70ad8483 The root of a tregex expression now keeps track of what variables it knows about John Bauer 2024-03-14 00:17:37 -07:00
  • 79833b400f Split the context mappings into two separate arrays - this gives us a place to reuse the AUX mappings from UniversalEnglishGrammaticalRelations John Bauer 2024-03-13 23:14:37 -07:00
  • 75e8b886e5 Whitespace John Bauer 2024-03-13 23:10:47 -07:00
  • bc72bce35c Simplify all the 1-1 tag mappings in UniversalPOSMapper John Bauer 2024-03-13 22:32:01 -07:00
  • 6fa8d4d99b Refactor somewhat - move all the context-sensitive mappings to their own construction John Bauer 2024-03-13 21:58:50 -07:00
  • c53fa2dfb4 Initial version: turn the UniversalPOSMapper tsurgeon file into code. This will allow us to easily programmatically reuse the AUX rules, which currently do not match the UD standard for UPOS John Bauer 2024-03-13 21:06:27 -07:00
  • fd6800faad Tiny test of the UniversalPOSMapper John Bauer 2024-03-13 18:32:43 -07:00
  • 731fc8ed49 Updated the combiner to mark a couple known lemmas John Bauer 2024-03-13 17:30:17 -07:00
  • 850e5888c9 Add lemmas to a few of the MWTs that we combine for English. A few others are still TODO, such as the n't 'll etc suite John Bauer 2024-03-13 16:45:16 -07:00
  • 1dd746cfea Treat 'dinna' as an MWT in the converter John Bauer 2024-03-13 00:22:33 -07:00
  • 541d3ec886 Fix doc error John Bauer 2024-03-12 23:09:19 -07:00
  • 6501bafa54 Oops, the flat needs to include the possible NN tag as part of its target John Bauer 2024-02-23 19:28:45 -08:00
  • cb338cd57f Add a FLAT relation for phrases such as 'en masse' which aren't considered a FIXED or MWT expression. Addresses a tiny part of https://github.com/UniversalDependencies/docs/issues/717 John Bauer 2024-02-23 19:15:05 -08:00
  • bc4acf11d1 Can treat 'sort of' the same as 'kind of' when converting constituency trees to dependencies. https://github.com/UniversalDependencies/docs/issues/717 John Bauer 2024-02-23 00:14:27 -08:00
  • a68ac4bb21 Add a sent_id to the UD Converter, since the validation script used in UD tools seems to throw a fit over not having a sent_id John Bauer 2024-02-22 23:47:46 -08:00
  • 2725b06fa9 Add the f1 scores for each tree to the parser protobuf responses John Bauer 2024-02-18 12:11:44 -08:00
  • a80772d97f Will need to get all of the F1 scores for an entire treebank in order to pick silver trees using the variance John Bauer 2024-02-15 22:51:40 -08:00
  • 7180285a15 Add some doc to EvaluateTreebank John Bauer 2024-02-15 21:31:07 -08:00
  • 3499d27e61 Document the output of the release script from the last few releases John Bauer 2024-02-01 14:05:25 -08:00
  • aa9dd0f41f Update links from 4.5.5 to 4.5.6 John Bauer 2024-02-01 12:54:35 -08:00
  • 71bc2568d8 Update jar links for a new version v4.5.6 John Bauer 2024-01-31 20:18:46 -08:00
  • 8c7a2eeded Update pom & readme files for CoreNLP 4.5.6 John Bauer 2024-01-31 19:56:45 -08:00
  • 39866df123 Update mailing list notes and add a link to the Semgrex client on the Tregex page John Bauer 2024-01-29 22:46:45 -08:00
  • 6d17c2390b Fix (some of?) the crashes that occur if you delete the leftmost, rightmost, or rootmost nodes. I'm the guy who tests his code. You must be the other guy. Addresses https://github.com/stanfordnlp/CoreNLP/issues/1405 John Bauer 2024-01-11 16:30:38 -08:00
  • 76b024af5f Update all links from http to https John Bauer 2024-01-03 17:18:06 -08:00
  • 87563a3682 Minor code cleanup; no functional changes Christopher Manning 2024-01-03 16:17:03 -08:00
  • c61b9a9008 Trivial code cleanup. Christopher Manning 2024-01-03 15:43:52 -08:00
  • c33bcce963 Maybe Chrome won't nanny reject https links John Bauer 2024-01-02 21:56:41 -08:00
  • e0488055ea Add an option to make the srparser the default server model John Bauer 2023-12-11 00:03:01 -08:00
  • cfff2c348d Update maven links to reflect the latest release John Bauer 2023-12-10 22:42:29 -08:00
  • 8adcbfe67f A few lemmatizer updates: enroll and appall instead of enrol or appal, add de- as a verb prefix (presumably doesn't break any exceptions), add blog and xfer as other double letter exceptions John Bauer 2023-12-08 16:58:06 -08:00
  • 2dd08da9de Add cowrite as another word in the write family of lemmas John Bauer 2023-12-08 16:13:09 -08:00
  • 9b5bec8919 Special case for elder/eldest lemmas John Bauer 2023-12-05 08:16:52 -08:00
  • c26b25e118 Add an Ssurgeon operation which adds an (English only) lemma to text John Bauer 2023-12-04 14:55:04 -08:00
  • d302c6394b Minor javadoc fix John Bauer 2023-12-04 14:31:41 -08:00
  • 9103adb279 Update some incorrect doc John Bauer 2023-12-04 14:23:13 -08:00
  • ad37f2acfa Add #number as an allowed continuation after ABBREV3. Addresses https://github.com/stanfordnlp/CoreNLP/issues/1396 John Bauer 2023-11-16 13:19:48 -08:00
  • 2852da8b1e Add Yazidi as a demonym - https://en.wikipedia.org/wiki/Yazidis John Bauer 2023-11-28 14:44:28 -08:00
  • e5c9d44398 Remove another use of yield, this time from CountTrees John Bauer 2023-11-22 23:39:52 -08:00
  • b084233fd6 Proposed removal of new keyword yield from one file... see if it clears up that compile error John Bauer 2023-11-22 23:38:47 -08:00
  • 6a0c697939 fix typo John Bauer 2023-11-15 21:03:38 -08:00
  • e93dbb278e Clean up a bunch of the FAQ John Bauer 2023-11-15 20:10:14 -08:00
  • 4f3fc5fd00 Update through 'what other constraints' John Bauer 2023-11-15 13:53:46 -08:00
  • b47cf78fda Center columns in the scenegraph page John Bauer 2023-11-15 13:46:29 -08:00
  • 5b433a6ac4 Redo the formatting on the NER page John Bauer 2023-11-15 13:45:24 -08:00
  • d9f874c2af Update formatting of the classifier page John Bauer 2023-11-15 13:43:28 -08:00
  • 587419336d Center columns John Bauer 2023-11-15 13:40:53 -08:00
  • b556d88575 Redo formatting of the tregex release table John Bauer 2023-11-15 13:39:05 -08:00
  • e572f6fa85 Remove these really ugly NBSPs John Bauer 2023-11-15 13:30:20 -08:00
  • 03880af49f First attempt at non-breaking dash didn't work John Bauer 2023-11-15 13:27:54 -08:00
  • 9c1b6e540d Center a couple columns John Bauer 2023-11-15 13:19:42 -08:00
  • 449305f637 Updated dashes to non-breaking in the lexparser page John Bauer 2023-11-15 13:18:26 -08:00
  • fbf4b78816 Fix command line John Bauer 2023-11-15 13:15:58 -08:00
  • 8d26c5a649 Update download links for lex-parser history John Bauer 2023-11-15 13:15:09 -08:00
  • 1329caaf19 Fix rows John Bauer 2023-11-15 13:12:53 -08:00
  • 4f55ebbe2c Begin updating the core lex-parser page John Bauer 2023-11-15 13:10:48 -08:00
  • 53a1d63a5c Update link to segmenter on NER page John Bauer 2023-11-15 12:54:26 -08:00
  • e11041113c Fix links & small errors John Bauer 2023-11-15 07:44:58 -08:00
  • 947e8848f1 Update links and remove the 'contact' section from the the Segmenter portion of the doc John Bauer 2023-11-15 07:44:39 -08:00
  • 2a76ed7956 Update links for NERDemo, the tagger, and where to file issues on the NER FAQ John Bauer 2023-11-11 08:42:59 -08:00
  • 92f163a773 Change tregex ppt links to point to /u/downloads John Bauer 2023-11-11 08:34:31 -08:00
  • 40ef99e900 Add doc for <-- etc semgrex relations John Bauer 2023-11-10 07:42:04 -08:00
  • b9f19a67c0 Add 'tis and 'twas to the MWT combiner John Bauer 2023-11-08 21:43:57 -08:00
  • 0de435b9b9 Add MWTs to English trees if given a flag in UniversalDependenciesConverter John Bauer 2023-11-08 20:45:55 -08:00
  • 0cdb71a654 The rest of the original NNDep page... can probably be merged with https://stanfordnlp.github.io/CoreNLP/pytorch-depparse.html John Bauer 2023-11-08 15:03:40 -08:00
  • 9d9983177a Add skeletal versions of the POS pages John Bauer 2023-11-08 14:57:05 -08:00
  • ee138b3f84 Remove unnecessary header John Bauer 2023-11-08 14:54:39 -08:00
  • 549a191b63 Improve formatting, give the Arabic parser faq a unique name John Bauer 2023-11-08 14:50:48 -08:00
  • f7c1435355 Add an initial checkin of the SRParser page conversion John Bauer 2023-11-08 14:36:40 -08:00
  • 86e09b0df1 Update User Guide heading John Bauer 2023-11-08 00:17:13 -08:00
  • 8a44dd2cfb Need a unique filename for the Arabic FAQ John Bauer 2023-11-08 00:16:25 -08:00
  • 9af3b0d26e Initial checkin of the Segmenter pages John Bauer 2023-11-07 23:58:58 -08:00
  • af38b77f3b Initial conversions of lexparser & a couple of its FAQ pages John Bauer 2023-11-07 23:55:50 -08:00
  • 82f52f3d2e Update some SceneGraph links John Bauer 2023-11-06 14:52:03 -08:00
  • 053d647e64 Update a couple more links in the CRF FAQ John Bauer 2023-11-06 14:45:19 -08:00
  • c79a350613 Update some FAQ links John Bauer 2023-11-06 14:38:05 -08:00
  • d238d401d5 Update download links John Bauer 2023-11-06 14:34:05 -08:00
  • 72949779b6 Add Ssurgeon to the keywords to search for on the tregex page John Bauer 2023-11-06 10:21:17 -08:00
  • df55661bef Formatting updates John Bauer 2023-11-04 20:50:13 -07:00