Commit Graph

  • f3321edd04 Revise documentation hankcs 2021-10-16 19:30:12 -04:00
  • 69a660ed9c Coreference resolution RESTful APIs released hankcs 2021-10-16 16:59:57 -04:00
  • f426ab2162 Remove data and batch_size arguments from component hankcs 2021-10-16 15:50:09 -04:00
  • 61cc753a82 清理代码 hankcs 2021-10-15 12:50:02 -04:00
  • 7e4d1c2795 Revise documentation hankcs 2021-10-12 10:37:07 -04:00
  • 66802dadd8 Remove batch_size in predict of MTL, use sampler_builder instead fix https://github.com/hankcs/HanLP/issues/1688 hankcs 2021-10-12 10:16:56 -04:00
  • ddac451b73 Revise documentation hankcs 2021-09-20 09:28:50 -04:00
  • 47a9e5c887 Fix bug in span_ranking_srl w/ gold predicates hankcs 2021-09-18 14:59:44 -04:00
  • 194940ab8b Merge remote-tracking branch 'origin/dev' into dev hankcs 2021-09-18 14:59:53 -04:00
  • a94d9d8945 Merge pull request #1684 from yzhangcs/dev hankcs 2021-09-18 00:42:33 -04:00
  • 49a4c989ea Correct the order of the returned P/R values Yu Zhang 2021-09-18 12:30:11 +08:00
  • f62adfa98e Merge pull request #1682 from yzhangcs/dev hankcs 2021-09-17 01:37:38 -04:00
  • 4102e07936 Correct the seeding behavior for seed=0 Yu Zhang 2021-09-17 11:04:14 +08:00
  • 7f94c4e022 Ensure python command is Python2 before running CoNLL scripts hankcs 2021-09-15 21:43:08 -04:00
  • fd432af13f Fix con issue on Windows fix https://github.com/hankcs/HanLP/issues/1679 hankcs 2021-09-14 13:44:01 -04:00
  • dd376ed264 Revise documentation hankcs 2021-09-04 23:27:31 -04:00
  • 3602e3553a Release ERNIE-Gram Chinese MTL model which excels in span tasks. hankcs 2021-09-04 15:07:14 -04:00
  • bb6675dbfc Support whether to calculate effective number of tokens when applying the batch_max_tokens hankcs 2021-09-04 15:07:04 -04:00
  • 04e579b841 Improve error handling hankcs 2021-09-02 15:01:58 -04:00
  • 6255c7b450 Fix redundant config issue: https://bbs.hankcs.com/t/topic/4038/10 hankcs 2021-09-02 15:01:45 -04:00
  • 560dfb5009 Revise documentation hankcs 2021-09-01 12:39:15 -04:00
  • c4bfa9ac44 Add a handy tokenize method for Java RESTful hankcs 2021-09-01 12:26:33 -04:00
  • 36459a2897 Use / on Windows instead of \\ hankcs 2021-08-28 15:36:29 -04:00
  • be7e5fa8d2 Move tf and torch utils to the right places hankcs 2021-08-28 15:26:33 -04:00
  • c1a647bc6b Revise documentation hankcs 2021-08-28 14:36:02 -04:00
  • 4befbd1605 Enable sdp as the first task hankcs 2021-08-28 13:41:37 -04:00
  • 363d0b0760 Do not allow any transition when parse with empty trie fix https://github.com/hankcs/AhoCorasickDoubleArrayTrie/issues/50 hankcs 2021-08-13 10:18:41 -04:00
  • 7b03824e39 Merge pull request #1674 from tiandiweizun/1.x hankcs 2021-08-24 11:18:11 -04:00
  • a9997d8d80 DoubleArrayTrie里的LongestSearcher的next方法需要进行强化,当传入的treemap的value为null时,会引发bug,可以根据index或者length字段判断。 tiandi 2021-08-24 17:18:28 +08:00
  • ee1effb3a9 Revise documentation hankcs 2021-08-22 14:55:30 -04:00
  • 8c1881ea15 Improve error log hankcs 2021-08-16 22:46:27 -04:00
  • 9ae1498417 调整莎=sha1,suo1 fix https://github.com/hankcs/HanLP/issues/1670 hankcs 2021-08-11 12:35:42 -04:00
  • 2aa169f192 Release a couple of new single task models hankcs 2021-08-07 16:16:38 -04:00
  • e2ba85dd27 Revise documentation hankcs 2021-08-01 14:16:04 -04:00
  • b494b1a2a0 Warn windows users that some training scripts are not supported. hankcs 2021-08-01 14:00:31 -04:00
  • 3374ab4767 Release semantic textual similarity API hankcs 2021-07-31 20:42:49 -04:00
  • b3b449cbc1 Revise documentation hankcs 2021-07-29 12:29:54 -04:00
  • d112862eb4 Don't use GPU when CUDA is not installed hankcs 2021-07-29 10:56:47 -04:00
  • e8a8ae77bb Fix pre-processing script for OntoNotes5 Chinese hankcs 2021-07-29 10:27:04 -04:00
  • b124b2d36a Warn windows users that some training scripts are not supported. hankcs 2021-07-28 21:40:32 -04:00
  • 5bad513029 Release text style transfer API hankcs 2021-07-28 20:14:07 -04:00
  • 8fa5a7f857 Provide mirrors for transformer models hankcs 2021-07-23 18:00:42 -04:00
  • a880b644e6 Revise documentation hankcs 2021-07-08 22:56:35 -04:00
  • cbf6ac0ad2 Fix HuggingFace tokenizer_config.json error on some Windows versions https://bbs.hankcs.com/t/topic/3878 hankcs 2021-07-06 13:01:17 -04:00
  • 7b4510dac8 Test on ubuntu-latest, macos-latest, windows-latest hankcs 2021-07-06 12:19:20 -04:00
  • 1d74fca6c2 Revise documentation hankcs 2021-06-30 13:32:22 -04:00
  • 6d65332faa Fix multi_label config in transformer_classifier_tf fix https://github.com/hankcs/HanLP/issues/1661 hankcs 2021-06-30 10:48:24 -04:00
  • eb7412caa5 Fix task scheduling when tasks='dep', skip_tasks='tok*' hankcs 2021-06-21 13:01:49 -04:00
  • babc1e402a Portable同步升级到v1.8.2 hankcs 2021-06-18 13:17:10 -04:00
  • 24ccd6ebbb Merge branch '1.x' into portable hankcs 2021-06-18 13:16:48 -04:00
  • 6b89f39d11 :checkered_flag:常规维护与准确率提升;小版本+1,发布v1.8.2 v1.8.2 hankcs 2021-06-18 13:10:23 -04:00
  • 8ee039b68b 支持禁用自动刷新词典缓存(CustomDictionaryAutoRefreshCache=false)fix https://github.com/hankcs/HanLP/issues/1655 hankcs 2021-06-17 22:10:28 -04:00
  • 1cfc1ec5e2 Pass verbose to to() hankcs 2021-06-12 00:25:24 -04:00
  • 61631b02ff 改进 HMM 采样函数 https://bbs.hankcs.com/t/topic/136/64?u=hankcs hankcs 2021-06-10 14:34:20 -04:00
  • c0d75697b2 Fix devices error hankcs 2021-06-07 21:18:09 -04:00
  • eceeb48073 Fix sliding window in transformer tagger hankcs 2021-06-09 02:51:13 -04:00
  • 3a99bc6d89 调整公式,维特比分词准确率从94.49提升至94.69 https://bbs.hankcs.com/t/topic/136/61?u=hankcs hankcs 2021-06-08 19:35:52 -04:00
  • 99548e74c8 修复CoreDictionary的reload方法 hankcs 2021-06-07 11:36:08 -04:00
  • 2f51659dde Pass save_dir to build_model hankcs 2021-06-05 19:54:53 -04:00
  • 337b4d8985 Pass save_dir to on_config_ready hankcs 2021-06-05 19:38:51 -04:00
  • e491a6ff69 Use AutoTokenizer_ instead hankcs 2021-06-03 17:32:44 -04:00
  • e7b3fbcb27 Use sampler_builder during decoding for tok hankcs 2021-06-03 16:01:27 -04:00
  • 8b643ef601 Avoid re-downloading transformers for tok hankcs 2021-06-03 16:00:01 -04:00
  • e8f62eece6 Release a new tok model COARSE_ELECTRA_SMALL_ZH hankcs 2021-06-03 11:25:58 -04:00
  • e7d3c4dd90 Provide mirrors for xlm-roberta-base and bert-base-japanese-char hankcs 2021-06-02 22:07:59 -04:00
  • 9e45880dd4 Update the mul XLMR model to properly tokenize apostrophe hankcs 2021-06-02 21:41:25 -04:00
  • b95c6109ca Add a output_spans to the config of tokenizers hankcs 2021-06-02 20:10:51 -04:00
  • 41d8654bf3 Update the STS model hankcs 2021-05-29 01:25:04 -04:00
  • e7a0a27f0f Support HTML visualization in Jupyter notebooks hankcs 2021-05-26 16:03:06 -04:00
  • 309cf4a8fc Remove debugging codes hankcs 2021-05-25 13:35:46 -04:00
  • a3f9d02626 修订bigram模型 hankcs 2021-05-24 22:22:55 -04:00
  • 979a044820 Avoid re-downloading Electra model hankcs 2021-05-24 16:26:18 -04:00
  • 9bb2a0b170 Revise documentation hankcs 2021-05-24 16:09:20 -04:00
  • 23d0cab759 Support flat data for STS hankcs 2021-05-24 16:03:04 -04:00
  • 879c0e0bca A simple supervised STS baseline hankcs 2021-05-21 10:44:59 -04:00
  • f1c1c71a1e Fix edge case that input str is removed by hugging face tokenizers fix https://github.com/hankcs/HanLP/issues/1651#issuecomment-845737681 hankcs 2021-05-21 11:14:52 -04:00
  • 40e59ed6eb Use GitHub Actions for CI and deployment hankcs 2021-05-20 14:07:06 -04:00
  • 9a9de4dc00 Remove iwpt eval scripts hankcs 2021-05-20 14:01:03 -04:00
  • 02485842b7 Revise hint messages hankcs 2021-05-20 13:14:50 -04:00
  • ca33f6cbd4 Mirror transformers for faster access hankcs 2021-05-20 13:04:38 -04:00
  • 1bc33ac552 Fix subword tokenization on mojibake hankcs 2021-05-18 01:51:45 -04:00
  • 19fcda3dec Revise demo hankcs 2021-05-18 00:11:34 -04:00
  • 4fd9660328 Suppress warning hankcs 2021-05-18 00:02:53 -04:00
  • ea55bda4d9 Use GitHub Actions for CI hankcs 2021-05-20 13:58:53 -04:00
  • faea3fa9ad Release a Japanese joint model trained on NPCMJ/UD/Kyoto corpora with encoders including tok, pos, ner, dep, con, srl. hankcs 2021-05-17 22:39:01 -04:00
  • 3764f7c6ab Revise documentation hankcs 2021-05-17 00:12:46 -04:00
  • b1f3ac9ecf Save loading time for unittest hankcs 2021-05-16 21:45:13 -04:00
  • afc4e3f7e7 Recall characters removed by the BERT tokenizer, fix https://github.com/hankcs/HanLP/issues/1651 hankcs 2021-05-16 21:41:22 -04:00
  • 8eba90ff49 Revise documentation hankcs 2021-05-16 21:41:15 -04:00
  • 1632955d3f 修订简繁映射表 hankcs 2021-05-14 14:46:14 -04:00
  • 770671cd0d Support the NPCMJ (NINJAL Parsed Corpus of Modern Japanese) hankcs 2021-05-12 23:24:22 -04:00
  • 759964d543 Report system info for convenience hankcs 2021-05-09 22:58:13 -04:00
  • b5a02247c2 Revise documentation hankcs 2021-05-05 22:13:49 -04:00
  • edc5185fed cprint supports file argument hankcs 2021-05-01 15:59:04 -04:00
  • 4eaf7eef5e Improve the merge rule for NER dict_whitelist hankcs 2021-04-29 11:30:23 -04:00
  • 7e36bc4369 Fix AttributeError: 'ProgbarLogger' object has no attribute 'params' for tf2.3.0 hankcs 2021-04-13 10:54:47 -04:00
  • 4335d55ab0 Add utility iobes_to_bilou hankcs 2021-04-06 13:21:01 -04:00
  • de84655c5a Improve visualization: if the root label is shorter than the level number, extend it to the same length hankcs 2021-04-06 13:31:19 -04:00
  • 528c75a041 Revise documentation hankcs 2021-03-31 18:05:54 -04:00
  • 1696479912 lve4的声母修正为ve fix https://github.com/hankcs/HanLP/issues/1644 hankcs 2021-04-17 11:28:27 -04:00