Commit Graph

  • 89cce50f8c Update llama.cpp Andrei Betlen 2024-01-18 21:21:49 -05:00
  • b8fc1c7d83 feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files. Andrei Betlen 2024-01-18 21:21:37 -05:00
  • 48c3b77e6f Offload KQV by default Andrei Betlen 2024-01-18 11:08:57 -05:00
  • 850416ae82 Merge branch 'main' into batch-processing batch-processing Andrei Betlen 2024-01-18 08:49:00 -05:00
  • 6f08021280 Cleanup pyproject Andrei Betlen 2024-01-17 09:48:46 -05:00
  • 6bfe98bd80 Integration of Jinja2 Templating (#875) Austin 2024-01-17 09:47:52 -05:00
  • 52adc23115 Update llama.cpp Andrei Betlen 2024-01-17 09:27:40 -05:00
  • 7b46bb5a78 Re-order classes in llama.py Andrei Betlen 2024-01-17 09:16:13 -05:00
  • cc4630e66f Move helper classes to _internals submodule Andrei Betlen 2024-01-17 09:14:00 -05:00
  • 3b92419132 Move cache classes to llama_cache submodule. Andrei Betlen 2024-01-17 09:09:12 -05:00
  • 6981597835 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-01-16 19:35:59 -05:00
  • d5dbb3f8de Update llama.cpp Andrei Betlen 2024-01-16 19:35:57 -05:00
  • 84380fe9a6 Add llamaindex integration to readme (#1092) Jerry Liu 2024-01-16 16:10:50 -08:00
  • 9c36688b33 fix(cli): allow passing n_ctx=0 to openAI API server args to use model n_ctx_train field per #1015 (#1093) Kyle Mistele 2024-01-16 17:54:06 -06:00
  • cfb7da98ed Support Accept text/event-stream in chat and completion endpoints, resolves #1083 (#1088) anil 2024-01-16 11:52:52 -06:00
  • e39778f8eb Update llama.cpp Andrei Betlen 2024-01-16 11:56:44 -05:00
  • e7ef07db96 Merge branch 'batch-processing' of github.com:abetlen/llama_cpp_python into batch-processing Andrei Betlen 2024-01-15 17:50:36 -05:00
  • 358593fc9e Merge branch 'main' into batch-processing Andrei Betlen 2024-01-15 17:50:26 -05:00
  • 4b11fa83c0 Bump version v0.2.29 Andrei Betlen 2024-01-15 12:54:51 -05:00
  • 84615adbc6 Add split_mode option. Closes #1085 Andrei Betlen 2024-01-15 12:49:20 -05:00
  • 76aafa6149 Implement GGUF metadata KV overrides (#1011) Phil H 2024-01-15 17:29:29 +00:00
  • 7eff42c239 Avoid "LookupError: unknown encoding: ascii" when open() called in a destructor (#1012) yieldthought 2024-01-15 16:52:10 +01:00
  • 1eaace8ea3 Fix low_level_api_chat_cpp example to match current API (#1086) anil 2024-01-15 09:46:35 -06:00
  • c689ccc728 Fix Pydantic model parsing (#1087) Mark Neumann 2024-01-15 07:45:57 -08:00
  • 5502ac8876 Update llama.cpp Andrei Betlen 2024-01-15 10:12:10 -05:00
  • 359ae73643 Update llama.cpp Andrei Betlen 2024-01-14 08:17:22 -05:00
  • 7c898d5684 Update llama.cpp Andrei Betlen 2024-01-13 22:37:49 -05:00
  • 7a1c2b5d2e Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into batch-processing Andrei Betlen 2024-01-12 02:18:21 -05:00
  • bb610b9428 Update llama.cpp Andrei Betlen 2024-01-11 22:51:12 -05:00
  • 7f4ba48ada Use sampling context Andrei Betlen 2024-01-10 08:29:54 -05:00
  • 456a601842 Merge branch 'main' into batch-processing Andrei Betlen 2024-01-10 03:19:11 -05:00
  • f0159663d9 Bump version v0.2.28 Andrei Betlen 2024-01-10 02:51:17 -05:00
  • df3be58d6c Add ability to pass in penalize_nl param (#1068) Stephen Hankinson 2024-01-10 03:46:27 -04:00
  • 2ddce7294e print_grammar to stderr (#1052) Joseph Turian 2024-01-10 02:46:03 -05:00
  • 431cb3ec81 Update llama.cpp Andrei Betlen 2024-01-09 15:32:39 -05:00
  • 1ae05c102b Update llama.cpp Andrei Betlen 2024-01-08 14:51:29 -05:00
  • 142a9e1bc3 Update llama.cpp Andrei Betlen 2024-01-05 16:20:50 -05:00
  • e1cd61ed91 Fix #1038 Andrei Betlen 2024-01-05 04:57:57 -05:00
  • b1e996219c Merge branch 'main' into batch-processing Andrei Betlen 2024-01-05 04:09:28 -05:00
  • 75d0527fd7 Bump version v0.2.27 Andrei Betlen 2024-01-04 18:30:12 -05:00
  • fffcd0181c Update llama.cpp Andrei Betlen 2024-01-04 18:26:00 -05:00
  • 907b9e9d42 Add Saiga chat format. (#1050) Fedor Moiseev 2024-01-05 06:12:58 +07:00
  • f766b70c9a Fix: Correct typo in README.md (#1058) Caleb Hoff 2024-01-04 17:12:32 -06:00
  • cf743ec5d3 Added ChatGLM chat format (#1059) xaviviro 2024-01-05 00:12:02 +01:00
  • eb9c7d4ed8 Update llama.cpp Andrei Betlen 2024-01-03 22:04:04 -05:00
  • 011c3630f5 Bump version v0.2.26 Andrei Betlen 2023-12-27 17:35:02 -05:00
  • 969ea6a2c0 Update llama.cpp Andrei Betlen 2023-12-27 17:33:26 -05:00
  • f952d45c2c Update llama.cpp Andrei Betlen 2023-12-24 01:34:36 -05:00
  • f6f157c06d Update bug report instructions for new build process. Andrei Betlen 2023-12-22 15:35:51 -05:00
  • 92284f32cb Add HIP_PATH to dll search directories for windows users. Andrei Betlen 2023-12-22 15:29:56 -05:00
  • 2b0d3f36fa set llama_max_devices using library function Andrei Betlen 2023-12-22 15:19:28 -05:00
  • d9a1d90fd7 Fix typo Andrei Betlen 2023-12-22 15:12:27 -05:00
  • 37556bf9c4 Bump version v0.2.25 Andrei Betlen 2023-12-22 14:55:58 -05:00
  • 6d8bc090f9 fix: inccorect bindings for kv override. Based on #1011 Andrei Betlen 2023-12-22 14:52:20 -05:00
  • f4be84c122 Fix typo Andrei Betlen 2023-12-22 14:40:44 -05:00
  • 9b3a5939f3 docs: Add multi-model link to readme Andrei Betlen 2023-12-22 14:40:13 -05:00
  • 522aecb868 docs: add server config docs Andrei Betlen 2023-12-22 14:37:24 -05:00
  • 6473796343 Update llama.cpp Andrei Betlen 2023-12-22 14:10:34 -05:00
  • 15ee2106f6 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2023-12-22 14:05:26 -05:00
  • 4b01a873ef server: Support none defaulting to infinity for completions (#111) swg 2023-12-22 14:05:13 -05:00
  • 99ff175562 Check if completion_tokens is none in error handler. Andrei Betlen 2023-12-22 13:41:06 -05:00
  • 12b7f2f4e9 [Feat] Multi model support (#931) Dave 2023-12-22 11:51:25 +01:00
  • 4a85442c35 Update llama.cpp Andrei Betlen 2023-12-22 00:12:37 -05:00
  • 2f03fb0231 fix text_offset of multi-token characters (#1037) twaka 2023-12-22 14:03:29 +09:00
  • 33cc623346 Implement openai api compatible authentication (#1010) docmeth02 2023-12-21 19:44:49 +01:00
  • 788394c096 Update llama.cpp Andrei Betlen 2023-12-21 13:16:46 -05:00
  • ffceb772d1 Update llama.cpp Andrei Betlen 2023-12-19 17:05:40 -05:00
  • a05b4da80a fix: float32 is not JSON serializable when streaming logits. Andrei Betlen 2023-12-18 18:40:36 -05:00
  • fcbd177c95 Fix logits are not json serializable Andrei Betlen 2023-12-18 18:38:04 -05:00
  • a625412a74 Merge branch 'main' into batch-processing Andrei Betlen 2023-12-18 18:37:23 -05:00
  • abda047284 Update changelog Andrei Betlen 2023-12-18 18:16:17 -05:00
  • 7df6c32544 Fix type annotations Andrei Betlen 2023-12-18 18:14:53 -05:00
  • b703aad79e Fix type annotation Andrei Betlen 2023-12-18 18:13:37 -05:00
  • d0aedfcff6 Fix type annotation Andrei Betlen 2023-12-18 18:12:49 -05:00
  • 2993936b10 Fix ctypes definitions of llama_kv_cache_view_update and llama_kv_cache_view_free. (#1028) Eduard Christian Dumitrescu 2023-12-18 18:11:26 -05:00
  • 5e863d8a3b Bump version v0.2.24 Andrei Betlen 2023-12-18 16:09:18 -05:00
  • cfd698c75c Update low_level_api_llama_cpp.py to match current API (#1023) Jonathan Soma 2023-12-18 15:59:11 -05:00
  • 095c650006 Add offload_kqv option to llama and server Andrei Betlen 2023-12-18 15:36:09 -05:00
  • 472b344ae3 Remove unnused import Andrei Betlen 2023-12-18 15:32:40 -05:00
  • 2fc48c54be Update llama.cpp Andrei Betlen 2023-12-18 15:32:15 -05:00
  • 6b2e0e05b4 perf: Don't convert logprobs arrays to lists (#1021) kddubey 2023-12-18 11:28:12 -08:00
  • 62944df142 Bugfix: Remove f16_kv, add offload_kqv field (#1019) Brandon Roberts 2023-12-18 12:27:11 -07:00
  • 37da8e863a Update README.md functionary demo typo (#996) evelynmitchell 2023-12-16 17:00:30 -07:00
  • f1c631dc53 Bug fixed with n_ctx=0 (#1015) Daniele Morotti 2023-12-17 00:59:50 +01:00
  • 5a8944672f Fix logits_to_logprobs for 2-D and 3-D logits (#1002) kddubey 2023-12-16 15:59:26 -08:00
  • 534b1ea9b5 Update llama.cpp Andrei Betlen 2023-12-16 18:57:43 -05:00
  • cbce061ffd Bump version v0.2.23 Andrei Betlen 2023-12-13 21:52:29 -05:00
  • 8b4db732bd Add qwen chat format (#1005) yhfgyyf 2023-12-14 10:43:43 +08:00
  • 690c563b60 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2023-12-13 21:43:19 -05:00
  • c0fc0a1e82 Update llama.cpp Andrei Betlen 2023-12-13 21:43:16 -05:00
  • 8e44a32075 Add support for running the server with SSL (#994) Radoslav Gerganov 2023-12-12 03:47:11 +02:00
  • ef22e478db Replace logits_to_logprobs implementation with numpy equivalent to llama.cpp (#991) Tanner Hobson 2023-12-11 20:46:27 -05:00
  • ac35f68e4d Fix UnsupportedOperation: fileno in suppress_stdout_stderr (#961) zocainViken 2023-12-12 02:44:51 +01:00
  • b938cccf05 Add Pygmalion chat format (#986) chiensen 2023-12-12 09:44:04 +08:00
  • 6bbeea07ae README.md multimodal params fix (#967) zocainViken 2023-12-12 02:41:38 +01:00
  • c1d92ce680 fix minor typo (#958) Aniket Maurya 2023-12-12 01:40:38 +00:00
  • 4335a9db13 Merge branch 'main' into batch-processing Andrei Betlen 2023-12-11 19:46:59 -05:00
  • e9bc4c4baf Fix docker build Andrei Betlen 2023-12-11 10:39:51 -05:00
  • c1e73e73a3 Bump version v0.2.22 Andrei Betlen 2023-12-11 10:26:42 -05:00
  • ec26f364cc Remove f16_kv Andrei Betlen 2023-12-11 10:25:37 -05:00