Commit Graph

  • ea0faabae1 Update llama.cpp Andrei Betlen 2023-04-28 15:32:43 -04:00
  • b7d14efc8b Python weirdness Mug 2023-04-28 13:20:31 +02:00
  • eed61289b6 Dont detect off tokens, detect off detokenized utf8 Mug 2023-04-28 13:16:18 +02:00
  • 3a98747026 One day, i'll fix off by 1 errors permanently too Mug 2023-04-28 12:54:28 +02:00
  • c39547a986 Detect multi-byte responses and wait Mug 2023-04-28 12:50:30 +02:00
  • 9339929f56 Update llama.cpp Andrei Betlen 2023-04-26 20:00:54 -04:00
  • 5f81400fcb Also ignore errors on input prompts Mug 2023-04-26 14:45:51 +02:00
  • 3c130f00ca Remove try catch from chat Mug 2023-04-26 14:38:53 +02:00
  • be2c961bc9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python Mug 2023-04-26 14:38:09 +02:00
  • c4a8491d42 Fix decode errors permanently Mug 2023-04-26 14:37:06 +02:00
  • cbd26fdcc1 Update llama.cpp Andrei Betlen 2023-04-25 19:03:41 -04:00
  • 3cab3ef4cb Update n_batch for server Andrei Betlen 2023-04-25 09:11:32 -04:00
  • cc706fb944 Add ctx check and re-order __init__. Closes #112 Andrei Betlen 2023-04-25 09:00:53 -04:00
  • 996e31d861 Bump version v0.1.38 Andrei Betlen 2023-04-25 01:37:07 -04:00
  • 848c83dfd0 Add FORCE_CMAKE option Andrei Betlen 2023-04-25 01:36:37 -04:00
  • 9dddb3a607 Bump version v0.1.37 Andrei Betlen 2023-04-25 00:19:44 -04:00
  • d484c5634e Bugfix: Check cache keys as prefix to prompt tokens Andrei Betlen 2023-04-24 22:18:54 -04:00
  • b75fa96bf7 Update docs Andrei Betlen 2023-04-24 19:56:57 -04:00
  • cbe95bbb75 Add cache implementation using llama state Andrei Betlen 2023-04-24 19:54:41 -04:00
  • 2c359a28ff Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2023-04-24 17:51:27 -04:00
  • 197cf80601 Add save/load state api for Llama class Andrei Betlen 2023-04-24 17:51:25 -04:00
  • c4c332fc51 Update llama.cpp Andrei Betlen 2023-04-24 17:42:09 -04:00
  • 280a047dd6 Update llama.cpp Andrei Betlen 2023-04-24 15:52:24 -04:00
  • 86f8e5ad91 Refactor internal state for Llama class Andrei Betlen 2023-04-24 15:47:54 -04:00
  • f37456133a Merge pull request #108 from eiery/main Andrei 2023-04-24 13:48:09 -04:00
  • 02cf881317 Update llama.cpp Andrei Betlen 2023-04-24 09:30:10 -04:00
  • 8476b325f1 Change to bullseye Niek van der Maas 2023-04-24 09:54:38 +02:00
  • aa12d8a81f Update llama.py eiery 2023-04-23 20:56:40 -04:00
  • 7230599593 Disable mmap when applying lora weights. Closes #107 Andrei Betlen 2023-04-23 14:53:17 -04:00
  • e99caedbbd Update llama.cpp Andrei Betlen 2023-04-22 19:50:28 -04:00
  • 643b73e155 Bump version v0.1.36 Andrei Betlen 2023-04-21 19:38:54 -04:00
  • 1eb130a6b2 Update llama.cpp Andrei Betlen 2023-04-21 17:40:27 -04:00
  • ba3959eafd Update llama.cpp Andrei Betlen 2023-04-20 05:15:31 -04:00
  • 207adbdf13 Bump version v0.1.35 Andrei Betlen 2023-04-20 01:48:24 -04:00
  • 3d290623f5 Update llama.cpp Andrei Betlen 2023-04-20 01:08:15 -04:00
  • e4647c75ec Add use_mmap flag to server Andrei Betlen 2023-04-19 15:57:46 -04:00
  • 207ebbc8dc Update llama.cpp Andrei Betlen 2023-04-19 14:02:11 -04:00
  • 0df4d69c20 If lora base is not set avoid re-loading the model by passing NULL Andrei Betlen 2023-04-18 23:45:25 -04:00
  • 95c0dc134e Update type signature to allow for null pointer to be passed. Andrei Betlen 2023-04-18 23:44:46 -04:00
  • 453e517fd5 Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights. Andrei Betlen 2023-04-18 10:20:46 -04:00
  • 32ca803bd8 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main Andrei Betlen 2023-04-18 02:22:39 -04:00
  • b2d44aa633 Update llama.cpp Andrei Betlen 2023-04-18 02:22:35 -04:00
  • 4ce6670bbd Merge pull request #87 from SagsMug/main Andrei 2023-04-18 02:11:40 -04:00
  • eb7f278cc6 Add lora_path parameter to Llama model Andrei Betlen 2023-04-18 01:43:44 -04:00
  • 35abf89552 Add bindings for LoRA adapters. Closes #88 Andrei Betlen 2023-04-18 01:30:04 -04:00
  • 3f68e95097 Update llama.cpp Andrei Betlen 2023-04-18 01:29:27 -04:00
  • 1b73a15e62 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python Mug 2023-04-17 14:45:42 +02:00
  • 53d17ad003 Fixed end of text wrong type, and fix n_predict behaviour Mug 2023-04-17 14:45:28 +02:00
  • b2a24bddac Update docs Andrei Betlen 2023-04-15 22:31:14 -04:00
  • e38485a66d Bump version. v0.1.34 Andrei Betlen 2023-04-15 20:27:55 -04:00
  • 89856ef00d Bugfix: only eval new tokens Andrei Betlen 2023-04-15 17:32:53 -04:00
  • 6df27b2da0 Merge branch 'main' of github.com:abetlen/llama-cpp-python Niek van der Maas 2023-04-15 20:24:59 +02:00
  • 59b37bbbd2 Support openblas Niek van der Maas 2023-04-15 20:24:46 +02:00
  • 887f3b73ac Update llama.cpp Andrei Betlen 2023-04-15 12:16:05 -04:00
  • 92c077136d Add experimental cache Andrei Betlen 2023-04-15 12:03:09 -04:00
  • a6372a7ae5 Update stop sequences for chat Andrei Betlen 2023-04-15 12:02:48 -04:00
  • 83b2be6dc4 Update chat parameters Andrei Betlen 2023-04-15 11:58:43 -04:00
  • 62087514c6 Update chat prompt Andrei Betlen 2023-04-15 11:58:19 -04:00
  • 02f9fb82fb Bugfix Andrei Betlen 2023-04-15 11:39:52 -04:00
  • 3cd67c7bd7 Add type annotations Andrei Betlen 2023-04-15 11:39:21 -04:00
  • d7de0e8014 Bugfix Andrei Betlen 2023-04-15 00:08:04 -04:00
  • e90e122f2a Use clear Andrei Betlen 2023-04-14 23:33:18 -04:00
  • ac7068a469 Track generated tokens internally Andrei Betlen 2023-04-14 23:33:00 -04:00
  • 25b646c2fb Update llama.cpp Andrei Betlen 2023-04-14 23:32:05 -04:00
  • 6e298d8fca Set kv cache size to f16 by default Andrei Betlen 2023-04-14 22:21:19 -04:00
  • 9c8c2c37dc Update llama.cpp Andrei Betlen 2023-04-14 10:01:57 -04:00
  • 6c7cec0c65 Fix completion request Andrei Betlen 2023-04-14 10:01:15 -04:00
  • 6153baab2d Clean up logprobs implementation Andrei Betlen 2023-04-14 09:59:33 -04:00
  • 26cc4ee029 Fix signature for stop parameter Andrei Betlen 2023-04-14 09:59:08 -04:00
  • 7dc0838fff Bump version v0.1.33 Andrei Betlen 2023-04-13 00:35:05 -04:00
  • 6595ad84bf Add field to disable reseting between generations Andrei Betlen 2023-04-13 00:28:00 -04:00
  • 22fa5a621f Revert "Deprecate generate method" Andrei Betlen 2023-04-13 00:19:55 -04:00
  • 4f5f99ef2a Formatting Andrei Betlen 2023-04-12 22:40:12 -04:00
  • 0daf16defc Enable logprobs on completion endpoint Andrei Betlen 2023-04-12 19:08:11 -04:00
  • 19598ac4e8 Fix threading bug. Closes #62 Andrei Betlen 2023-04-12 19:07:53 -04:00
  • 005c78d26c Update llama.cpp Andrei Betlen 2023-04-12 14:29:00 -04:00
  • c854c2564b Don't serialize stateful parameters Andrei Betlen 2023-04-12 14:07:14 -04:00
  • 2f9b649005 Style fix Andrei Betlen 2023-04-12 14:06:22 -04:00
  • 6cf5876538 Deprecate generate method Andrei Betlen 2023-04-12 14:06:04 -04:00
  • b3805bb9cc Implement logprobs parameter for text completion. Closes #2 Andrei Betlen 2023-04-12 14:05:11 -04:00
  • 9ce8146231 More generic model name Niek van der Maas 2023-04-12 11:56:16 +02:00
  • c14201dc0f Add Dockerfile + build workflow Niek van der Maas 2023-04-12 11:53:39 +02:00
  • 2a60eb820f Update llama.cpp Andrei Betlen 2023-04-11 23:53:46 -04:00
  • 9f1e565594 Update llama.cpp Andrei Betlen 2023-04-11 11:59:03 -04:00
  • 213cc5c340 Remove async from function signature to avoid blocking the server Andrei Betlen 2023-04-11 11:54:31 -04:00
  • 3727ba4d9e Bump version v0.1.32 Andrei Betlen 2023-04-10 12:56:48 -04:00
  • 5247e32d9e Update llama.cpp Andrei Betlen 2023-04-10 12:56:23 -04:00
  • 90e1021154 Add unlimited max_tokens jm12138 2023-04-10 15:56:05 +00:00
  • ffb1e80251 Bump version v0.1.31 Andrei Betlen 2023-04-10 11:37:41 -04:00
  • a5554a2f02 Merge pull request #61 from jm12138/fix_windows_install Andrei 2023-04-10 11:35:04 -04:00
  • adfd9f681c Matched the other encode calls jm12138 2023-04-10 15:33:31 +00:00
  • 0460fdb9ce Merge pull request #28 from SagsMug/local-lib Andrei 2023-04-10 11:32:19 -04:00
  • 2559e5af9b Changed the environment variable name into "LLAMA_CPP_LIB" Mug 2023-04-10 17:27:17 +02:00
  • 63d8a3c688 Merge pull request #63 from SagsMug/main Andrei 2023-04-10 11:23:00 -04:00
  • ee71ce8ab7 Make windows users happy (hopefully) Mug 2023-04-10 17:12:25 +02:00
  • cf339c9b3c Better custom library debugging Mug 2023-04-10 17:06:58 +02:00
  • 4132293d2d Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into local-lib Mug 2023-04-10 17:00:42 +02:00
  • 76131d5bb8 Use environment variable for library override Mug 2023-04-10 17:00:35 +02:00
  • 3bb45f1658 More reasonable defaults Mug 2023-04-10 16:38:45 +02:00
  • 0cccb41a8f Added iterative search to prevent instructions from being echoed, add ignore eos, add no-mmap, fixed 1 character echo too much bug Mug 2023-04-10 16:35:38 +02:00