Commit Graph

  • 7f52335c50 feat: Update llama.cpp Andrei Betlen 2024-04-25 21:21:29 -04:00
  • 266abfc1a3 fix(ci): Fix metal tests as well Andrei Betlen 2024-04-25 03:09:46 -04:00
  • de37420fcf fix(ci): Fix python macos test runners issue Andrei Betlen 2024-04-25 03:08:32 -04:00
  • 2a9979fce1 feat: Update llama.cpp Andrei Betlen 2024-04-25 02:48:26 -04:00
  • c50d3300d2 chore: Bump version v0.2.64-metal v0.2.64-cu123 v0.2.64-cu122 v0.2.64-cu121 v0.2.64 Andrei Betlen 2024-04-23 02:53:20 -04:00
  • 611781f531 ci: Build arm64 wheels. Closes #1342 Andrei Betlen 2024-04-23 02:48:09 -04:00
  • 53ebcc8bb5 feat(server): Provide ability to dynamically allocate all threads if desired using -1 (#1364) Sean Bailey 2024-04-23 02:35:38 -04:00
  • 507c1da066 fix: Update scikit-build-core build dependency avoid bug in 0.9.1 (#1370) Geza Velkey 2024-04-23 08:34:15 +02:00
  • 8559e8ce88 feat: Add Llama-3 chat format (#1371) abk16 2024-04-23 06:33:29 +00:00
  • 617d536e1c feat: Update llama.cpp Andrei Betlen 2024-04-23 02:31:40 -04:00
  • d40a250ef3 feat: Use new llama_token_is_eog in create_completions Andrei Betlen 2024-04-22 00:35:47 -04:00
  • b21ba0e2ac Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-04-21 20:46:42 -04:00
  • 159cc4e5d9 feat: Update llama.cpp Andrei Betlen 2024-04-21 20:46:40 -04:00
  • 0281214863 chore: Bump version v0.2.63-metal v0.2.63-cu123 v0.2.63-cu122 v0.2.63-cu121 v0.2.63 Andrei Betlen 2024-04-20 00:09:37 -04:00
  • cc81afebf0 feat: Add stopping_criteria to ChatFormatter, allow stopping on arbitrary token ids, fixes llama3 instruct Andrei Betlen 2024-04-20 00:00:53 -04:00
  • d17c1887a3 feat: Update llama.cpp Andrei Betlen 2024-04-19 23:58:16 -04:00
  • 893a27a736 chore: Bump version v0.2.62-metal v0.2.62-cu123 v0.2.62-cu122 v0.2.62-cu121 v0.2.62 Andrei Betlen 2024-04-18 01:43:39 -04:00
  • a128c80500 feat: Update llama.cpp Andrei Betlen 2024-04-18 01:39:45 -04:00
  • 4f42664955 feat: update grammar schema converter to match llama.cpp (#1353) Lucca Zenóbio 2024-04-18 02:36:25 -03:00
  • fa4bb0cf81 Revert "feat: Update json to grammar (#1350)" Andrei Betlen 2024-04-17 16:18:16 -04:00
  • 610a592f70 feat: Update json to grammar (#1350) Lucca Zenóbio 2024-04-17 11:10:21 -03:00
  • b73c73c0c6 feat: add disable_ping_events flag (#1257) khimaros 2024-04-17 14:08:19 +00:00
  • 4924455dec feat: Make saved state more compact on-disk (#1296) tc-wolf 2024-04-17 09:06:50 -05:00
  • 9842cbf99d feat: Update llama.cpp Andrei Betlen 2024-04-17 10:06:15 -04:00
  • c96b2daebf feat: Use all available CPUs for batch processing (#1345) ddh0 2024-04-17 09:04:33 -05:00
  • a420f9608b feat: Update llama.cpp Andrei Betlen 2024-04-14 19:14:09 -04:00
  • 90dceaba8a feat: Update llama.cpp Andrei Betlen 2024-04-14 11:35:57 -04:00
  • 2e9ffd28fd feat: Update llama.cpp Andrei Betlen 2024-04-12 21:09:12 -04:00
  • ef29235d45 chore: Bump version v0.2.61-metal v0.2.61-cu123 v0.2.61-cu122 v0.2.61-cu121 v0.2.61 Andrei Betlen 2024-04-10 03:44:46 -04:00
  • bb65b4d764 fix: pass correct type to chat handlers for chat completion logprobs Andrei Betlen 2024-04-10 03:41:55 -04:00
  • 060bfa64d5 feat: Add support for yaml based configs Andrei Betlen 2024-04-10 02:47:01 -04:00
  • 1347e1d050 feat: Add typechecking for ctypes structure attributes Andrei Betlen 2024-04-10 02:40:41 -04:00
  • 889d0e8981 feat: Update llama.cpp Andrei Betlen 2024-04-10 02:25:58 -04:00
  • 56071c956a feat: Update llama.cpp Andrei Betlen 2024-04-09 09:53:49 -04:00
  • 08b16afe11 chore: Bump version v0.2.60-metal v0.2.60-cu123 v0.2.60-cu122 v0.2.60-cu121 v0.2.60 Andrei Betlen 2024-04-06 01:53:38 -04:00
  • 7ca364c8bd feat: Update llama.cpp Andrei Betlen 2024-04-06 01:37:43 -04:00
  • b3bfea6dbf fix: Always embed metal library. Closes #1332 Andrei Betlen 2024-04-06 01:36:53 -04:00
  • f4092e6b46 feat: Update llama.cpp Andrei Betlen 2024-04-05 10:59:31 -04:00
  • 2760ef6156 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-04-05 10:51:54 -04:00
  • 1ae3abbcc3 fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 Closes #1314 Andrei Betlen 2024-04-05 10:50:49 -04:00
  • 49bc66bfa2 fix: missing logprobs in response, incorrect response type for functionary, minor type issues. Closes #1328 #1314 Andrei Betlen 2024-04-05 10:50:49 -04:00
  • 9111b6e03a feat: Update llama.cpp Andrei Betlen 2024-04-05 09:21:02 -04:00
  • 7265a5dc0e fix(docs): incorrect tool_choice example (#1330) Sigbjørn Skjæret 2024-04-05 15:14:03 +02:00
  • 909ef66951 docs: Rename cuBLAS section to CUDA Andrei Betlen 2024-04-04 03:08:47 -04:00
  • 1db3b58fdc docs: Add docs explaining how to install pre-built wheels. Andrei Betlen 2024-04-04 02:57:06 -04:00
  • c50309e52a docs: LLAMA_CUBLAS -> LLAMA_CUDA Andrei Betlen 2024-04-04 02:49:19 -04:00
  • 612e78d322 fix(ci): use correct script name Andrei Betlen 2024-04-03 16:15:29 -04:00
  • 34081ddc5b chore: Bump version v0.2.59-metal v0.2.59-cu123 v0.2.59-cu122 v0.2.59-cu121 v0.2.59 Andrei Betlen 2024-04-03 15:38:27 -04:00
  • 368061c04a Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-04-03 15:35:30 -04:00
  • 5a5193636b feat: Update llama.cpp Andrei Betlen 2024-04-03 15:35:28 -04:00
  • 5a930ee9a1 feat: Binary wheels for CPU, CUDA (12.1 - 12.3), Metal (#1247) Andrei 2024-04-03 15:32:13 -04:00
  • b5374e9273 Merge branch 'main' into binary-wheels binary-wheels Andrei 2024-04-03 15:31:08 -04:00
  • 8649d7671b fix: segfault when logits_all=False. Closes #1319 Andrei Betlen 2024-04-03 15:30:31 -04:00
  • 6f72de1382 Update workflow name Andrei Betlen 2024-04-03 14:59:46 -04:00
  • 3fcfa8b13c Update generate index workflow Andrei Betlen 2024-04-03 14:58:55 -04:00
  • cdf7be7a44 Add workflows to build CUDA and Metal wheels Andrei Betlen 2024-04-03 02:10:07 -04:00
  • 79de49514a Merge branch 'main' into binary-wheels Andrei Betlen 2024-04-03 01:10:16 -04:00
  • f96de6d920 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-04-03 00:55:21 -04:00
  • e465157804 feat: Update llama.cpp Andrei Betlen 2024-04-03 00:55:19 -04:00
  • 62aad610e1 fix: last tokens passing to sample_repetition_penalties function (#1295) Yuri Mikhailov 2024-04-02 04:25:43 +09:00
  • 45bf5ae582 chore: Bump version v0.2.58 Andrei Betlen 2024-04-01 10:28:22 -04:00
  • a0f373e310 fix: Changed local API doc references to hosted (#1317) lawfordp2017 2024-04-01 08:21:00 -06:00
  • f165048a69 feat: add support for KV cache quantization options (#1307) Limour 2024-04-01 22:19:28 +08:00
  • aa9f1ae011 feat: Add logprobs support to chat completions (#1311) windspirit95 2024-04-01 02:30:13 +09:00
  • 1e60dba082 feat: Update llama.cpp Andrei Betlen 2024-03-29 13:34:23 -04:00
  • dcbe57fcf8 feat: Update llama.cpp Andrei Betlen 2024-03-29 12:45:27 -04:00
  • 125b2358c9 feat: Update llama.cpp Andrei Betlen 2024-03-28 12:06:46 -04:00
  • 901fe02461 feat: Update llama.cpp Andrei Betlen 2024-03-26 22:58:53 -04:00
  • b64fa4e2c0 feat: Update llama.cpp Andrei Betlen 2024-03-25 23:09:07 -04:00
  • a93b9149f8 feat: Update llama.cpp Andrei Betlen 2024-03-25 11:10:14 -04:00
  • 364678bde5 feat: Update llama.cpp Andrei Betlen 2024-03-24 12:27:49 -04:00
  • d11ccc3036 fix(server): minor type fixes Andrei Betlen 2024-03-23 17:14:15 -04:00
  • c1325dcdfb fix: tool_call missing first token. Andrei Betlen 2024-03-22 23:44:04 -04:00
  • e325a831f0 feat: Update llama.cpp Andrei Betlen 2024-03-22 23:43:29 -04:00
  • c89be28ef9 feat: Update llama.cpp Andrei Betlen 2024-03-20 20:50:47 -04:00
  • 3db03b7302 feat: Update llama.cpp Andrei Betlen 2024-03-20 13:27:43 -04:00
  • 740f3f3812 fix: set LLAMA_METAL_EMBED_LIBRARY=on on MacOS arm64 (#1289) bretello 2024-03-20 17:46:09 +01:00
  • f7decc9562 docs: Add chat examples to openapi ui Andrei Betlen 2024-03-19 10:52:53 -04:00
  • 60d8498f21 feat: Add tools/functions variables to Jinja2ChatFormatter, add function response formatting for all simple chat formats (#1273) Andrei 2024-03-19 04:55:57 -04:00
  • 18d7ce918f feat: Update llama.cpp Andrei Betlen 2024-03-19 04:40:24 -04:00
  • 7d4a5ec59f Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main v0.2.57 Andrei Betlen 2024-03-18 11:37:33 -04:00
  • bf64752535 chore: Bump version Andrei Betlen 2024-03-18 11:37:30 -04:00
  • 8a60c7bc8c fix: Fix and optimize functionary chat handler (#1282) Jeffrey Fong 2024-03-18 22:40:57 +08:00
  • 8d298b4750 feat: Update llama.cpp Andrei Betlen 2024-03-18 10:26:36 -04:00
  • 6eb25231e4 feat: Update llama.cpp Andrei Betlen 2024-03-15 12:58:45 -04:00
  • 20e6815252 fix: json mode Andrei Betlen 2024-03-15 12:58:34 -04:00
  • e9d1b8d7be fallback to get_embeddings_ith fix-embeddings-for-non-embedding-models Andrei Betlen 2024-03-14 12:02:24 -04:00
  • 1a9b8af2dd feat: Update llama.cpp Andrei Betlen 2024-03-14 11:46:48 -04:00
  • 4084aabe86 fix: set default pooling type to unspecified Andrei Betlen 2024-03-14 10:04:57 -04:00
  • d318cc8b83 fix: Set default pooling_type to mean, check for null pointer. Andrei Betlen 2024-03-14 09:17:41 -04:00
  • dd0ee56217 feat: Update llama.cpp Andrei Betlen 2024-03-13 15:57:35 -04:00
  • 08e910f7a7 feat: Update llama.cpp Andrei Betlen 2024-03-10 23:45:05 -04:00
  • a7281994d8 chore: Bump version v0.2.56 Andrei Betlen 2024-03-08 21:14:44 -05:00
  • 919fca9f2b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-03-08 21:10:56 -05:00
  • d02a9cf16f Fixed json strings grammar by blacklisting character control set. Closes #1259 Andrei Betlen 2024-03-08 21:10:53 -05:00
  • c139f8b5d5 feat: Add endpoints for tokenize, detokenize and count tokens (#1136) Felipe Lorenz 2024-03-08 21:09:00 -05:00
  • 1f3156d4f2 fix: Check for existence of clip model path (#1264) Kevin Cao 2024-03-08 21:00:10 -05:00
  • 2811014bae feat: Switch embed to llama_get_embeddings_seq (#1263) Douglas Hanley 2024-03-08 19:59:35 -06:00
  • 40c6b54f68 feat: Update llama.cpp Andrei Betlen 2024-03-08 20:58:50 -05:00
  • 93dc56ace8 Update llama.cpp Andrei Betlen 2024-03-06 01:32:00 -05:00