Commit Graph

  • 296304b60b fix(server): Fix bug in FastAPI streaming response where dependency was released before request completes causing SEGFAULT Andrei Betlen 2024-07-02 02:49:20 -04:00
  • dc20e8c342 fix: Copy dependencies for windows Andrei Betlen 2024-07-01 23:28:19 -04:00
  • 73fe013a48 fix: Fix RPATH so it works on macos Andrei Betlen 2024-07-01 23:17:02 -04:00
  • e51f200f2c fix: Fix installation location for shared libraries Andrei Betlen 2024-07-01 23:11:49 -04:00
  • d5f6a15a9b fix: force $ORIGIN rpath for shared library files Andrei Betlen 2024-07-01 23:03:26 -04:00
  • 139774b8b0 fix: Update shared library rpath Andrei Betlen 2024-07-01 22:21:34 -04:00
  • 92bad6e510 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-07-01 22:03:21 -04:00
  • c546c94b48 misc: Install shared libraries to lib subdirectory Andrei Betlen 2024-07-01 22:03:19 -04:00
  • 73ddf297be fix(ci): Fix the CUDA workflow (#1551) oobabooga 2024-07-01 22:31:25 -03:00
  • bf5e0bb4b1 fix(server): Update embeddings=False by default. Embeddings should be enabled by default for embedding models. Andrei Betlen 2024-07-01 21:29:13 -04:00
  • 117cbb2f53 feat: Update llama.cpp Andrei Betlen 2024-07-01 21:28:11 -04:00
  • 19e3a54f0a Merge branch 'main' into docker Olivier DEBAUCHE 2024-06-23 03:40:28 +02:00
  • 04959f1884 feat: Update llama_cpp.py bindings Andrei Betlen 2024-06-21 16:56:15 -04:00
  • 35c980eb2e chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.1 (#1527) dependabot[bot] 2024-06-21 12:10:43 -04:00
  • 398fe81547 chore(deps): bump docker/build-push-action from 5 to 6 (#1539) dependabot[bot] 2024-06-21 12:10:34 -04:00
  • 27d53589ff docs: Update readme examples to use newer Qwen2 model (#1544) Jon Craton 2024-06-21 12:10:15 -04:00
  • 5beec1a1fd feat: Update llama.cpp Andrei Betlen 2024-06-21 12:09:14 -04:00
  • d98a24a25b docs: Remove references to deprecated opencl backend. Closes #1512 Andrei Betlen 2024-06-20 10:50:40 -04:00
  • 6c331909ca chore: Bump version v0.2.79-metal v0.2.79 Andrei Betlen 2024-06-19 10:10:01 -04:00
  • 554fd08e7d feat: Update llama.cpp Andrei Betlen 2024-06-19 10:07:28 -04:00
  • 4c1d74c0ae fix: Make destructor to automatically call .close() method on Llama class. Andrei Betlen 2024-06-19 10:07:20 -04:00
  • f4491c4903 feat: Update llama.cpp Andrei Betlen 2024-06-17 11:56:03 -04:00
  • ed15d2e1a3 Update Dockerfile Olivier DEBAUCHE 2024-06-16 05:11:01 +02:00
  • 7c086bafc5 Update Dockerfile Olivier DEBAUCHE 2024-06-16 05:10:28 +02:00
  • 4db6bb5d31 Update build-docker.yaml Olivier DEBAUCHE 2024-06-14 16:07:26 +02:00
  • e74b6592ce Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:43:05 +02:00
  • c4919f034c Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:42:39 +02:00
  • 67a314f680 Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:41:11 +02:00
  • 2bea4f3ff0 Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:37:03 +02:00
  • acfd90a8dc Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:36:32 +02:00
  • 299ad0dbfa Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:36:13 +02:00
  • 7a1ca4ec0a Update Dockerfile Olivier DEBAUCHE 2024-06-14 15:18:44 +02:00
  • 8401c6f2d1 feat: Update llama.cpp Andrei Betlen 2024-06-13 11:31:31 -04:00
  • 9e396b3ebd feat: Update workflows and pre-built wheels (#1416) Olivier DEBAUCHE 2024-06-13 16:19:57 +02:00
  • 5af81634cb chore(deps): bump pypa/cibuildwheel from 2.18.1 to 2.19.0 (#1522) dependabot[bot] 2024-06-13 10:12:02 -04:00
  • 320a5d7ea5 feat: Add .close() method to Llama class to explicitly free model from memory (#1513) Junpei Kawamoto 2024-06-13 02:16:14 -06:00
  • dbcf64cf07 feat: Support SPM infill (#1492) Sigbjørn Skjæret 2024-06-13 09:45:24 +02:00
  • e342161371 feat: Update llama.cpp Andrei Betlen 2024-06-13 03:38:11 -04:00
  • 86a38ad4a0 chore: Bump version v0.2.78-metal v0.2.78 Andrei Betlen 2024-06-10 11:14:33 -04:00
  • 1615eb9e5b feat: Update llama.cpp Andrei Betlen 2024-06-10 11:05:45 -04:00
  • 83d6b26e6f feat: Update llama.cpp Andrei Betlen 2024-06-08 23:14:22 -04:00
  • 255e1b4495 feat: Update llama.cpp Andrei Betlen 2024-06-07 02:02:12 -04:00
  • d634efcdd9 feat: adding rpc_servers parameter to Llama class (#1477) v0.2.77-cu124 v0.2.77-cu123 v0.2.77-cu122 v0.2.77-cu121 nullname 2024-06-04 22:38:21 +08:00
  • 2b5438d71b Add rpc servers to server options dev-add-rpc Andrei Betlen 2024-06-04 10:37:20 -04:00
  • 1e42468a27 Only set rpc_servers when provided Andrei Betlen 2024-06-04 10:37:02 -04:00
  • 6e0642ca19 fix: fix logprobs when BOS is not present (#1471) Asghar Ghorbani 2024-06-04 16:18:38 +02:00
  • 027f7bc678 fix: Avoid duplicate special tokens in chat formats (#1439) Sigbjørn Skjæret 2024-06-04 16:15:41 +02:00
  • 71805353ef Use python warnings module remove-unwanted-bos Andrei Betlen 2024-06-04 10:13:15 -04:00
  • 9f14fd29ab Merge branch 'main' into remove-unwanted-bos Andrei Betlen 2024-06-04 10:09:29 -04:00
  • ff88fcbc20 update readme hongruichen 2024-05-29 12:18:51 +08:00
  • aeebfba860 Revert "enable llama rpc by default" hongruichen 2024-05-23 10:09:27 +08:00
  • 2f7f83e121 add rpc package hongruichen 2024-05-22 21:12:43 +08:00
  • fd7bcc951d convert string to byte hongruichen 2024-05-21 23:18:04 +08:00
  • 9e1d80f1d0 enable llama rpc by default hongruichen 2024-05-21 22:42:38 +08:00
  • e8b4f32da3 passthru rpc_servers params hongruichen 2024-05-21 22:41:43 +08:00
  • 951e39caf9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main v0.2.77-metal v0.2.77 Andrei Betlen 2024-06-04 00:49:26 -04:00
  • c3ef41ba06 chore: Bump version Andrei Betlen 2024-06-04 00:49:24 -04:00
  • ae5682f500 fix: Disable Windows+CUDA workaround when compiling for HIPBLAS (#1493) Engininja2 2024-06-03 22:42:34 -06:00
  • cd3f1bb387 feat: Update llama.cpp Andrei Betlen 2024-06-04 00:35:47 -04:00
  • 6b018e00b1 misc: Improve llava error messages Andrei Betlen 2024-06-03 11:19:10 -04:00
  • a6457ba74b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-06-01 18:10:13 -04:00
  • af3ed503e9 fix: Use numpy recarray for candidates data, fixes bug with temp < 0 Andrei Betlen 2024-06-01 18:09:24 -04:00
  • a6e5917ca4 move to another PR Sigbjørn Skjæret 2024-05-29 20:24:42 +02:00
  • 165b4dc6c1 fix: Fix typo in Llama3VisionAlphaChatHandler. Closes #1488 Andrei Betlen 2024-05-29 02:29:44 -04:00
  • 91d05aba46 fix: adjust kv_override member names to match llama.cpp Andrei Betlen 2024-05-29 02:28:58 -04:00
  • df45a4b3fe fix: fix string value kv_overrides. Closes #1487 Andrei Betlen 2024-05-29 02:02:22 -04:00
  • 10b7c50cd2 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-28 22:52:30 -04:00
  • 2907c26906 misc: Update debug build to keep all debug symbols for easier gdb debugging Andrei Betlen 2024-05-28 22:52:28 -04:00
  • c26004b1be feat: Update llama.cpp Andrei Betlen 2024-05-28 22:52:03 -04:00
  • c564007ff6 chore(deps): bump pypa/cibuildwheel from 2.18.0 to 2.18.1 (#1472) dependabot[bot] 2024-05-27 10:57:17 -04:00
  • 454c9bb1cb feat: Update llama.cpp Andrei Betlen 2024-05-27 10:51:57 -04:00
  • 2d89964147 docs: Fix table formatting Andrei Betlen 2024-05-24 11:55:41 -04:00
  • 9e8d7d55bd fix(docs): Fix link typo Andrei Betlen 2024-05-24 11:55:01 -04:00
  • ec43e8920f docs: Update multi-modal model section Andrei Betlen 2024-05-24 11:54:15 -04:00
  • a4c9ab885d chore: Bump version v0.2.76-metal v0.2.76-cu124 v0.2.76-cu123 v0.2.76-cu122 v0.2.76-cu121 v0.2.76 Andrei Betlen 2024-05-24 01:59:25 -04:00
  • 5cae1040e3 feat: Improve Llama.eval performance by avoiding list conversion (#1476) Linghan Zhong 2024-05-24 00:49:44 -05:00
  • 087cc0b036 feat: Update llama.cpp Andrei Betlen 2024-05-24 01:43:36 -04:00
  • b9a1e61f24 changed to a warning Sigbjørn Skjæret 2024-05-22 11:21:34 +02:00
  • 5a595f035a feat: Update llama.cpp Andrei Betlen 2024-05-22 02:40:31 -04:00
  • 3dbfec74e7 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-18 01:19:20 -04:00
  • d8a3b013c3 feat: Update llama.cpp Andrei Betlen 2024-05-18 01:19:19 -04:00
  • 03f171e810 example: LLM inference with Ray Serve (#1465) Radoslav Gerganov 2024-05-17 20:27:26 +03:00
  • b564d05806 chore: Bump version v0.2.75-metal v0.2.75-cu124 v0.2.75-cu123 v0.2.75-cu122 v0.2.75-cu121 v0.2.75 Andrei Betlen 2024-05-16 00:41:21 -04:00
  • d99a6ba607 fix: segfault for models without eos / bos tokens. Closes #1463 Andrei Betlen 2024-05-16 00:37:27 -04:00
  • e811a81066 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-15 23:59:18 -04:00
  • ca8e3c967d feat: Update llama.cpp Andrei Betlen 2024-05-15 23:59:17 -04:00
  • 5212fb08ae feat: add MinTokensLogitProcessor and min_tokens argument to server (#1333) twaka 2024-05-14 22:50:53 +09:00
  • 389e09c2f5 misc: Remove unnecessary metadata lookups (#1448) Sigbjørn Skjæret 2024-05-14 15:44:09 +02:00
  • 4b54f79330 chore(deps): bump pypa/cibuildwheel from 2.17.0 to 2.18.0 (#1453) dependabot[bot] 2024-05-14 09:35:52 -04:00
  • 50f5c74ecf Update llama.cpp Andrei Betlen 2024-05-14 09:30:04 -04:00
  • 43ba1526c8 feat: Update llama.cpp Andrei Betlen 2024-05-13 09:39:08 -04:00
  • 3f8e17af63 fix(ci): Use version without extra platform tag in pep503 index Andrei Betlen 2024-05-12 11:45:55 -04:00
  • 3c19faa0d4 chore: Bump version v0.2.74-metal v0.2.74-cu124 v0.2.74-cu123 v0.2.74-cu122 v0.2.74-cu121 v0.2.74 Andrei Betlen 2024-05-12 10:32:52 -04:00
  • 3fe8e9a8f3 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main Andrei Betlen 2024-05-12 10:30:24 -04:00
  • 9dc5e20fb6 feat: Update llama.cpp Andrei Betlen 2024-05-12 10:30:23 -04:00
  • aef3b1c31a align test with new response Sigbjørn Skjæret 2024-05-11 10:51:34 +02:00
  • aa25cd3dbb typo-- Sigbjørn Skjæret 2024-05-11 10:44:28 +02:00
  • 2e26f2d4d1 just let tokenizer do the job Sigbjørn Skjæret 2024-05-11 10:42:37 +02:00
  • bb6cf4f913 proper bos/eos detection Sigbjørn Skjæret 2024-05-11 10:27:13 +02:00
  • 06cf25d1ed add some missing internals Sigbjørn Skjæret 2024-05-11 09:49:18 +02:00