Blame: README.md - abetlen/llama-cpp-python

Python bindings for llama.cpp

0 0 1 Python

Update README.md 2023-03-24 00:06:24 -04:00			# 🦙 Python Bindings for `llama.cpp`

Add docs link 2023-03-27 18:30:12 -04:00			`[![Documentation](https://img.shields.io/badge/docs-passing-green.svg)](https://abetlen.github.io/llama-cpp-python)`
Update workflow name and add badge to README 2023-04-05 04:41:24 -04:00			`[![Tests](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/abetlen/llama-cpp-python/actions/workflows/test.yaml)`
Update README.md 2023-03-24 00:06:24 -04:00			`[![PyPI](https://img.shields.io/pypi/v/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)`
			`[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)`
			`[![PyPI - License](https://img.shields.io/pypi/l/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)`
			`[![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-cpp-python)](https://pypi.org/project/llama-cpp-python/)`
Initial commit 2023-03-23 05:33:06 -04:00
Update README.md 2023-03-23 23:55:42 -04:00			Simple Python bindings for @ggerganov's [`llama.cpp`](https://github.com/ggerganov/llama.cpp) library.
Update README 2023-03-23 16:00:10 -04:00			`This package provides:`
Initial commit 2023-03-23 05:33:06 -04:00
Update README.md 2023-03-23 23:55:42 -04:00			- Low-level access to C API via `ctypes` interface.
			`- High-level Python API for text completion`
			`- OpenAI-like API`
Update README.md 2023-03-24 00:06:24 -04:00			`- LangChain compatibility`

Move docs link up 2023-05-17 11:40:12 -04:00			`Documentation is available at [https://abetlen.github.io/llama-cpp-python](https://abetlen.github.io/llama-cpp-python).`

Update README.md add link to main README>md 2023-06-13 09:52:22 +10:00			`Detailed MacOS Metal GPU install documentation is available at [docs/macos_install.md](docs/macos_install.md)`


Update README 2023-05-07 05:20:04 -04:00			`## Installation from PyPI (recommended)`
Initial commit 2023-03-23 05:33:06 -04:00
Update README 2023-04-28 17:12:03 -04:00			`Install from PyPI (requires a c compiler):`
Initial commit 2023-03-23 05:33:06 -04:00
			```bash
Update pip instructions in readme 2023-03-23 14:24:34 -04:00			`pip install llama-cpp-python`
Initial commit 2023-03-23 05:33:06 -04:00			```

Update README.md Fixes typo in README 2023-06-13 00:56:05 -05:00			The above command will attempt to install the package and build `llama.cpp` from source.
Update README 2023-04-28 17:08:18 -04:00			This is the recommended installation method as it ensures that `llama.cpp` is built with the available optimizations for your system.

Add upgrade instructions to the README 2023-05-19 02:20:41 -04:00			If you have previously installed `llama-cpp-python` through pip and want to upgrade your version or rebuild the package with different compiler options, please add the following flags to ensure that the package is rebuilt correctly:

			```bash
			`pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir`
			```

chore: add note for Mac m1 installation 2023-05-15 20:46:59 +10:00			`Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:`
			```
			`wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh`
			`bash Miniforge3-MacOSX-arm64.sh`
			```
			`Otherwise, while installing it will build the llama.ccp x86 version which will be 10x slower on Apple Silicon (M1) Mac.`
Update README 2023-05-07 05:20:04 -04:00
Update README.md 2023-06-10 15:59:26 -07:00			`### Installation with OpenBLAS / cuBLAS / CLBlast / Metal`
Update README 2023-05-07 05:20:04 -04:00
			`llama.cpp` supports multiple BLAS backends for faster processing.
			Use the `FORCE_CMAKE=1` environment variable to force the use of `cmake` and install the pip package for the desired BLAS backend.

			To install with OpenBLAS, set the `LLAMA_OPENBLAS=1` environment variable before installing:

			```bash
Updated installation instructions for BLAS backends 2023-05-09 21:34:46 +05:30			`CMAKE_ARGS="-DLLAMA_OPENBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python`
Update README 2023-05-07 05:20:04 -04:00			```

			To install with cuBLAS, set the `LLAMA_CUBLAS=1` environment variable before installing:

			```bash
Updated installation instructions for BLAS backends 2023-05-09 21:34:46 +05:30			`CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python`
Update README 2023-05-07 05:20:04 -04:00			```

			To install with CLBlast, set the `LLAMA_CLBLAST=1` environment variable before installing:

			```bash
Updated installation instructions for BLAS backends 2023-05-09 21:34:46 +05:30			`CMAKE_ARGS="-DLLAMA_CLBLAST=on" FORCE_CMAKE=1 pip install llama-cpp-python`
Update README 2023-05-07 05:20:04 -04:00			```

Update README.md 2023-06-10 15:59:26 -07:00			To install with Metal (MPS), set the `LLAMA_METAL=on` environment variable before installing:

			```bash
			`CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python`
			```
Update README 2023-04-28 17:08:18 -04:00
Update README and docs 2023-04-05 17:44:25 -04:00			`## High-level API`
Initial commit 2023-03-23 05:33:06 -04:00
Update README 2023-05-07 01:41:19 -04:00			The high-level API provides a simple managed interface through the `Llama` class.

			`Below is a short example demonstrating how to use the high-level API to generate text:`

Initial commit 2023-03-23 05:33:06 -04:00			```python
			`>>> from llama_cpp import Llama`
Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00			`>>> llm = Llama(model_path="./models/7B/ggml-model.bin")`
Initial commit 2023-03-23 05:33:06 -04:00			`>>> output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)`
			`>>> print(output)`
			`{`
			`"id": "cmpl-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",`
			`"object": "text_completion",`
			`"created": 1679561337,`
Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00			`"model": "./models/7B/ggml-model.bin",`
Initial commit 2023-03-23 05:33:06 -04:00			`"choices": [`
			`{`
			`"text": "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto.",`
			`"index": 0,`
			`"logprobs": None,`
			`"finish_reason": "stop"`
			`}`
			`],`
			`"usage": {`
			`"prompt_tokens": 14,`
			`"completion_tokens": 28,`
			`"total_tokens": 42`
			`}`
			`}`
			```
Update README.md 2023-03-24 00:06:24 -04:00
Update README and docs 2023-04-05 17:44:25 -04:00			`## Web Server`

			`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
			`This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).`

			`To install the server package and get started:`

			```bash
			`pip install llama-cpp-python[server]`
Update README to use cli options for server 2023-05-07 05:10:52 -04:00			`python3 -m llama_cpp.server --model models/7B/ggml-model.bin`
Update README.md add windows server commad 2023-05-05 14:21:57 +02:00			```

Update README and docs 2023-04-05 17:44:25 -04:00			`Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.`

Add Dockerfile + build workflow 2023-04-12 11:53:39 +02:00			`## Docker image`

			`A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:`

			```bash
Fix docker command 2023-05-11 22:12:35 -04:00			`docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/ggml-model-name.bin ghcr.io/abetlen/llama-cpp-python:latest`
Add Dockerfile + build workflow 2023-04-12 11:53:39 +02:00			```

Update README and docs 2023-04-05 17:44:25 -04:00			`## Low-level API`

Update README 2023-05-07 01:41:19 -04:00			The low-level API is a direct [`ctypes`](https://docs.python.org/3/library/ctypes.html) binding to the C API provided by `llama.cpp`.
			`The entire lowe-level API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and directly mirrors the C API in [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).`

			`Below is a short example demonstrating how to use the low-level API to tokenize a prompt:`

			```python
			`>>> import llama_cpp`
			`>>> import ctypes`
			`>>> params = llama_cpp.llama_context_default_params()`
			`# use bytes for char * params`
			`>>> ctx = llama_cpp.llama_init_from_file(b"./models/7b/ggml-model.bin", params)`
			`>>> max_tokens = params.n_ctx`
			`# use ctypes arrays for array params`
Update README.md Fix typo. 2023-05-15 14:52:25 -07:00			`>>> tokens = (llama_cpp.llama_token * int(max_tokens))()`
Update README 2023-05-07 01:41:19 -04:00			`>>> n_tokens = llama_cpp.llama_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, add_bos=llama_cpp.c_bool(True))`
			`>>> llama_cpp.llama_free(ctx)`
			```

			`Check out the [examples folder](examples/low_level_api) for more examples of using the low-level API.`
Update README and docs 2023-04-05 17:44:25 -04:00

Update README 2023-04-02 21:03:39 -04:00			`# Documentation`

			`Documentation is available at [https://abetlen.github.io/llama-cpp-python](https://abetlen.github.io/llama-cpp-python).`
			`If you find any issues with the documentation, please open an issue or submit a PR.`

			`# Development`

			`This package is under active development and I welcome any contributions.`

			`To get started, clone the repository and install the package in development mode:`

			```bash
Fix whitespace 2023-05-01 18:07:45 -04:00			`git clone --recurse-submodules git@github.com:abetlen/llama-cpp-python.git`
README: better setup instructions for developers for pip and poetry Give folks options + explicit instructions for installing with poetry or pip. 2023-04-30 23:28:50 -07:00
			`# Install with pip`
			`pip install -e .`

			`# if you want to use the fastapi / openapi server`
			`pip install -e .[server]`

			`# If you're a poetry user, installing will also include a virtual environment`
			`poetry install --all-extras`
			`. .venv/bin/activate`

Update README 2023-04-02 21:03:39 -04:00			`# Will need to be re-run any time vendor/llama.cpp is updated`
			`python3 setup.py develop`
			```

			# How does this compare to other Python bindings of `llama.cpp`?

Update README 2023-04-04 10:57:22 -04:00			`I originally wrote this package for my own use with two goals in mind:`
Update README 2023-04-02 21:03:39 -04:00
			- Provide a simple process to install `llama.cpp` and access the full C API in `llama.h` from Python
			- Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use `llama.cpp`

			`Any contributions and changes to this package will be made with these goals in mind.`

Update README.md 2023-03-24 00:06:24 -04:00			`# License`

			`This project is licensed under the terms of the MIT license.`