Python bindings for llama.cpp
Add doc string for n_gpu_layers argument and make -1 offload all layers