add `mul_mat_q` parameter
This also fixes a crash when loading the 70b llama2 model on MacOS with metal and `n_gpu_layers=1`
B
bretello committed
39978ccaf5b8ca85bc6b72d719e746ea305ad37f
Parent: 91bf8fa
This also fixes a crash when loading the 70b llama2 model on MacOS with metal and `n_gpu_layers=1`