vulkan: support solve_tri with larger N/K values (llama/17781)
Split N into chunks to fit into shared memory. If K > 128, use a larger workgroup with enough invocations. Add perf tests matching qwen3next.
J
Jeff Bolz committed
875d8614733338c24d729de9f58df5d374f0f4db
Parent: 41cf229
Committed by Georgi Gerganov <ggerganov@gmail.com>
on 12/12/2025, 3:53:20 PM