whisper.cpp

ggml-org/whisper.cpp

Fork 0

mirror of https://github.com/ggml-org/whisper.cpp.git synced 2026-03-28 19:27:07 +00:00

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

ci/env

copilot/add-duplicate-text-removal

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

fix_vs_sdl2

gg/alloc-enc-results

gg/bench-fix-print

gg/benches-update

gg/chess

gg/ci-cuda-fix

gg/ci-fix-android

gg/ci-fix-windows

gg/cuda-fix-mmvq

gg/cuda-no-async

gg/disable-cuda-graphs

gg/fix-external-encoder

gg/hipblas-fix

gg/make-fix-glob

gg/objc

gg/prompt-tokens

gg/reduce-ctx-use

gg/wchess

gg/whisper-short-audio-check

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

sync-ggml-25-04-02-2

sync-ggml-25-05-07

sync-ggml-25-05-13

sync-ggml-25-09-30-2

sync-ggml-25-12-12

sync-ggml-25-12-17

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1980

#1981

#1982

#1983

#1990

#1990

#1994

#1997

#1998

#20

#2000

#2001

#2004

#2005

#2005

#201

#2012

#2019

#2020

#2024

#2025

#2026

#203

#203

#2043

#2044

#2045

#2048

#2049

#2054

#2058

#2063

#2068

#2068

#2069

#2070

#2071

#2071

#2072

#2073

#2075

#2075

#2080

#2086

#2088

#2090

#2094

#2095

#2095

#21

#2100

#2102

#2108

#2115

#2119

#2121

#2123

#2127

#2127

#2128

#2129

#2133

#2138

#2142

#2152

#2153

#2154

#2166

#2170

#2181

#2182

#2184

#2184

#2189

#2194

#2196

#2198

#2206

#2208

#2217

#222

#2220

#2227

#2231

#2232

#2234

#2235

#2236

#2237

#2238

#2239

#224

#2240

#2242

#2254

#2254

#2256

#2261

#2264

#2266

#2267

#2270

#2272

#2272

#2279

#2279

#228

#2288

#229

#2290

#2291

#2294

#2299

#23

#230

#2302

#231

#2311

#2324

#2330

#2336

#2339

#2342

#2343

#2346

#2350

#2358

#2360

#2367

#2369

#2369

#2376

#2382

#2383

#2384

#2386

#2387

#239

#2391

#2393

#2396

#24

#2401

#2406

#2406

#2407

#2410

#2414

#2416

#2417

#2419

#2424

#2425

#2427

#2429

#2431

#2432

#2432

#2433

#2440

#2443

#2444

#2449

#245

#2451

#2455

#2464

#2475

#2477

#2481

#2484

#2485

#2488

#2489

#2495

#2505

#2506

#2511

#2515

#2516

#2517

#2518

#2519

#252

#2523

#2525

#2528

#2529

#253

#2534

#254

#2543

#2546

#2547

#2548

#2549

#2550

#2551

#2555

#2560

#2560

#2561

#2562

#2567

#2569

#257

#2570

#2573

#2574

#2576

#2577

#2577

#2579

#2580

#2585

#2589

#2593

#2593

#260

#2604

#2608

#2611

#2613

#2617

#2623

#2624

#2625

#2629

#2633

#2634

#2634

#2635

#2637

#2638

#2639

#2641

#2642

#2643

#2648

#2649

#2653

#2654

#2656

#2659

#2663

#2664

#2670

#2674

#2676

#2683

#2684

#2686

#2687

#2690

#2690

#2691

#2691

#2692

#2693

#2694

#2694

#2699

#27

#2700

#2707

#2709

#271

#2711

#2716

#2718

#2728

#273

#2734

#2736

#2737

#274

#2745

#2749

#2756

#2759

#2760

#2769

#2769

#277

#2770

#2777

#2779

#2790

#2796

#2797

#2799

#28

#2800

#2800

#2816

#282

#2821

#2822

#2824

#2826

#2826

#2831

#2831

#2832

#2832

#2836

#2838

#2838

#284

#284

#2840

#2842

#2842

#2843

#2844

#2845

#2846

#285

#2851

#2853

#2855

#2858

#286

#2862

#2863

#2868

#287

#2873

#2875

#2876

#2877

#2878

#2879

#288

#2880

#2882

#2887

#2889

#2891

#2893

#2895

#2896

#29

#2900

#2902

#2904

#2905

#2908

#291

#2910

#2911

#2912

#2914

#2915

#2916

#2918

#2919

#2921

#2923

#2924

#2925

#2932

#2935

#2937

#2938

#2939

#294

#2941

#2942

#2943

#2945

#2946

#2947

#2948

#2949

#2951

#2952

#2953

#2955

#2956

#2958

#2959

#296

#2960

#2962

#2966

#2968

#2969

#2971

#2972

#2973

#2975

#2976

#2977

#2979

#298

#2981

#2985

#2986

#2987

#2988

#299

#2990

#2991

#2992

#2993

#2994

#2997

#2999

#3

#3000

#3001

#3002

#3004

#3005

#3006

#3007

#301

#3016

#302

#3021

#3022

#3024

#3025

#3027

#3028

#3029

#3031

#3033

#3038

#3042

#3043

#3044

#3045

#3050

#3052

#3054

#3054

#3055

#3056

#3057

#306

#3060

#3062

#3064

#3065

#3068

#3069

#3070

#3071

#3073

#3075

#3076

#308

#3082

#3083

#3084

#3085

#3086

#3087

#3090

#3097

#3098

#31

#3100

#3101

#3102

#3103

#3104

#3106

#3108

#3109

#3112

#3114

#3120

#3124

#3125

#3126

#3127

#3130

#3131

#3132

#3133

#3134

#3136

#3138

#3140

#3141

#3142

#3143

#3145

#3147

#3148

#3149

#3150

#3151

#3152

#3156

#3157

#3158

#3160

#3160

#3163

#3164

#317

#3170

#3171

#3172

#3173

#3175

#3177

#3178

#3179

#318

#3180

#3181

#3183

#3184

#3185

#3186

#3187

#3189

#319

#3190

#3191

#3192

#3193

#3195

#3196

#3197

#3199

#320

#3200

#3201

#3202

#3203

#3206

#3208

#3209

#3214

#3215

#3217

#3218

#3218

#3219

#322

#3220

#3221

#3222

#3223

#3223

#3229

#323

#3230

#3231

#3233

#3234

#3237

#3239

#324

#3241

#3242

#3243

#3244

#3244

#3245

#3246

#3247

#3251

#3255

#3257

#3257

#3261

#3262

#3264

#3265

#3266

#3268

#3270

#3272

#3273

#3274

#3274

#3275

#3276

#3277

#3281

#3282

#3283

#3284

#3287

#3288

#3289

#3291

#3291

#3292

#3294

#3296

#3298

#3298

#3300

#3301

#3307

#331

#3310

#3313

#3313

#3318

#3319

#3321

#3321

#3322

#3323

#3324

#3325

#3325

#3327

#3328

#3329

#3332

#3333

#3336

#3342

#3346

#3349

#3350

#3354

#336

#3363

#3365

#3369

#3371

#3372

#3374

#3374

#3378

#3378

#3379

#3381

#3383

#3387

#3387

#3389

#3394

#3395

#34

#340

#3401

#3401

#3406

#3408

#3409

#3412

#3412

#3416

#3417

#3417

#3419

#3419

#3422

#3422

#3423

#3423

#3425

#3426

#3428

#343

#343

#3430

#3430

#3433

#3433

#3436

#3437

#3438

#3439

#3441

#3442

#3443

#3445

#3447

#3448

#345

#3453

#3456

#3457

#3457

#346

#3461

#3462

#3462

#3463

#3466

#3467

#3468

#3468

#3469

#3470

#3471

#3471

#3472

#3472

#3473

#3473

#3474

#3474

#3477

#3478

#3482

#3483

#3484

#3485

#3485

#3487

#3488

#3489

#3489

#349

#3490

#3492

#3494

#3494

#3495

#3495

#3496

#3498

#350

#3502

#3502

#3503

#3505

#3505

#3506

#3506

#3507

#351

#3513

#3514

#3516

#3518

#3519

#3522

#3524

#3526

#3527

#3527

#3528

#3529

#3529

#353

#3532

#3533

#3534

#3535

#3540

#3540

#3543

#3543

#3555

#3555

#3557

#3558

#3559

#3563

#3564

#3565

#3565

#3566

#3568

#3569

#357

#3572

#3573

#3575

#3578

#3578

#3579

#3580

#3581

#3582

#3582

#3583

#3585

#3587

#3587

#3588

#3588

#3589

#3589

#359

#3590

#3591

#3592

#3593

#3594

#3594

#3598

#3598

#3599

#3599

#36

#3600

#3605

#3605

#3606

#3608

#3608

#3610

#3610

#3612

#3612

#3615

#3615

#3616

#3616

#3617

#3617

#3619

#3619

#362

#3621

#3624

#3625

#3625

#3626

#3626

#3630

#3632

#3632

#3633

#3636

#3637

#3637

#3638

#3638

#3639

#3641

#3641

#3644

#3646

#3647

#3649

#365

#3650

#3650

#3652

#3653

#3653

#3655

#3656

#3656

#3658

#3658

#366

#3660

#3660

#3661

#3661

#3665

#3668

#3670

#3671

#3672

#3675

#3675

#3677

#3677

#3678

#3679

#368

#3684

#3685

#3686

#3686

#3689

#369

#3691

#3691

#3693

#3693

#3696

#3696

#3699

#3699

#3703

#3704

#3704

#3705

#3707

#3709

#3710

#3711

#3713

#3714

#3714

#3715

#3716

#3716

#3719

#3719

#3727

#3727

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

b2250

b2251

b2252

b2253

b2254

b2257

b2258

b2259

b2260

b2261

b2262

b2263

b2266

b2267

b2268

b2269

b2270

b2271

b2273

b2274

b2275

b2276

b2279

b2280

b2339

b2340

b2341

b2342

b2348

b2349

b2350

b2351

b2352

b2353

b2362

b2364

b2365

danbev-java-jar-artifact-test

danbev-testing-xcframework-release

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

v1.5.5

v1.6.0

v1.6.1

v1.6.2

v1.7.0

v1.7.1

v1.7.2

v1.7.2-pre

v1.7.3

v1.7.3-pre

v1.7.4

v1.7.4-pre-0

v1.7.4-pre-1

v1.7.5

v1.7.6

v1.8.0

v1.8.1

v1.8.2

v1.8.3

v1.8.4

76684141a5 ruby : fix dangling pointers, memory leak, and SEGV on parallel transcription (#3715) master KITAITI Makoto 2026-03-22 02:03:00 +09:00
9386f23940 release : v1.8.4 v1.8.4 Georgi Gerganov 2026-03-19 10:40:13 +02:00
ef3463bb29 ci : update workflows Georgi Gerganov 2026-03-18 22:43:38 +02:00
4bbce1e5b2 benches : update gg/benches-update Georgi Gerganov 2026-03-18 22:34:51 +02:00
f5b477ab09 sync : ggml Georgi Gerganov 2026-03-18 14:45:25 +02:00
b2be16208d ggml : bump version to 0.9.8 (ggml/1442) Georgi Gerganov 2026-03-16 20:15:14 +02:00
945d3151d9 ggml : restore ggml_type_sizef() to aboid major version bump (ggml/1441) Georgi Gerganov 2026-03-16 20:09:25 +02:00
dc96116622 fix: VAD time mapping timestamp drift caused by overlap samples (#3711) lohopupa 2026-03-17 12:19:08 +06:00
79218f51d0 go : handle EOF correctly in model download (#3671) Alan 2026-03-16 12:44:18 +01:00
975b979834 py : replace deprecated openvino-dev with openvino>=2023.3.0 (#3678) Aiudadadadf 2026-03-16 12:41:54 +01:00
21665eab4c examples : Allow max_len to be used for any output format (#3679) Gaël James 2026-03-16 12:33:56 +01:00
136dc2eb12 server: return proper HTTP status codes for error responses (#3707) Igor Loskutov 2026-03-16 07:33:06 -04:00
27fa20774a ggml : try fix arm build (#0) Georgi Gerganov 2026-03-16 09:11:13 +02:00
2bc630f197 talk-llama : sync llama.cpp Georgi Gerganov 2026-03-16 07:16:46 +02:00
ab1252c19e sync : ggml Georgi Gerganov 2026-03-16 07:13:51 +02:00
d4bc312169 ggml : extend im2col f16 (ggml/1434) David366AI 2026-03-15 15:50:56 -04:00
81ea958719 common : add nvfp4 (ggml/0) Georgi Gerganov 2026-03-15 19:56:19 +02:00
d7926e62d4 CUDA: limit number of FA stream-k CUDA blocks (llama/20586) Johannes Gäßler 2026-03-15 18:30:47 +01:00
2fb6aea8ad ggml: avoid creating CUDA context during device init (llama/20595) Pascal 2026-03-15 17:42:56 +01:00
b327a321a2 ggml/hip: fix APU compatibility - soft error handling for hipMemAdviseSetCoarseGrain (llama/20536) MoonShadow 2026-03-16 00:23:58 +08:00
6770239830 ggml : guard against sumq2 being 0 in IQ4_NL (llama/20460) Bartowski 2026-03-15 04:47:28 -04:00
55c66106af cuda : add RDNA4-specific MMVQ parameter table for bs=1 decode (llama/19478) PikaPikachu 2026-03-15 15:33:39 +08:00
cd02195b8f vulkan: use graphics queue on AMD (llama/20551) Ruben Ortlam 2026-03-15 08:18:54 +01:00
b312018435 metal : add FA specialization for HSK = 320, HSV = 256 (llama/20549) Georgi Gerganov 2026-03-14 23:15:47 +02:00
55f8cfdaed hexagon: Q4_0 and MXFP4 repack fixes (llama/20527) Max Krasnyansky 2026-03-14 11:09:08 -07:00
c5f9a49b51 add op gated_delta_net (llama/20455) Neo Zhang 2026-03-14 22:01:57 +08:00
93d09fdb23 ggml : add native AVX512-FP16 support for F16 operations (llama/20529) Adrien Gallouët 2026-03-14 10:06:14 +01:00
8ad5cb1e9d Use fp32 in cuBLAS V100 to avoid overflows, env variables to override cuBLAS compute type (llama/19959) Wallentri 2026-03-14 10:43:13 +03:00
96b163e874 ggml : add OpenVINO backend (llama/15307) Zijun Yu 2026-03-14 13:56:55 +08:00
46aad766f5 Fix data race in CUDA's "cpy" kernel (influences GGML's DUP, CONT operations). (llama/20507) Rail Chabdarov 2026-03-14 06:19:44 +01:00
a31600d8e3 opencl: fix l2_norm (llama/20480) lhez 2026-03-13 22:18:52 -07:00
c7abcd577b graph : remove redundant GDN state transposes (llama/20443) Georgi Gerganov 2026-03-13 22:12:54 +02:00
5905e8708f ggml-cpu: add RVV vec dot kernels for quantization types (llama/18859) rehan-10xengineer 2026-03-13 20:36:04 +05:00
9bfa81d262 ggml : fix typo gmml (llama/20512) Adrien Gallouët 2026-03-13 14:36:13 +01:00
f1f5f43d69 metal : fix l2 norm scale (llama/20493) Georgi Gerganov 2026-03-13 11:43:20 +02:00
2ed6dc0222 llama : disable graph reuse with pipeline parallelism (llama/20463) Georgi Gerganov 2026-03-12 21:04:13 +02:00
2450919665 vulkan: add GATED_DELTA_NET op support (llama/20334) ProgenyAlpha 2026-03-12 06:32:04 -04:00
44c12c642e vulkan: fix SSM_CONV PP scaling with large ubatch sizes (llama/20379) ProgenyAlpha 2026-03-12 05:03:18 -04:00
7e816a99d2 sync : ggml Georgi Gerganov 2026-03-16 07:13:14 +02:00
b48ffe28fc metal : avoid divisions in bin kernel (llama/20426) Georgi Gerganov 2026-03-16 07:12:50 +02:00
7ccebd5264 sync : ggml Georgi Gerganov 2026-03-16 07:12:37 +02:00
86e312d61d vulkan: fix l2_norm epsilon handling (llama/20350) Jeff Bolz 2026-03-12 00:39:41 -05:00
6c5e3aac3e vulkan: fix OOB check in flash_attn_mask_opt (llama/20296) Jeff Bolz 2026-03-12 00:35:49 -05:00
26ee4f7362 vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap (llama/20059) Masato Nakasaka 2026-03-11 22:30:16 -07:00
d5772cf7b2 opencl: use larger workgroup size for get_rows (llama/20316) lhez 2026-03-11 22:03:27 -07:00
193781cf0e opencl: add cumsum op (llama/18981) shaofeiqi 2026-03-11 22:03:07 -07:00
f5ba865378 hip: compile debug builds with -O2 on hip to avoid a compiler bug (llama/20392) uvos 2026-03-12 03:37:10 +01:00
5267523829 ggml-webgpu: Add supports for GGML_OP_REPEAT (llama/20230) Masashi Yoshimura 2026-03-12 06:40:36 +09:00
d73fe25267 llama : enable chunked fused GDN path (llama/20340) Georgi Gerganov 2026-03-11 22:46:40 +02:00
e4021d4071 ggml : add NVFP4 quantization type support (llama/19769) Richard Davison 2026-03-11 21:02:54 +01:00
5d3a5447c8 llama : add support for Nemotron 3 Super (llama/20411) Daniel Bevenius 2026-03-11 19:27:53 +01:00
e2aa5c73f3 metal : fix capture_compute counter logic (llama/20410) Georgi Gerganov 2026-03-11 18:38:22 +02:00
0e1e76f93b metal : fix q5_k mul_mv register spill (llama/20399) Georgi Gerganov 2026-03-11 16:25:27 +02:00
c2e384f21e metal : add env var to trigger graph capture (llama/20398) Georgi Gerganov 2026-03-11 16:25:10 +02:00
8b335550cf ggml-cuda: gdn use shared mem for HIP (llama/20366) uvos 2026-03-11 06:06:19 +01:00
7c9a16c565 cuda/hip: fix loop unrolling in ssm-conv (llama/20369) uvos 2026-03-11 06:04:32 +01:00
286387ef0a fix op rope, add rope_back (llama/20293) Neo Zhang 2026-03-11 09:53:34 +08:00
72c7a2532d fix for failed UT case: ACC, L2_NORM, UPSCALE, fused_glu, unary (llama/20283) Neo Zhang 2026-03-11 09:53:05 +08:00
1e05b10d67 ggml : bump RPC version (llama/20330) Georgi Gerganov 2026-03-10 21:36:57 +02:00
fddedc5cbc ggml webgpu: faster normal quant and some k-quant matrix operations, better shader parameter handling (llama/20173) Reese Levine 2026-03-10 09:14:27 -07:00
dfa6858d02 kleidiai : support for concurrent sme and neon kernel execution (llama/20070) Charles Xu 2026-03-10 08:25:25 +01:00
bd64b8af4d ggml-cpu: add RVV repack GEMM and GEMV for quantization types (llama/19121) Taimur Ahmad 2026-03-10 11:49:52 +05:00
cabe3d95f4 metal: handle command buffer failures gracefully in synchronize (llama/20306) Julian Pscheid 2026-03-09 23:32:24 -07:00
ae21974f4f metal : extend mul_mv_ext to BF16, Q2_K, Q3_K (llama/20250) Paul Flynn 2026-03-09 10:48:12 -04:00
d19c65e9da metal : add upscale (llama/20284) Georgi Gerganov 2026-03-09 16:45:11 +02:00
3984ae384d ggml-cuda: disable gdn for musa (llama/20278) Aman Gupta 2026-03-09 16:15:36 +08:00
65dbf3c31a ggml-vulkan: add SGN operator, auto-generate Vulkan.csv and ops.md (llama/20219) Bertay Eren 2026-03-09 09:24:16 +03:00
890c047e30 vulkan: skip zero size tensors in backend copies (llama/20233) Ruben Ortlam 2026-03-09 07:23:45 +01:00
f099ed27b8 cuda : display total and free VRAM capacity during device initialization (llama/20185) Michael Huang 2026-03-08 21:45:43 -07:00
8d97f59639 ggml-vulkan: Add ELU op support (llama/20183) GiantPrince 2026-03-08 07:38:17 -04:00
4b0653a792 vulkan: Fix data races in coopmat1 mul_mat(_id) (llama/20084) Jeff Bolz 2026-03-08 06:33:48 -05:00
8a9b0ba1df supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 (llama/20190) Neo Zhang 2026-03-08 12:00:07 +08:00
49489bfbd1 ggml: add GATED_DELTA_NET op (llama/19504) Aman Gupta 2026-03-07 15:41:10 +08:00
910034df28 opencl: add l2_norm (llama/20160) lhez 2026-03-06 18:03:05 -08:00
6e063fae5a quants : Add memsets and other fixes for IQ quants (llama/19861) Bartowski 2026-03-06 16:06:56 -05:00
78b3801d54 hexagon: add f32 ssm_conv op (llama/20122) Todor Boinovski 2026-03-06 09:59:26 -08:00
247ec204d8 cpu: skip redudant ROPE cache updates (llama/20149) Max Krasnyansky 2026-03-06 08:32:40 -08:00
d658720fa5 ggml-cuda: add mem check for fusion (llama/19916) Aman Gupta 2026-03-07 00:05:43 +08:00
5d9b73dc06 ggml: update comments for backends which have no memory to report (llama/20157) Aaron Teo 2026-03-06 23:24:38 +08:00
548f2e5190 ggml-cpu: Fix gcc 15 ICE on ppc64le (ggml/20083) (llama/20130) shalinib-ibm 2026-03-06 20:52:39 +05:30
d2d235f467 CUDA: use shared mem for ssm_conv (llama/20128) Aman Gupta 2026-03-06 23:09:59 +08:00
596b655dbd ggml-cpu: fix data race for debug asserts (llama/20148) Johannes Gäßler 2026-03-06 09:12:49 +01:00
1d94b0be4f opencl: add neg, exp and diag (llama/20127) lhez 2026-03-05 21:16:39 -08:00
f56fb1be3b hexagon: add fp16 support for binary ops: add,sub,mul,div (llama/20139) YardenTal44 2026-03-06 04:29:13 +02:00
51f397c1af CUDA: Improve performance via less synchronizations between token (llama/17795) Andreas Kieslinger 2026-03-05 12:53:21 +01:00
67abc63e9d chore : correct typos [no ci] (llama/20041) Marcel Petrick 2026-03-05 08:50:21 +01:00
2e79b85f66 hexagon: Flash Attention optimizations (dma, mpyacc, multi-row) and MatMul updates (llama/20118) Max Krasnyansky 2026-03-04 21:55:29 -08:00
2c50962528 opencl: add SET, support i32 for CPY, minor refactor for cpy (llama/20101) lhez 2026-03-04 21:32:26 -08:00
4834971a4f Fix wait logic for inflight jobs (llama/20096) Nikhil Jain 2026-03-04 11:54:55 -08:00
8d78d40946 Add concat op to webgpu. (llama/20068) Masashi Yoshimura 2026-03-05 04:19:00 +09:00
5d25427e58 ggml: fix ggml_is_contiguous_n for ne == 1 (llama/20092) Johannes Gäßler 2026-03-04 12:04:31 +01:00
b1b018dfd1 ggml : use a simple std::thread in AMX without OpenMP (llama/20074) Adrien Gallouët 2026-03-04 11:57:09 +01:00
169d723fa0 kleidiai : add sme fp16 compute path for q4_0 gemm on aarch64 (llama/20043) Charles Xu 2026-03-03 10:40:26 +01:00
3a96680718 opencl: add optimized q4_1 mm kernel for adreno (llama/19840) shaofeiqi 2026-03-02 19:49:41 -08:00
3145384715 ggml webgpu: fix workgroup dispatch limit for large batch sizes (llama/19965) Abhijit Ramesh 2026-03-02 19:35:11 -08:00
22034a5f6f ggml webgpu: Clean up per-thread parameter buffer pool and job submission logic (llama/19772) Nikhil Jain 2026-03-02 10:23:34 -08:00
de686fafad ggml-webgpu: Support non-contiguous src0 and overlapping src0/src1 in binary ops (llama/19850) Masashi Yoshimura 2026-03-03 00:59:53 +09:00
923a292429 vulkan: tune MMVQ for Intel Windows (llama/19988) Ruben Ortlam 2026-03-02 15:58:25 +01:00
e2be9edd5a ggml-cpu: optimise s390x multiply extend instructions (llama/20032) Aaron Teo 2026-03-02 16:23:56 +08:00
2a9649c420 vulkan: improve partial offloading performance on AMD (llama/19976) Ruben Ortlam 2026-03-01 17:32:14 +01:00

1 2 3 4 5 ...