whisper.cpp

ggml-org/whisper.cpp

Fork 0

mirror of https://github.com/ggml-org/whisper.cpp.git synced 2026-03-28 19:27:07 +00:00

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

ci/env

copilot/add-duplicate-text-removal

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

fix_vs_sdl2

gg/alloc-enc-results

gg/bench-fix-print

gg/benches-update

gg/chess

gg/ci-cuda-fix

gg/ci-fix-android

gg/ci-fix-windows

gg/cuda-fix-mmvq

gg/cuda-no-async

gg/disable-cuda-graphs

gg/fix-external-encoder

gg/hipblas-fix

gg/make-fix-glob

gg/objc

gg/prompt-tokens

gg/reduce-ctx-use

gg/wchess

gg/whisper-short-audio-check

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

sync-ggml-25-04-02-2

sync-ggml-25-05-07

sync-ggml-25-05-13

sync-ggml-25-09-30-2

sync-ggml-25-12-12

sync-ggml-25-12-17

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1980

#1981

#1982

#1983

#1990

#1990

#1994

#1997

#1998

#20

#2000

#2001

#2004

#2005

#2005

#201

#2012

#2019

#2020

#2024

#2025

#2026

#203

#203

#2043

#2044

#2045

#2048

#2049

#2054

#2058

#2063

#2068

#2068

#2069

#2070

#2071

#2071

#2072

#2073

#2075

#2075

#2080

#2086

#2088

#2090

#2094

#2095

#2095

#21

#2100

#2102

#2108

#2115

#2119

#2121

#2123

#2127

#2127

#2128

#2129

#2133

#2138

#2142

#2152

#2153

#2154

#2166

#2170

#2181

#2182

#2184

#2184

#2189

#2194

#2196

#2198

#2206

#2208

#2217

#222

#2220

#2227

#2231

#2232

#2234

#2235

#2236

#2237

#2238

#2239

#224

#2240

#2242

#2254

#2254

#2256

#2261

#2264

#2266

#2267

#2270

#2272

#2272

#2279

#2279

#228

#2288

#229

#2290

#2291

#2294

#2299

#23

#230

#2302

#231

#2311

#2324

#2330

#2336

#2339

#2342

#2343

#2346

#2350

#2358

#2360

#2367

#2369

#2369

#2376

#2382

#2383

#2384

#2386

#2387

#239

#2391

#2393

#2396

#24

#2401

#2406

#2406

#2407

#2410

#2414

#2416

#2417

#2419

#2424

#2425

#2427

#2429

#2431

#2432

#2432

#2433

#2440

#2443

#2444

#2449

#245

#2451

#2455

#2464

#2475

#2477

#2481

#2484

#2485

#2488

#2489

#2495

#2505

#2506

#2511

#2515

#2516

#2517

#2518

#2519

#252

#2523

#2525

#2528

#2529

#253

#2534

#254

#2543

#2546

#2547

#2548

#2549

#2550

#2551

#2555

#2560

#2560

#2561

#2562

#2567

#2569

#257

#2570

#2573

#2574

#2576

#2577

#2577

#2579

#2580

#2585

#2589

#2593

#2593

#260

#2604

#2608

#2611

#2613

#2617

#2623

#2624

#2625

#2629

#2633

#2634

#2634

#2635

#2637

#2638

#2639

#2641

#2642

#2643

#2648

#2649

#2653

#2654

#2656

#2659

#2663

#2664

#2670

#2674

#2676

#2683

#2684

#2686

#2687

#2690

#2690

#2691

#2691

#2692

#2693

#2694

#2694

#2699

#27

#2700

#2707

#2709

#271

#2711

#2716

#2718

#2728

#273

#2734

#2736

#2737

#274

#2745

#2749

#2756

#2759

#2760

#2769

#2769

#277

#2770

#2777

#2779

#2790

#2796

#2797

#2799

#28

#2800

#2800

#2816

#282

#2821

#2822

#2824

#2826

#2826

#2831

#2831

#2832

#2832

#2836

#2838

#2838

#284

#284

#2840

#2842

#2842

#2843

#2844

#2845

#2846

#285

#2851

#2853

#2855

#2858

#286

#2862

#2863

#2868

#287

#2873

#2875

#2876

#2877

#2878

#2879

#288

#2880

#2882

#2887

#2889

#2891

#2893

#2895

#2896

#29

#2900

#2902

#2904

#2905

#2908

#291

#2910

#2911

#2912

#2914

#2915

#2916

#2918

#2919

#2921

#2923

#2924

#2925

#2932

#2935

#2937

#2938

#2939

#294

#2941

#2942

#2943

#2945

#2946

#2947

#2948

#2949

#2951

#2952

#2953

#2955

#2956

#2958

#2959

#296

#2960

#2962

#2966

#2968

#2969

#2971

#2972

#2973

#2975

#2976

#2977

#2979

#298

#2981

#2985

#2986

#2987

#2988

#299

#2990

#2991

#2992

#2993

#2994

#2997

#2999

#3

#3000

#3001

#3002

#3004

#3005

#3006

#3007

#301

#3016

#302

#3021

#3022

#3024

#3025

#3027

#3028

#3029

#3031

#3033

#3038

#3042

#3043

#3044

#3045

#3050

#3052

#3054

#3054

#3055

#3056

#3057

#306

#3060

#3062

#3064

#3065

#3068

#3069

#3070

#3071

#3073

#3075

#3076

#308

#3082

#3083

#3084

#3085

#3086

#3087

#3090

#3097

#3098

#31

#3100

#3101

#3102

#3103

#3104

#3106

#3108

#3109

#3112

#3114

#3120

#3124

#3125

#3126

#3127

#3130

#3131

#3132

#3133

#3134

#3136

#3138

#3140

#3141

#3142

#3143

#3145

#3147

#3148

#3149

#3150

#3151

#3152

#3156

#3157

#3158

#3160

#3160

#3163

#3164

#317

#3170

#3171

#3172

#3173

#3175

#3177

#3178

#3179

#318

#3180

#3181

#3183

#3184

#3185

#3186

#3187

#3189

#319

#3190

#3191

#3192

#3193

#3195

#3196

#3197

#3199

#320

#3200

#3201

#3202

#3203

#3206

#3208

#3209

#3214

#3215

#3217

#3218

#3218

#3219

#322

#3220

#3221

#3222

#3223

#3223

#3229

#323

#3230

#3231

#3233

#3234

#3237

#3239

#324

#3241

#3242

#3243

#3244

#3244

#3245

#3246

#3247

#3251

#3255

#3257

#3257

#3261

#3262

#3264

#3265

#3266

#3268

#3270

#3272

#3273

#3274

#3274

#3275

#3276

#3277

#3281

#3282

#3283

#3284

#3287

#3288

#3289

#3291

#3291

#3292

#3294

#3296

#3298

#3298

#3300

#3301

#3307

#331

#3310

#3313

#3313

#3318

#3319

#3321

#3321

#3322

#3323

#3324

#3325

#3325

#3327

#3328

#3329

#3332

#3333

#3336

#3342

#3346

#3349

#3350

#3354

#336

#3363

#3365

#3369

#3371

#3372

#3374

#3374

#3378

#3378

#3379

#3381

#3383

#3387

#3387

#3389

#3394

#3395

#34

#340

#3401

#3401

#3406

#3408

#3409

#3412

#3412

#3416

#3417

#3417

#3419

#3419

#3422

#3422

#3423

#3423

#3425

#3426

#3428

#343

#343

#3430

#3430

#3433

#3433

#3436

#3437

#3438

#3439

#3441

#3442

#3443

#3445

#3447

#3448

#345

#3453

#3456

#3457

#3457

#346

#3461

#3462

#3462

#3463

#3466

#3467

#3468

#3468

#3469

#3470

#3471

#3471

#3472

#3472

#3473

#3473

#3474

#3474

#3477

#3478

#3482

#3483

#3484

#3485

#3485

#3487

#3488

#3489

#3489

#349

#3490

#3492

#3494

#3494

#3495

#3495

#3496

#3498

#350

#3502

#3502

#3503

#3505

#3505

#3506

#3506

#3507

#351

#3513

#3514

#3516

#3518

#3519

#3522

#3524

#3526

#3527

#3527

#3528

#3529

#3529

#353

#3532

#3533

#3534

#3535

#3540

#3540

#3543

#3543

#3555

#3555

#3557

#3558

#3559

#3563

#3564

#3565

#3565

#3566

#3568

#3569

#357

#3572

#3573

#3575

#3578

#3578

#3579

#3580

#3581

#3582

#3582

#3583

#3585

#3587

#3587

#3588

#3588

#3589

#3589

#359

#3590

#3591

#3592

#3593

#3594

#3594

#3598

#3598

#3599

#3599

#36

#3600

#3605

#3605

#3606

#3608

#3608

#3610

#3610

#3612

#3612

#3615

#3615

#3616

#3616

#3617

#3617

#3619

#3619

#362

#3621

#3624

#3625

#3625

#3626

#3626

#3630

#3632

#3632

#3633

#3636

#3637

#3637

#3638

#3638

#3639

#3641

#3641

#3644

#3646

#3647

#3649

#365

#3650

#3650

#3652

#3653

#3653

#3655

#3656

#3656

#3658

#3658

#366

#3660

#3660

#3661

#3661

#3665

#3668

#3670

#3671

#3672

#3675

#3675

#3677

#3677

#3678

#3679

#368

#3684

#3685

#3686

#3686

#3689

#369

#3691

#3691

#3693

#3693

#3696

#3696

#3699

#3699

#3703

#3704

#3704

#3705

#3707

#3709

#3710

#3711

#3713

#3714

#3714

#3715

#3716

#3716

#3719

#3719

#3727

#3727

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

b2250

b2251

b2252

b2253

b2254

b2257

b2258

b2259

b2260

b2261

b2262

b2263

b2266

b2267

b2268

b2269

b2270

b2271

b2273

b2274

b2275

b2276

b2279

b2280

b2339

b2340

b2341

b2342

b2348

b2349

b2350

b2351

b2352

b2353

b2362

b2364

b2365

danbev-java-jar-artifact-test

danbev-testing-xcframework-release

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

v1.5.5

v1.6.0

v1.6.1

v1.6.2

v1.7.0

v1.7.1

v1.7.2

v1.7.2-pre

v1.7.3

v1.7.3-pre

v1.7.4

v1.7.4-pre-0

v1.7.4-pre-1

v1.7.5

v1.7.6

v1.8.0

v1.8.1

v1.8.2

v1.8.3

v1.8.4

ca3f6bbd3c cuda: cap grid.y at 65535 in non-contiguous dequantize/convert kernels (llama/19999) oobabooga 2026-03-01 02:40:22 -03:00
699eaf3a10 CUDA: add CDNA3 MFMA support for flash attention MMA kernel (llama/19806) Jayant Lohia 2026-02-28 00:07:26 +05:30
b524b5a1f0 ggml-cpu: add repack for mxfp4 (llama/19738) Aman Gupta 2026-02-27 18:15:09 +08:00
30c5194c96 ruby : null-check (#3689) KITAITI Makoto 2026-03-05 14:36:42 +09:00
9453b4b9be gguf : sync (ggml/0) Georgi Gerganov 2026-02-27 12:24:59 +02:00
aaf8bdf3b8 scripts : sync gguf Georgi Gerganov 2026-02-27 12:24:33 +02:00
84f8db71d8 talk-llama : sync llama.cpp Georgi Gerganov 2026-02-27 12:23:40 +02:00
4734056067 sync : ggml Georgi Gerganov 2026-02-27 12:19:27 +02:00
64f48603e6 replace the magic nunber 768 by max work group size to support iGPU (llama/19920) Neo Zhang 2026-02-27 09:26:07 +08:00
9c1fd5cc6e ggml-zendnn: update code for latest ZenDNN API (llama/19923) Vishal Singh 2026-02-27 06:13:41 +05:30
316d921c1a ggml : fix AMX and add batched support (llama/19925) Adrien Gallouët 2026-02-26 21:39:11 +01:00
e722ee1bf5 vulkan: fix fp16 Flash Attention on Windows AMD RDNA2 and below (llama/19921) Ruben Ortlam 2026-02-26 19:11:04 +01:00
f877e1b202 ggml-virtgpu: improve the reliability of the code (llama/19846) Kevin Pouget 2026-02-26 13:00:57 +01:00
4cac408c60 support permuted, remove check s0/s10 (llama/19889) Neo Zhang 2026-02-26 10:27:20 +08:00
fb55b2654b vulkan: check for memory overlap before doing fusion (llama/19768) Jeff Bolz 2026-02-25 11:25:38 -06:00
279be33a83 ggml/gguf : prevent integer overflows (llama/19856) Georgi Gerganov 2026-02-24 20:17:11 +02:00
90800b5aa5 Vulkan Scalar Flash Attention Refactor (llama/19625) Ruben Ortlam 2026-02-24 08:35:48 +01:00
dcc877688d vulkan: fix coopmat1 without bf16 support (llama/19793) Jeff Bolz 2026-02-24 00:48:32 -06:00
344eae3d22 vulkan: fix data race in mul_mat_id shader (llama/19790) Jeff Bolz 2026-02-24 00:43:12 -06:00
53b571a47e hexagon refactor all Ops to use local context struct (llama/19819) Max Krasnyansky 2026-02-23 16:32:14 -08:00
06fbd9c5f2 ggml-cpu: arm64: q5_K repack gemm and gemv (and generic) implementations (dotprod) (llama/19356) Alberto Cabrera Pérez 2026-02-23 12:42:52 +00:00
98915f889a Improve CUDA graph capture (llama/19754) Gaurav Garg 2026-02-21 15:09:36 +05:30
0c10a15447 ggml-cpu: add RVV vec dot kernels for quantization types (llama/18784) Taimur Ahmad 2026-02-20 16:30:07 +05:00
0158795ebc ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. (llama/19700) Masashi Yoshimura 2026-02-20 01:18:30 +09:00
3f68f30907 vulkan: fix MMQ shader push constants and multi-dispatch (llama/19732) Ruben Ortlam 2026-02-19 14:59:16 +01:00
ade724fced CUDA: fix kernel selection logic for tile FA (llama/19686) Johannes Gäßler 2026-02-19 12:42:58 +01:00
cc9e5cf89d llamafile: powerpc: add FP16 MMA path for Q4/Q8 matmul (llama/19709) shalinib-ibm 2026-02-19 11:58:53 +05:30
8b3a52ba87 ggml webgpu: Fix bug in dispatching large matrix-vector multiplication (llama/19535) Reese Levine 2026-02-18 16:06:29 -07:00
fc7a78f4d8 ggml webgpu: shader library organization (llama/19530) Reese Levine 2026-02-25 09:33:32 +02:00
f1da0a26f5 vulkan: split mul_mat into multiple dispatches to avoid overflow (llama/19509) Jeff Bolz 2026-02-18 01:47:10 -08:00
51ce7de94c opencl: refactor expm1 and softplus (llama/19404) shaofeiqi 2026-02-17 14:47:18 -08:00
6fadc749a9 opencl: optimize mean and sum_row kernels (llama/19614) shaofeiqi 2026-02-17 13:56:09 -08:00
58855d08c2 ggml: ggml-cpu: force-no-lto-for-cpu-feats (llama/19609) Talha Can Havadar 2026-02-17 12:22:46 +01:00
cf4bd07028 cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (llama/19645) Georgi Gerganov 2026-02-17 12:31:49 +02:00
5ee5748722 ggml : make ggml_is_view as API (llama/19539) Judd 2026-02-16 23:43:34 +08:00
5d9d72ec12 Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (llama/19591) Mario Limonciello 2026-02-16 07:46:08 -06:00
f8f7c1d891 ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (llama/19132) abhijain1204fujitsu 2026-02-16 12:08:43 +05:30
02a9f660b8 cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization (llama/19624) David Friehs 2026-02-15 18:08:42 +01:00
df2f8d3bc4 cmake : check if KleidiAI API has been fetched (llama/19640) Daniel Bevenius 2026-02-15 13:59:38 +01:00
22f0861efc ggml : avoid UB in gemm ukernel (llama/19642) Georgi Gerganov 2026-02-15 14:56:35 +02:00
7b5a1ebaa6 ggml-cpu: optimize ggml_vec_dot_bf16 for s390x (llama/19399) Aaron Teo 2026-02-15 18:20:35 +08:00
76f769d06f ggml-cpu: FA add GEMM microkernel (llama/19422) Aman Gupta 2026-02-15 11:09:24 +05:30
7ee772ab2b cmake : fix KleidiAI install target failure with EXCLUDE_FROM_ALL (llama/19581) SamareshSingh 2026-02-14 23:22:53 -06:00
4bea3cd329 ggml : bump version to 0.9.7 (ggml/1425) Georgi Gerganov 2026-02-15 22:21:04 +02:00
cec1dd9d12 examples : update miniaudio library to 0.11.24 (#3672) Dmitry Atamanov 2026-02-27 15:15:15 +05:00
21411d81ea docs : fix duplicate word typo in VAD section (#3670) Maxime Grenu 2026-02-19 16:18:42 +01:00
364c77f4ca talk-llama : sync llama.cpp Georgi Gerganov 2026-02-15 19:43:28 +02:00
83f2ed19e1 sync : ggml Georgi Gerganov 2026-02-15 19:42:09 +02:00
4ac70ce791 models : optimize qwen3next graph (llama/19375) Georgi Gerganov 2026-02-14 12:57:36 +02:00
226e8c041c ggml : fix GGML_DEBUG with OpenMP (llama/19599) Adrien Gallouët 2026-02-14 11:22:57 +01:00
fbdac5119c metal : fix ACC op (llama/19427) Georgi Gerganov 2026-02-14 09:54:03 +02:00
cc448def01 vulkan: support L2_NORM with contiguous rows (llama/19604) Jeff Bolz 2026-02-13 21:42:04 -08:00
197e9ab6eb vulkan: support GGML_OP_SET (llama/19584) Jeff Bolz 2026-02-13 21:36:38 -08:00
fc6bbab817 vulkan: Add vendor id for Qualcomm drivers (llama/19569) Sophon 2026-02-14 13:29:17 +08:00
e6476d4c12 hexagon: further optimizations and refactoring for flash attention (llama/19583) Max Krasnyansky 2026-02-13 16:27:30 -08:00
ec57bf407c vulkan: restore -inf check in FA shaders (llama/19582) Jeff Bolz 2026-02-13 11:35:29 -08:00
e8a25654b2 Fix wrong memcpy length for block_interleave == 4 (llama/19575) Alberto Cabrera Pérez 2026-02-13 12:32:14 +00:00
628b545b7e fix vulkan ggml_acc only works in 3d but not 4d (llama/19426) ymcki 2026-02-13 20:31:37 +08:00
58e3d5a42d CUDA: loop over ne2*ne3 in case it overflows (llama/19538) Aman Gupta 2026-02-13 17:01:40 +05:30
3eb4905af1 CUDA: Do not mutate cgraph for fused ADDs (llama/19566) Oliver Simons 2026-02-13 10:37:55 +01:00
0e94faa19c metal : improve concurrency (llama/19555) Georgi Gerganov 2026-02-13 07:35:57 +02:00
c5325e50fc metal : support GGML_OP_SET (llama/19548) Georgi Gerganov 2026-02-13 07:34:52 +02:00
195af60a8b hexagon: fix typo in vtcm_needs_release (llama/19545) Shupei Fan 2026-02-13 07:07:49 +08:00
9f87eeccdf opencl: add basic support for q4_1 (llama/19534) lhez 2026-02-12 14:52:37 -08:00
d8e3e2ef08 metal : update sum_rows kernel to support float4 (llama/19524) Georgi Gerganov 2026-02-12 11:35:28 +02:00
39b5f414a3 Add a workaround for compilation with ROCWMMA_FATTN and gfx9 (llama/19461) Mario Limonciello 2026-02-12 02:38:35 -06:00
304205679c hexagon: further optimization and tuning of matmul and dot kernels (llama/19407) Max Krasnyansky 2026-02-11 23:04:27 -08:00
0326fd37dd opencl: add general Q6_K mm and Q4_K mv (llama/19347) lhez 2026-02-11 10:33:13 -08:00
f3e78985be ggml : unary ops support non-cont src0 + metal F16 unary ops (llama/19511) Georgi Gerganov 2026-02-11 18:58:43 +02:00
3ffa1fd84e metal : extend l2_norm support for non-cont src0 (llama/19502) Georgi Gerganov 2026-02-11 14:53:19 +02:00
09587ceb12 hexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU (llama/19406) Max Krasnyansky 2026-02-10 23:21:12 -08:00
3504358056 ggml : extend bin bcast for permuted src1 (llama/19484) Georgi Gerganov 2026-02-11 07:52:00 +02:00
de949fb1db metal : consolidate unary ops (llama/19490) Georgi Gerganov 2026-02-11 07:51:12 +02:00
57c620b4b1 CUDA : Update CCCL-tag for 3.2 to final release from RC (llama/19486) Oliver Simons 2026-02-10 22:31:19 +01:00
562255fd77 Plug memory leaks and free resources on shutdown (llama/19315) Nikhil Jain 2026-02-10 08:04:00 -08:00
d77265c818 ggml-cpu: arm64: q6_K repack gemm and gemv (and generic) implementations (dotprod) (llama/19360) Alberto Cabrera Pérez 2026-02-10 10:47:45 +00:00
b0fe2e84fa ggml : use noexcept overload for is_regular_file in backend registration (llama/19452) k4ss4n 2026-02-10 10:57:48 +01:00
2de2fc9270 CANN: Remove unnecessary wrapper for gml_backend_buft_is_cann (llama/18968) Raul Torres 2026-02-10 06:19:30 +00:00
6a74f56212 CANN: implement quantized MUL_MAT_ID for MoE models (llama/19228) hipudding 2026-02-10 14:18:59 +08:00
a36210c836 cuda : extend GGML_OP_PAD to work with non-cont src0 (llama/19429) Georgi Gerganov 2026-02-10 08:07:16 +02:00
808904277e CUDA: Fix non-contig rope (llama/19338) Oliver Simons 2026-02-08 14:12:51 +01:00
764482c317 ci: add vulkan docker image (#3644) Nuno 2026-02-09 11:33:06 +01:00
052066c4f7 chore: Update outdated GitHub Actions versions (#3646) Pádraic Slattery 2026-02-09 11:32:46 +01:00
525be69a66 cmake: Drop obsolete build-time configuration of backends (#3649) Christian Kastner 2026-02-09 11:32:18 +01:00
eb27fa2252 server : fix hardcoded /inference path in default HTML page (#3639) Sid Mohan 2026-02-09 00:10:13 -08:00
193f7cdaaf ci : try fix mirrors (#3655) Georgi Gerganov 2026-02-09 09:59:22 +02:00
4b23ff249e talk-llama : sync llama.cpp Georgi Gerganov 2026-02-07 10:39:43 +02:00
b0e81c1a2e sync : ggml Georgi Gerganov 2026-02-07 10:38:22 +02:00
55d7cb2e93 metal : consolidate bin kernels (llama/19390) Georgi Gerganov 2026-02-07 10:35:56 +02:00
a9a0a51fba metal : fix event synchronization in cpy_tensor_async (llama/19402) Georgi Gerganov 2026-02-07 07:37:15 +02:00
1739af663a ggml-webgpu: JIT compile binary operators and handle binding overlaps (llama/19310) Abhijit Ramesh 2026-02-06 10:33:30 -08:00
f2f7320817 sycl: add F16 support for GGML_OP_CEIL (llama/19306) Nechama Krashinski 2026-02-06 17:13:44 +02:00
cea22b3075 vulkan: For coopmat2 FA, use fp16 accumulators for the final result (llama/19376) Jeff Bolz 2026-02-06 02:15:13 -06:00
c1b63354bb vulkan: make FA mask/softcap enables spec constants (llama/19309) Jeff Bolz 2026-02-06 01:49:58 -06:00
776cf61857 metal : skip loading all-zero mask (llama/19337) Georgi Gerganov 2026-02-06 09:25:11 +02:00
2a7d5490f1 cuda : cuda graphs now compare all node params (llama/19383) Georgi Gerganov 2026-02-06 07:55:06 +02:00
34d332aca5 metal : adaptive CPU/GPU interleave based on number of nodes (llama/19369) Georgi Gerganov 2026-02-05 19:07:22 +02:00
a567c140a3 vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (llama/19281) Jeff Bolz 2026-02-05 09:26:38 -06:00
0781df2518 metal : add diag (llama/19330) Georgi Gerganov 2026-02-05 10:08:45 +02:00
932def3198 vulkan: fix GPU deduplication logic. (llama/19222) Oleksandr Kuvshynov 2026-02-05 03:06:59 -05:00

1 2 3 4 5 ...