TAGS
20 tagsBump minimum unsloth version to 2026.3.16 in install scripts (#4663) Update install.sh and install.ps1 to require unsloth>=2026.3.16, matching the latest PyPI release.
feat(studio): editable context length with Apply/Reset for GGUF settings (#4592) * feat(studio): editable context length with Apply/Reset for GGUF model settings Previously the Context Length field was read-only and the backend hardcoded `-c 0`, ignoring custom values entirely. KV Cache Dtype also triggered an immediate model reload with no way to cancel. Backend: - llama_cpp.py: pass the actual n_ctx value to `-c` instead of always 0 - models/inference.py: relax max_seq_length to 0..1048576 (0 = model default) so GGUF models with large context windows are supported Frontend: - chat-runtime-store: add customContextLength and loadedKvCacheDtype state fields for dirty tracking - chat-settings-sheet: make Context Length an editable number input, stop KV Cache Dtype from auto-reloading, show Apply/Reset buttons when either setting has been changed - use-chat-model-runtime: send customContextLength as max_seq_length in the load request, reset after successful load * fix: preserve maxSeqLength for non-GGUF models in load request customContextLength ?? 0 sent max_seq_length=0 for non-GGUF models, breaking the finetuning/inference path that needs the slider value. Now uses a three-way branch: - customContextLength set: use it (user edited GGUF context) - GGUF without custom: 0 (model's native context) - Non-GGUF: maxSeqLength from the sampling slider * fix: keep max_seq_length default at 4096 for non-GGUF callers Only relax the bounds (ge=0 for GGUF's "model default" mode, le=1048576 for large context windows). The default stays at 4096 so API callers that omit max_seq_length still get a sane value for non-GGUF models. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(studio): rename trust remote code toggle and hide when no model selected - Rename "Trust remote code" to "Enable custom code" - Shorten subtitle to "Only enable if sure" - Hide the toggle when no model is loaded (already hidden for GGUFs) * fix: restore ge=128 for max_seq_length validation Keep the minimum at 128 so the API rejects nonsensical values. GGUF path now sends the model's native context length (from ggufContextLength) instead of 0 when the user has not customized it. The upper bound stays at 1048576 for large-context GGUF models. * feat(studio): replace Context Length input with slider Use a ParamSlider (512 to model's native context, step 512) instead of a small number input. Shows "Max" when at the model's native context length. Consistent with the other slider controls in the settings panel. * feat(studio): add editable number input alongside Context Length slider The slider and number input stay synced -- dragging the slider updates the number, typing a number moves the slider. The input also accepts values beyond the slider range for power users who need custom context lengths larger than the model default. * fix(studio): widen context length input and use 1024 step for slider Make the number input wider (100px) so large values like 262144 are fully visible. Change slider step from 512 to 1024 and min from 512 to 1024. * fix(studio): context length number input increments by 1024 * fix(studio): cap context length input at model's native max Adds max attribute and clamps typed/incremented values so the context length cannot exceed the GGUF model's reported context window. * fix(studio): point "What's new" link to changelog page Changed from /blog to /docs/new/changelog. * fix(studio): preserve custom context length after Apply, remove stale subtitle - After a reload with a custom context length, keep the user's value in the UI instead of snapping back to the model's native max. ggufContextLength always reports the model's native metadata value regardless of what -c was passed, so we need to preserve customContextLength when it differs from native. - Remove "Reload to apply." from KV Cache Dtype subtitle since the Apply/Reset buttons now handle this. * feat(studio): auto-enable Search and Code tools when model supports them Previously toolsEnabled and codeToolsEnabled stayed false after loading a model even if it reported supports_tools=true. Now both toggles are automatically enabled when the loaded model supports tool calling, matching the existing behavior for reasoning. * fix(studio): auto-enable tools in autoLoadSmallestModel path The suggestion cards trigger autoLoadSmallestModel which bypasses selectModel entirely. It was hardcoding toolsEnabled: false and codeToolsEnabled: false even when the model supports tool calling. Now both are set from the load response, matching the selectModel behavior. Also sets kvCacheDtype/loadedKvCacheDtype for dirty tracking consistency. * fix(studio): re-read tool flags after auto-loading model The runtime state was captured once at the start of the chat adapter's run(), before autoLoadSmallestModel() executes. After auto-load enables tools in the store, the request was still built with the stale snapshot that had toolsEnabled=false. Now re-reads the store after auto-load so the first message includes tools. * fix(studio): re-read entire runtime state after auto-load, not just tools The runtime snapshot (including params.checkpoint, model id, and all tool/reasoning flags) was captured once before auto-load. After autoLoadSmallestModel sets the checkpoint and enables tools, the request was still built with stale params (empty checkpoint, tools disabled). Now re-reads the full store state after auto-load so the first message has the correct model, tools, and reasoning flags. * feat(studio): add Hugging Face token field in Preferences Adds a password input under Configuration > Preferences for users to enter their HF token. The token is persisted in localStorage and passed to all model validate/load/download calls, replacing the previously hardcoded null. This enables downloading gated and private models. * fix(studio): use model native context for GGUF auto-load, show friendly errors The auto-load paths and selectModel for GGUF were sending max_seq_length=4096 which now actually limits the context window (since we fixed the backend to respect n_ctx). Changed to send 0 for GGUF, which means "use model's native context size". Also replaced generic "An internal error occurred" messages with user-friendly descriptions for known errors like context size exceeded and lost connections. LoadRequest validation changed to ge=0 to allow the GGUF "model default" signal. The frontend slider still enforces min=128 for non-GGUF models. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(studio): filter out FP8 models from model search results Hide models matching *-FP8-* or *FP8-Dynamic* from both the recommended list and HF search results. These models are not yet supported in the inference UI. --------- Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
feat(studio): multi-file unstructured seed upload with better backend extraction (#4468) * fix(recipe-studio): prevent fitView from zooming to wrong location on recipe load * feat: add pymupdf/python-docx deps and unstructured uploads storage root * feat: add POST /seed/upload-unstructured-file endpoint * feat: add multi-file chunking with source_file column * feat: update frontend types and API layer for multi-file upload * feat: round-robin preview rows across source files Ensures every uploaded file is represented in the preview table by cycling through sources instead of just taking the first N rows. * fix: disable OCR, fix auto-load timing, fix persistence on reload - Disable pymupdf4llm OCR with write_images=False, show_progress=False - Replace onAllUploaded callback with useEffect that detects uploading→done transition (avoids stale closure reading empty file IDs) - Fix importer to preserve file IDs from saved recipes instead of clearing (clearing only happens at share time via sanitizeSeedForShare) * fix: harden unstructured upload with input validation and state fixes Validate block_id/file_id with alphanumeric regex to prevent path traversal, use exact stem match for file deletion, add error handling for metadata writes and empty files, fix React stale closures and object mutations in upload loop, and correct validation logic for unstructured seed resolved_paths. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: address PR review - legacy path import, share sanitizer, sync effect Promote legacy source.path into resolved_paths for old unstructured recipes, clear source.paths in share sanitizer to prevent leaking local filesystem paths, and gate file sync effect to dialog open transition so users can actually delete all uploaded files. * fix: CSV column fix (BOM + whitespace + unnamed index re-save) for #4470 * fix: harden unstructured upload flow and polish dialog UX * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
fix: detect AMD/no-NVIDIA GPU early in Windows installer and guard unsloth.exe existence (#4478) * fix(install.ps1): detect AMD/no-NVIDIA GPU early and guard unsloth.exe existence When a user has an AMD GPU (no nvidia-smi), uv's --torch-backend=auto resolves to CPU torch, which constrains the solver to unsloth==2024.8. That ancient release has no unsloth.exe CLI entry point, so the subsequent & \ studio setup call throws a confusing PowerShell 'module could not be loaded' CommandNotFoundException instead of a clear error. Two fixes: - Detect nvidia-smi early; if no NVIDIA GPU is found, print a clear error explaining AMD/Intel GPUs are unsupported and exit before wasting time installing the wrong package version. - Guard Test-Path \ before invoking it, so any future case where the CLI entry point is missing produces a readable error instead of a cryptic PowerShell exception. Fixes: unsloth_studio\Scripts\unsloth.exe CommandNotFoundException on AMD GPU systems (Windows). * fix(install.ps1): correct GPU support message - AMD is Linux-only via ROCm * Slim down to just the unsloth.exe existence guard Remove the early NVIDIA GPU detection gate -- Studio supports Windows and Mac without a GPU (finetuning is simply disabled). The GPU gate was blocking legitimate non-NVIDIA users from installing. Keep only the Test-Path guard on unsloth.exe before invoking it. This turns the confusing PowerShell CommandNotFoundException into a clear error message pointing at the likely cause (older unsloth version resolved by the package solver that does not include the Studio CLI). * Fix quickstart link in unsloth.exe guard message --------- Co-authored-by: LeoBorcherding <LeoBorcherding@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Update README.md
Misc fixes (#4018) * convert print to logger * Print but cleaner * Hide model on multiple devices * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo transfomers -> transformers, revert MoE message change * Update MoE detection message to show num_experts and target_modules * Fix llama-cli path in save info message * target_parameters warning for moe * fix should_convert_module for llm_int8_skip_modules * fix should_convert_module for llm_int8_skip_modules * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Logging filters * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * negation * remove should_convert_module patch * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Hanchen <danielhanchen@users.noreply.github.com>
FunctionGemma
Merge branch 'main' into nightly
Update fp8.py
Versioning
Versioning
Merge branch 'main' of https://github.com/unslothai/unsloth
Update chat_templates.py
Update pyproject.toml