Update README.md#12
Closed
eltociear wants to merge 1 commit into
Closed
Conversation
Huggingface -> Hugging Face
Member
|
I'm not sure if this is necessary to merge? |
mmathew23
pushed a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 8, 2025
* UNSLOTH_PATCHED * Update patching_utils.py * Patching * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Disable * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * retry * gradient checkpointing * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update patching_utils.py * Update patching_utils.py * Bug fix
mmathew23
pushed a commit
to mmathew23/unsloth
that referenced
this pull request
Jun 25, 2025
* UNSLOTH_PATCHED * Update patching_utils.py * Patching * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Disable * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * retry * gradient checkpointing * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update patching_utils.py * Update patching_utils.py * Bug fix
mmathew23
pushed a commit
to mmathew23/unsloth
that referenced
this pull request
Jul 7, 2025
* UNSLOTH_PATCHED * Update patching_utils.py * Patching * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Disable * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * retry * gradient checkpointing * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update gradient_checkpointing.py * Update patching_utils.py * Update patching_utils.py * Bug fix
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1: route-layer chat/diffusion/export releases were still asymmetric. Training start and export load called ``diff_backend.unload_model`` inside a best-effort try/except so a wedged diffusion backend let the next workload allocate over the top of the resident pipeline and OOM. Both now use the strict ``_release_diffusion_for`` helper from routes.inference, which raises HTTPException 503 on status/unload failure or post-check mismatch. P2 #9: diffusion load exceptions can include the absolute local repo / base / gguf path verbatim (FileNotFoundError, OSError from diffusers / safetensors). The path flows into ``_last_error``, which ``status()`` returns to every authenticated session. Collapse the known repo_id / effective_base / gguf_filename paths to their leaf name before storing the error, mirroring the ``_display_repo_id`` convention used for the public repo label. P2 #10: when ``repo_id`` is an absolute local path, ``detect_family`` matched _FAMILY_EXCLUDE deny lists against the full path, so models stored under a parent directory containing ``qwen-image-edit`` or ``3.5`` were misclassified as None. Reduce the family-detection needle to the leaf directory when the input looks like a filesystem path; Hub-style ``owner/repo`` ids continue to use the original needle so existing detection rules keep working. P2 #12: ``gguf_filename`` was missing from the ``_reject_embedded_hf_token`` validator. A URL-form quant path like ``https://hf_xxxxx@huggingface.co/.../flux.gguf`` would be stored on ``DiffusionBackend._gguf_filename`` and surface in status() / log lines. Extend the validator to gguf_filename so the token is dropped before it can leak. All 85 diffusion-relevant backend tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1 + #2 + #6: extended the chat / diffusion / training identifier hardening to every export-side request model. ExportCommonOptions (parent of ExportMergedModelRequest / ExportBaseModelRequest / ExportLoRAAdapterRequest) now applies _no_control_chars and _reject_embedded_hf_token to repo_id and base_model_id; ExportGGUFRequest gets the same on its repo_id plus a control-char check on quantization_method; and LoadCheckpointRequest validates checkpoint_path. Previously "/api/export/*" accepted newline-smuggled identifiers and URL-form ``hf_xxxxx`` tokens that flowed into log lines. P1 #3 + #4: ``_run_with_helper`` and ``_run_multi_pass_advisor`` now use a shared ``_gpu_workload_busy_for_helper`` that gates on diffusion (round 22 already), training, AND export. The round 22 guard only checked diffusion, so the dataset helper / advisor could still load llama-server on top of an active training run or a resident export checkpoint. Each step fails closed (unverifiable status counts as busy) so the user's primary workload is preserved. P1 #5: PublishDatasetRequest in models/data_recipe.py also applies the identifier hardening to repo_id; the publish path previously accepted control characters and URL-form tokens. P1 #7-10: added _validate_logged_identifier helper to routes/models.py and applied it to the path / query parameter endpoints that flow into logger.info(...) calls -- ``/config/{model_name}``, ``/check-vision/{model_name}``, ``/check-embedding/{model_name}``, ``/gguf-variants``. Mapped the validator's ValueError to HTTP 422 so the client sees the same shape as a Pydantic validation failure. P2 #11 + #12: ``Loading diffusion model %s`` and ``Diffusion load failed for %s`` log lines route ``repo_id`` / ``effective_base`` through ``_display_repo_id`` (collapses absolute local paths to the leaf, still scrubs HF tokens) instead of plain ``_redact_hf_tokens``. The error path was already collapsed in the user-facing 400 / RuntimeError, but the structured-log lines kept the full path. All 97 diffusion + training-validation + related tests pass locally.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Twelve P1 findings from round 26 reviewer aggregate, plus the CI revert of round 25 P1 #5 to a less invasive location. 1. requirements/studio.txt + requirements/single-env/constraints.txt: revert the round 25 huggingface-hub bump (broke Studio Update CI, Mac Studio Update CI, Mac Studio UI CI, Studio UI CI all with ResolutionImpossible against transformers==4.57.6 which requires hub<1.0). Standard install path stays on the well-tested 4.57.6 + 0.36.2 + trl 0.23.1 trio. 2. requirements/no-torch-runtime.txt + pyproject.toml [huggingfacenotorch]: bump huggingface_hub floor from >=0.34.0 to >=1.3.0,<2.0 -- this is where the actual transformers 5.x + hub 0.36.2 broken combo can land because the file installs --no-deps. transformers 5.x calls hub.is_offline_mode which only exists in hub 1.x. 3. utils/datasets/llm_assist.py: revert round 25 P1 #4 (helper/advisor sharing the global llama backend) which introduced three regressions: a chat-evict load race after the busy precheck, a finally-block that could unload a user chat model, and an identifier mismatch the delete guard could not canonicalize. Go back to PRIVATE LlamaCppBackend instances and expose the active helper/advisor repos through a new thread-safe registry (helper_advisor_owns_repo / _register_helper_advisor_repo / _unregister_helper_advisor_repo) so DELETE /api/models/delete-cached can still block the rmtree. 4. routes/models.py delete_cached_model: check the new helper/advisor registry up front and 409 if a helper/advisor still owns the target repo. Closes round 26 P1 #13 and #14 (helper/advisor identifiers were prefixed and would never equal the raw repo id). 5. routes/models.py get_lora_base_model: validate lora_path with _validate_logged_identifier before it is reflected in 404 detail and error logs (round 26 P1 #12). 6. routes/inference.py /unload: round 21 P1 #3 added a "or not is_loaded" fallback that let an unload of owner/B cancel a pending llama load of owner/A. Replace it with a narrow llama_is_starting_without_identifier branch that only fires when llama-server is mid-startup with neither identifier set (round 26 P1 #5). 7. routes/inference.py /unload: poll loading_model_identifier for up to 5 s after asyncio.to_thread(unload_model) so a legitimate pending-load cancel does not 503 because the load thread has not yet observed _cancel_event in its finally (round 26 P2 #15). 8. models/training.py TrainingStartRequest: extend identifier hardening to hf_dataset, subset, train_split, eval_split. Round 22 only guarded model_name (round 26 P1 #10). 9. models/data_recipe.py SeedInspectRequest: add _no_control_chars + _reject_embedded_hf_token field_validators on dataset_name (round 26 P1 #11). Tests: 105 targeted (diffusion + cached_gguf + llama_cpp_cache + inference_model_validation + models_get_model_config) and 1768 broader backend tests pass locally. Pre-existing test_desktop_auth.py, test_studio_api.py, and test_training_worker_flash_attn.py failures reproduce on HEAD without these changes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Four actionable findings from round 30. Skipped P1 #1 / #2 / #3 (huggingface-hub bump in studio.txt / single-env / colab-new) because the live B200 Studio that successfully generated FLUX.2 klein images runs the exact combo the reviewer flags as broken: huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 Flux2KleinPipeline: True (imports cleanly) The is_offline_mode ImportError only fires with transformers 5.x, and the standard install path pins transformers==4.57.6 via constraints. The round 26 fix bumped no-torch-runtime.txt + pyproject huggingfacenotorch where the --no-deps install path can land on transformers 5.x; that remains the correct surface. 1. core/inference/diffusion.py: preflight transformers + accelerate via importlib.util.find_spec BEFORE any destructive GPU-owner unload. Diffusers can expose stub pipeline classes when transformers / accelerate are missing, so the load used to drop chat first and fail later inside from_pretrained. find_spec keeps existing tests that stub these modules passing because no real module is executed (round 30 P1 #11). 2. models/export.py ExportGGUFRequest.quantization_method: extend the embedded HF token validator to this field too. Round 23 added the control-char guard but not the token guard; the value is forwarded into worker command lines and reflected in error / success text (round 30 P1 #5). 3. models/data_recipe.py SeedInspectUploadRequest: add _no_control_chars + _reject_embedded_hf_token field_validators to filename and to each entry of file_names. Mirrors the sibling SeedInspectRequest.dataset_name hardening (round 30 P1 #6). 4. frontend/src/features/images/images-page.tsx: defer the initial refreshStatus() call via queueMicrotask so the synchronous setRefreshingStatus(true) inside it does not trip the react-hooks/set-state-in-effect lint on mount (round 30 P2 #12). Deferred (need larger surgery / out of scope for this round): P1 #4 native_path_lease for diffusion local-path loads P1 #7-#10 helper/advisor + public-start window mutual lock symmetry Tests: 98 targeted (diffusion + cached_gguf + inference_validation) pass locally; frontend npm run typecheck passes.
danielhanchen
added a commit
that referenced
this pull request
May 25, 2026
Addresses remaining round-30 reviewer findings against PR #5754 (diffusion image generation in Unsloth Studio). The studio.txt / constraints.txt / colab-new hub-bump items (round 30 #1-#3) are intentionally skipped: the live B200 Studio install path with huggingface_hub==0.36.2, transformers==4.57.6 and diffusers==0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation (see staging CI green on bec81b8 plus round 28-30 local validation suites). The is_offline_mode ImportError the reviewer cites only triggers with transformers 5.x against huggingface_hub 0.x; the constraints pin holds transformers at 4.x so the combo never materialises on the standard install path. Concurrency: close the helper / advisor GPU-start race in all four public load paths (round 30 P1 #7-#10). * Add a _PUBLIC_LOAD_PENDING_COUNT counter in utils/datasets/llm_assist.py, published under _HELPER_ADVISOR_START_LOCK by _raise_if_helper_advisor_busy and cleared by a paired _clear_public_load_window in routes/inference.py. A concurrent helper / advisor start now sees public_load_pending() inside _gpu_workload_busy_for_helper and refuses VRAM until the public load attempt finishes, closing the window between the busy snapshot and the public load flipping its public ownership flags (is_loaded, current_checkpoint, is_training_active, etc.). * Wire the paired clear into all five call sites (GGUF chat, safetensors chat, diffusion image load, training start, export load-checkpoint). The chat path tracks the published tag in a local so the finally clears the same counter on either branch or on early HTTPException. Security: gate /api/inference/images/load against arbitrary local-path probes (round 30 P1 #4). Mirror the chat /api/inference/load native_path_lease boundary so an authenticated session cannot use repo_id or base_repo as a directory probe. * Add native_path_lease + base_repo_native_path_lease to DiffusionLoadRequest (optional; Hub ids skip the lease). * Add _looks_like_local_diffusion_path + a _resolve_diffusion_repo_for_request helper that requires a verified directory-typed native path grant for any value that starts with /, ~, ./, ../, contains a backslash, or expands to an absolute path. The detector deliberately avoids Path.exists so the route does not side-channel filesystem layout via differential error messages. Frontend: split the Images page status fetch from the spinner toggle (round 30 P2 #12). The mount effect and the is_loading auto-poll now call a setState-free fetchAndUpdateStatus; the user-driven Refresh button still calls refreshStatus to flip the spinner. Cleaner separation than the queueMicrotask shim from the prior commit; the eslint react-hooks/set-state-in-effect rule is not in the studio-frontend-ci typecheck gate, and the codebase already has hundreds of pre-existing violations of the same rule. 98 targeted backend tests pass (test_diffusion_routes, test_diffusion_backend, test_inference_model_validation, test_models_get_model_config_case_resolution, test_data_recipe_seed, test_training_raw_support, test_export_log_cursor). Frontend typecheck passes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Huggingface -> Hugging Face