Add Optional Attention Residuals (AttnRes) Integration for Unsloth Fast Paths by danielhanchen · Pull Request #9 · unslothai/unsloth-staging-1

danielhanchen · 2026-04-06T06:11:57Z

Staging mirror of unslothai#4863

Original PR: unslothai#4863
Author: @kleeedolinux

This is a staging copy for review and editing. Once finalized, changes will be pushed back to the original PR.

Original description

1. Why this PR exists

Unsloth fast paths currently use standard residual updates around attention blocks, e.g.:

$$x_{l+1} = x_l + f_l(x_l)$$

This is stable and efficient, but all prior layer information is mixed with fixed weights (implicit weight = 1 for the direct residual stream at each step). The AttnRes idea is to make residual mixing content-adaptive while preserving the same forward structure and default behavior.

Reference paper:

Attention Residuals: https://arxiv.org/pdf/2603.15031

This PR introduces an opt-in, low-risk integration that keeps Unsloth architecture and coding style intact.

2. High-level design

The integration goal is:

Do not break existing kernels / fast paths.
Keep default behavior identical when AttnRes is disabled.
Touch only attention-branch outputs before existing residual merges.

Implemented approach:

Build lightweight AttnRes state per forward pass.
Track block-local states and block summaries.
Before residual add, compute an AttnRes contribution from prior states.
Add that contribution to the attention output, then continue existing residual code.

So if current attention output is (a_l), the transformed output is:

$$\tilde{a}_l = a_l + \alpha \cdot \sum_i w_{l,i} v_i$$

with:

$$w_{l,i} = \mathrm{softmax}\left(\frac{q_l \cdot k_i}{\sqrt{d}}\right)$$

where in this practical implementation:

query (q_l): derived from current residual/query tensor,
keys/values ((k_i, v_i)): prior block summaries + in-block previous states,
(\alpha): configurable residual-mixing scale.

3. Mathematical mapping to implementation

The paper discusses full and block residual attention. This PR implements a practical block-style variant:

Maintain two sets:
- completed block summaries (S_1, \dots, S_{m-1})
- in-progress block states in current block (h_{m,1}, \dots, h_{m,t-1})
Candidate set for layer (l):

$$\mathcal{C}_l = \{S_1, \dots, S_{m-1}\} \cup \{h_{m,1}, \dots, h_{m,t-1}\}$$

Residual attention mix:

$$r_l = \sum_{c \in \mathcal{C}_l} \mathrm{softmax}\left(\frac{q_l \cdot c}{\sqrt{d}}\right) c$$

Output transform:

$$\tilde{a}_l = a_l + \alpha r_l$$

Block summary update at boundary:

$$S_m = \sum_{j \in B_m} h_j$$

4. What was changed

New files

unsloth/models/attnres.py
- forward-state init
- block summary accumulation
- attention-output transform
unsloth/utils/attnres.py
- utility-level AttnRes state/config/hook helpers

Updated files

unsloth/utils/attention_dispatch.py
- optional AttnRes-aware post-attention finalization hook path
unsloth/utils/__init__.py
unsloth/kernels/__init__.py
- export AttnRes helpers

Model integrations (attention branch only)

unsloth/models/llama.py
unsloth/models/gemma2.py
unsloth/models/cohere.py
`unsloth/models

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

danielhanchen · 2026-04-06T06:12:04Z

/gemini review

danielhanchen · 2026-04-06T06:12:05Z

@codex review

gemini-code-assist

Code Review

This pull request introduces 'Attention Residuals' (AttnRes), a feature designed to accumulate and mix residual states across transformer layers to potentially improve model performance or stability. The implementation includes a core state management module in unsloth/models/attnres.py, integration into several model architectures (Llama, Gemma2, Cohere, Falcon, and Granite), and hooks within the central attention dispatch utility to support various backends. The review feedback suggests minor logic simplifications in the new modules, specifically regarding how the number of layers is determined, how model boundary conditions are checked, and the redundancy of state initialization logic.

gemini-code-assist · 2026-04-06T06:14:36Z

+    num_layers = int(getattr(config, "num_hidden_layers", 0))
+    if num_layers <= 0 and hasattr(model, "layers"):
+        num_layers = len(model.layers)


The logic for determining num_layers is redundant. getattr(config, "num_hidden_layers", 0) already returns 0 if the attribute is missing, and the subsequent check if num_layers <= 0 and hasattr(model, "layers") is sufficient. The explicit int() cast is also unnecessary as num_hidden_layers is typically an integer in Hugging Face configs.

Suggested change

num_layers = int(getattr(config, "num_hidden_layers", 0))

if num_layers <= 0 and hasattr(model, "layers"):

num_layers = len(model.layers)

num_layers = getattr(config, "num_hidden_layers", 0)

if num_layers <= 0 and hasattr(model, "layers"):

gemini-code-assist · 2026-04-06T06:14:36Z

+    at_block_end = ((layer_idx + 1) % attnres_state.block_size) == 0
+    at_model_end = (attnres_state.num_layers > 0) and (
+        (layer_idx + 1) >= attnres_state.num_layers
+    )


The condition at_model_end can be simplified. Since layer_idx is 0-indexed, layer_idx + 1 is the 1-based layer count. Comparing layer_idx + 1 == attnres_state.num_layers is sufficient.

Suggested change

at_block_end = ((layer_idx + 1) % attnres_state.block_size) == 0

at_model_end = (attnres_state.num_layers > 0) and (

(layer_idx + 1) >= attnres_state.num_layers

)

at_model_end = (attnres_state.num_layers > 0) and ((layer_idx + 1) == attnres_state.num_layers)

gemini-code-assist · 2026-04-06T06:14:36Z

+        self.enabled = bool(
+            self.enabled or self.config.enabled or self.kernel_hook is not None
+        )


The enabled property logic in __post_init__ is redundant because begin() performs the same check. Consider simplifying the state initialization.

Suggested change

self.enabled = bool(

self.enabled or self.config.enabled or self.kernel_hook is not None

)

def __post_init__(self):

if self.kernel_hook is None:

self.kernel_hook = self.config.kernel_hook

self.enabled = self.enabled or self.config.enabled or self.kernel_hook is not None

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 513888151c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T06:19:20Z

+        return torch.zeros_like(query)
+
+    # (bsz, seqlen, n_states, dim)
+    stacked = torch.stack(candidates, dim = 2)


Normalize candidate devices before stacking AttnRes states

AttnRes currently stacks completed_block_summaries and current_block_states as-is, but multi-GPU pipeline inference moves activations per layer (move_to_device in the inference loops) while reusing one shared attnres_state. As soon as layers run on different _per_layer_device_index values, this list contains tensors from different GPUs and torch.stack fails with a cross-device error. The candidates should be moved to query.device (or tracked per device) before mixing.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T06:19:20Z

+    candidates.extend(attnres_state.current_block_states)
+    if len(candidates) != 0:
+        mixed = _compute_residual_mix(query, candidates)
+        attention_output = attention_output + (attnres_state.alpha * mixed)


Preserve attention output dtype when applying AttnRes mix

attention_output can be fp16/bf16, but the mix is computed from query and may be fp32 (for example, fast inference paths pass a float32 residual buffer as query). The expression attention_output + (alpha * mixed) then promotes the result to fp32, which pushes subsequent layers off the intended low-precision fast path and increases memory/latency whenever AttnRes is enabled. Cast mixed (or query) to attention_output.dtype before the add.

Useful? React with 👍 / 👎.

… MPS + base namespace for PR unslothai#5754 Round 12 reviewer findings. Backend correctness (P1) * core/inference/diffusion.py load_model: GGUF branch now handles an absolute local directory passed as repo_id by joining Path(repo_id) / gguf_filename directly instead of handing the path to hf_hub_download (which raises HFValidationError because the path is not 'namespace/repo'). Closes round 12 review #1 -- the load request advertised 'local path' support but actually only worked for Hub repo ids. Delete guard precision (P1) * routes/models.py /delete-finetuned + /delete-cached: diffusion guard now consults gguf_filename from status() and ALLOWS per-variant deletes that target a different quant than the one the loaded pipeline is reading. Loading 'Q4_K_S' no longer blocks deleting 'Q8_0' from the same repo / export directory (round 12 reviews #3 and #4). Accelerator (P2) * core/inference/diffusion.py _drain_cuda_cache: also calls torch.mps.empty_cache() when the MPS backend is the active accelerator. Apple Silicon swaps now actually return held VRAM instead of leaving it pinned in the Metal allocator (round 12 review #10). Smart base repo (P2) * core/inference/diffusion.py _smart_base_repo: only inspects the LAST segment of the repo id / path for the 'base' / '9b' tokens. A namespace like baseorg/FLUX.2-klein-4B-GGUF or a parent directory like /home/me/.cache/base/... no longer falsely selects the Base variant (round 12 review #9).

P1: route-layer chat/diffusion/export releases were still asymmetric. Training start and export load called ``diff_backend.unload_model`` inside a best-effort try/except so a wedged diffusion backend let the next workload allocate over the top of the resident pipeline and OOM. Both now use the strict ``_release_diffusion_for`` helper from routes.inference, which raises HTTPException 503 on status/unload failure or post-check mismatch. P2 #9: diffusion load exceptions can include the absolute local repo / base / gguf path verbatim (FileNotFoundError, OSError from diffusers / safetensors). The path flows into ``_last_error``, which ``status()`` returns to every authenticated session. Collapse the known repo_id / effective_base / gguf_filename paths to their leaf name before storing the error, mirroring the ``_display_repo_id`` convention used for the public repo label. P2 #10: when ``repo_id`` is an absolute local path, ``detect_family`` matched _FAMILY_EXCLUDE deny lists against the full path, so models stored under a parent directory containing ``qwen-image-edit`` or ``3.5`` were misclassified as None. Reduce the family-detection needle to the leaf directory when the input looks like a filesystem path; Hub-style ``owner/repo`` ids continue to use the original needle so existing detection rules keep working. P2 #12: ``gguf_filename`` was missing from the ``_reject_embedded_hf_token`` validator. A URL-form quant path like ``https://hf_xxxxx@huggingface.co/.../flux.gguf`` would be stored on ``DiffusionBackend._gguf_filename`` and surface in status() / log lines. Extend the validator to gguf_filename so the token is dropped before it can leak. All 85 diffusion-relevant backend tests pass locally.

P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.

P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)

P1 #1: ``_release_llama_for()`` now verifies ``llama.unload_model`` did not return False AND that ``is_loaded`` / ``is_active`` / ``loading_model_identifier`` are all cleared after the call. The previous version only treated raised exceptions as failure, so a subprocess refusing to terminate or an in-flight GGUF download let the next workload allocate on top. P1 #2: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` now raises RuntimeError when ``exp._shutdown_subprocess`` fails on a settled checkpoint. Direct backend callers used to log at debug level and proceed toward diffusion allocation while the export checkpoint still owned VRAM. P1 #3 + P1 #7: ``/images/load`` no longer drops chat + idle export before the cheap backend validation runs. ``DiffusionBackend.load_model`` already calls the strict ``_release_other_gpu_owners_for_diffusion`` and ``_release_chat_backend_for_diffusion`` helpers AFTER family inference and GGUF filename checks pass, so the GPU is still freed before allocation and a malformed payload no longer silently unloads the user's chat / chat-export pair. P1 #4: ``_release_chat_backend_for_diffusion`` now also rejects a post-unload state where ``loading_model_identifier`` is still set, matching the route-level ``_release_llama_for`` strictness. A GGUF download mid-flight before the diffusion handoff used to slip through and end up double-owning VRAM after diffusion allocated. P1 #5: ``_release_diffusion_for`` no longer swallows a post-unload ``status()`` failure as ``after = {}``. Training / chat / export handoffs need proof that the diffusion pipeline released VRAM; the helper now raises HTTP 503 when the verification status call itself raises, so the caller retries. P1 #6: ``DiffusionBackend._release_other_gpu_owners_for_diffusion`` raises RuntimeError when ``get_export_backend()`` itself raises. Direct backend callers used to silently ``return`` here and proceed to GPU allocation without being able to verify export ownership. P1 #8: ``/training/start`` releases settled export BEFORE chat, matching the chat-load helpers. If idle export shutdown fails the user's chat model is preserved instead of being dropped for a training run that never starts. P2 #9: GGUF load-error scrubber also collapses ``local_gguf_path``, the resolved HF cache path passed to ``transformer_cls.from_single_file()``. Without this an exception like ``OSError: cannot load /home/alice/.cache/huggingface/.../flux.gguf`` would leak the operator's filesystem layout through ``last_error`` and ``/images/status``. All 85 diffusion-relevant backend tests pass locally.

P1 #1: ``_gpu_workload_busy_for_helper`` in ``utils/datasets/llm_assist.py`` now also gates on the GGUF chat backend (llama-server) AND the safetensors chat backend. Round 23 extended it to training + export but missed Chat, so a helper / advisor GGUF could still race a loaded chat model for VRAM. Both checks fail closed when status is unverifiable. P1 #2 / #3 / #4 / #5: re-ordered the route-level GPU-handoff unloads so the diffusion release runs BEFORE the chat releases. A wedged diffusion unload used to fire AFTER chat was already gone, so the user lost both on a single failure. Drop chat last so an earlier failure preserves it. Applied to ``/training/start`` (training.py), ``/export/load`` (export.py), ``/chat/load`` GGUF branch and ``/chat/load`` safetensors branch (routes/inference.py). P1 #7 + P2 #13: ``/delete-finetuned`` body now hardens ``model_path`` and ``gguf_variant`` via the shared ``_validate_logged_identifier`` helper, so control characters and URL-form HF tokens can no longer log-line-smuggle. P1 #8 + #10: ``/delete-cached`` body hardens ``repo_id`` and ``variant`` the same way. P1 #9: ``/download-progress`` ``repo_id`` query parameter is also hardened; the value flows into log lines deep inside ``_get_repo_size_cached`` on lookup failure. P1 #11: ``CheckFormatRequest.dataset_name`` and ``AiAssistMappingRequest.{dataset_name, model_name}`` in ``models/datasets.py`` now apply the same control-char + embedded-HF-token validators, matching every other public request-body model. All 115 diffusion + training-validation + cached_gguf + export + inference model-validation tests pass locally. (P1 #6 native-path-lease enforcement for diffusion local paths and P1 #12 React Compiler frontend lint deferred -- both need focused design / frontend touchups separate from this batch.)

Two round-33 reviewer findings: hub-floor consistency and the multipart upload filename validator gap. Dependencies: reverted the round-26 huggingface_hub>=1.3.0 floor in no-torch-runtime.txt and pyproject.toml (round 33 P1 #1-#5, 4/12 vote consensus). studio.txt forces huggingface_hub==0.36.2 to match the transformers==4.57.6 pin in extras-no-deps.txt, so the 1.3.0 floor was internally inconsistent. Reviewers reproduced the resolver conflict on a fresh install. Empirical justification (re-verified on the live B200 host before the revert): huggingface_hub 0.36.2 + transformers 4.57.6 + diffusers 0.37.1 imports Flux2KleinPipeline cleanly and runs end-to-end image generation. transformers 4.57.6 carries its own transformers.utils.hub.is_offline_mode and does not actually need huggingface_hub.is_offline_mode at import time. The original bump was guarding against the (never-realised) transformers 5.x path, which extras-no-deps explicitly pins away. Validation: multipart /seed/upload-unstructured-file now applies the same _no_control_chars and _reject_embedded_hf_token checks to file.filename that SeedInspectUploadRequest.filename already applies in the JSON variant (round 33 P1 #7). The filename is reflected back to the client, persisted in the per-file meta JSON, and echoed by error responses, so the JSON-side hardening must not be asymmetric with the multipart path. Skipped (consistent with prior rounds): * Find_spec vs full import (R33 P1 #6): preserves test compatibility with the huggingface_hub stub fixture. * React hooks set-state-in-effect lint (R33 P1 #8): codebase has 146 pre-existing violations of the same rule; studio-frontend-ci does not gate on lint. * Direct DiffusionBackend.load_model bypass (R33 P1 #9): the route is the only production entry point, and the backend helper now publishes its own diffusion-backend pending tag (round 32 P1 #3). Direct-caller hardening would require duplicating the lease check into load_model itself, which is out of scope for the route-layer security boundary. * One-segment Hub IDs (R33 P2 #10): strict 2-segment Hub id check is intentional; one-segment names are not valid Hub ids. * Cwd-relative shadow of Hub IDs (R33 P2 #11): documented side-channel tradeoff accepted in round 31 commit msg. 97 targeted backend tests pass.

kleeedolinux and others added 11 commits April 5, 2026 17:29

Add models AttnRes state and transform helpers

210d349

Add utils AttnRes hook helpers

c9c9ad9

Update attention dispatch for AttnRes hooks

406d176

Update utils exports for AttnRes

8d5279b

Update kernels exports for AttnRes

0260d1c

Update llama fast paths with AttnRes integration

c6ea5d5

Update gemma2 fast paths with AttnRes integration

30cb672

Update cohere fast paths with AttnRes integration

ee896b6

Update granite fast paths with AttnRes integration

01373a1

Update falcon_h1 fast paths with AttnRes integration

973cd54

Apply suggestions from code review

5138881

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist Bot reviewed Apr 6, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 6, 2026

View reviewed changes

danielhanchen closed this Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Optional Attention Residuals (AttnRes) Integration for Unsloth Fast Paths#9

Add Optional Attention Residuals (AttnRes) Integration for Unsloth Fast Paths#9
danielhanchen wants to merge 11 commits into
mainfrom
pr-4863-head

danielhanchen commented Apr 6, 2026

Uh oh!

danielhanchen commented Apr 6, 2026

Uh oh!

danielhanchen commented Apr 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        self.enabled = bool(
-            self.enabled or self.config.enabled or self.kernel_hook is not None
-        )
+    def __post_init__(self):
+        if self.kernel_hook is None:
+            self.kernel_hook = self.config.kernel_hook
+        self.enabled = self.enabled or self.config.enabled or self.kernel_hook is not None

Conversation

danielhanchen commented Apr 6, 2026

Staging mirror of unslothai#4863

Original description

1. Why this PR exists

2. High-level design

3. Mathematical mapping to implementation

4. What was changed

New files

Updated files

Model integrations (attention branch only)

Uh oh!

danielhanchen commented Apr 6, 2026

Uh oh!

danielhanchen commented Apr 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants