Skip to content

Conversation

@MollySophia
Copy link
Collaborator

@MollySophia MollySophia commented Mar 16, 2025

@BlinkDL 's explanation of RWKV v7:
RWKV-7 as a meta-in-context learner
Also there are plenty of tests on trained models posted on his x account.

Current available RWKV v7 model repos in HF format:

Base models:

https://huggingface.co/fla-hub/rwkv7-191M-world
https://huggingface.co/fla-hub/rwkv7-0.4B-world
https://huggingface.co/fla-hub/rwkv7-1.5B-world
https://huggingface.co/fla-hub/rwkv7-2.9B-world
https://huggingface.co/fla-hub/rwkv7-0.1B-g1 (Haven't add the option to enable it's capability yet.)

Distilled models:

https://huggingface.co/RWKV-Red-Team/ARWKV-R1-1B5
https://huggingface.co/RWKV-Red-Team/ARWKV-R1-7B
https://huggingface.co/RWKV-Red-Team/ARWKV_7B_R1_16K

This PR contains:

  • GGML_OP_L2_NORM that applies pytorch-style l2 normalization, along the rows. Tested with CPU, CUDA, SYCL, Vulkan, Metal backends.
  • GGML_OP_RWKV_WKV7 which is the core of the RWKV v7 architecture. Implemented the naive recurrent wkv7 kernel in CPU, CUDA, SYCL, Vulkan, Metal.
  • Support inference of RWKV7 and ARWKV7 models.
  • Simple Metal kernel for the old WKV6.
  • Skip unused tokens in last layer ffn computation for rwkv models.
  • Fix inference with RWKV6Qwen2.

TODO:

  • llama-parallel seems broken with all rwkv models. Will check what's wrong and try to fix them tomorrow. (Inference is fixed. But the output seems mixed between these parallel sequences. Haven't figured out what's wrong yet)
  • Why is Musa build failing? (Seems that there's some bugs in their vectorization code. Getting rid of a #pragma unroll in wkv.cu fix the build.

@github-actions github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend python python script changes ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Mar 16, 2025
Signed-off-by: Molly Sophia <[email protected]>
Signed-off-by: Molly Sophia <[email protected]>
@MollySophia MollySophia requested a review from ggerganov March 17, 2025 07:02
Copy link
Collaborator

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concern with the SYCL changes, thanks

@MollySophia MollySophia merged commit 7dfad38 into ggml-org:master Mar 17, 2025
50 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025
* ggml: Add op l2_norm

Signed-off-by: Molly Sophia <[email protected]>

* ggml: Add op rwkv_wkv7

Signed-off-by: Molly Sophia <[email protected]>

* llama: Add support for RWKV7 and ARWKV7 models

Signed-off-by: Molly Sophia <[email protected]>

* llama: fix inference with RWKV6Qwen2

Signed-off-by: Molly Sophia <[email protected]>

* llama: add more (a)rwkv7 variants in size

Signed-off-by: Molly Sophia <[email protected]>

* Apply code-format changes

Signed-off-by: Molly Sophia <[email protected]>

* fix MUSA build

Signed-off-by: Molly Sophia <[email protected]>

* llama: fix shape error with rwkv using llama-parallel

Signed-off-by: Molly Sophia <[email protected]>

---------

Signed-off-by: Molly Sophia <[email protected]>
@heredos heredos mentioned this pull request Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants