[Spyre-Next] Reworked rms_norm#873
Conversation
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. We also recommend installing prek and configuring it to check your code before every local commit. |
|
bot:next-test |
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
|
How significant are the numerical differences observed, and which method is closer to a baseline non-spyre implementation? The simplification seems good but maybe the transpose().contiguous() serves a purpose; I've seen another case where a double transpose was needed for Spyre (REF) |
There was a problem hiding this comment.
lgtm code wise - fair point of @tjohnson31415 above ^^
|
@tjohnson31415 yes, indeed. It is a fair point and maybe @romitjain can comment also here. The case that you reference is a bit specific for attention I believe. There was a view operation done before and then the tensor needed to be re-stickified via this trick. We also currently use it in our attention implementation, see https://github.com/jvlunteren/vllm-spyre/blob/80c7cc9e0c261059375780fc24cb3cd9861d3030/vllm_spyre_next/vllm_spyre_next/v1/attention/backends/spyre_attn.py#L698-L701. However, one does observe some numerical difference when comparing the old and the new approach. In particular, I created this small repro script, which will fail for the 0.001 testcase: I had an offline communication with @romitjain and I think this was verified. |
|
Ah, if we are aligining with changes in torch-spyre then I think a little numerical change is ok (When initially running in my dev env, I was getting big changes, but something was misconfigured). I added a link to torch-spyre/torch-spyre#1236 to the PR description. |
<!-- markdownlint-disable --> ## Description Fix docstring inaccuracies, typos and typing. Changes: - cleans up docstrings after #873 - cleans up comment after #754 - typos and typing ## Test Plan Documentation-only changes, no functional impact. ## Checklist - [x] I have read the [contributing guidelines](https://docs.vllm.ai/projects/spyre/en/latest/contributing) - [x] My code follows the project's code style (run `bash format.sh`) - [ ] I have added tests for my changes (if applicable) - [ ] I have updated the documentation (if applicable) - [x] My commits include a `Signed-off-by:` line (DCO compliance) --------- Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com> Signed-off-by: Thomas Ortner <boh@zurich.ibm.com> Co-authored-by: Thomas Ortner <boh@zurich.ibm.com>
Description
This PR removes the transpose + .contiguous() operation from rms_norm, making it even closer to the native upstream implementation. However, I currently observe small numerical differences and I prepared a repro script for that.
cc @romitjain
Related Issues
Corresponding change in torch-spyre:
torch-spyre/torch-spyre#1236.
Test Plan
Change is non user-facing and all existing tests should pass
Checklist
bash format.sh)Signed-off-by:line (DCO compliance)