[Spyre-Next] 🎨 Fix docstring inaccuracies and typos#880
[Spyre-Next] 🎨 Fix docstring inaccuracies and typos#880yannicks1 merged 17 commits intotorch-spyre:mainfrom
Conversation
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
|
|
||
| Key differences from upstream: | ||
| - Uses transpose(-1, -2) for computation efficiency on Spyre | ||
| - Creates epsilon tensor via torch.ops.spyre.full() instead of scalar |
There was a problem hiding this comment.
@bohnstingl I was wondering: is this still true? Is full a custom spyre ops or do we use torch.full?
There was a problem hiding this comment.
No, this is not true anymore. We can now just use torch.full with torch-spyre.
| @@ -18,7 +18,6 @@ | |||
| - Minimum batch size: 64 (due to spyre constraint, automatically padded) | |||
| - Device dtype: float16 (converted for CPU) | |||
| - Output dtype: bfloat16 (converted on CPU) | |||
There was a problem hiding this comment.
@bohnstingl Is this always bfloat16 or is it just matching the input data type?
There was a problem hiding this comment.
Actually, maybe we can rephrase these dtypes a bit in general?
The Input dtype is defined by the model, respectively by the user. The computation in our wrappings are then always carried out in torch.float16.
There was a problem hiding this comment.
do you have a suggestion?
sth like
| - Output dtype: bfloat16 (converted on CPU) | |
| - Output dtype: model data type/ user defined (converted on CPU) |
There was a problem hiding this comment.
what about the bfloat mentions in line 158 and 168?
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. We also recommend installing prek and configuring it to check your code before every local commit. |
bohnstingl
left a comment
There was a problem hiding this comment.
Thank you very much @yannicks1 for opening this PR. I think it is very valuable to do these kind of refactors every once in a while 😊
| @@ -18,7 +18,6 @@ | |||
| - Minimum batch size: 64 (due to spyre constraint, automatically padded) | |||
| - Device dtype: float16 (converted for CPU) | |||
| - Output dtype: bfloat16 (converted on CPU) | |||
There was a problem hiding this comment.
Actually, maybe we can rephrase these dtypes a bit in general?
The Input dtype is defined by the model, respectively by the user. The computation in our wrappings are then always carried out in torch.float16.
|
|
||
| Key differences from upstream: | ||
| - Uses transpose(-1, -2) for computation efficiency on Spyre | ||
| - Creates epsilon tensor via torch.ops.spyre.full() instead of scalar |
There was a problem hiding this comment.
No, this is not true anymore. We can now just use torch.full with torch-spyre.
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
|
@yannicks1, could you please rebase to latest main? |
[Docstrings] Some additional docstring updates
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
Signed-off-by: Yannick Schnider <Yannick.Schnider1@ibm.com>
| @@ -18,7 +18,6 @@ | |||
| - Minimum batch size: 64 (due to spyre constraint, automatically padded) | |||
| - Device dtype: float16 (converted for CPU) | |||
| - Output dtype: bfloat16 (converted on CPU) | |||
There was a problem hiding this comment.
do you have a suggestion?
sth like
| - Output dtype: bfloat16 (converted on CPU) | |
| - Output dtype: model data type/ user defined (converted on CPU) |
| @@ -18,7 +18,6 @@ | |||
| - Minimum batch size: 64 (due to spyre constraint, automatically padded) | |||
| - Device dtype: float16 (converted for CPU) | |||
| - Output dtype: bfloat16 (converted on CPU) | |||
There was a problem hiding this comment.
what about the bfloat mentions in line 158 and 168?
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
…ocstrings Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
bohnstingl
left a comment
There was a problem hiding this comment.
@yannicks1 I created a PR with small changes again
| @@ -18,7 +18,6 @@ | |||
| - Minimum batch size: 64 (due to spyre constraint, automatically padded) | |||
| - Device dtype: float16 (converted for CPU) | |||
| - Output dtype: bfloat16 (converted on CPU) | |||
|
|
||
| Key differences from upstream: | ||
| - Uses transpose(-1, -2) for computation efficiency on Spyre | ||
| - Creates epsilon tensor via torch.ops.spyre.full() instead of scalar |
Docstrings
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
Added debug info about CustomOps
Description
Fix docstring inaccuracies, typos and typing.
Changes:
Test Plan
Documentation-only changes, no functional impact.
Checklist
bash format.sh)Signed-off-by:line (DCO compliance)