Conversation
|
Add fp8<->float conversion with standard rounding. |
|
This PR adds:
|
aosewski
left a comment
There was a problem hiding this comment.
Looks very good. I just have few observations regarding overall code structure. I'm not familiar with those bit-wise algorithms so I'm not even touching it :P
|
Casting functions are still far from optimal, I'll work on their performance as we have a test case to track conversion performance. |
aosewski
left a comment
There was a problem hiding this comment.
Thanks for refactoring files! The only one thing which I don't feel is OK is the relation between type_convert.hpp and data_type.hpp headers. Please see comments,
|
@aosewski, thanks for your review, I've refactored the type_convert.hpp, could you review the change? |
| using f8_t = uint8_t; | ||
| using half_t = _Float16; |
There was a problem hiding this comment.
Oh I've overlooked this. Please move the fp8 type alias to data_type.hpp and remove the half_t type alias. As I said for such simple type aliases definitions I think we should keep all the data type definitions in data_type.hpp header. I imagine an exception to this rule only when we would define our own class for a specific data type.
aosewski
left a comment
There was a problem hiding this comment.
I've overlooked the place of f8_t type definition. Other than that looks good. I do not want to prolong the review process, thus I approve. But please move this type definition and everything should be fine then. Thanks!
|
@aosewski, looks like I've overridden your approval, could you approve one more time? |
[CK] Add render group to AITER and FA dockers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Motivation The AITER and FA test dockers (`Dockerfile.aiter`, `Dockerfile.fa`) inherit from the `rocm/pytorch` base image. Recent updates to that base image dropped the `render` group from `/etc/group`, so every parallel test stage now fails on the test agents with: ``` docker: Error response from daemon: Unable to find group render: no matching entries in group file. ``` Jenkins resolves `--group-add render` against the **container's** `/etc/group`, not the host's, so even though the test agents have render in their `/etc/group` (GID 109), the container lookup fails. This pattern affects every recent develop build ([#673](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/673), [#674](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/674), [#686](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/686), [#688](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/688), [#699](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/699), [#708](http://micimaster.amd.com/blue/organizations/jenkins/rocm-libraries-folder%2FComposable%20Kernel/detail/develop/708) — 6 days in a row), where AITER tests fail in seconds and the cascading failure aborts all downstream Build/FMHA/TILE_ENGINE stages. ## Technical Details Add `groupadd -f render` to both `Dockerfile.aiter` and `Dockerfile.fa`, mirroring what the main `Dockerfile` already does (`Dockerfile:96`) and what `Dockerfile.pytorch` does (`Dockerfile.pytorch:4`). The `-f` flag makes it idempotent — silently succeeds if the group already exists. This guarantees the `render` group is always present in the container, regardless of whether the base image happens to ship it. ## Test Plan Triggering AITER CI job: ## Test Result ## Submission Checklist - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Add a PRNG