[MX | Triton] Create MX matmul op using new `scaled_dot` op in Triton #1084

drisspg · 2024-10-15T18:53:07Z

Summary

Recently triton added the scaled_dot which consumes A, B in f8, f6, f4 packed in int32 format and u8m0 scales via int8 datatype. https://github.com/triton-lang/triton/pull/4795/files#diff-1d96a0ed473569188c00d6e16c54dd7050e0a66040438ac630c889aef7cbbbe8R1544

Steps

Implement new mx matmul in triton | add utilities to ensure that this op is only available when new enough triton is used
Write unit tests verifying the correctness of implementation against the existing upcast and matmul approach

Update Mx Tensor's dispatch to (based on config) use the new op instead of upcasting and running in original precision:

ao/torchao/prototype/mx_formats/mx_ops.py

Lines 64 to 68 in 48bc81c

    
           b = args[1] 
        
           assert isinstance(a, MXTensor) and isinstance(b, MXTensor) 
        
           a_hp = a.to_dtype(a._orig_dtype) 
        
           b_hp = b.to_dtype(b._orig_dtype) 
        
           res = aten_op(a_hp, b_hp)

Create profile + memory traces

The text was updated successfully, but these errors were encountered:

…at/ folder (pytorch#1076) * [Hackability Refactor] Move known_model_params under torchchat (pytorch#1073) * [Hackability Refactor] Migrate CLI call sites to explicitly go through torchchat.py (pytorch#1075) * [Hackability Refactor] Move model.py underneath torchchat/ (pytorch#1077) * Move model.py * Clear out init to avoid package circular import * [Hackability Refactor] Move select top level docs into folders within torchchat (pytorch#1080) * [Hackability Refactor] Move the top level util folder into torchchat/utils (pytorch#1079) * [Hackability Refactor] Move the top level util file into torchchat/utils/ * Cleared out init to avoid packing * [Hackability Refactor] Collapse gguf_util into gguf_loader (pytorch#1078) * [Hackability Refactor] Collapse gguf_util into gguf_loader * Update bad import * [Hackability Refactor] Move model_config into torchchat/model_config (pytorch#1082) * [Hackability Refactor] Move cli related files under torchchat/cli (pytorch#1083) * [Hackability Refactor] Move build/util into torchchat/utils (pytorch#1084) * [Hackability Refactor] Easy Moves: eval, gguf_loader, quantize, model_dist (pytorch#1085) * [Hackability Refactor] Easy Cheap Moves: eval, gguf_loader, quantize, model_dist * Update eval.py call sites that slipped through the initial pass * [Hackability Refactor] Update missed direct file calls to use torchchat.py (pytorch#1088) * [Hackability Refactor] Move export and generate under torchchat/ (pytorch#1089) * [Hackability Refactor] Move scripts under torchchat/utils (pytorch#1090) * [Hackability Refactor] Move scripts under torchchat/utils * Fix install script for AOTI * Update referenced path in build_android * Adding missing utils path * Add another layer for torchchat * Move the source command depending on if TC root is defined * [Hackability Refactor] Move installation related files into install/ (pytorch#1081) * [Hackability Refactor] Move installation related files into install/ * Fix install req path * Test fix with install path for bash * Debug messages * Remove changes to install in et_python_libs * Remove debug echo * Fix pin path for et * [Hackability Refactor] Restricted Lint (pytorch#1091) * [Hackability Refactor] Removing __main__ from export/generate/eval (pytorch#1092)

andrejonasson · 2024-12-29T09:28:18Z

For others interested in the issue, while this feature has been merged to triton/main this does not yet seem to be released in triton version 2.1.0 (release 14th October, https://pypi.org/project/triton/3.1.0/#history).

drisspg added enhancement New feature or request mx labels Oct 15, 2024

drisspg changed the title ~~[MX] Create MX linear using new scaled_dot op in Triton~~ [MX | Triton] Create MX matmul op using new scaled_dot op in Triton Oct 15, 2024

msaroufim added the good first issue Good for newcomers label Oct 17, 2024

drisspg self-assigned this Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MX | Triton] Create MX matmul op using new `scaled_dot` op in Triton #1084

[MX | Triton] Create MX matmul op using new `scaled_dot` op in Triton #1084

drisspg commented Oct 15, 2024

andrejonasson commented Dec 29, 2024

[MX | Triton] Create MX matmul op using new scaled_dot op in Triton #1084

[MX | Triton] Create MX matmul op using new scaled_dot op in Triton #1084

Comments

drisspg commented Oct 15, 2024

Summary

Steps

andrejonasson commented Dec 29, 2024

[MX | Triton] Create MX matmul op using new `scaled_dot` op in Triton #1084

[MX | Triton] Create MX matmul op using new `scaled_dot` op in Triton #1084