Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MX | Triton] Create MX matmul op using new scaled_dot op in Triton #1084

Open
drisspg opened this issue Oct 15, 2024 · 1 comment
Open

[MX | Triton] Create MX matmul op using new scaled_dot op in Triton #1084

drisspg opened this issue Oct 15, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers mx

Comments

@drisspg
Copy link
Contributor

drisspg commented Oct 15, 2024

Summary

Recently triton added the scaled_dot which consumes A, B in f8, f6, f4 packed in int32 format and u8m0 scales via int8 datatype. https://github.com/triton-lang/triton/pull/4795/files#diff-1d96a0ed473569188c00d6e16c54dd7050e0a66040438ac630c889aef7cbbbe8R1544

Steps

  1. Implement new mx matmul in triton | add utilities to ensure that this op is only available when new enough triton is used
  2. Write unit tests verifying the correctness of implementation against the existing upcast and matmul approach
  3. Update Mx Tensor's dispatch to (based on config) use the new op instead of upcasting and running in original precision:
    b = args[1]
    assert isinstance(a, MXTensor) and isinstance(b, MXTensor)
    a_hp = a.to_dtype(a._orig_dtype)
    b_hp = b.to_dtype(b._orig_dtype)
    res = aten_op(a_hp, b_hp)
  4. Create profile + memory traces
@drisspg drisspg added enhancement New feature or request mx labels Oct 15, 2024
@drisspg drisspg changed the title [MX] Create MX linear using new scaled_dot op in Triton [MX | Triton] Create MX matmul op using new scaled_dot op in Triton Oct 15, 2024
@msaroufim msaroufim added the good first issue Good for newcomers label Oct 17, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this issue Dec 9, 2024
…at/ folder (pytorch#1076)

* [Hackability Refactor] Move known_model_params under torchchat (pytorch#1073)

* [Hackability Refactor] Migrate CLI call sites to explicitly go through torchchat.py (pytorch#1075)

* [Hackability Refactor] Move model.py underneath torchchat/ (pytorch#1077)

* Move model.py

* Clear out init to avoid package circular import

* [Hackability Refactor] Move select top level docs into folders within torchchat (pytorch#1080)

* [Hackability Refactor] Move the top level util folder into torchchat/utils (pytorch#1079)

* [Hackability Refactor] Move the top level util file into torchchat/utils/

* Cleared out init to avoid packing

* [Hackability Refactor] Collapse gguf_util into gguf_loader (pytorch#1078)

* [Hackability Refactor] Collapse gguf_util into gguf_loader

* Update bad import

* [Hackability Refactor] Move model_config into torchchat/model_config (pytorch#1082)

* [Hackability Refactor] Move cli related files under torchchat/cli (pytorch#1083)

* [Hackability Refactor] Move build/util into torchchat/utils (pytorch#1084)

* [Hackability Refactor] Easy Moves: eval, gguf_loader, quantize, model_dist (pytorch#1085)

* [Hackability Refactor] Easy Cheap Moves: eval, gguf_loader, quantize, model_dist

* Update eval.py call sites that slipped through the initial pass

* [Hackability Refactor] Update missed direct file calls to use torchchat.py (pytorch#1088)

* [Hackability Refactor] Move export and generate under torchchat/ (pytorch#1089)

* [Hackability Refactor] Move scripts under torchchat/utils (pytorch#1090)

* [Hackability Refactor] Move scripts under torchchat/utils

* Fix install script for AOTI

* Update referenced path in build_android

* Adding missing utils path

* Add another layer for torchchat

* Move the source command depending on if TC root is defined

* [Hackability Refactor] Move installation related files into install/ (pytorch#1081)

* [Hackability Refactor] Move installation related files into install/

* Fix install req path

* Test fix with install path for bash

* Debug messages

* Remove changes to install in et_python_libs

* Remove debug echo

* Fix pin path for et

* [Hackability Refactor] Restricted Lint (pytorch#1091)

* [Hackability Refactor] Removing __main__ from export/generate/eval (pytorch#1092)
@andrejonasson
Copy link

For others interested in the issue, while this feature has been merged to triton/main this does not yet seem to be released in triton version 2.1.0 (release 14th October, https://pypi.org/project/triton/3.1.0/#history).

@drisspg drisspg self-assigned this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers mx
Projects
None yet
Development

No branches or pull requests

3 participants