Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve primitives for FP6 quant #248

Merged
merged 90 commits into from
May 25, 2024
Merged
Changes from 1 commit
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
97924d7
add fp16_to_fp6 prototype
gau-nernst May 15, 2024
8bf081c
minor rename
gau-nernst May 15, 2024
558f4e4
Merge branch 'pytorch:main' into fp6_quant
gau-nernst May 16, 2024
314e9f6
fix rounding issue
gau-nernst May 16, 2024
030b956
Merge branch 'pytorch:main' into fp6_quant
gau-nernst May 16, 2024
79ce0db
update quant
gau-nernst May 16, 2024
45a92f3
add unpacked version
gau-nernst May 16, 2024
a8555e3
remove unnecessary comment
gau-nernst May 16, 2024
012176e
add CUDA version
gau-nernst May 16, 2024
d4b8681
add fp6 packed cpu
gau-nernst May 16, 2024
f0f3101
add CUDA for packed
gau-nernst May 16, 2024
f542eb1
some rename
gau-nernst May 16, 2024
40dc725
update name
gau-nernst May 16, 2024
3a98874
Merge branch 'main' into fp6_quant
gau-nernst May 17, 2024
eef2f95
add OpenMP
gau-nernst May 17, 2024
f61aa37
fix CUDA bug
gau-nernst May 17, 2024
1640bbf
add fp6->fp16
gau-nernst May 17, 2024
7a00b31
add FP6->FP32
gau-nernst May 17, 2024
b2fcc6c
move files around
gau-nernst May 17, 2024
d9ca476
rearrange stuff
gau-nernst May 17, 2024
ba89a0b
add more things
gau-nernst May 18, 2024
faf8682
Merge branch 'pytorch:main' into fp6_quant
gau-nernst May 18, 2024
e7b3135
update
gau-nernst May 18, 2024
8b3ac04
update. add comments
gau-nernst May 18, 2024
0635882
some rename. add some tests
gau-nernst May 18, 2024
4240692
add fp32->fp6 unpacked
gau-nernst May 18, 2024
7eb6fa8
fix
gau-nernst May 18, 2024
1c0e401
Merge branch 'main' into fp6_quant
gau-nernst May 18, 2024
26669b6
use template. add BF16
gau-nernst May 19, 2024
e09b61f
use template
gau-nernst May 19, 2024
887bac2
simplify API. add BF16 support via templates
gau-nernst May 19, 2024
4b5c99f
typo
gau-nernst May 19, 2024
39f9dce
enable OpenMP via compile flags
gau-nernst May 19, 2024
b681ae1
add memory access optimized version (though it is not faster..)
gau-nernst May 19, 2024
7c5fcd3
use fp32 mul impl for CUDA
gau-nernst May 19, 2024
82e4e60
add test case
gau-nernst May 19, 2024
fb18c73
typo. remove OpenMP since we cannot throw exception
gau-nernst May 19, 2024
a3c5e36
fix rounding for subnormal
gau-nernst May 19, 2024
27781e5
add to_fp6_value()
gau-nernst May 20, 2024
7d9dd34
simplify to_fp6_unpacked_cuda
gau-nernst May 20, 2024
7fb8c8b
simplify to_fp6_packed_cuda
gau-nernst May 20, 2024
965838c
clean up CPU impl
gau-nernst May 20, 2024
a64421e
add FP6->FP16/BF16
gau-nernst May 20, 2024
a4b7c7a
add dim check
gau-nernst May 20, 2024
3e4c1c1
add qtorch to dev req
gau-nernst May 20, 2024
632af93
handle exception with OpenMP
gau-nernst May 20, 2024
6c6fe83
handle exception in OpenMP
gau-nernst May 20, 2024
cb08b37
add tests
gau-nernst May 20, 2024
9f94030
more tests
gau-nernst May 20, 2024
7b7e823
simplify test
gau-nernst May 20, 2024
7c1ff7d
rename
gau-nernst May 20, 2024
0472b06
add back checks
gau-nernst May 20, 2024
0bda927
update docs
gau-nernst May 20, 2024
a21837c
add pure pytorch impl
gau-nernst May 20, 2024
6101869
add benchmark
gau-nernst May 20, 2024
81d7aeb
Merge branch 'main' into fp6_quant
msaroufim May 20, 2024
4c1da5f
Merge branch 'main' into fp6_quant
gau-nernst May 21, 2024
df7932b
update benchmark script
gau-nernst May 21, 2024
bdbd907
add triton kernel
gau-nernst May 21, 2024
42bf771
remove CUDA kernel
gau-nernst May 21, 2024
f178b01
move to_fp6 to dtypes/
gau-nernst May 21, 2024
ee2310c
add to_fp6 import
gau-nernst May 21, 2024
404f700
move tests
gau-nernst May 21, 2024
48fe45f
update benchmark script
gau-nernst May 21, 2024
5126f8f
add from_fp6
gau-nernst May 21, 2024
da767fa
migrate test
gau-nernst May 21, 2024
0b56ecf
add docs
gau-nernst May 21, 2024
110e888
add docs
gau-nernst May 21, 2024
3e2643c
add torch.compile test
gau-nernst May 21, 2024
71ecc45
Merge branch 'main' into fp6_quant
msaroufim May 21, 2024
750fbc6
polish docs
gau-nernst May 21, 2024
6a3f0c0
remove original weight dequant
gau-nernst May 21, 2024
f32d09f
remove weight dequant
gau-nernst May 21, 2024
8b5b81e
improve tests
gau-nernst May 22, 2024
a3cf93b
update names
gau-nernst May 22, 2024
3c636ff
rename
gau-nernst May 22, 2024
f672c70
update names
gau-nernst May 22, 2024
1a310e3
add notes about denormal numbers
gau-nernst May 23, 2024
c9ec255
update note
gau-nernst May 23, 2024
d1697e7
Merge branch 'main' into fp6_quant
gau-nernst May 25, 2024
8c86028
Merge branch 'main' into fp6_quant
gau-nernst May 25, 2024
d24dba8
fix merge problem
gau-nernst May 25, 2024
ce5dac1
fix merge conflict
gau-nernst May 25, 2024
922446d
add to_fp6 CPU C++ kernel
gau-nernst May 25, 2024
d287eb3
add from_fp6 cpu C++
gau-nernst May 25, 2024
ce7e09a
rename
gau-nernst May 25, 2024
22007a1
add some comments
gau-nernst May 25, 2024
f97421a
small cleanup
gau-nernst May 25, 2024
f727de0
always use uint32_t for bit manipulation
gau-nernst May 25, 2024
78e79ac
simplify test
gau-nernst May 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'pytorch:main' into fp6_quant
  • Loading branch information
gau-nernst authored May 16, 2024
commit 558f4e4c967cad0513142259dd765f074ccb20e9

This merge commit was added into this branch cleanly.

There are no new changes to show, but you can still view the diff.