Skip to content

python3Packages.bitsandbytes: do not require cuda for non cuda build#424174

Closed
ferrine wants to merge 1 commit intoNixOS:masterfrom
ferrine:bitsandbytes
Closed

python3Packages.bitsandbytes: do not require cuda for non cuda build#424174
ferrine wants to merge 1 commit intoNixOS:masterfrom
ferrine:bitsandbytes

Conversation

@ferrine
Copy link
Contributor

@ferrine ferrine commented Jul 10, 2025

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • Nixpkgs 25.11 Release Notes (or backporting 25.05 Nixpkgs Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
  • NixOS 25.11 Release Notes (or backporting 25.05 NixOS Release notes)
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other contributing documentation in corresponding paths.

Add a 👍 reaction to pull requests you find important.

@nix-owners nix-owners bot requested a review from bcdarwin July 10, 2025 22:37
@nixpkgs-ci nixpkgs-ci bot added 6.topic: python Python is a high-level, general-purpose programming language. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. labels Jul 10, 2025
@ferrine ferrine mentioned this pull request Jul 13, 2025
4 tasks
@ferrine
Copy link
Contributor Author

ferrine commented Jul 15, 2025

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 424174
Commit: 313223a967f630dbe2962d86b3fb382ac631b03e


aarch64-darwin

❌ 12 packages failed to build:
  • python312Packages.mmcv
  • python312Packages.mmcv.dist
  • python312Packages.mmengine
  • python312Packages.mmengine.dist
  • python312Packages.torchao
  • python312Packages.torchao.dist
  • python313Packages.mmcv
  • python313Packages.mmcv.dist
  • python313Packages.mmengine
  • python313Packages.mmengine.dist
  • python313Packages.torchao
  • python313Packages.torchao.dist
✅ 4 packages built:
  • python312Packages.bitsandbytes
  • python312Packages.bitsandbytes.dist
  • python313Packages.bitsandbytes
  • python313Packages.bitsandbytes.dist

Error logs: `aarch64-darwin`
python312Packages.mmengine
  /private/tmp/nix-build-python3.12-mmengine-0.10.7.drv-0/source/mmengine/visualization/visualizer.py:831: UserWarning: Warning: The polygon is out of bounds, the drawn polygon may not be in the image
    warnings.warn(

tests/test_visualizer/test_visualizer.py::TestVisualizer::test_draw_featmap
/private/tmp/nix-build-python3.12-mmengine-0.10.7.drv-0/source/mmengine/visualization/visualizer.py:987: UserWarning: Since the spatial dimensions of overlaid_image: (3, 3) and featmap: torch.Size([4, 3]) are not same, the feature map will be interpolated. This may cause mismatch problems !
warnings.warn(

tests/test_visualizer/test_visualizer.py::TestVisualizer::test_init
/private/tmp/nix-build-python3.12-mmengine-0.10.7.drv-0/source/mmengine/utils/manager.py:113: UserWarning: <class 'mmengine.visualization.visualizer.Visualizer'> instance named of test_save_dir has been created, the method get_instance should not accept any other arguments
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_runner/test_amp.py::TestAmp::test_autocast - AssertionError: torch.bfloat16 != torch.float32
FAILED tests/test_runner/test_runner.py::TestRunner::test_test - ValueError: User specified autocast device_type must be cuda or cpu, but got mps
FAILED tests/test_runner/test_runner.py::TestRunner::test_val - ValueError: User specified autocast device_type must be cuda or cpu, but got mps
FAILED tests/test_utils/test_dl_utils/test_setup_env.py::test_setup_multi_processes - assert 28 == 4

  • where 28 = ()
  • where = cv2.getNumThreads
    = 4 failed, 778 passed, 129 skipped, 96 deselected, 236 warnings in 216.00s (0:03:35) =
python312Packages.torchao
(3) install libomp via brew: `brew install libomp`;
(4) manually setup OpenMP and set the `OMP_PREFIX` environment variable to point to a path with `include/omp.h` under it.

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

To execute this test, run the following from the base repo dir:
python test/test_low_bit_optim.py TestOptim.test_optim_smoke_optim_name_AdamWFp8_bfloat16_device_cpu

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
FAILED test/test_low_bit_optim.py::TestOptim::test_optim_smoke_optim_name_AdamWFp8_bfloat16_device_mps - torch._inductor.exc.InductorError: KeyError: torch.float8_e4m3fn

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

To execute this test, run the following from the base repo dir:
python test/test_low_bit_optim.py TestOptim.test_optim_smoke_optim_name_AdamWFp8_bfloat16_device_mps

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
==== 64 failed, 566 passed, 2191 skipped, 187 warnings in 108.34s (0:01:48) ====

python313Packages.mmengine
  /private/tmp/nix-build-python3.13-mmengine-0.10.7.drv-0/source/mmengine/visualization/visualizer.py:831: UserWarning: Warning: The polygon is out of bounds, the drawn polygon may not be in the image
    warnings.warn(

tests/test_visualizer/test_visualizer.py::TestVisualizer::test_draw_featmap
/private/tmp/nix-build-python3.13-mmengine-0.10.7.drv-0/source/mmengine/visualization/visualizer.py:987: UserWarning: Since the spatial dimensions of overlaid_image: (3, 3) and featmap: torch.Size([4, 3]) are not same, the feature map will be interpolated. This may cause mismatch problems !
warnings.warn(

tests/test_visualizer/test_visualizer.py::TestVisualizer::test_init
/private/tmp/nix-build-python3.13-mmengine-0.10.7.drv-0/source/mmengine/utils/manager.py:113: UserWarning: <class 'mmengine.visualization.visualizer.Visualizer'> instance named of test_save_dir has been created, the method get_instance should not accept any other arguments
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/test_runner/test_amp.py::TestAmp::test_autocast - AssertionError: torch.bfloat16 != torch.float32
FAILED tests/test_runner/test_runner.py::TestRunner::test_test - ValueError: User specified autocast device_type must be cuda or cpu, but got mps
FAILED tests/test_runner/test_runner.py::TestRunner::test_val - ValueError: User specified autocast device_type must be cuda or cpu, but got mps
FAILED tests/test_utils/test_dl_utils/test_setup_env.py::test_setup_multi_processes - assert 28 == 4

  • where 28 = ()
  • where = cv2.getNumThreads
    = 4 failed, 778 passed, 129 skipped, 96 deselected, 237 warnings in 191.38s (0:03:11) =
python313Packages.torchao
/private/tmp/nix-build-python3.13-ao-0.11.0.drv-0/torchinductor__nixbld10/pi/cpicxudqmdsjh5cm4klbtbrvy2cxwr7whxl3md2zzdjdf3orvfdf.h:11:10: fatal error: 'omp.h' file not found
   11 | #include 
      |          ^~~~~~~
1 error generated.

OpenMP support not found. Please try one of the following solutions:
(1) Set the CXX environment variable to a compiler other than Apple clang++/g++ that has builtin OpenMP support;
(2) install OpenMP via conda: conda install llvm-openmp;
(3) install libomp via brew: brew install libomp;
(4) manually setup OpenMP and set the OMP_PREFIX environment variable to point to a path with include/omp.h under it.

Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"

To execute this test, run the following from the base repo dir:
python test/dtypes/test_floatx.py TestFloatxTensorCoreAQTTensorImpl.test_to_scaled_tc_floatx_compile_ebits_3_mbits_2_device_cpu

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
==== 64 failed, 566 passed, 2191 skipped, 187 warnings in 132.46s (0:02:12) ====

@ferrine
Copy link
Contributor Author

ferrine commented Jul 15, 2025

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 424174
Commit: 313223a967f630dbe2962d86b3fb382ac631b03e


x86_64-linux

❌ 4 packages failed to build:
  • python312Packages.torchao
  • python312Packages.torchao.dist
  • python313Packages.mmcv
  • python313Packages.mmcv.dist
✅ 18 packages built:
  • python312Packages.bitsandbytes
  • python312Packages.bitsandbytes.dist
  • python312Packages.kserve
  • python312Packages.kserve.dist
  • python312Packages.mmcv
  • python312Packages.mmcv.dist
  • python312Packages.mmengine
  • python312Packages.mmengine.dist
  • python312Packages.unsloth
  • python312Packages.unsloth.dist
  • vllm (python312Packages.vllm)
  • vllm.dist (python312Packages.vllm.dist)
  • python313Packages.bitsandbytes
  • python313Packages.bitsandbytes.dist
  • python313Packages.mmengine
  • python313Packages.mmengine.dist
  • python313Packages.torchao
  • python313Packages.torchao.dist

Error logs: `x86_64-linux`
python312Packages.torchao
    warnings.warn(

test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_no_conv_bias
/nix/store/cp3z5b1jdwa6mgfwjys8sx6vs65ygigx-python3.12-torch-2.7.1/lib/python3.12/site-packages/torch/fx/graph.py:1179: UserWarning: erase_node(batch_norm_6) on an already erased node
warnings.warn(f"erase_node({to_erase}) on an already erased node")

test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_no_conv_bias
test/quantization/pt2e/test_quantize_pt2e_qat.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_fusion_no_conv_bias
/nix/store/cp3z5b1jdwa6mgfwjys8sx6vs65ygigx-python3.12-torch-2.7.1/lib/python3.12/site-packages/torch/fx/graph.py:1179: UserWarning: erase_node(batch_norm_7) on an already erased node
warnings.warn(f"erase_node({to_erase}) on an already erased node")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED test/quantization/test_qat.py::TestQAT::test_qat_generic_fake_quantize - AssertionError: 0.7939453125 not greater than or equal to 0.8
==== 1 failed, 646 passed, 2148 skipped, 210 warnings in 737.01s (0:12:17) =====

python313Packages.mmcv
    ...<11 lines>...
        with_cuda=with_cuda,
        ^^^^^^^^^^^^^^^^^^^^
        with_sycl=with_sycl)
        ^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/sgyza8l7w67fvhqqz9xg2r4i3a9a0rxb-python3.13-torch-2.7.1/lib/python3.13/site-packages/torch/utils/cpp_extension.py", line 2159, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
    ~~~~~~~~~~~~~~~~^
        build_directory,
        ^^^^^^^^^^^^^^^^
    ...<2 lines>...
        # that failed to build but there isn't a good way to get it here.
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        error_prefix='Error compiling objects for extension')
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/sgyza8l7w67fvhqqz9xg2r4i3a9a0rxb-python3.13-torch-2.7.1/lib/python3.13/site-packages/torch/utils/cpp_extension.py", line 2522, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

ERROR Backend subprocess exited when trying to invoke build_wheel

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Aug 23, 2025
Copy link
Member

@bcdarwin bcdarwin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good although unfortunately needs a rebase - sorry about the delay.

(I don't have a strong opinion about the separation into two separate attrsets - seems Nixpkgs style is mostly to push the conditions down instead as with the previous code. This does result in a proliferation of essentially the same conditional but doesn't seem to be a big problem in practice, probably because individual packages are leaf nodes in Nixpkgs?)


cmakeFlags = [ (lib.cmakeFeature "COMPUTE_BACKEND" "cuda") ];

CUDA_HOME = "${cuda-native-redist}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be env.CUDA_HOME if possible (though I realize this is just preserving what was there before).

];

doCheck = false; # tests require CUDA and also GPU access
doCheck = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should probably be a comment explaining why tests are disabled, so better to update than remove the previous one


CUDA_HOME = "${cuda-native-redist}";

NVCC_PREPEND_FLAGS = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly consider updating to env.NVCC_PREPEND_FLAGS.

@ferrine
Copy link
Contributor Author

ferrine commented Aug 24, 2025

Yes, I also had a concern over the chosen approach and wanted to get into the feedback loop once #424179 is finished

@LunNova
Copy link
Member

LunNova commented Aug 25, 2025

bitsandbytes has ROCm support upstream so there'll need to be a third set of attrs to merge in and it may get a bit repetitive using this structure.

Comment on lines +65 to 66
buildInputs = [ ];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
buildInputs = [ ];

if it is empty, we can just omit it, right?

@bcdarwin
Copy link
Member

bcdarwin commented Sep 8, 2025

Due to #441275, will now need to be rebased ..

@ferrine
Copy link
Contributor Author

ferrine commented Sep 12, 2025

The pr might solve the problem as well, will check later after xgrammar is pr merged

@prusnak
Copy link
Member

prusnak commented Sep 14, 2025

I don't think this convoluted approach achieves anything more than has been already fixed by #441275

@ferrine
Copy link
Contributor Author

ferrine commented Sep 16, 2025

Agree

@ferrine ferrine closed this Sep 16, 2025
@ferrine ferrine deleted the bitsandbytes branch September 16, 2025 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: python Python is a high-level, general-purpose programming language. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants