Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions pkgs/development/python-modules/gpuctypes/default.nix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I bother you with adding optional cuda support? 🙃

I think you could just patch into https://github.com/tinygrad/gpuctypes/blob/77b94adaf8e27fa5ef7caa0bdd949bd528576d4b/gpuctypes/cuda.py#L146-L148

_libraries["libnvrtc.so"] = "${cudaPackages.cuda_nvrtc}/lib/libnvrtc.so"

and libcuda.so is "${addDriverRunpath.driverLink}/lib/libcuda.so" but you need to first test if it os.path.exists in case we're not running under NixOS

Copy link
Contributor

@SomeoneSerge SomeoneSerge Jan 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would be nice to display a nixpkgs-specific warning about cudaSupport in the cudaSupport = false branch, because people might expect gpuctypes to just work but since they rely on the broken-by-design ctypes and find_library they can't

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep the CDLL bit but make it if "libnvrtc.so" not in ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn I'm not even sure what to suggest upstream. The thing is, CDLL("libcuda.so") is much less broken than CDLL(find_library("libcuda.so"))

Copy link

@jeremygrant jeremygrant Jan 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this to a hacked up nixified version of tinygrad allows me to run on my gpu

(lib.optionalString cudaSupport ''
substituteInPlace gpuctypes/cuda.py --replace "ctypes.util.find_library('cuda')" "\"${addDriverRunpath.driverLink}/lib/libcuda.so\""
substituteInPlace gpuctypes/cuda.py --replace "ctypes.util.find_library('nvrtc')" "\"${cudaPackages.cuda_nvrtc}/lib/libnvrtc.so\""
'');

I haven't tested it too hard, but the following works as expected

CUDA=1 DEBUG=4 python3 -c "import tinygrad; print(tinygrad.Tensor([1,3,-5]).dot(tinygrad.Tensor([4,-2,-1])).realize().numpy())"
*** CUDA       1 copy        3,    CUDA <- CPU         arg   2 mem  0.00 GB tm     86.45us/     0.09ms (    0.00 GFLOPS,    0.00 GB/s)
*** CUDA       2 copy        3,    CUDA <- CPU         arg   2 mem  0.00 GB tm     26.82us/     0.11ms (    0.00 GFLOPS,    0.00 GB/s)
  0 ━┳ STORE MemBuffer(idx=0, dtype=dtypes.int, st=ShapeTracker(views=(View(shape=(1,), strides=(0,), offset=0, mask=None, contiguous=True),)))
  1  ┗━┳ SUM (1,)
  2    ┗━┳ MUL
  3      ┣━━ LOAD MemBuffer(idx=1, dtype=dtypes.int, st=ShapeTracker(views=(View(shape=(3,), strides=(1,), offset=0, mask=None, contiguous=True),)))
  4      ┗━━ LOAD MemBuffer(idx=2, dtype=dtypes.int, st=ShapeTracker(views=(View(shape=(3,), strides=(1,), offset=0, mask=None, contiguous=True),)))
#define INFINITY (__int_as_float(0x7f800000))
#define NAN (__int_as_float(0x7fffffff))
extern "C" __global__ void r_3(int* data0, const int* data1, const int* data2) {
  int acc0 = 0;
  int val0 = data1[0];
  int val1 = data1[1];
  int val2 = data1[2];
  int val3 = data2[0];
  int val4 = data2[1];
  int val5 = data2[2];
  float alu0 = (((float)(val2)*(float)(val5))+(((float)(val1)*(float)(val4))+(((float)(val0)*(float)(val3))+(float)(acc0))));
  acc0 = (int)(alu0);
  data0[0] = acc0;
}
*** CUDA       3 r_3                                   arg   3 mem  0.00 GB tm     17.60us/     0.13ms (    0.00 GFLOPS,    0.00 GB/s)
*** CPU        4 copy        1,     CPU <- CUDA        arg   2 mem  0.00 GB tm     36.58us/     0.17ms (    0.00 GFLOPS,    0.00 GB/s)
3
avg:     0.00 GFLOPS     0.00 GB/s           total:     4 kernels     0.00 GOPS     0.00 GB     0.17 ms

Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{ lib
, buildPythonPackage
, fetchFromGitHub
, rocmPackages
, ocl-icd
}:
buildPythonPackage {
pname = "gpuctypes";
version = "unstable-2023-11-26";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we adopt 0.2.0?
There is nothing new in the diff with this commit.
tinygrad/gpuctypes@c969506
tinygrad/gpuctypes@c969506...3d402bb


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add pyproject = true to avoid legacy build.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did ^ and I also needed to include

nativeBuildInputs = [  setuptools ];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natsukium There is no pyproject.toml in the root of the repo, but only a setup.py. I'm not a python packaging expert, can you explain to me why the function argument is named pyproject and what the effect of enabling it is, when there is no pyproject.toml in the root of the repo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, please read the document first.
#236913 (comment)

src = fetchFromGitHub {
repo = "gpuctypes";
owner = "tinygrad";
rev = "3d402bb3bd5536aeb09c55da02277c339b4af089";
hash = "sha256-bYSjCrVO6Bd/oGCLLp5AcPEe6fFUdnJwxCzRWK54AG8=";
};

postPatch = ''
# patch correct path to opencl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems redundant since this is a common operation.
I would like to avoid inline comments to prevent unintended changes to the derivation due to comment modification.

substituteInPlace gpuctypes/opencl.py --replace "ctypes.util.find_library('OpenCL')" "'${ocl-icd}/lib/libOpenCL.so'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opencl should also be available in darwin, we need to use darwin.apple_sdk.frameworks.OpenCL.

We can use stdenv.hostPlatform.extensions.sharedLibrary to get the extension of the shared object if necessary.


# patch correct path to hip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

substituteInPlace gpuctypes/hip.py --replace "/opt/rocm/lib/libamdhip64.so" "${rocmPackages.clr}/lib/libamdhip64.so"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap these in a rocmSupport check to avoid having to build rocm?

(lib.optionalString rocmSupport ''
substituteInPlace gpuctypes/hip.py --replace "/opt/rocm/lib/libamdhip64.so" "${rocmPackages.clr}/lib/libamdhip64.so"
substituteInPlace gpuctypes/hip.py --replace "/opt/rocm/lib/libhiprtc.so" "${rocmPackages.clr}/lib/libhiprtc.so"
'') +

substituteInPlace gpuctypes/hip.py --replace "/opt/rocm/lib/libhiprtc.so" "${rocmPackages.clr}/lib/libhiprtc.so"
'';

pythonImportsCheck = [ "gpuctypes" ];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta = with lib; {
description = "Ctypes wrappers for HIP, CUDA, and OpenCL";
homepage = "https://github.com/tinygrad/gpuctypes";
license = licenses.mit;
maintainers = with maintainers; [ matthewcroughan wozeparrot ];
};

}
2 changes: 2 additions & 0 deletions pkgs/top-level/python-packages.nix
Original file line number Diff line number Diff line change
Expand Up @@ -4812,6 +4812,8 @@ self: super: with self; {

gpsoauth = callPackage ../development/python-modules/gpsoauth { };

gpuctypes = callPackage ../development/python-modules/gpuctypes { };

gpustat = callPackage ../development/python-modules/gpustat { };

gpxpy = callPackage ../development/python-modules/gpxpy { };
Expand Down