- 
                Notifications
    You must be signed in to change notification settings 
- Fork 269
Only enumerate ROCm-capable AMD GPUs #1500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
497e20c    to
    7bf6748      
    Compare
  
    Discover AMD graphics devices using AMDKFD topology instead of enumerating the PCIe bus. This interface exposes a lot more information about potential devices, allowing RamaLama to filter out unsupported devices. Currently, devices older than GFX9 are filtered, as they are no longer supported by ROCm. Signed-off-by: Leorize <[email protected]>
7bf6748    to
    fab8765      
    Compare
  
    | At the moment, RamaLama will fallback to CPU inference for unsupported AMD GPUs since  | 
| Ideally this should be tested on AMD APUs systems. I have only verified this implementation with my dGPU. | 
| Not a big fan of inner functions, it encourages indenting for little benefit, can we just make them normal functions? | 
| Thanks for completing this, the logic seems to make sense, not that I tested | 
| As regards "I'm not sure how to detect if a Vulkan-capable accelerator is available to force passing:" We recently integrated this change: Maybe we want to default to  We could parse this just before we execute llama-server also potentially:  | 
Signed-off-by: Leorize <[email protected]>
Signed-off-by: Leorize <[email protected]>
Signed-off-by: Leorize <[email protected]>
| 
 I actually tested that, and it tries to do repacks for CPU. I'm not sure if that's a good or a bad thing. But either way I think that change should be done separately. 
 
 | 
| 
 Yeah... There's a patch from llama.cpp merged hours ago that might change this behaviour though. Did you try with that patch? 
 It's installed in all our container images with vulkan, so if it's done just before where llama-server is execute it should be fine in the containerized case. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces PCIe-based enumeration with AMDKFD topology to detect ROCm-capable AMD GPUs, filters out unsupported older architectures, and accounts for VRAM across memory banks.
- Add ramalama.amdkfdmodule for parsing KFD properties and listing GPU nodes
- Update check_rocm_amd()to use KFD topology, filter gfx versions <90000, and sum VRAM from public/private heaps
- Introduce MIN_VRAM_BYTESconstant (1 GiB) for minimum VRAM threshold
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description | 
|---|---|
| ramalama/common.py | Import new module, add VRAM threshold constant, overhaul GPU scan | 
| ramalama/amdkfd.py | New utilities for reading KFD sysfs properties and listing GPUs | 
Comments suppressed due to low confidence (3)
ramalama/common.py:453
- [nitpick] The variable name npis ambiguous and commonly associated with NumPy; consider renaming tonode_pathor similar for clarity.
for i, (np, props) in enumerate(amdkfd.gpus()):
ramalama/common.py:470
- This assignment is outside the if mem_bytes > ...block, sogpu_numwill be set on every iteration rather than only when a larger VRAM GPU is found. It should be indented inside theif.
gpu_num = i
ramalama/amdkfd.py:12
- The new gpus()function andparse_props()logic lack tests; consider adding unit tests or mocks for sysfs reads to verify correct parsing and filtering.
def gpus():
| mem_banks_count = int(props['mem_banks_count']) | ||
| mem_bytes = 0 | ||
| for bank in range(mem_banks_count): | ||
| bank_props = amdkfd.parse_props(np + f'/mem_banks/{bank}/properties') | 
    
      
    
      Copilot
AI
    
    
    
      Jun 10, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Path construction via string concatenation can be fragile; prefer os.path.join(np, 'mem_banks', str(bank), 'properties').
| bank_props = amdkfd.parse_props(np + f'/mem_banks/{bank}/properties') | |
| bank_props = amdkfd.parse_props(os.path.join(np, 'mem_banks', str(bank), 'properties')) | 
| # See /usr/include/linux/kfd_sysfs.h for possible heap types | ||
| # | ||
| # Count public and private framebuffer memory as VRAM | ||
| if bank_props['heap_type'] in [1, 2]: | 
    
      
    
      Copilot
AI
    
    
    
      Jun 10, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Magic numbers 1 and 2 for heap types are unclear; consider defining named constants or an enum for readability and future maintenance.
| if bank_props['heap_type'] in [1, 2]: | |
| if bank_props['heap_type'] in [HEAP_TYPE_PUBLIC, HEAP_TYPE_PRIVATE]: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is correct we should try and take this is another PR @alaviss . I have no idea what 1 and 2 mean.
Discover AMD graphics devices using AMDKFD topology instead of enumerating the PCIe bus. This interface exposes a lot more information about potential devices, allowing RamaLama to filter out unsupported devices.
Currently, devices older than GFX9 are filtered, as they are no longer supported by ROCm.
Ref: #1482
Summary by Sourcery
Use the AMDKFD sysfs topology interface to detect and select ROCm-capable AMD GPUs by architecture and VRAM capacity.
New Features: