Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: hipMemSetGetAccess failing in hip-tests/catch/unit/virtualMemoryManagement #3763

Open
tjtanaa opened this issue Mar 13, 2025 · 1 comment

Comments

@tjtanaa
Copy link

tjtanaa commented Mar 13, 2025

Problem Description

Description

hip-tests/catch/unit/virtualMemoryManagement/hipMemSetGetAccess test is failing.

The ROCm and HIP version is based on this docker image rocm/dev-ubuntu-22.04:6.3.1-complete.

root:/app/hip-tests/build/catch_tests/unit/virtualMemoryManagement# ./VirtualMemoryManagementTest 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VirtualMemoryManagementTest is a Catch v2.13.4 host application.
Run with -? for options

-------------------------------------------------------------------------------
Unit_hipMemCreate_MapNonContiguousChunks
-------------------------------------------------------------------------------
/app/hip-tests/catch/unit/virtualMemoryManagement/hipMemCreate.cc:272
...............................................................................

/app/hip-tests/catch/unit/virtualMemoryManagement/hipMemCreate.cc:323: FAILED:
  REQUIRE( true == std::equal(B_h.begin(), B_h.end(), C_h.data()) )
with expansion:
  true == false

Memory access fault by GPU node-9 (Agent handle: 0x2373990) on address 0x7f88a6dd5000. Reason: Unknown.
GPU core dump created: gpucore.27623
-------------------------------------------------------------------------------
Unit_hipMemSetAccess_Vmm2UnifiedMemCpy
-------------------------------------------------------------------------------
/app/hip-tests/catch/unit/virtualMemoryManagement/hipMemSetGetAccess.cc:552
...............................................................................

/app/hip-tests/catch/unit/virtualMemoryManagement/hipMemSetGetAccess.cc:552: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:    24 |    22 passed | 2 failed
assertions: 73852 | 73850 passed | 2 failed

Aborted

Operating System

Ubuntu 22.04.4 LTS (Jammy Jellyfish)

CPU

AMD EPYC 9654 96-Core Processor

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.3.1

ROCm Component

HIP

Steps to Reproduce

Steps:

  1. Launch Docker
#!/bin/bash
docker run -it \
   --network=host \
   --group-add=video \
   --ipc=host \
   --cap-add=SYS_PTRACE \
   --security-opt seccomp=unconfined \
   --device /dev/kfd \
   --device /dev/dri \
   rocm/dev-ubuntu-22.04:6.3.1-complete \
   bash
  1. Compile ROCM/hip-tests
export ROCM_BRANCH=rocm-6.3.x
git clone -b "$ROCM_BRANCH" https://github.com/ROCm/hip-tests.git
export HIPTESTS_DIR="$(readlink -f hip-tests)"
cd "$HIPTESTS_DIR"
mkdir -p build; cd build
cmake ../catch -DHIP_PLATFORM=amd -DHIP_PATH=$CLR_DIR/build/install
make build_tests
cd catch_tests/unit/virtualMemoryManagement/
./VirtualMemoryManagementTest 

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@ppanchad-amd
Copy link

Hi @tjtanaa. Internal ticket has been created to investigate this issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants