Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatglm-ggml_q4_0.bin GGML_ASSERT ggml-metal.m:1453: false #184

Open
zwqjoy opened this issue Nov 10, 2023 · 7 comments
Open

chatglm-ggml_q4_0.bin GGML_ASSERT ggml-metal.m:1453: false #184

zwqjoy opened this issue Nov 10, 2023 · 7 comments

Comments

@zwqjoy
Copy link

zwqjoy commented Nov 10, 2023

./build/bin/main -m ../GGUF_Models/chatglm-ggml_q4_0.bin -l 256 -p "你好"
GGML_ASSERT: /Users/apple/PycharmProjects/NLPProject/chatglm.cpp/third_party/ggml/src/ggml-metal.m:1453: false

@Weaxs
Copy link

Weaxs commented Dec 18, 2023

产生报错的环境是 mac + metal 吗?

@XuYicong
Copy link

XuYicong commented Dec 19, 2023

同样遇到了这个问题。
输出log如下:
ggml_metal_graph_compute: command buffer 0 failed with status 5

M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。

系统是Sonoma 14.2.

@Weaxs
Copy link

Weaxs commented Dec 20, 2023

同样遇到了这个问题。 输出log如下: ggml_metal_graph_compute: command buffer 0 failed with status 5

M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。

系统是Sonoma 14.2.

执行一下 uname -spm 看下,顺便cmake的日志方便贴一下吗

@XuYicong
Copy link

XuYicong commented Dec 20, 2023

执行一下 uname -spm 看下,顺便cmake的日志方便贴一下吗

Darwin arm64 arm

cmake在终端的输出:

-- The CXX compiler identification is AppleClang 15.0.0.15000100
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Deprecation Warning at third_party/ggml/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- The C compiler identification is AppleClang 15.0.0.15000100
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: arm64
-- ARM detected
-- Accelerate framework found
CMake Warning (dev) at third_party/ggml/src/CMakeLists.txt:322 (install):
  Target ggml has RESOURCE files but no RESOURCE DESTINATION.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Deprecation Warning at third_party/sentencepiece/CMakeLists.txt:15 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- VERSION: 0.2.00
-- Configuring done (1.1s)
-- Generating done (0.0s)
-- Build files have been written to: /Users/xyct/llm/chatglm.cpp/build

顺便编译警告里有个这个不知道有没有关系:

/Users/xyct/llm/chatglm.cpp/third_party/ggml/src/ggml.c:11895:17: warning: 'cblas_sgemm' is deprecated: first deprecated in macOS 13.3 - An updated CBLAS interface supporting ILP64 is available.  Please compile with -DACCELERATE_NEW_LAPACK to access the new headers and -DACCELERATE_LAPACK_ILP64 for ILP64 support. [-Wdeprecated-declarations]
                cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasTrans,
                ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/System/Library/Frameworks/vecLib.framework/Headers/cblas.h:610:6: note: 'cblas_sgemm' has been explicitly marked deprecated here
void cblas_sgemm(const enum CBLAS_ORDER __Order,
     ^

@Weaxs
Copy link

Weaxs commented Dec 20, 2023

这个warning应该没关系,方便再发下执行指令和执行log吗

@Weaxs
Copy link

Weaxs commented Dec 20, 2023

看 cmake 编译log其实主要想看 编译ggml那段的参数,你发的这个没有

@XuYicong
Copy link

XuYicong commented Dec 20, 2023

@Weaxs 我删除了build目录重新cmake了一遍,获取了更完整的输出更新在上面了,但好像仍然没有ggml相关的参数。

执行指令和log跟楼主的一样,但我已经删掉模型了所以不方便重新运行了...磁盘空间伤不起

但我觉得原因只是单纯的爆内存了。我用llama.cpp运行qwen-14B-Q4_K,上下文长度较短时一切正常,但设到1000左右就会发生一模一样的错误。之前运行chatglm3-f16时,从敲回车到GGML_ASSERT输出之间,大概有十几秒的等待,期间内存压力也是先攀升到顶后瞬间跌落,肯定是爆内存了。


Update:

通过修改代码,输出[ctx->command_buffers[i] error],获取到错误信息如下:
Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)
所以错误原因确实是内存不足。

修改方法:void ggml_metal_graph_compute方法开头的ctx->command_buffers[i] = [ctx->queue commandBuffer];改成

        MTLCommandBufferDescriptor* descriptor = [[MTLCommandBufferDescriptor alloc] init];
        descriptor.errorOptions = MTLCommandBufferErrorOptionEncoderExecutionStatus;
        ctx->command_buffers[i] = [ctx->queue commandBufferWithDescriptor:descriptor];
        [descriptor release];

并且在报错行之前插入:

            NSError*error = [ctx->command_buffers[i] error];
            if(error && ([ctx->command_buffers[i] errorOptions] &
                         MTLCommandBufferErrorOptionEncoderExecutionStatus)) {
                GGML_METAL_LOG_INFO("%s", error.localizedDescription.UTF8String);
            }

即可看到错误信息。
注:chatglm.cpp未设置GGML_METAL_LOG_INFO的输出回调,可能需要改成printf才有输出。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants