Skip to content

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Nov 4, 2024

TODO:

  • fix build and tests

ggerganov and others added 18 commits November 4, 2024 10:50
… MobileVLM model. (llama/9763)

* ggml: Add POOL2D OP for GPU ACC to the Vulkan.

- The MobileVLM model now supports inference acceleration through GPU by utilizing the Vulkan backend.
- A GGML_OP_POOL_2D shader has been added. (Pooling)
- The encoding performance of the CLIP model improved from 2.8s on the CPU to 0.7s on the GPU.

Signed-off-by: Changyeon Kim <[email protected]>

* [fix] Correct the incorrect order of the parameters.

fix casting to int.

Signed-off-by: Changyeon Kim <[email protected]>

---------

Signed-off-by: Changyeon Kim <[email protected]>
* ggml : RISC-V vector gemv for q4_0_8x8

* ggml : Added WIP rvv q4_0_8x8 gemm

* ggml : Added initial implementation of rvv gemm

* ggml : optimize gemm to avoid register spillover

* ggml : Fix GCC rvv load alignment issue

* ggml : Format gemm rvv code

* ggml : Fix a typo in RVV q4_0_8_8 GEMM
* ggml : fix gguf string leak when reading kv pairs fails

* ggml : avoid crashing with GGML_ABORT when the KV has an invalid type

* ggml : avoid crashing on failed memory allocations when loading a gguf file
Get in line with the other backends by supporting the newer
backend/device registry interfaces.

Signed-off-by: Sergio Lopez <[email protected]>
This is a more or less direct translation from the Metal implementation
to GLSL.

Signed-off-by: Sergio Lopez <[email protected]>
* llama : fix buffer checks for mamba and rwk

* llama : fix missing worst case flag during reserve

* cuda : fix supports_op for norm

* disable sched SET_CAUSE
* llama : add simple-chat example

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
* metal : minor fixup in FA kernel

ggml-ci

* metal : use the unrolled loop variable

* metal : remove unused var
@ggerganov ggerganov changed the title sync : llam.cpp sync : llama.cpp Nov 4, 2024
@slaren
Copy link
Member

slaren commented Nov 4, 2024

The test-opt should just be disabled until it is updated in #988, since the opt interface has been removed it cannot be updated.

Looks like other tests are failing too, I will update them.

@slaren
Copy link
Member

slaren commented Nov 4, 2024

I disabled all tests and examples that depend on ggml_opt. They should be re-enabled or removed in #988.

@ggerganov ggerganov marked this pull request as ready for review November 4, 2024 17:37
@ggerganov ggerganov merged commit f3c1e6a into master Nov 4, 2024
4 checks passed
@ggerganov ggerganov deleted the sync branch November 4, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants