Skip to content

Conversation

@phillip-kravtsov
Copy link
Contributor

@phillip-kravtsov phillip-kravtsov commented Sep 29, 2023

  • Adds Persimmon 8B which is, architecturally, a standard dense transformer with:
    • Q/K layernorm
    • Squared ReLU activations
    • partial RoPE
    • very large vocab size (most unused for text)

To support Partial RoPE & Squared ReLU, this PR adds concat & square kernels for metal.
I've confirmed agreement between the GGML & HF implementation up to tensor values in the last layer.

@ggerganov ggerganov added high priority Very important issue model Model specific labels Sep 30, 2023
@ggerganov
Copy link
Member

Let's resolve the CI fails and merge

@phillip-kravtsov phillip-kravtsov force-pushed the phillip-kravtsov/support-adept-persimmon-8b branch from 92acb44 to 5d259d3 Compare October 5, 2023 18:04
@ggerganov ggerganov merged commit 0e797c2 into ggml-org:master Oct 7, 2023
@slaren
Copy link
Member

slaren commented Oct 7, 2023

The switches in llm_load_hparams and llama_build_graph are missing breaks, so it should be using the refact graph. Does this work currently?

@ggerganov
Copy link
Member

@phillip-kravtsov PTAL at @slaren's comment and fix as necessary

@KerfuffleV2
Copy link
Contributor

I got tired of seeing the compiler warning and created #3535 (not sure if there are any other issues, haven't had a chance to test it yet).

@phillip-kravtsov
Copy link
Contributor Author

Thanks for the fix @KerfuffleV2 -- that PR should be sufficient.

joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 12, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  py : change version of numpy requirement to 1.24.4 (ggml-org#3515)
  quantize : fail fast on write errors (ggml-org#3521)
  metal : support default.metallib load & reuse code for swift package (ggml-org#3522)
  llm : support Adept Persimmon 8B (ggml-org#3410)
  Fix for ggml-org#3454 (ggml-org#3455)
  readme : update models, cuda + ppl instructions (ggml-org#3510)
  server : docs fix default values and add n_probs (ggml-org#3506)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

high priority Very important issue model Model specific

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants