Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main : don't print special tokens with --grammar #6923

Merged
merged 15 commits into from
May 25, 2024

Conversation

jart
Copy link
Contributor

@jart jart commented Apr 26, 2024

The CLI interface was recently changed to print special control tokens like the stop message one. This token shouldn't be printed if the grammar flag was passed, unless the grammar specifies it, because that breaks shell-scriptability.

The CLI interface was recently changed to print special control tokens
like the </s> stop message one. This token shouldn't be printed if the
grammar flag was passed, unless the grammar specifies it, because that
breaks shell-scriptability.
@mofosyne mofosyne added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 9, 2024
@HanClinto
Copy link
Collaborator

The CLI interface was recently changed to print special control tokens like the stop message one.

Looks like this was #6807 ?

This token shouldn't be printed if the grammar flag was passed

Looks like your PR accomplishes this part...

unless the grammar specifies it,

... but is this condition included in your PR? It looks like the PR checks whether any grammar is present at all, and doesn't check to see if the grammar specifies special control tokens (?).

That said, I'm not entirely sure how a grammar would specify a special control token (I probably need to learn the tokenizer better...). If there's a mechanism or escape sequence to tag part of a grammar as a control token (rather than regular text), then I'm not aware of it. But now that you mention it, that sounds like the sort of thing we may want to add support for later (?).

because that breaks shell-scriptability.

Do you have an example of the sort of script that fails without this PR?

Apologies if any of my questions seem ignorant or dense -- mainly trying to get up to speed with what you're talking about. I trust that what you wrote is important, and I would love it if you could help me understand it better.

Thank you very much!

@jart
Copy link
Contributor Author

jart commented May 17, 2024

See:

I like to ask LLMs yes/no questions in shell scripts. I use the grammar flag to force it to only print yes or no. If it instead prints "no<s>" or "no<|end-of-turn|>" then that breaks my shell script if statments.

@HanClinto
Copy link
Collaborator

That makes sense.

I don't use shell scripts with grammars, but I wonder if this functionality would be better added behind a command line option to specifically render special tokens or hide them? If I'm trying to debug a grammar- constrained generation, it would tend to want to display the special tokens rather than hide them.

How do you feel about separating this flag out into its own option?

Copy link
Collaborator

@mofosyne mofosyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic checks out. Intent is sensible

What's your thought about @HanClinto idea of separating it out into it's own flag? Regardless, should be safe to make a separate PR and merge this in anyway as the default behavior of omitting special token with grammar make sense.


FYI: Android CI is bit broken in main branch, but I see a PR coming in soon to fix. But this doesn't touch android anyway.

@mofosyne mofosyne added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 18, 2024
@ggerganov
Copy link
Owner

I don't use shell scripts with grammars, but I wonder if this functionality would be better added behind a command line option to specifically render special tokens or hide them? If I'm trying to debug a grammar- constrained generation, it would tend to want to display the special tokens rather than hide them.

How do you feel about separating this flag out into its own option?

I agree it's better to have this as separate flag - should consolidate with the existing conversation flag - rename it and reuse it.

@mofosyne
Copy link
Collaborator

Possible approach that @jart could potentially use in jart#3 to address @HanClinto 's idea of separating out the flag. Investigated @ggerganov idea of consolidating with existing conversation flag but there is significant enough difference in semantic that i could not merge it.

Feel free to adjust as needed or ignore if it puts too much complexity to this PR.

@jart
Copy link
Contributor Author

jart commented May 18, 2024

One thing you could do is print the special tokens out of band to file descriptor 3. Then if a shell script doesn't want them, it could either pass a flag to disable them, or simply say 3>/dev/null.

@teleprint-me
Copy link
Contributor

I haven't had a chance to look at the grammar yet, but I'm wondering what the logic is behind including the special tokens in the output?

@mofosyne
Copy link
Collaborator

mofosyne commented May 21, 2024

To test this you can try these

cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build

echo "== Expect Control Token To Shared Console 3>&1 =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&1
echo
echo "== Expect No Control Token To Console because 3>/dev/null =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>/dev/null
echo
echo "== Expect No Control Token To Console because 3>&- =="
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null 3>&-
echo
echo == Expect No Control Token To Console as we are still in grammar mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --grammar 'root ::= "yes" | "no"' --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Say yes
<|assistant|>" 2>/dev/null
echo
echo == Expect Control Token To Console as we are in normal completion mode ==
./build/bin/main --hf-repo TheBloke/phi-2-GGUF --hf-file phi-2.Q6_K.gguf --temp 0 -c 0 --no-display-prompt --log-disable -p "<|user|>
Hi
<|assistant|>" 2>/dev/null 3>&1
echo

@mofosyne
Copy link
Collaborator

mofosyne commented May 21, 2024

I haven't had a chance to look at the grammar yet, but I'm wondering what the logic is behind including the special tokens in the output?

Might be handy for debugging at least.

Also it has tokens showing the split between the user, assistant and end of text. Might be handy for integration if not using the apis or libraries for some reason.

It does get me thinking if it's possible to also separate the input special tokens so that the special tokens are fully out of band. Might help make it a little bit more secure?


Anyway, is everyone happy enough with the new changes?

llama.h Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented May 21, 2024

📈 llama.cpp server for bench-server-baseline on Standard_NC4as_T4_v3 for phi-2-q4_0: 535 iterations 🚀

Expand details for performance related PR only
  • Concurrent users: 8, duration: 10m
  • HTTP request : avg=8728.07ms p(95)=22198.78ms fails=, finish reason: stop=481 truncated=54
  • Prompt processing (pp): avg=102.85tk/s p(95)=415.52tk/s
  • Token generation (tg): avg=32.3tk/s p(95)=46.58tk/s
  • ggml-org/models/phi-2/ggml-model-q4_0.gguf parallel=8 ctx-size=16384 ngl=33 batch-size=2048 ubatch-size=256 pp=1024 pp+tg=2048 branch=grammar-token commit=e75c5ca4512cef4bdd7470e4e756bf3d0af60ff3

prompt_tokens_seconds

More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 535 iterations"
    y-axis "llamacpp:prompt_tokens_seconds"
    x-axis "llamacpp:prompt_tokens_seconds" 1716621768 --> 1716622390
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 614.93, 614.93, 614.93, 614.93, 614.93, 917.56, 917.56, 917.56, 917.56, 917.56, 870.78, 870.78, 870.78, 870.78, 870.78, 915.13, 915.13, 915.13, 915.13, 915.13, 963.0, 963.0, 963.0, 963.0, 963.0, 955.35, 955.35, 955.35, 955.35, 955.35, 960.2, 960.2, 960.2, 960.2, 960.2, 964.38, 964.38, 964.38, 964.38, 964.38, 970.96, 970.96, 970.96, 970.96, 970.96, 964.57, 964.57, 964.57, 964.57, 964.57, 983.61, 983.61, 983.61, 983.61, 983.61, 971.06, 971.06, 971.06, 971.06, 971.06, 945.11, 945.11, 945.11, 945.11, 945.11, 924.34, 924.34, 924.34, 924.34, 924.34, 905.14, 905.14, 905.14, 905.14, 905.14, 907.64, 907.64, 907.64, 907.64, 907.64, 909.37, 909.37, 909.37, 909.37, 909.37, 906.0, 906.0, 906.0, 906.0, 906.0, 900.74, 900.74, 900.74, 900.74, 900.74, 852.46, 852.46, 852.46, 852.46, 852.46, 854.58, 854.58, 854.58, 854.58, 854.58, 861.01, 861.01, 861.01, 861.01, 861.01, 863.55, 863.55, 863.55, 863.55, 863.55, 874.66, 874.66, 874.66, 874.66, 874.66, 877.63, 877.63, 877.63, 877.63, 877.63, 879.08, 879.08, 879.08, 879.08, 879.08, 880.2, 880.2, 880.2, 880.2, 880.2, 890.3, 890.3, 890.3, 890.3, 890.3, 887.63, 887.63, 887.63, 887.63, 887.63, 883.19, 883.19, 883.19, 883.19, 883.19, 883.47, 883.47, 883.47, 883.47, 883.47, 887.19, 887.19, 887.19, 887.19, 887.19, 885.07, 885.07, 885.07, 885.07, 885.07, 887.9, 887.9, 887.9, 887.9, 887.9, 896.82, 896.82, 896.82, 896.82, 896.82, 898.37, 898.37, 898.37, 898.37, 898.37, 900.92, 900.92, 900.92, 900.92, 900.92, 906.61, 906.61, 906.61, 906.61, 906.61, 896.98, 896.98, 896.98, 896.98, 896.98, 896.34, 896.34, 896.34, 896.34, 896.34, 897.8, 897.8, 897.8, 897.8, 897.8, 899.21, 899.21, 899.21, 899.21, 899.21, 903.4, 903.4, 903.4, 903.4, 903.4, 865.56, 865.56, 865.56, 865.56, 865.56, 867.76, 867.76, 867.76, 867.76, 867.76, 866.62, 866.62, 866.62, 866.62, 866.62, 866.26, 866.26, 866.26, 866.26, 866.26, 858.96, 858.96, 858.96, 858.96, 858.96, 865.36, 865.36, 865.36, 865.36, 865.36, 864.29, 864.29, 864.29, 864.29, 864.29, 862.94, 862.94, 862.94, 862.94, 862.94, 867.61, 867.61, 867.61, 867.61, 867.61, 870.85, 870.85, 870.85, 870.85, 870.85, 874.59, 874.59, 874.59, 874.59, 874.59, 873.06, 873.06, 873.06, 873.06, 873.06, 872.87, 872.87, 872.87, 872.87, 872.87, 873.59, 873.59, 873.59, 873.59, 873.59, 872.87, 872.87, 872.87, 872.87, 872.87, 870.75, 870.75, 870.75, 870.75, 870.75, 871.52, 871.52, 871.52, 871.52, 871.52, 871.51, 871.51]
                    
Loading
predicted_tokens_seconds
More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 535 iterations"
    y-axis "llamacpp:predicted_tokens_seconds"
    x-axis "llamacpp:predicted_tokens_seconds" 1716621768 --> 1716622390
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 42.87, 42.87, 42.87, 42.87, 42.87, 33.35, 33.35, 33.35, 33.35, 33.35, 27.2, 27.2, 27.2, 27.2, 27.2, 30.97, 30.97, 30.97, 30.97, 30.97, 31.72, 31.72, 31.72, 31.72, 31.72, 32.68, 32.68, 32.68, 32.68, 32.68, 33.81, 33.81, 33.81, 33.81, 33.81, 33.82, 33.82, 33.82, 33.82, 33.82, 34.22, 34.22, 34.22, 34.22, 34.22, 34.06, 34.06, 34.06, 34.06, 34.06, 33.94, 33.94, 33.94, 33.94, 33.94, 33.5, 33.5, 33.5, 33.5, 33.5, 33.27, 33.27, 33.27, 33.27, 33.27, 32.76, 32.76, 32.76, 32.76, 32.76, 32.17, 32.17, 32.17, 32.17, 32.17, 31.05, 31.05, 31.05, 31.05, 31.05, 29.89, 29.89, 29.89, 29.89, 29.89, 29.56, 29.56, 29.56, 29.56, 29.56, 29.97, 29.97, 29.97, 29.97, 29.97, 29.88, 29.88, 29.88, 29.88, 29.88, 29.75, 29.75, 29.75, 29.75, 29.75, 29.67, 29.67, 29.67, 29.67, 29.67, 29.8, 29.8, 29.8, 29.8, 29.8, 30.12, 30.12, 30.12, 30.12, 30.12, 30.09, 30.09, 30.09, 30.09, 30.09, 30.19, 30.19, 30.19, 30.19, 30.19, 30.45, 30.45, 30.45, 30.45, 30.45, 30.27, 30.27, 30.27, 30.27, 30.27, 30.14, 30.14, 30.14, 30.14, 30.14, 30.03, 30.03, 30.03, 30.03, 30.03, 30.14, 30.14, 30.14, 30.14, 30.14, 30.29, 30.29, 30.29, 30.29, 30.29, 30.43, 30.43, 30.43, 30.43, 30.43, 30.53, 30.53, 30.53, 30.53, 30.53, 30.53, 30.53, 30.53, 30.53, 30.53, 30.43, 30.43, 30.43, 30.43, 30.43, 30.14, 30.14, 30.14, 30.14, 30.14, 30.06, 30.06, 30.06, 30.06, 30.06, 29.76, 29.76, 29.76, 29.76, 29.76, 29.77, 29.77, 29.77, 29.77, 29.77, 29.9, 29.9, 29.9, 29.9, 29.9, 30.08, 30.08, 30.08, 30.08, 30.08, 30.28, 30.28, 30.28, 30.28, 30.28, 30.28, 30.28, 30.28, 30.28, 30.28, 29.98, 29.98, 29.98, 29.98, 29.98, 29.58, 29.58, 29.58, 29.58, 29.58, 28.98, 28.98, 28.98, 28.98, 28.98, 28.65, 28.65, 28.65, 28.65, 28.65, 28.59, 28.59, 28.59, 28.59, 28.59, 28.61, 28.61, 28.61, 28.61, 28.61, 28.61, 28.61, 28.61, 28.61, 28.61, 28.62, 28.62, 28.62, 28.62, 28.62, 28.73, 28.73, 28.73, 28.73, 28.73, 28.75, 28.75, 28.75, 28.75, 28.75, 28.7, 28.7, 28.7, 28.7, 28.7, 28.64, 28.64, 28.64, 28.64, 28.64, 28.75, 28.75, 28.75, 28.75, 28.75, 28.86, 28.86, 28.86, 28.86, 28.86, 29.01, 29.01, 29.01, 29.01, 29.01, 29.06, 29.06, 29.06, 29.06, 29.06, 29.13, 29.13]
                    
Loading

Details

kv_cache_usage_ratio

More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 535 iterations"
    y-axis "llamacpp:kv_cache_usage_ratio"
    x-axis "llamacpp:kv_cache_usage_ratio" 1716621768 --> 1716622390
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.18, 0.18, 0.18, 0.18, 0.18, 0.33, 0.33, 0.33, 0.33, 0.33, 0.15, 0.15, 0.15, 0.15, 0.15, 0.12, 0.12, 0.12, 0.12, 0.12, 0.15, 0.15, 0.15, 0.15, 0.15, 0.14, 0.14, 0.14, 0.14, 0.14, 0.2, 0.2, 0.2, 0.2, 0.2, 0.12, 0.12, 0.12, 0.12, 0.12, 0.2, 0.2, 0.2, 0.2, 0.2, 0.14, 0.14, 0.14, 0.14, 0.14, 0.26, 0.26, 0.26, 0.26, 0.26, 0.16, 0.16, 0.16, 0.16, 0.16, 0.26, 0.26, 0.26, 0.26, 0.26, 0.27, 0.27, 0.27, 0.27, 0.27, 0.39, 0.39, 0.39, 0.39, 0.39, 0.31, 0.31, 0.31, 0.31, 0.31, 0.37, 0.37, 0.37, 0.37, 0.37, 0.15, 0.15, 0.15, 0.15, 0.15, 0.11, 0.11, 0.11, 0.11, 0.11, 0.22, 0.22, 0.22, 0.22, 0.22, 0.21, 0.21, 0.21, 0.21, 0.21, 0.22, 0.22, 0.22, 0.22, 0.22, 0.14, 0.14, 0.14, 0.14, 0.14, 0.24, 0.24, 0.24, 0.24, 0.24, 0.12, 0.12, 0.12, 0.12, 0.12, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.13, 0.32, 0.32, 0.32, 0.32, 0.32, 0.23, 0.23, 0.23, 0.23, 0.23, 0.11, 0.11, 0.11, 0.11, 0.11, 0.16, 0.16, 0.16, 0.16, 0.16, 0.18, 0.18, 0.18, 0.18, 0.18, 0.14, 0.14, 0.14, 0.14, 0.14, 0.19, 0.19, 0.19, 0.19, 0.19, 0.27, 0.27, 0.27, 0.27, 0.27, 0.21, 0.21, 0.21, 0.21, 0.21, 0.28, 0.28, 0.28, 0.28, 0.28, 0.36, 0.36, 0.36, 0.36, 0.36, 0.18, 0.18, 0.18, 0.18, 0.18, 0.14, 0.14, 0.14, 0.14, 0.14, 0.17, 0.17, 0.17, 0.17, 0.17, 0.14, 0.14, 0.14, 0.14, 0.14, 0.2, 0.2, 0.2, 0.2, 0.2, 0.5, 0.5, 0.5, 0.5, 0.5, 0.54, 0.54, 0.54, 0.54, 0.54, 0.46, 0.46, 0.46, 0.46, 0.46, 0.58, 0.58, 0.58, 0.58, 0.58, 0.15, 0.15, 0.15, 0.15, 0.15, 0.25, 0.25, 0.25, 0.25, 0.25, 0.29, 0.29, 0.29, 0.29, 0.29, 0.13, 0.13, 0.13, 0.13, 0.13, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.18, 0.34, 0.34, 0.34, 0.34, 0.34, 0.12, 0.12, 0.12, 0.12, 0.12, 0.21, 0.21, 0.21, 0.21, 0.21, 0.11, 0.11, 0.11, 0.11, 0.11, 0.07, 0.07, 0.07, 0.07, 0.07, 0.16, 0.16, 0.16, 0.16, 0.16, 0.2, 0.2, 0.2, 0.2, 0.2, 0.23, 0.23]
                    
Loading
requests_processing
More
---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 535 iterations"
    y-axis "llamacpp:requests_processing"
    x-axis "llamacpp:requests_processing" 1716621768 --> 1716622390
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 2.0, 2.0, 2.0, 2.0, 2.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 1.0, 1.0, 1.0, 1.0, 1.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 5.0, 5.0, 5.0, 5.0, 5.0, 2.0, 2.0, 2.0, 2.0, 2.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 1.0, 1.0, 1.0, 1.0, 1.0, 3.0, 3.0, 3.0, 3.0, 3.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 1.0, 1.0]
                    
Loading

examples/main/main.cpp Outdated Show resolved Hide resolved
@mofosyne mofosyne requested review from ggerganov, phymbert and ngxson and removed request for ggerganov May 25, 2024 07:05
@mofosyne
Copy link
Collaborator

@ngxson thanks. Confident now that full consensus has been reached now. Merging

@mofosyne mofosyne merged commit 00c6390 into ggerganov:master May 25, 2024
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes an issue or bug examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants