Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
30caac3
llama : the WPM vocabs use the CLS token as BOS (#10930)
ggerganov Dec 24, 2024
09fe2e7
server: allow filtering llama server response fields (#10940)
nvrxq Dec 24, 2024
2cd43f4
ggml : more perfo with llamafile tinyblas on x86_64 (#10714)
Djip007 Dec 24, 2024
9ba399d
server : add support for "encoding_format": "base64" to the */embeddi…
elk-cloner Dec 24, 2024
d283d02
examples, ggml : fix GCC compiler warnings (#10983)
peter277 Dec 26, 2024
d79d8f3
vulkan: multi-row k quants (#10846)
netrunnereve Dec 26, 2024
16cdce7
server : fix token duplication when streaming with stop strings (#10997)
z80maniac Dec 28, 2024
f865ea1
server: added more docs for response_fields field (#10995)
isaac-mcfadyen Dec 28, 2024
fdd2188
vulkan: Use push constant offset to handle misaligned descriptors (#1…
jeffbolznv Dec 29, 2024
a813bad
vulkan: im2col and matmul optimizations for stable diffusion (#10942)
jeffbolznv Dec 29, 2024
c250ecb
android : fix llama_batch free (#11014)
ag2s20150909 Dec 30, 2024
716bd6d
vulkan: optimize mul_mat for small values of N (#10991)
jeffbolznv Dec 30, 2024
6e1531a
common, examples, ggml : fix MSYS2 GCC compiler errors and warnings w…
peter277 Dec 31, 2024
bc7b1f8
convert : fix Llama-3_1-Nemotron-51B rope settings (#11008)
ymcki Dec 31, 2024
5896c65
server : add OAI compat for /v1/completions (#10974)
ngxson Dec 31, 2024
45095a6
server : clean up built-in template detection (#11026)
ngxson Dec 31, 2024
0827b2c
ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)
Srihari-mcw Dec 31, 2024
a45433b
readme : add llama-swap to infrastructure section (#11032)
mostlygeek Jan 2, 2025
0da5d86
server : allow using LoRA adapters per-request (#10994)
ngxson Jan 2, 2025
2f0ee84
server: bench: minor fixes (#10765)
phymbert Jan 2, 2025
f66f582
llama : refactor `src/llama.cpp` (#10902)
ggerganov Jan 3, 2025
e7da954
metal : avoid uint (#11019)
ggerganov Jan 3, 2025
4b0c638
common : disable KV cache shifting automatically for unsupported mode…
MollySophia Jan 3, 2025
c31fc8b
fix: Vulkan shader gen binary path (#11037)
giladgd Jan 4, 2025
db68c93
ggml : improve inputs log sched_print_assignments (ggml/1053)
danbev Dec 19, 2024
5e3b08d
ggml : do not install metal source when embed library (ggml/1054)
ggerganov Jan 4, 2025
78c6785
sync : ggml
ggerganov Jan 4, 2025
46be942
llama : add support for the cohere2 model architecture (#10900)
dranger003 Jan 4, 2025
f922a9c
[GGML][RPC] Support for models with non-512-aligned tensors over RPC.…
matt23654 Jan 4, 2025
9394bbd
llama : Add support for DeepSeek V3 (#11049)
fairydreaming Jan 4, 2025
b56f079
Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…
0cc4m Jan 4, 2025
46e3556
CUDA: add BF16 support (#11093)
JohannesGaessler Jan 6, 2025
5047dd3
llama : use _impl suffix instead of _internal (#11060)
ggerganov Jan 6, 2025
727368c
llama : use LLAMA_TOKEN_NULL (#11062)
ggerganov Jan 6, 2025
ae2f606
mmap : fix fileno macro clash (#11076)
ggerganov Jan 6, 2025
3e6e7a6
tokenize : escape the prompt (#11058)
ggerganov Jan 6, 2025
47182dd
llama : update llama_model API names (#11063)
ggerganov Jan 6, 2025
6369f86
llama : rename missed batch params/vars to ubatch (#10059)
danbev Jan 6, 2025
96a1dc2
llama : prevent system info string accumulation across calls (#11101)
a-ghorbani Jan 6, 2025
09186fa
llama : remove check flash_attn with lora (#11104)
ngxson Jan 6, 2025
e6e7c75
server : fix extra BOS in infill endpoint (#11106)
ggerganov Jan 6, 2025
cb6d4b3
Merge branch 'master' into master_fix
mtmcp Jan 6, 2025
96be8c3
github : add cmd line field to bug report (#11090)
ngxson Jan 6, 2025
ecebbd2
llama : remove unused headers (#11109)
ggerganov Jan 6, 2025
dc7cef9
llama-run : fix context size (#11094)
ericcurtin Jan 6, 2025
c0d6f79
SYCL: Use get_multi_ptr instead of deprecated get_pointer in wkv6 (#1…
qnixsynapse Jan 7, 2025
a4dd490
rpc : code cleanup (#11107)
rgerganov Jan 7, 2025
a3d50bc
ggml-backend : only offload from host buffers (#11120)
slaren Jan 7, 2025
017cc5f
ggml-backend : only offload from host buffers (fix) (#11124)
slaren Jan 7, 2025
53ff6b9
GGUF: C++ refactor, backend support, misc fixes (#11030)
JohannesGaessler Jan 7, 2025
bec2183
fix: Vulkan shader gen binary path when Cross-compiling (#11096)
ag2s20150909 Jan 8, 2025
02f0430
Disable GL_KHR_cooperative_matrix Vulkan extension if not available. …
mbaudier Jan 8, 2025
0d52a69
ci : fix cmake option (#11125)
ggerganov Jan 8, 2025
8cef75c
llamafile : ppc64le MMA INT8 implementation (#10912)
amritahs-ibm Jan 8, 2025
a3c1232
arg : option to exclude arguments from specific examples (#11136)
ggerganov Jan 8, 2025
80ccf5d
ci : pin dependency to specific version (#11137)
ngxson Jan 8, 2025
c792dcf
ggml : allow loading backend with env variable (ggml/1059)
rgerganov Jan 5, 2025
99a3755
sync : ggml
ggerganov Jan 8, 2025
809393c
Merge branch 'master' into sparkle_master_fix
mtmcp Jan 8, 2025
c07d437
llama : avoid hardcoded QK_K (#11061)
ggerganov Jan 8, 2025
4d2b3d8
lora : improve compat with `mergekit-extract-lora` (#11131)
ngxson Jan 8, 2025
f7cd133
ci : use actions from ggml-org (#11140)
ngxson Jan 8, 2025
1bf839b
Enhance user input handling for llama-run (#11138)
ericcurtin Jan 8, 2025
8a1d9c2
gguf-py : move scripts directory (#11116)
VJHack Jan 8, 2025
8d59d91
fix: add missing msg in static_assert (#11143)
hydai Jan 8, 2025
b9f0aad
Merge branch 'ggerganov:master' into sparkle_master_fix
mtmcp Jan 8, 2025
b229ae2
llama-chat : add phi 4 template (#11148)
ngxson Jan 9, 2025
ba00c2d
media : remove old img [no ci]
ggerganov Jan 9, 2025
1fbdd87
model: Add support for PhiMoE arch (#11003)
phymbert Jan 9, 2025
7449fb0
server : add tooltips to settings and themes btn (#11154)
danbev Jan 9, 2025
8856312
doc: add cuda guide for fedora (#11135)
teihome Jan 9, 2025
e4de404
Ignore cross-compile root path when finding programs
mtmcp Jan 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion .github/ISSUE_TEMPLATE/010-bug-compilation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,22 @@ body:
If possible, please do a git bisect and identify the exact commit that introduced the bug.
validations:
required: false
- type: textarea
id: command
attributes:
label: Compile command
description: >
Please provide the exact command you used to compile llama.cpp. For example: `cmake -B ...`.
This will be automatically formatted into code, so no need for backticks.
render: shell
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant log output
description: >
Please copy and paste any relevant log output, including the command that you entered and any generated text.
Please copy and paste any relevant log output, including any generated text.
This will be automatically formatted into code, so no need for backticks.
render: shell
validations:
Expand Down
12 changes: 11 additions & 1 deletion .github/ISSUE_TEMPLATE/019-bug-misc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,16 @@ body:
- Other (Please specify in the next section)
validations:
required: false
- type: textarea
id: command
attributes:
label: Command line
description: >
Please provide the exact commands you entered, if applicable. For example: `llama-server -m ... -c ...`, `llama-cli -m ...`, etc.
This will be automatically formatted into code, so no need for backticks.
render: shell
validations:
required: false
- type: textarea
id: info
attributes:
Expand All @@ -74,7 +84,7 @@ body:
attributes:
label: Relevant log output
description: >
If applicable, please copy and paste any relevant log output, including the command that you entered and any generated text.
If applicable, please copy and paste any relevant log output, including any generated text.
This will be automatically formatted into code, so no need for backticks.
render: shell
validations:
Expand Down
30 changes: 14 additions & 16 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,7 @@ jobs:
-DLLAMA_CURL=ON \
-DGGML_METAL_USE_BF16=ON \
-DGGML_METAL_EMBED_LIBRARY=ON \
-DGGML_RPC=ON \
-DBUILD_SHARED_LIBS=OFF
-DGGML_RPC=ON
cmake --build . --config Release -j $(sysctl -n hw.logicalcpu)

- name: Test
Expand Down Expand Up @@ -123,8 +122,7 @@ jobs:
-DLLAMA_FATAL_WARNINGS=ON \
-DLLAMA_CURL=ON \
-DGGML_METAL=OFF \
-DGGML_RPC=ON \
-DBUILD_SHARED_LIBS=OFF
-DGGML_RPC=ON
cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

- name: Test
Expand Down Expand Up @@ -181,7 +179,7 @@ jobs:
run: |
mkdir build
cd build
cmake .. -DLLAMA_FATAL_WARNINGS=ON -DLLAMA_CURL=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF
cmake .. -DLLAMA_FATAL_WARNINGS=ON -DLLAMA_CURL=ON -DGGML_RPC=ON
cmake --build . --config Release -j $(nproc)

- name: Test
Expand Down Expand Up @@ -651,23 +649,23 @@ jobs:
matrix:
include:
- build: 'noavx-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF'
- build: 'avx2-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON'
- build: 'avx-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX2=OFF -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX2=OFF'
- build: 'avx512-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX512=ON -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_AVX512=ON'
- build: 'openblas-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BLAS=ON -DBUILD_SHARED_LIBS=ON -DGGML_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include" -DBLAS_LIBRARIES="$env:RUNNER_TEMP/openblas/lib/openblas.lib"'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS -DBLAS_INCLUDE_DIRS="$env:RUNNER_TEMP/openblas/include" -DBLAS_LIBRARIES="$env:RUNNER_TEMP/openblas/lib/openblas.lib"'
- build: 'kompute-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_KOMPUTE=ON -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_KOMPUTE=ON -DKOMPUTE_OPT_DISABLE_VULKAN_VERSION_CHECK=ON'
- build: 'vulkan-x64'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_VULKAN=ON -DBUILD_SHARED_LIBS=ON'
defines: '-DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_RPC=ON -DGGML_VULKAN=ON'
- build: 'llvm-arm64'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DBUILD_SHARED_LIBS=ON'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON'
- build: 'msvc-arm64'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-msvc.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DBUILD_SHARED_LIBS=ON'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-msvc.cmake -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON'
- build: 'llvm-arm64-opencl-adreno'
defines: '-G "Ninja Multi-Config" -D CMAKE_TOOLCHAIN_FILE=cmake/arm64-windows-llvm.cmake -DCMAKE_PREFIX_PATH="$env:RUNNER_TEMP/opencl-arm64-release" -DGGML_OPENCL=ON -DGGML_OPENCL_USE_ADRENO_KERNELS=ON'

Expand Down Expand Up @@ -914,7 +912,7 @@ jobs:
shell: cmd
run: |
call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat"
cmake -S . -B build -G "Ninja Multi-Config" -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_CUDA=ON -DBUILD_SHARED_LIBS=ON -DGGML_RPC=ON
cmake -S . -B build -G "Ninja Multi-Config" -DGGML_NATIVE=OFF -DLLAMA_BUILD_SERVER=ON -DGGML_CUDA=ON -DGGML_RPC=ON
set /A NINJA_JOBS=%NUMBER_OF_PROCESSORS%-1
cmake --build build --config Release -j %NINJA_JOBS% -t ggml
cmake --build build --config Release
Expand Down Expand Up @@ -1239,7 +1237,7 @@ jobs:

- name: Create release
id: create_release
uses: anzz1/action-create-release@v1
uses: ggml-org/action-create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
Expand Down
3 changes: 1 addition & 2 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,9 @@ jobs:
GITHUB_BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
GITHUB_REPOSITORY_OWNER: '${{ github.repository_owner }}'

# https://github.com/jlumbroso/free-disk-space/tree/54081f138730dfa15788a46383842cd2f914a1be#example
- name: Free Disk Space (Ubuntu)
if: ${{ matrix.config.free_disk_space == true }}
uses: jlumbroso/free-disk-space@main
uses: ggml-org/free-disk-space@v1.3.1
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
Expand Down
4 changes: 3 additions & 1 deletion .github/workflows/editorconfig.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: editorconfig-checker/action-editorconfig-checker@main
- uses: editorconfig-checker/action-editorconfig-checker@v2
with:
version: v3.0.3
- run: editorconfig-checker
8 changes: 7 additions & 1 deletion CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs

/ci/ @ggerganov
/.devops/ @ngxson
/.devops/*.Dockerfile @ngxson
/examples/server/ @ngxson
/ggml/src/ggml-cuda/fattn* @JohannesGaessler
/ggml/src/ggml-cuda/mmq.* @JohannesGaessler
/ggml/src/ggml-cuda/mmv.* @JohannesGaessler
/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler
/ggml/src/ggml-opt.cpp @JohannesGaessler
/ggml/src/gguf.cpp @JohannesGaessler
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
- [x] [Qwen models](https://huggingface.co/models?search=Qwen/Qwen)
- [x] [PLaMo-13B](https://github.com/ggerganov/llama.cpp/pull/3557)
- [x] [Phi models](https://huggingface.co/models?search=microsoft/phi)
- [x] [PhiMoE](https://github.com/ggerganov/llama.cpp/pull/11003)
- [x] [GPT-2](https://huggingface.co/gpt2)
- [x] [Orion 14B](https://github.com/ggerganov/llama.cpp/pull/5118)
- [x] [InternLM2](https://huggingface.co/models?search=internlm2)
Expand Down Expand Up @@ -201,6 +202,7 @@ Instructions for adding support for new models: [HOWTO-add-model.md](docs/develo
- [Paddler](https://github.com/distantmagic/paddler) - Stateful load balancer custom-tailored for llama.cpp
- [GPUStack](https://github.com/gpustack/gpustack) - Manage GPU clusters for running LLMs
- [llama_cpp_canister](https://github.com/onicai/llama_cpp_canister) - llama.cpp as a smart contract on the Internet Computer, using WebAssembly
- [llama-swap](https://github.com/mostlygeek/llama-swap) - transparent proxy that adds automatic model switching with llama-server

</details>

Expand Down
21 changes: 15 additions & 6 deletions common/arg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ common_arg & common_arg::set_examples(std::initializer_list<enum llama_example>
return *this;
}

common_arg & common_arg::set_excludes(std::initializer_list<enum llama_example> excludes) {
this->excludes = std::move(excludes);
return *this;
}

common_arg & common_arg::set_env(const char * env) {
help = help + "\n(env: " + env + ")";
this->env = env;
Expand All @@ -37,6 +42,10 @@ bool common_arg::in_example(enum llama_example ex) {
return examples.find(ex) != examples.end();
}

bool common_arg::is_exclude(enum llama_example ex) {
return excludes.find(ex) != excludes.end();
}

bool common_arg::get_value_from_env(std::string & output) {
if (env == nullptr) return false;
char * value = std::getenv(env);
Expand Down Expand Up @@ -420,7 +429,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
* - if both {LLAMA_EXAMPLE_COMMON, LLAMA_EXAMPLE_*,} are set, we will prioritize the LLAMA_EXAMPLE_* matching current example
*/
auto add_opt = [&](common_arg arg) {
if (arg.in_example(ex) || arg.in_example(LLAMA_EXAMPLE_COMMON)) {
if ((arg.in_example(ex) || arg.in_example(LLAMA_EXAMPLE_COMMON)) && !arg.is_exclude(ex)) {
ctx_arg.options.push_back(std::move(arg));
}
};
Expand Down Expand Up @@ -649,7 +658,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
[](common_params & params, const std::string & value) {
params.prompt = value;
}
));
).set_excludes({LLAMA_EXAMPLE_SERVER}));
add_opt(common_arg(
{"--no-perf"},
string_format("disable internal libllama performance timings (default: %s)", params.no_perf ? "true" : "false"),
Expand All @@ -673,7 +682,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
params.prompt.pop_back();
}
}
));
).set_excludes({LLAMA_EXAMPLE_SERVER}));
add_opt(common_arg(
{"--in-file"}, "FNAME",
"an input file (repeat to specify multiple files)",
Expand All @@ -700,7 +709,7 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
params.prompt = ss.str();
fprintf(stderr, "Read %zu bytes from binary file %s\n", params.prompt.size(), value.c_str());
}
));
).set_excludes({LLAMA_EXAMPLE_SERVER}));
add_opt(common_arg(
{"-e", "--escape"},
string_format("process escapes sequences (\\n, \\r, \\t, \\', \\\", \\\\) (default: %s)", params.escape ? "true" : "false"),
Expand Down Expand Up @@ -1512,15 +1521,15 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
{"--lora"}, "FNAME",
"path to LoRA adapter (can be repeated to use multiple adapters)",
[](common_params & params, const std::string & value) {
params.lora_adapters.push_back({ std::string(value), 1.0 });
params.lora_adapters.push_back({ std::string(value), 1.0, nullptr });
}
// we define this arg on both COMMON and EXPORT_LORA, so when showing help message of export-lora, it will be categorized as "example-specific" arg
).set_examples({LLAMA_EXAMPLE_COMMON, LLAMA_EXAMPLE_EXPORT_LORA}));
add_opt(common_arg(
{"--lora-scaled"}, "FNAME", "SCALE",
"path to LoRA adapter with user defined scaling (can be repeated to use multiple adapters)",
[](common_params & params, const std::string & fname, const std::string & scale) {
params.lora_adapters.push_back({ fname, std::stof(scale) });
params.lora_adapters.push_back({ fname, std::stof(scale), nullptr });
}
// we define this arg on both COMMON and EXPORT_LORA, so when showing help message of export-lora, it will be categorized as "example-specific" arg
).set_examples({LLAMA_EXAMPLE_COMMON, LLAMA_EXAMPLE_EXPORT_LORA}));
Expand Down
3 changes: 3 additions & 0 deletions common/arg.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

struct common_arg {
std::set<enum llama_example> examples = {LLAMA_EXAMPLE_COMMON};
std::set<enum llama_example> excludes = {};
std::vector<const char *> args;
const char * value_hint = nullptr; // help text or example for arg value
const char * value_hint_2 = nullptr; // for second arg value
Expand Down Expand Up @@ -53,9 +54,11 @@ struct common_arg {
) : args(args), value_hint(value_hint), value_hint_2(value_hint_2), help(help), handler_str_str(handler) {}

common_arg & set_examples(std::initializer_list<enum llama_example> examples);
common_arg & set_excludes(std::initializer_list<enum llama_example> excludes);
common_arg & set_env(const char * env);
common_arg & set_sparam();
bool in_example(enum llama_example ex);
bool is_exclude(enum llama_example ex);
bool get_value_from_env(std::string & output);
bool has_value_from_env();
std::string to_string();
Expand Down
Loading