Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
2a5c270
Enable external file and add datestamp
pudepiedj Sep 30, 2023
f71068f
Add name of external file at end
pudepiedj Sep 30, 2023
0dde56c
Upload ToK2024
pudepiedj Sep 30, 2023
9d6533b
Delete ToK2024.txt
pudepiedj Sep 30, 2023
3c2d677
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 2, 2023
3e41cba
Experiments with jeopardy
pudepiedj Oct 2, 2023
2fd71e2
Merge branch 'load-parallel-prompt-file' of https://github.com/pudepi…
pudepiedj Oct 2, 2023
d673691
Move ParallelQuestions to /proimpts and rename
pudepiedj Oct 2, 2023
e293ebd
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 2, 2023
51196a4
Interim commit
pudepiedj Oct 3, 2023
2e3dad3
Interim commit
pudepiedj Oct 3, 2023
b343833
Final revision
pudepiedj Oct 3, 2023
ce10861
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 3, 2023
af2fbb8
Merge branch 'ggerganov:master' into Update-load-parallel-prompt-file
pudepiedj Oct 3, 2023
fc1ba35
Merge remote-tracking branch 'origin/load-parallel-prompt-file' into …
pudepiedj Oct 3, 2023
bf8c4df
Merge branch 'Update-load-parallel-prompt-file' of https://github.com…
pudepiedj Oct 3, 2023
0286818
Remove trailing whitespace
pudepiedj Oct 3, 2023
18b342d
remove cmake_all.sh
pudepiedj Oct 3, 2023
5366375
Remove cmake_all.sh
pudepiedj Oct 4, 2023
bbfec95
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 4, 2023
2f0181b
Changed .gitignore
pudepiedj Oct 4, 2023
b805ec2
Merge branch 'load-parallel-prompt-file' of https://github.com/pudepi…
pudepiedj Oct 4, 2023
f75fe38
Improved reporting and new question files.
pudepiedj Oct 4, 2023
a02e042
Corrected typo
pudepiedj Oct 4, 2023
000c468
More LLM questions
pudepiedj Oct 4, 2023
f630096
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 4, 2023
b505cfb
Update LLM-questions.txt
pudepiedj Oct 4, 2023
8394762
Merge branch 'load-parallel-prompt-file' of https://github.com/pudepi…
pudepiedj Oct 4, 2023
e9aa6e9
Yet more LLM-questions
pudepiedj Oct 5, 2023
325fcb7
Remove jeopardy results file
pudepiedj Oct 5, 2023
db44b46
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 5, 2023
1c4c8cd
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 6, 2023
8b7d88a
Reinstate original jeopardy.sh
pudepiedj Oct 6, 2023
84b43bb
Merge branch 'load-parallel-prompt-file' of https://github.com/pudepi…
pudepiedj Oct 6, 2023
defffb6
Merge branch 'ggerganov:master' into load-parallel-prompt-file
pudepiedj Oct 6, 2023
4bded6e
Update examples/parallel/parallel.cpp
ggerganov Oct 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ build*/
out/
tmp/

cmake-all.sh
cmake_all.sh

models/*
models-mnt

Expand Down
6 changes: 6 additions & 0 deletions cmake_all.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
cd llama.cpp
rm -r build
cmake -B build
cd build
cmake --build . --config Release
cd ..
2 changes: 2 additions & 0 deletions common/common.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,8 @@ bool gpt_params_parse(int argc, char ** argv, gpt_params & params) {
invalid_param = true;
break;
}
// store the external file name in params
params.prompt_file = argv[i];
std::copy(std::istreambuf_iterator<char>(file), std::istreambuf_iterator<char>(), back_inserter(params.prompt));
if (params.prompt.back() == '\n') {
params.prompt.pop_back();
Expand Down
1 change: 1 addition & 0 deletions common/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ struct gpt_params {
std::string model_draft = ""; // draft model for speculative decoding
std::string model_alias = "unknown"; // model alias
std::string prompt = "";
std::string prompt_file = ""; // store the external prompt file name
std::string path_prompt_cache = ""; // path to file for saving/loading prompt eval state
std::string input_prefix = ""; // string to prefix user inputs with
std::string input_suffix = ""; // string to suffix user inputs with
Expand Down
2 changes: 1 addition & 1 deletion examples/jeopardy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This is pretty much just a straight port of aigoopy/llm-jeopardy/ with an added graph viewer.

The jeopardy test can be used to compare the fact knowledge of different models and compare them to eachother. This is in contrast to some other tests, which test logical deduction, creativity, writing skills, etc.
The jeopardy test can be used to compare the fact knowledge of different models and compare them to each other. This is in contrast to some other tests, which test logical deduction, creativity, writing skills, etc.


Step 1: Open jeopardy.sh and modify the following:
Expand Down
9 changes: 5 additions & 4 deletions examples/jeopardy/jeopardy.sh
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
#!/bin/bash
set -e

MODEL=./models/ggml-vicuna-13b-1.1-q4_0.bin
MODEL_NAME=Vicuna
MODEL=./models/llama-2-7b/ggml-model-q4_0.gguf
MODEL_NAME=Llama2-7b-ggml-q4

# exec options
prefix="Human: " # Ex. Vicuna uses "Human: "
prefix="Question: " # Ex. Vicuna uses "Human: "
opts="--temp 0 -n 80" # additional flags
nl='
'
introduction="You will be playing a game of Jeopardy. Simply answer the question in the correct format (Ex. What is Paris, or Who is George Washington)."

# file options
question_file=./examples/jeopardy/questions.txt
# if the required model file doesn't exist, create it; otherwise update it
touch ./examples/jeopardy/results/$MODEL_NAME.txt
output_file=./examples/jeopardy/results/$MODEL_NAME.txt

Expand All @@ -21,7 +22,7 @@ counter=1
echo 'Running'
while IFS= read -r question
do
exe_cmd="./main -p "\"$prefix$introduction$nl$prefix$question\"" "$opts" -m ""\"$MODEL\""" >> ""\"$output_file\""
exe_cmd="./build/bin/main -p "\"$nl$prefix$question\"" "$opts" -m ""\"$MODEL\""" >> ""\"$output_file\""
echo $counter
echo "Current Question: $question"
eval "$exe_cmd"
Expand Down
657 changes: 657 additions & 0 deletions examples/jeopardy/results/Llama2-7b-ggml-q4.txt

Large diffs are not rendered by default.

52 changes: 47 additions & 5 deletions examples/parallel/parallel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include <cstdio>
#include <string>
#include <vector>
#include <ctime>

// trim whitespace from the beginning and end of a string
static std::string trim(const std::string & str) {
Expand Down Expand Up @@ -70,6 +71,26 @@ struct client {
std::vector<llama_token> tokens_prev;
};

static void print_date_time() {
std::time_t current_time = std::time(nullptr);
std::tm* local_time = std::localtime(&current_time);
char buffer[80];
strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", local_time);

printf("\n\033[35mrun parameters as at %s\033[0m\n", buffer);
}

// Define a split string function to ...
static std::vector<std::string> split_string(const std::string& input, char delimiter) {
std::vector<std::string> tokens;
std::istringstream stream(input);
std::string token;
while (std::getline(stream, token, delimiter)) {
tokens.push_back(token);
}
return tokens;
}

int main(int argc, char ** argv) {
srand(1234);

Expand Down Expand Up @@ -104,6 +125,23 @@ int main(int argc, char ** argv) {
params.logits_all = true;
std::tie(model, ctx) = llama_init_from_gpt_params(params);

// load the prompts from an external file if there are any
if (params.prompt.empty()) {
printf("\n\033[32mNo new questions so proceed with build-in defaults.\033[0m\n");
} else {
// Output each line of the input params.prompts vector and copy to k_prompts
int index = 0;
printf("\n\033[32mNow printing the external prompt file %s\033[0m\n\n", params.prompt_file.c_str());

std::vector<std::string> prompts = split_string(params.prompt, '\n');
for (const auto& prompt : prompts) {
k_prompts.resize(index + 1);
k_prompts[index] = prompt;
index++;
printf("%3d prompt: %s\n", index, prompt.c_str());
}
}

fprintf(stderr, "\n\n");
fflush(stderr);

Expand Down Expand Up @@ -233,7 +271,7 @@ int main(int argc, char ** argv) {
client.n_decoded = 0;
client.i_batch = batch.n_tokens - 1;

LOG_TEE("\033[1mClient %3d, seq %4d, started decoding ...\033[0m\n", client.id, client.seq_id);
LOG_TEE("\033[31mClient %3d, seq %4d, started decoding ...\033[0m\n", client.id, client.seq_id);

g_seq_id += 1;

Expand Down Expand Up @@ -336,8 +374,8 @@ int main(int argc, char ** argv) {

const auto t_main_end = ggml_time_us();

LOG_TEE("\033[1mClient %3d, seq %4d, prompt %4d t, response %4d t, time %5.2f s, speed %5.2f t/s, cache miss %d \033[0m \n\nInput: %s\nResponse: %s\n\n",
client.id, client.seq_id, client.n_prompt, client.n_decoded,
LOG_TEE("\033[31mClient %3d, seq %3d/%3d, prompt %4d t, response %4d t, time %5.2f s, speed %5.2f t/s, cache miss %d \033[0m \nInput: %s\n\033[35mResponse: %s\033[0m\n\n",
client.id, client.seq_id, n_seq, client.n_prompt, client.n_decoded,
(t_main_end - client.t_start_prompt) / 1e6,
(double) (client.n_prompt + client.n_decoded) / (t_main_end - client.t_start_prompt) * 1e6,
n_cache_miss,
Expand All @@ -357,13 +395,17 @@ int main(int argc, char ** argv) {

const auto t_main_end = ggml_time_us();

LOG_TEE("\n\n");
print_date_time();

LOG_TEE("\n%s: n_parallel = %d, n_sequences = %d, cont_batching = %d, system tokens = %d\n", __func__, n_clients, n_seq, cont_batching, n_tokens_system);
printf("external prompt file (if any): %s\n\n", params.prompt_file.c_str());

LOG_TEE("Total prompt tokens: %6d, speed: %5.2f t/s\n", n_total_prompt, (double) (n_total_prompt ) / (t_main_end - t_main_start) * 1e6);
LOG_TEE("Total gen tokens: %6d, speed: %5.2f t/s\n", n_total_gen, (double) (n_total_gen ) / (t_main_end - t_main_start) * 1e6);
LOG_TEE("Total speed (AVG): %6s speed: %5.2f t/s\n", "", (double) (n_total_prompt + n_total_gen) / (t_main_end - t_main_start) * 1e6);
LOG_TEE("Cache misses: %6d\n", n_cache_miss);

LOG_TEE("\n\n");
LOG_TEE("\n");

llama_print_timings(ctx);

Expand Down
10 changes: 5 additions & 5 deletions llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7761,14 +7761,14 @@ void llama_print_timings(struct llama_context * ctx) {
const llama_timings timings = llama_get_timings(ctx);

LLAMA_LOG_INFO("\n");
LLAMA_LOG_INFO("%s: load time = %8.2f ms\n", __func__, timings.t_load_ms);
LLAMA_LOG_INFO("%s: sample time = %8.2f ms / %5d runs (%8.2f ms per token, %8.2f tokens per second)\n",
LLAMA_LOG_INFO("%s: load time = %10.2f ms\n", __func__, timings.t_load_ms);
LLAMA_LOG_INFO("%s: sample time = %10.2f ms / %5d runs (%8.2f ms per token, %8.2f tokens per second)\n",
__func__, timings.t_sample_ms, timings.n_sample, timings.t_sample_ms / timings.n_sample, 1e3 / timings.t_sample_ms * timings.n_sample);
LLAMA_LOG_INFO("%s: prompt eval time = %8.2f ms / %5d tokens (%8.2f ms per token, %8.2f tokens per second)\n",
LLAMA_LOG_INFO("%s: prompt eval time = %10.2f ms / %5d tokens (%8.2f ms per token, %8.2f tokens per second)\n",
__func__, timings.t_p_eval_ms, timings.n_p_eval, timings.t_p_eval_ms / timings.n_p_eval, 1e3 / timings.t_p_eval_ms * timings.n_p_eval);
LLAMA_LOG_INFO("%s: eval time = %8.2f ms / %5d runs (%8.2f ms per token, %8.2f tokens per second)\n",
LLAMA_LOG_INFO("%s: eval time = %10.2f ms / %5d runs (%8.2f ms per token, %8.2f tokens per second)\n",
__func__, timings.t_eval_ms, timings.n_eval, timings.t_eval_ms / timings.n_eval, 1e3 / timings.t_eval_ms * timings.n_eval);
LLAMA_LOG_INFO("%s: total time = %8.2f ms\n", __func__, (timings.t_end_ms - timings.t_start_ms));
LLAMA_LOG_INFO("%s: total time = %10.2f ms\n", __func__, (timings.t_end_ms - timings.t_start_ms));
}

void llama_reset_timings(struct llama_context * ctx) {
Expand Down
32 changes: 32 additions & 0 deletions prompts/parallel-questions.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
What do you know about Hobbits?
What is quantum field theory?
Why did the chicken cross the road?
Who is the president of the United States?
How do I run CMake on MacOS?
Do you agree that C++ is a really finicky language compared with Python3?
Is it a good idea to invest in technology?
Do you like Wagner's Ring?
Do you think this file input option is really neat?
What should we all do about climate change?
Is time-travel possible within the laws of current physics?
Is it like anything to be a bat?
Once the chicken has crossed the road, does it try to go back?
Who is the greatest of all musical composers?
What is art?
Is there life elsewhere in the universe?
What is intelligence?
What is the difference between knowledge and intelligence?
Will religion ever die?
Do we understand ourselves?
What is the best way to cook eggs?
If you cannot see things, on what basis do you evaluate them?
Explain the role of the np junction in photovoltaic cells?
Is professional sport a good or bad influence on human behaviour?
Is capital punishment immoral?
Should we care about other people?
Who are you?
Which sense would you surrender if you could?
Was Henry Ford a hero or a villain?
Do we need leaders?
What is nucleosynthesis?
Who is the greatest scientist of all time so far?