Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement backward passes for llama with small training llama from scratch example #1360

Merged
merged 110 commits into from
May 13, 2023
Merged
Show file tree
Hide file tree
Changes from 108 commits
Commits
Show all changes
110 commits
Select commit Hold shift + click to select a range
73ac18d
implement 8 of 14 missing backward pass operations used by llama
xaedes May 1, 2023
b164343
implement 5 of 6 missing backward pass operations used by llama
xaedes May 1, 2023
b908007
norm & rms_norm can not be threaded:
xaedes Apr 24, 2023
36d8a05
remove already resolved TODO
xaedes Apr 24, 2023
488decf
implement backward pass of ggml_rope and ggml_rope_back
xaedes Apr 24, 2023
4e1f81d
implement backward pass for ggml_get_rows and for new operation ggml_…
xaedes Apr 24, 2023
0da2675
add test-grad0.c
xaedes Apr 25, 2023
20e3c1d
use GGML_PRINT_DEBUG for debug messages which will otherwise flood th…
xaedes Apr 24, 2023
9345f4c
test both gradients of mul_mat
xaedes Apr 24, 2023
9d6fc28
disable graph dot export as it floods console
xaedes Apr 25, 2023
6fb08b4
bug fixes for silu_back
xaedes Apr 25, 2023
671e592
successfully test silu backward
xaedes Apr 25, 2023
a367eb9
bug fix for scale backward pass
xaedes Apr 25, 2023
0197bcb
successfully test scale backward
xaedes Apr 25, 2023
bfe5072
improve performance of sum backward pass
xaedes Apr 25, 2023
b583136
improve performance of sqr backward pass
xaedes Apr 25, 2023
7571147
successfully test rope backward
xaedes Apr 25, 2023
0ea8201
bug fix for cpy backward pass
xaedes Apr 26, 2023
b2bd822
successfully test cpy backward
xaedes Apr 26, 2023
c483a7d
bug fix for reshape backward pass
xaedes Apr 26, 2023
ecf949b
successfully test reshape backward
xaedes Apr 26, 2023
54ab300
add test-opt.c
xaedes Apr 26, 2023
1a80e9a
correctly implement softmax backward pass using new operation ggml_diag
xaedes Apr 26, 2023
fea42be
successfully test soft_max backward
xaedes Apr 26, 2023
9310650
align shape annotations
xaedes Apr 26, 2023
38675e5
add shape annotations for llama
xaedes Apr 27, 2023
c1a8893
de-duplicate ggml_forward_dup code taking care of contiguous tensors …
xaedes Apr 27, 2023
83fa6b3
fix ggml_compute_forward_dup_same_cont for when nelements < nthreads
xaedes May 1, 2023
cecd6c7
bug fix for add_at forward
xaedes Apr 27, 2023
124fdca
successfully test view backward
xaedes Apr 28, 2023
410a47a
minor code format improvement
xaedes Apr 27, 2023
b9416d7
fix ggml_forward_add functions to work correctly with transposed tensors
xaedes Apr 28, 2023
339b2ad
fix ggml_forward_add1 functions to work correctly with transposed ten…
xaedes Apr 28, 2023
86b44a0
test-grad0.c : add print_elements to help with debugging
xaedes Apr 28, 2023
a7a8370
successfully test permute backward
xaedes Apr 28, 2023
b0555fc
some minor test-grad0 fixes
xaedes Apr 28, 2023
02d3fd0
fix sub, mul and div functions to work correctly with transposed tensors
xaedes Apr 28, 2023
3d21f26
implement ggml_cont backward pass
xaedes Apr 28, 2023
c601df9
successfully test transpose backward and permute for all permutations
xaedes Apr 28, 2023
1997152
test-grad0.c add TODO for view_2d and view_3d
xaedes Apr 28, 2023
d42531f
fix comments
xaedes Apr 28, 2023
19f5159
successfully test diag_mask_inf and diag_mask_zero backward
xaedes Apr 28, 2023
b9920e5
test-grad0 : fix test for div
xaedes Apr 28, 2023
3dbd649
fix diag_mask to work with non-inplace input
xaedes Apr 28, 2023
7281f60
move dup call into the actual add_at functions
xaedes Apr 28, 2023
96e773b
fix get rows backward pass
xaedes Apr 28, 2023
f0302fa
successfully test get_rows backward
xaedes Apr 28, 2023
8443638
fix view backward pass
xaedes Apr 30, 2023
b18b72d
successfully test backward pass of view_1d, view_2d and view_3d
xaedes Apr 30, 2023
84a4b39
fix backward pass for rms_norm
xaedes Apr 30, 2023
2ecc690
successfully test backward pass of rms_norm
xaedes Apr 30, 2023
2277053
add todos for llama backward pass
xaedes Apr 30, 2023
c4539ed
add operation ggml_sum_rows
xaedes Apr 30, 2023
ba62c79
add missing GGML_OP_SUM_ROWS
xaedes May 1, 2023
8b5b2f0
fix backward pass for repeat
xaedes Apr 30, 2023
72bcfb5
successfully test backward pass of repeat
xaedes Apr 30, 2023
1c4dc1e
update quantization types in switch-case of add_at and add1
xaedes May 1, 2023
8fde656
add baby-llama example training a very small llama model from scratch…
xaedes May 1, 2023
29a0f8b
fix softmax in baby-llama example
xaedes May 1, 2023
5f23052
switching from training with adam to lbfgs produces much better resul…
xaedes May 1, 2023
bc1c13b
train with two examples, creating new tensors each time..
xaedes May 1, 2023
83ee1cd
fix bug when using ggml_opt to optimize params in one context and use…
xaedes May 6, 2023
f1d51d1
train on multiple examples, generate & print tokens with trained mode…
xaedes May 6, 2023
b4c273f
add ggml_reshape_1d, ggml_reshape_4d and ggml_view_4d
xaedes May 6, 2023
8cf04fe
fix soft_max backward pass for input->ne[1] != 1
xaedes May 6, 2023
65d9f73
add ggml_log operation necessary for cross entropy loss
xaedes May 6, 2023
5724628
add test for ggml_log gradients
xaedes May 6, 2023
7a15a83
implement backward pass for ggml_sum_rows, necessary for cross entrop…
xaedes May 6, 2023
e6186d9
implement ggml_repeat support for rank > 2 tensors
xaedes May 6, 2023
80223d9
add test for ggml_sum_rows gradients
xaedes May 6, 2023
73fd66e
fix training get_example_targets
xaedes May 6, 2023
7a5dec2
add square_error_loss and cross_entropy_loss functions
xaedes May 6, 2023
226521a
optimize loss over multiple samples
xaedes May 6, 2023
48bcc4d
fix backward pass for add_at and change arguments to have same order …
xaedes May 6, 2023
47561de
add ggml_set(ctx, a, b) to set b in view of a and return modified a
xaedes May 6, 2023
956511b
fix kv_self gradients for training
xaedes May 6, 2023
561fbe0
replace inplace operations for training with copying operations to al…
xaedes May 6, 2023
e91b83b
add GGML_ASSERT to catch ggml_rope and back value errors
xaedes May 6, 2023
93201ab
add trainable lora-only model with all big matrices C split into A,B …
xaedes May 7, 2023
49d6daa
vastly improve training results
xaedes May 7, 2023
e0de09d
shorten code using a variable
xaedes May 7, 2023
4764842
change name of GGML_OP_ADD_AT to GGML_OP_ACC
xaedes May 7, 2023
ee565f3
Merge branch 'master' into train-example
xaedes May 7, 2023
e643fa1
smaller default values for baby llama model parameters
xaedes May 7, 2023
d20ba6f
update static assert of GGML_OP_COUNT
xaedes May 7, 2023
5d9fed7
remove shape annotations in llama_eval_internal
xaedes May 7, 2023
47ad186
revert disabling of threading for rms_norm and norm
xaedes May 7, 2023
9dd8e40
rename print functions in baby-llama example
xaedes May 7, 2023
660836f
fix call to ggml_set_name
xaedes May 7, 2023
7c8768f
add missing include for strcmp, etc
xaedes May 7, 2023
2936dd6
remove trailing whitespace
xaedes May 7, 2023
4997bc5
reduce number of test-grad0 iterations
xaedes May 7, 2023
f530106
remove busy loop that was used as sleep for slower sinus wave generation
xaedes May 7, 2023
1ecbece
disable slow tests grad0 and opt to avoid exceeding timeouts
xaedes May 8, 2023
dea9c93
c++ in baby-llama example
xaedes May 8, 2023
0d72207
c++ in baby-llama example
xaedes May 8, 2023
78af3e9
ggml : fix compiler warnings + cosmetic changes
ggerganov May 8, 2023
6cc42de
ggml : fix nullptr derefs in GGML_OP_CONT and GGML_OP_RESHAPE back
ggerganov May 8, 2023
cafbb78
swap arguments to vDSP_vdiv call
xaedes May 8, 2023
9c3fe4e
swap arguments to vDSP_vdiv call
xaedes May 8, 2023
6ca682b
ggml : swap vDSP_vsub args as per documentation
ggerganov May 8, 2023
3e3ed95
add parallel batched forward function for baby-llama training
xaedes May 11, 2023
581e5eb
cleanup code for batched training
xaedes May 11, 2023
b9ef08c
remove trailing whitespace
xaedes May 11, 2023
f977243
minor : fix compiler warnings + indentation style
ggerganov May 13, 2023
33034cf
ggml : fix null ptr deref in backward pass
ggerganov May 13, 2023
092913e
Merge remote-tracking branch 'origin/master' into HEAD
ggerganov May 13, 2023
95a487a
ggml : remove Q4_2 remnants
ggerganov May 13, 2023
ef3d42a
ggml : fix clang-tidy warnings
ggerganov May 13, 2023
dae6ba2
baby-llama : couple of clang-tidy warnings
ggerganov May 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,5 @@ else()
add_subdirectory(embedding)
add_subdirectory(save-load-state)
add_subdirectory(benchmark)
add_subdirectory(baby-llama)
endif()
4 changes: 4 additions & 0 deletions examples/baby-llama/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
set(TARGET baby-llama)
add_executable(${TARGET} baby-llama.cpp)
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
Loading