-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Closed
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communityhigh priorityVery important issueVery important issue
Description
The original paper, and the reference implementation [1] uses RMS norm. However, llama.cpp uses ggml_norm() which looks like Layer norm?
The differences between these may not be too obvious, because the mean is probably around 0. However, we should follow the original design.
[1] https://github.com/facebookresearch/llama/blob/main/llama/model.py
schneiderfelipeschneiderfelipe
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomershelp wantedNeeds help from the communityNeeds help from the communityhigh priorityVery important issueVery important issue