Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find common ground between what Jan is doing and what's happening in ggml with Training #12

Open
janchk opened this issue Oct 21, 2024 · 2 comments
Assignees

Comments

@janchk
Copy link
Collaborator

janchk commented Oct 21, 2024

Let's start looking at what's happening over here https://github.com/ggerganov/ggml/tree/master/examples/mnist down to the flow of PRs and the kinds of discussions folks are having there (e.g. ggerganov/ggml#982).

The work being done there is obviously a massive refactoring of an earlier attempt at llama.cpp finetuning that was ripped out some time ago waiting for the above ggml research to conclude. The previous work was around here ggerganov/llama.cpp#8669 and here ggerganov/llama.cpp#2632)

It may actually make sense to look back at the last release of llama.cpp that still had that functionality just to appreciate the scope a little bit.

@janchk janchk assigned rvs and janchk Oct 21, 2024
@janchk
Copy link
Collaborator Author

janchk commented Oct 21, 2024

Scope of work that needed to be done:

  1. Implement noisy multiplication operation like GGML_OP_MUL_MAT
    https://github.com/ggerganov/ggml/blob/6dccc647264f5429df2624f36138f601e7ce23e5/src/ggml-backend.cpp#L1186 // Only for CPU without BLAS, etc.
    https://github.com/ggerganov/ggml/blob/6dccc647264f5429df2624f36138f601e7ce23e5/src/ggml.c#L5635
  2. Implement potenial loss
    https://github.com/ggerganov/ggml/blob/6dccc647264f5429df2624f36138f601e7ce23e5/examples/mnist/mnist-common.cpp#L706
  3. Implement adjustable LR (hardcode?)
  4. Implement mixed noise scaler (hadcode?)
  5. Implement graph analyzer to gather weight noise scales (hardcode?)
  6. Implement distillation techinque

@janchk janchk transferred this issue from aifoundry-org/ainekko Oct 21, 2024
@janchk
Copy link
Collaborator Author

janchk commented Nov 4, 2024

https://github.com/aifoundry-org/ggml/tree/dev/noisy_mul
Implemented noisy multiplication.
Further development require either deep talk with llamacpp contributor (or help from him) or large amount of time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: For review
Development

No branches or pull requests

2 participants