Skip to content

Cubic sampling w/ curve param#5551

Merged
oobabooga merged 40 commits into
oobabooga:devfrom
kalomaze:curve-test
Mar 3, 2024
Merged

Cubic sampling w/ curve param#5551
oobabooga merged 40 commits into
oobabooga:devfrom
kalomaze:curve-test

Conversation

@kalomaze
Copy link
Copy Markdown
Contributor

@kalomaze kalomaze commented Feb 20, 2024

This adds upon the original Quadratic Sampling method with an additional parameter that I've labeled "smoothing_curve".

image

The idea is to enable even lower smoothing_factor values than ~0.25ish to work well; we do this by applying a cubic transformation to compensate, which seems to make the falloff steeper.

Not ready to merge yet, needs empirical testing from users. My hope is that you can fully avoid having to use truncation schemes and go for a fully "smooth" transformation to the distribution across different models.

The higher the smoothing_curve, the steeper the fall off (so it becomes harsher).

2024-02-19_15-42-06.mp4

Brief tests on a 7b are showing that it does in fact help to make lower smoothing_factor values coherent in practice (so far at least).

1.0 smoothing_curve is the "old" behavior and has no effect.

@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Feb 21, 2024

When you go for 3 or higher on the smoothing_curve, it doesn't work anymore
nIDEWXx

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Mar 1, 2024

When you go for 3 or higher on the smoothing_curve, it doesn't work anymore nIDEWXx

I can't reproduce this issue, through llama.cpp_HF at least.

@Myobu1
Copy link
Copy Markdown

Myobu1 commented Mar 2, 2024

When you go for 3 or higher on the smoothing_curve, it doesn't work anymore nIDEWXx

I can't reproduce this issue, through llama.cpp_HF at least.

I have the exact same issue on Ooba. Setting smoothing curve to lower than 3, it works fine. The moment I go at or above it, I start getting errors regarding RuntimeError: probability tensor contains either inf, nan or element < 0 and inability to output replies after 5 attempts on Silly.

Edit: Apparently if you have any banned tokens the issue will occur.

@Ph0rk0z
Copy link
Copy Markdown
Contributor

Ph0rk0z commented Mar 2, 2024

I get nan error as well. using exl2_HF. utils.py line 2734

Its a problem from transformers.

next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)

2.99 is the highest I can go before this error. But so far I was able to get the factor down to .05 with that value.

@Ph0rk0z
Copy link
Copy Markdown
Contributor

Ph0rk0z commented Mar 2, 2024

I only got 2 braincells to knock together, but to me

k = (3 - self.smoothing_curve) / 2

3-3 is 0 and then thats 0/2 which makes NaN

Btw: Changing it to 10 did fix it for me, up to 9.99 ofc. I am getting best results with .20-.2X and 1.04 because all I can do is watch the PR video go by token distribution pictured. Also .02 and 4.82-5.6 was another decent point. Otherwise this removes too many tokens and gets very deterministic.

@oobabooga
Copy link
Copy Markdown
Owner

The nan error was caused by making operations with -inf^3. I solved it by keeping scores with -inf value unchanged.

@oobabooga oobabooga changed the base branch from main to dev March 3, 2024 16:20
@oobabooga oobabooga merged commit cfb25c9 into oobabooga:dev Mar 3, 2024
@Ph0rk0z
Copy link
Copy Markdown
Contributor

Ph0rk0z commented Mar 3, 2024

Will retest.

works but back to square one on how to set it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants