Skip to content

Dynamic Temperature HF loader support#5174

Merged
oobabooga merged 47 commits into
oobabooga:devfrom
kalomaze:dynatemp-ooba-hf
Jan 7, 2024
Merged

Dynamic Temperature HF loader support#5174
oobabooga merged 47 commits into
oobabooga:devfrom
kalomaze:dynatemp-ooba-hf

Conversation

@kalomaze
Copy link
Copy Markdown
Contributor

@kalomaze kalomaze commented Jan 5, 2024

A rough WIP backporting my Dynamic Temperature sampling method [which has gained some mild traction again recently] to the HF loaders.

image
  • Currently it doesn't take minTemp and maxTemp as proper arguments, so it's only using 0.0 minTemp and 2.0 maxTemp for now ✅
  • It's also hardcoded to trigger only if the "dynatemp" UI variable is above 0.8. This should be a bool switch and the UI should also have minTemp + maxTemp configurable ✅
  • Right now it runs after the truncation samplers always, but it should respect the "temperature_last" argument and come either first or last depending on that option's value ✅
  • Max Possible Entropy measurement will have to be changed to exclude all tokens set to -inf probability when estimating 'vocab size' in cases where truncation is used, in order for the measurement to match koboldcpp's behavior ✅

EDIT: PR is ready now. It functions as a range instead of minTemp and maxTemp. Set to zero to disable

image

oobabooga and others added 18 commits December 14, 2023 22:39
- Currently doesn't take minTemp and maxTemp as proper arguments, so it's hardcoded to 0.0 minTemp and 2.0 maxTemp for now
- Atm it's hardcoded to only trigger if the "dynatemp" UI variable is above 0.8. This should be a bool and the UI should also have minTemp and maxTemp
- For obvious reasons, the regular temperature shouldn't apply when Dynamic Temp is on either
- Right now it runs after the truncation samplers always, but it should respect the "temperature_last" argument and come either first or last depending on that bool
@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Jan 5, 2024

Can you increase the max temp? For highly confident models like Mixtral you can go up to 5 without any issues.

@oobabooga
Copy link
Copy Markdown
Owner

Thank you for the PR. I think that a good solution would be to monkey patch the original TemperatureLogitsWarper in the transformers library, similar to how RepetitionPenaltyLogitsProcessor is monkeypatched to add support for repetition_penalty_range. Something like this:

--- a/modules/sampler_hijack.py
+++ b/modules/sampler_hijack.py
@@ -233,6 +233,7 @@ def get_logits_warper_patch(self, generation_config):
         temperature_idx = None
         for i in range(len(warpers)):
             if warpers[i].__class__.__name__ == 'TemperatureLogitsWarper':
+                warpers[i] = TemperatureLogitsWarperWithDynatemp(generation_config.temperature)
                 temperature_idx = i
                 break

Then TemperatureLogitsWarperWithDynatemp can be a modified version of
https://github.com/huggingface/transformers/blob/57e9c8321385dfd31bda33df144a4ac849206e06/src/transformers/generation/logits_process.py#L221

that defaults to regular temperature when the relevant parameters are not set.

@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Jan 5, 2024

I tried (with a fixed seed) different values of DynaTemp and I always got the same outputs, dunno if DynaTemp is working as intended there.

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

I tried (with a fixed seed) different values of DynaTemp and I always got the same outputs, dunno if DynaTemp is working as intended there.

As described in the main post, the UI value is irrelevant atm and was just there for testing.

There was also a proposal to turn Dynamic Temp into one value, a range.

So, let's say your regular temp is 1.0, and your DynaTemp range is 0.5, the minimum Temp would become Temp - 0.5 (0.5), and the maximum temp would be Temp + 0.5 (1.5). That way instead of "ignoring" your regular temp value, it simply augments it.

So if you wanted minTemp = 0 and maxTemp = 5 you would set the regular temp to 2.5 and the range to 2.5, and so on...

Thoughts? This makes sense to me, and makes it a simple value that is disabled when you turn it to zero.

@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Jan 6, 2024

Both solutions give the same results in the end, but I still prefer to choose the range myself, it's clearer for the user who can see exactly what the limits are, and it dissociates the normal temperature which shouldn't be involved in the dynamic temperature in my opinion.

@oobabooga
Copy link
Copy Markdown
Owner

oobabooga commented Jan 6, 2024

Agreed with BadisG, having explicit temperatures makes things more interpretable. Maybe have a dynamic_temperature boolean parameter (false by default) and a dynatemp_low parameter that sets the minimum value? Then the existing temperature parameter becomes the maximum value when dynamic_temperature is true.

The use case would be to tick dynamic_temperature in the UI, set dynatemp_low to something low like 0.1, and set the main temperature to a higher value like 1.5. It would be good to have a preset under presets/ with values that have been tested by other people to work well (I assume combined with temperature_last and min_p).

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

Agreed with BadisG, having explicit temperatures makes things more interpretable. Maybe have a dynamic_temperature boolean parameter (false by default) and a dynatemp_low parameter that sets the minimum value? Then the existing temperature parameter becomes the maximum value when dynamic_temperature is true.

The use case would be to tick dynamic_temperature in the UI, set dynatemp_low to something low like 0.1, and set the main temperature to a higher value like 1.5. It would be good to have a preset under presets/ with values that have been tested by other people to work well (I assume combined with temperature_last and min_p).

The reasoning for not doing this idea when I asked concedo was that people who want Dynamic Temp turned off would mistakenly believe that 0.0 dynatemp_low is turning it off.

I think it would be best to either dissociate the regular Temperature altogether as originally proposed in this PR, or set a single range value which can be set to 0.0 which would make dynamic temp not trigger, and therefore wouldn't require a bool.

The range value might be smarter because it's just one extra value that is set to 0 to disable the dynamicism.

@oobabooga
Copy link
Copy Markdown
Owner

The reasoning for not doing this idea when I asked concedo was that people who want Dynamic Temp turned off would mistakenly believe that 0.0 dynatemp_low is turning it off.

dynamic_temperature and dynatemp_low could be placed near each other in the UI, and a hint saying Only used when dynamic_temperature is checked could be added under dynatemp_low, so I don't see that as a problem.

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

The reasoning for not doing this idea when I asked concedo was that people who want Dynamic Temp turned off would mistakenly believe that 0.0 dynatemp_low is turning it off.

dynamic_temperature and dynatemp_low could be placed near each other in the UI, and a hint saying Only used when dynamic_temperature is checked could be added under dynatemp_low, so I don't see that as a problem.

I would rather just have minTemp and maxTemp there at that point and go all the way tbh. That way we avoid the monkeypatch

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

This is ready to merge. Only thing that might need changing is removal of the print statements that I had for debugging.

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

This might have a bug actually. The entropy calculation shouldn't change if the temperature value changes because the dynamic temp effect hasn't been applied by the time the print statement prints out the entropy.

But it's doing that on my end. I'm not sure what's wrong, I thought I ensured that the original temp function doesn't get ran, but I guess not. Trying to resolve.

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 6, 2024

It seems like I resolved the issue with the latest commit. Before, it was possible for both the regular Temperature and Dynamic Temperature to run, when Dynamic Temp is supposed to take the base value and modify it, not run the original function first.

Now, it forcibly removes the original Temperature function if DynaTemp is above 0, as intended.

SillyTavern's DynaTemp option only works for koboldcpp at the moment, so I think the only thing left is adjusting that for the API, everything seems to work when it comes to the ooba side of things.

If you spot any potential issues @oobabooga let me know, but it seems ready now.

@Ph0rk0z
Copy link
Copy Markdown
Contributor

Ph0rk0z commented Jan 6, 2024

FWIW it has still been working in exllamaV2 for me still: https://pastebin.com/h259DiUz

Eager to try now in other loaders. The debug stuff is interesting but definitely not something I like keeping on. Also vote for high and low values being exposed, even when they were in the TXT it is good to set upper bounds. I remember using it and getting really low temperatures and having to set a higher minimum.

@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Jan 7, 2024

I tried temperature 3 + dynamtemp 2 (to get a temp range between 1 and 5) but it doesn't work, all I get is this

Output generated in 0.33 seconds (0.00 tokens/s, 0 tokens, context 55, seed 2054783615)

@kalomaze
Copy link
Copy Markdown
Contributor Author

kalomaze commented Jan 7, 2024

I tried temperature 3 + dynamtemp 2 (to get a temp range between 1 and 5) but it doesn't work, all I get is this

Output generated in 0.33 seconds (0.00 tokens/s, 0 tokens, context 55, seed 2054783615)

I think there's a weird bug when the value is exactly a whole number and not a decimal point. Try 1.99 Dynatemp and 2.99 Temp.
I really don't know how or why this happens. @oobabooga any clue?

@oobabooga
Copy link
Copy Markdown
Owner

I have made the following changes:

  • Join temperature and dynamic temperature into a single sampler
  • Add an extension for using minimum/maximum temperature explictly:

print

I think there's a weird bug when the value is exactly a whole number and not a decimal point. Try 1.99 Dynatemp and 2.99 Temp.
I really don't know how or why this happens. @oobabooga any clue?

I have experienced this 0 tokens generated artifact when the temperature is too high. It may be a bug in the transformers library; if using an integer value is the issue, we could add 1e-6 to the temperature in these cases.

@oobabooga
Copy link
Copy Markdown
Owner

The 0.00 tokens/second bug was a silent exception that is now fixed:

    raise ValueError(except_msg)
ValueError: `temperature` (=1) has to be a strictly positive float, otherwise your next token scores will be invalid.

I also fixed a bug with getting the logits when a prefix-match happens in llamacpp_HF. It may have resolved #5186.

Everything seems to be working well now, so it should be good to merge.

@oobabooga oobabooga changed the base branch from main to dev January 7, 2024 13:34
@oobabooga oobabooga merged commit 48327cc into oobabooga:dev Jan 7, 2024
@BadisG
Copy link
Copy Markdown
Contributor

BadisG commented Jan 7, 2024

I tried to activate the "dynatemp_with_range" extension but it doesn't seem to work, I still have the dynatemp value only and nothing else.

Edit: Oh ok I see it on the markdown and the chat, why can't it be on the Parameters -> Generation tab instead? It's a bit confusing because the dynamtemp is still there and it's clashing with the other way of doing it.

The extension could simply have removed the dynamtemp value from the user interface and replaced it with MinT | MaxT instead.

I'm not a big fan of having one MinP - MaxP on the markdown (for the notebook) and one for the chat, I'd want to save those values into my own samplers preset like every other samplers.

@oobabooga
Copy link
Copy Markdown
Owner

Yeah, I agree that it's pretty annoying to work with a dynamic temperature range. It's better to be able to set the low and high value directly.

I have removed the extension and changed the main parameters to dynamic_temperature (on/off) and dynamic_temperature_low here: #5198. The high value is the regular temperature.

PoetOnTheRun pushed a commit to PoetOnTheRun/text-generation-webui that referenced this pull request Feb 22, 2024
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants