Dynamic Temperature HF loader support#5174
Conversation
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
Merge dev branch
- Currently doesn't take minTemp and maxTemp as proper arguments, so it's hardcoded to 0.0 minTemp and 2.0 maxTemp for now - Atm it's hardcoded to only trigger if the "dynatemp" UI variable is above 0.8. This should be a bool and the UI should also have minTemp and maxTemp - For obvious reasons, the regular temperature shouldn't apply when Dynamic Temp is on either - Right now it runs after the truncation samplers always, but it should respect the "temperature_last" argument and come either first or last depending on that bool
|
Can you increase the max temp? For highly confident models like Mixtral you can go up to 5 without any issues. |
|
Thank you for the PR. I think that a good solution would be to monkey patch the original --- a/modules/sampler_hijack.py
+++ b/modules/sampler_hijack.py
@@ -233,6 +233,7 @@ def get_logits_warper_patch(self, generation_config):
temperature_idx = None
for i in range(len(warpers)):
if warpers[i].__class__.__name__ == 'TemperatureLogitsWarper':
+ warpers[i] = TemperatureLogitsWarperWithDynatemp(generation_config.temperature)
temperature_idx = i
breakThen that defaults to regular temperature when the relevant parameters are not set. |
|
I tried (with a fixed seed) different values of DynaTemp and I always got the same outputs, dunno if DynaTemp is working as intended there. |
Merge dev branch
As described in the main post, the UI value is irrelevant atm and was just there for testing. There was also a proposal to turn Dynamic Temp into one value, a range. So, let's say your regular temp is 1.0, and your DynaTemp range is 0.5, the minimum Temp would become Temp - 0.5 (0.5), and the maximum temp would be Temp + 0.5 (1.5). That way instead of "ignoring" your regular temp value, it simply augments it. So if you wanted minTemp = 0 and maxTemp = 5 you would set the regular temp to 2.5 and the range to 2.5, and so on... Thoughts? This makes sense to me, and makes it a simple value that is disabled when you turn it to zero. |
|
Both solutions give the same results in the end, but I still prefer to choose the range myself, it's clearer for the user who can see exactly what the limits are, and it dissociates the normal temperature which shouldn't be involved in the dynamic temperature in my opinion. |
|
Agreed with BadisG, having explicit temperatures makes things more interpretable. Maybe have a The use case would be to tick |
The reasoning for not doing this idea when I asked concedo was that people who want Dynamic Temp turned off would mistakenly believe that 0.0 dynatemp_low is turning it off. I think it would be best to either dissociate the regular Temperature altogether as originally proposed in this PR, or set a single range value which can be set to 0.0 which would make dynamic temp not trigger, and therefore wouldn't require a bool. The range value might be smarter because it's just one extra value that is set to 0 to disable the dynamicism. |
|
I would rather just have minTemp and maxTemp there at that point and go all the way tbh. That way we avoid the monkeypatch |
more work to be done elsewhere
Last commit also ensured the default value is zero for dynatemp
|
This is ready to merge. Only thing that might need changing is removal of the print statements that I had for debugging. |
|
This might have a bug actually. The entropy calculation shouldn't change if the temperature value changes because the dynamic temp effect hasn't been applied by the time the print statement prints out the entropy. But it's doing that on my end. I'm not sure what's wrong, I thought I ensured that the original temp function doesn't get ran, but I guess not. Trying to resolve. |
|
It seems like I resolved the issue with the latest commit. Before, it was possible for both the regular Temperature and Dynamic Temperature to run, when Dynamic Temp is supposed to take the base value and modify it, not run the original function first. Now, it forcibly removes the original Temperature function if DynaTemp is above 0, as intended. SillyTavern's DynaTemp option only works for koboldcpp at the moment, so I think the only thing left is adjusting that for the API, everything seems to work when it comes to the ooba side of things. If you spot any potential issues @oobabooga let me know, but it seems ready now. |
|
FWIW it has still been working in exllamaV2 for me still: https://pastebin.com/h259DiUz Eager to try now in other loaders. The debug stuff is interesting but definitely not something I like keeping on. Also vote for high and low values being exposed, even when they were in the TXT it is good to set upper bounds. I remember using it and getting really low temperatures and having to set a higher minimum. |
|
I tried temperature 3 + dynamtemp 2 (to get a temp range between 1 and 5) but it doesn't work, all I get is this |
I think there's a weird bug when the value is exactly a whole number and not a decimal point. Try 1.99 Dynatemp and 2.99 Temp. |
|
I have made the following changes:
I have experienced this 0 tokens generated artifact when the temperature is too high. It may be a bug in the transformers library; if using an integer value is the issue, we could add 1e-6 to the temperature in these cases. |
|
The 0.00 tokens/second bug was a silent exception that is now fixed: I also fixed a bug with getting the logits when a prefix-match happens in llamacpp_HF. It may have resolved #5186. Everything seems to be working well now, so it should be good to merge. |
|
I tried to activate the "dynatemp_with_range" extension but it doesn't seem to work, I still have the dynatemp value only and nothing else. Edit: Oh ok I see it on the markdown and the chat, why can't it be on the Parameters -> Generation tab instead? It's a bit confusing because the dynamtemp is still there and it's clashing with the other way of doing it. The extension could simply have removed the dynamtemp value from the user interface and replaced it with MinT | MaxT instead. I'm not a big fan of having one MinP - MaxP on the markdown (for the notebook) and one for the chat, I'd want to save those values into my own samplers preset like every other samplers. |
|
Yeah, I agree that it's pretty annoying to work with a dynamic temperature range. It's better to be able to set the low and high value directly. I have removed the extension and changed the main parameters to |
--------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>

A rough WIP backporting my Dynamic Temperature sampling method [which has gained some mild traction again recently] to the HF loaders.
EDIT: PR is ready now. It functions as a range instead of minTemp and maxTemp. Set to zero to disable