Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autogen always uses ChatGPT 4, it ignores the config list. #460

Closed
einfachai opened this issue Oct 28, 2023 · 22 comments
Closed

Autogen always uses ChatGPT 4, it ignores the config list. #460

einfachai opened this issue Oct 28, 2023 · 22 comments

Comments

@einfachai
Copy link

Hey there,
Is this a bug or my fault? I cannot use ChatGPT 3.5 with Autogen, it always uses ChatGPT 4, even though it's not configured in the OAI_CONFIG_LIST file.
My OAI_CONFIG_LIST looks like this:
[ { "model": "gpt-3.5-turbo-16k", "api_key": "1234567" } ]

I load the config file like this:
config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")

When I type "set" into my Terminal, there is no OAI_CONFIG_LIST environment variable set that might overwrite the config file.
I tried this again with a clean install of Autogen in a new project and I can't get it to work with ChatGPT 3.5.

Am I doing something wrong?

Thank you

@sonichi
Copy link
Contributor

sonichi commented Oct 28, 2023

@maazsabahuddin
Copy link

maazsabahuddin commented Oct 28, 2023

I tried the same but it seems like autogen only works with chat-gpt4

@einfachai
Copy link
Author

@sonichi
Yes, this is how I use it.
@maazsabahuddin
https://microsoft.github.io/autogen/docs/FAQ/#agents-keep-thanking-each-other-when-using-gpt-35-turbo
Here they are talking about the use of gpt-3.5-turbo, so it must work somehow.

@afourney
Copy link
Member

Gpt-3.5-turbo definitely works and is supported. I test it all the time.

Do you see any warnings printed to console when running your script. If it fails to load the config-list, it will print a warning and fall back to GPT-4 by default.

@maazsabahuddin
Copy link

@afourney I am trying to execute the following code.

assistant = AssistantAgent("assistant",
llm_config={
"api_key": "MY-API-KEY",
"model": "gpt-3.5-turbo"
})

user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding"})

user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")

It returns the following error.

openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.

I have not added my payment details until now.

@einfachai
Copy link
Author

@afourney
no, there is no console output which is complaining about missing configuration or anything related

@einfachai
Copy link
Author

@afourney
you were right, when i delete the openai api key from my .env file, I get an unexpected error from the autogen part.

Error: No API key provided. You can set your API key in code using 'openai.api_key = ', or you can set the environment variable OPENAI_API_KEY=). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = '. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details

I changed loading of the config now to this:
config_list = autogen.config_list_from_json( env_or_file="OAI_CONFIG_LIST", filter_dict={ "model": { "gpt-3.5-turbo-16k", } } )

But I get the same error.

@sonichi
Copy link
Contributor

sonichi commented Oct 28, 2023

Your original config_list looks good. But I'm afraid they are not provided to all the AssistantAgents' llm_config.

@einfachai
Copy link
Author

@sonichi
I double checked it, I provide it to every AssistantAgents, only UserProxyAgents don't get it setted. Do I need a LLM Config for UserProxyAgents where human_input_mode is set to "NEVER"?

@chaturveda
Copy link

even i am getting this kind of error

openai.error.InvalidRequestError: The model gpt-4 does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.

full error msg:

user_proxy (to assistant):

Plot a chart of NVDA and TESLA stock price change YTD.


[autogen.oai.completion: 10-29 11:52:31] {788} WARNING - Completion was provided with a config_list, but the list was empty. Adopting default OpenAI behavior, which reads from the 'model' parameter instead.
WARNING:root:The specified config_list file 'OAI_CONFIG_LIST' does not exist.
WARNING:autogen.oai.completion:Completion was provided with a config_list, but the list was empty. Adopting default OpenAI behavior, which reads from the 'model' parameter instead.
Traceback (most recent call last):
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\main.py", line 7, in
user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\agentchat\conversable_agent.py", line 531, in initiate_chat
self.send(self.generate_init_message(**context), recipient, silent=silent)
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\agentchat\conversable_agent.py", line 334, in send
recipient.receive(message, self, request_reply, silent)
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\agentchat\conversable_agent.py", line 462, in receive
reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\agentchat\conversable_agent.py", line 781, in generate_reply
final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\agentchat\conversable_agent.py", line 606, in generate_oai_reply
response = oai.ChatCompletion.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\oai\completion.py", line 834, in create
return cls._get_response(params, raise_on_ratelimit_or_timeout=raise_on_ratelimit_or_timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\autogen\oai\completion.py", line 224, in _get_response
response = openai_completion.create(request_timeout=request_timeout, **config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\openai\api_resources\chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\openai\api_resources\abstract\engine_api_resource.py", line 155, in create
response, _, api_key = requestor.request(
^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\openai\api_requestor.py", line 299, in request
resp, got_stream = self._interpret_response(result, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\openai\api_requestor.py", line 710, in _interpret_response
self._interpret_response_line(
File "C:\Users\Lenovo\OneDrive\Desktop\autogen\venv\Lib\site-packages\openai\api_requestor.py", line 775, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: The model gpt-4 does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4.

Process finished with exit code 1

kindly help me regarding this

@sonichi
Copy link
Contributor

sonichi commented Oct 29, 2023

@sonichi I double checked it, I provide it to every AssistantAgents, only UserProxyAgents don't get it setted. Do I need a LLM Config for UserProxyAgents where human_input_mode is set to "NEVER"?

It's not necessary. Do you know which AssistantAgent caused this error? And do you use GroupChatManager? That agent needs a llm_config.

@afourney
Copy link
Member

afourney commented Oct 30, 2023

This warning in the trace above, suggested to me the OAI_CONFIG_LIST isn't being found:

"WARNING:root:The specified config_list file 'OAI_CONFIG_LIST' does not exist."

@pcdeadeasy
Copy link
Contributor

Closing this issue per the stack trace.

@the-xentropy
Copy link

the-xentropy commented Dec 2, 2023

I can confirm this is still happening. However, I believe I know the cause and it's technically WAI but extremely potentially problematic behaviour by the library that will likely cost users money.

Here's how I constructed my llm_config (I copied and pasted this from a guide) :

import openai

config_list = autogen.config_list_from_models(model_list=[str(x.id) for x in openai.models.list() if x.id.startswith("gpt-")])
llm_config={
        "model":"gpt-3.5-turbo",
        "cache_seed": 12312312,  # seed for caching and reproducibility
        "config_list": config_list,  # a list of OpenAI API configurations
        "temperature": 0,  # temperature for sampling
    },

Followed by llm_config being passed to every single Agent. Additionally, after creating the agent and checking agent_instance.llm_config it yields:

{'model': 'gpt-4'}

Did you notice what I did wrong? It took me a bit to notice. There's a trailing comma on the llm_config that I copied from somewhere. It seems that if llm_config is invalid for whatever reason, autogen reverts to gpt-4 silently as a safe default instead of refusing to continue. I racked up 10 bucks worth of unnecessary charges from this mistake, but I think it could be a lot worse if someone doesn't pay attention.

I suspect that people running into this issue are setting the OPENAI_API_KEY through environment variables like me, and then accidentally passing something which is silently rejected by the constructor, which then allows it to continue with the fallback variable.

Looking a bit deeper why there might be multiple people making this mistake. In a lot of the notebooks used for guidance (e.g. https://github.com/microsoft/autogen/blob/main/notebook/oai_openai_utils.ipynb ) there's examples such as:

assistant_two = autogen.AssistantAgent(
    name="4-assistant",
    llm_config={
        "timeout": 600,
        "cache_seed": 42,
        "config_list": costly_config_list,
        "temperature": 0,
    },
)

I believe I copied my original llm_config snippet from a notebook just like this, and if a user does this and doesn't notice or think about the trailing comma autogen will always revert to gpt-4, and if you have the environment variable set successfully generate text without any indications something is wrong. What clued me off that I was using gpt-4 instead of gpt-3.5 is actually the very slow inference time due to extensive previous experience using the API, but a novice can not make that determination and are likely to rack up a lot of bills without knowing it.

Trying some variants of corrupt configs, it also appears if you set a non-existent model Autogen will just fall-back to GPT 4 silently without notifying you. After correcting the above issue and verifying that it uses gpt-3.5, I set it to a non-existent model and verify llm_config is set to it ('gpt-2.5-turbo' - confirmed through agent.llm_config) and unfortunately in this case it also continues to work without warning me it's doing this. Since all the configs I have access to via code are still set to the 'wrong' model there's no easy way to detect the fallback happened without actually tracing traffic and checking usage in the OpenAI billing panel.

Proposal:

I do not believe advising users to not use the OPENAI_API_KEY environment variable is the path forward here, simply because Autogen might have to interop with other libraries which use this as the primary mechanism to set the key through, and additionally might be deployed in containers or environments where that's set by default since it's their primary dependency.

Just to get the ball rolling on a discussion, perhaps one of these options would be the best way forward:

  1. Flatly refuse to continue if a llm_config is rejected (I'd prefer this as a developer -- LLMs are not cheap for processing a lot of text). Additionally, refuse to continue if a model of choice does not exist.
  2. Silently unwrap list or tuple types, to at least allow safe copying and pasting from the official documentation without a potentially expensive gotcha. This still means any typo of the model name will silently revert you to the more expensive model though without any warning. This might also mean that any parsing bug (e.g. a regression - it happens) in the codebase that accidentally prevents model selection to succeed would go undetected and opt every user into using a more expensive model.
  3. Give a warning that a non-dict or a dict with incorrect values was passed as llm_config and for users to fix it. I don't think this is necessarily a good mitigation since it only helps under interactive use if the consequences are highly emphasized (e.g. monetary loss) and doesn't prevent regressions in unmonitored deployments from racking up massive bills. There's also a risk that since autogen generates a lot of text by default it may escape notice for a bit even if it is used interactively.

Since Microsoft has a financial interest in OpenAI, I also believe there's some risk of reputational harm by silently opting people into a more expensive model choice. It would be easy for a tech writer without scruples to twist the narrative into Microsoft lining their own pockets.

@sonichi
Copy link
Contributor

sonichi commented Dec 3, 2023

I vote for proposal 1. Can someone make a PR?

@austinhumes-valtech
Copy link

Is there any update on this? I'm running into an issue where it is defaulting to gpt-4 even though I am attempting to specify gpt-3.5-turbo-16k, and don't see any issues with trailing commas as mentioned above. After a few queries in a RAG application, I am receiving:

Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens, however you requested 5153 tokens (4897 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}

which is confusing since 4096 seems to be the limit for other 3.5 models, and both gpt-4 and gpt-3.5-turbo-16k should both have much higher amounts.

@austinhumes-valtech
Copy link

Is there any update on this? I'm running into an issue where it is defaulting to gpt-4 even though I am attempting to specify gpt-3.5-turbo-16k, and don't see any issues with trailing commas as mentioned above. After a few queries in a RAG application, I am receiving:

Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens, however you requested 5153 tokens (4897 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}

which is confusing since 4096 seems to be the limit for other 3.5 models, and both gpt-4 and gpt-3.5-turbo-16k should both have much higher amounts.

Update: I updated to the latest Autogen version (0.2.9) and am now seeing llm config output such as:

document_agent.llm_config:  {'timeout': 60, 'config_list': [{'model': 'gpt-3.5-turbo-1106', 'api_key': 'sk-abcdefg....'}], 'temperature': 0, 'functions': [{'name': 'query_pdf_documents', 'description': 'Answer questions based on the content of the local PDF files', 'parameters': {'type': 'object', 'properties': {'question': {'type': 'string', 'description': 'The question to ask of the local PDF files.'}}, 'required': ['question']}}]}

However, I'm still getting the same 4097 error message, even though gpt-3.5-turbo-1106 has a far higher limit. Any ideas @sonichi @pcdeadeasy ? Thanks!

@the-xentropy
Copy link

@austinhumes-valtech , I'm not a contributor, but can you check the OpenAI billing page to see what model it's getting charged to? It should tell you if it's GPT4 or GPT3.5 - I'm not a maintainer as mentioned, but it would be worrying if it was hot-swapping it to GPT4 randomly in even more circumstances.

@austinhumes-valtech
Copy link

@austinhumes-valtech , I'm not a contributor, but can you check the OpenAI billing page to see what model it's getting charged to? It should tell you if it's GPT4 or GPT3.5 - I'm not a maintainer as mentioned, but it would be worrying if it was hot-swapping it to GPT4 randomly in even more circumstances.

@the-xentropy for today it looks like it is hitting both, which makes sense as I've attempted to use both. After some further investigation, I'm wondering if the PDF file I'm using is just too big. I have a RAG setup with some PDF files that I have vectorized in a local Chroma DB. If I ask it really specific questions like "what as the revenue in 2021?" it works, but if I ask it "summarize the report", that's when I get the error about token limit. I'm revisiting https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them to see if I need to go about this differently, since it states:

Depending on the [model](https://beta.openai.com/docs/engines/gpt-3) used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most. 

The limit is currently a technical limitation, but there are often creative ways to solve problems within the limit, e.g. condensing your prompt, breaking the text into smaller pieces, etc.

@afourney
Copy link
Member

afourney commented Feb 2, 2024

It should not be defaulting to gpt-4 anymore I don't think.

We removed that default weeks ago, I think.

@sonichi
Copy link
Contributor

sonichi commented Feb 2, 2024

Closing this issue. @austinhumes-valtech could you open a new issue and provide full trace and steps to reproduce? The error msg is unexpected based on your description.

@sonichi sonichi closed this as completed Feb 2, 2024
@the-xentropy
Copy link

I was a bit confused why this was closed initially since Austin's message was a bit of an aside, but I'm guessing that this is closed in favor of the pending PR at #847 , so anyone else who's also interested in this can track it there.

jackgerrits pushed a commit that referenced this issue Oct 2, 2024
* add standard contributing.md Fill out contributing guide on website #455

* add contibuting.md #455

* Update CONTRIBUTING.md

Co-authored-by: Eric Zhu <[email protected]>

* Update CONTRIBUTING.md

Co-authored-by: Eric Zhu <[email protected]>

* Update CONTRIBUTING.md

Co-authored-by: gagb <[email protected]>

* add symlink to top level contributing

---------

Co-authored-by: Eric Zhu <[email protected]>
Co-authored-by: gagb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants