-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
o3 #1098
Comments
The problem is that
|
Hey @guiramos for now we've introduced o models for the |
@assafelovic ok, that makes sense. Thank you. |
@assafelovic 4o is more expensive than o3-mini (Double the price), even if it is a overkill it might still be a better deal? |
That is a very good point. However the way that o3 handles/interprets the prompts is different, it will cause impact in the code |
yes, there is some impact, to make it work locally I had to remove the temperature settings and add the reasoning effort (I am using o3-mini on all llm's fast and smart is low and strategic is high) My hacky local patch
and start with:
Edit: So some work would be necessary to add support to o3-mini while not breaking support to everything else. |
Not all of us have access to o3, so this works just fine for my use case at the moment. |
Looking back now , I could probably check if the model name contains o3 and just change the llm.py if it does to omit the temperature and add the reasoning effor field. |
This is much cleaner and should support current existing models 🤔 % git diff --staged master -- gpt_researcher | cat
diff --git a/gpt_researcher/actions/query_processing.py b/gpt_researcher/actions/query_processing.py
index 3b08da54..062e7e7f 100644
--- a/gpt_researcher/actions/query_processing.py
+++ b/gpt_researcher/actions/query_processing.py
@@ -60,6 +60,7 @@ async def generate_sub_queries(
llm_provider=cfg.strategic_llm_provider,
max_tokens=None,
llm_kwargs=cfg.llm_kwargs,
+ reasoning_effort="high",
cost_callback=cost_callback,
)
except Exception as e:
diff --git a/gpt_researcher/utils/llm.py b/gpt_researcher/utils/llm.py
index 611e065a..bab610e2 100644
--- a/gpt_researcher/utils/llm.py
+++ b/gpt_researcher/utils/llm.py
@@ -28,7 +28,8 @@ async def create_chat_completion(
stream: Optional[bool] = False,
websocket: Any | None = None,
llm_kwargs: Dict[str, Any] | None = None,
- cost_callback: callable = None
+ cost_callback: callable = None,
+ reasoning_effort: Optional[str] = "low"
) -> str:
"""Create a chat completion using the OpenAI API
Args:
@@ -51,9 +52,18 @@ async def create_chat_completion(
f"Max tokens cannot be more than 16,000, but got {max_tokens}")
# Get the provider from supported providers
- provider = get_llm(llm_provider, model=model, temperature=temperature,
- max_tokens=max_tokens, **(llm_kwargs or {}))
-
+ kwargs = {
+ 'model': model,
+ 'max_tokens': max_tokens,
+ **(llm_kwargs or {})
+ }
+
+ if 'o3' in model:
+ kwargs['reasoning_effort'] = reasoning_effort
+ else:
+ kwargs['temperature'] = temperature
+
+ provider = get_llm(llm_provider, **kwargs)
response = ""
# create response
for _ in range(10): # maximum of 10 attempts
@@ -103,6 +113,7 @@ async def construct_subtopics(task: str, data: str, config, subtopics: list = []
model=config.smart_llm_model,
temperature=temperature,
max_tokens=config.smart_token_limit,
+ reasoning_effort="high",
**config.llm_kwargs,
)
model = provider.llm
You might want to make it possible for people for select the reasoning effort for each LLM but this at least allows people to use o3 for now. Edit: Interestingly o3-mini is also much faster than 4o for a simple research it completed the task in a third of the time. edit2: nvm, this just failed on the detailed report inside of the runnable sequence (Construct subtopics), the fix is easy just replicate this logic on the generate subtopics. |
Hello, how to properly use with o3?
I am still doing a custom json with all the parameters:
And then doing this:
researcher = GPTResearcher(query=query, report_type=REPORT_TYPE, config_path=f"{APP_PATH}/agent/resources/researcher.json")
researcher.cfg.total_words = total_words
researcher.cfg.smart_token_limit = total_words * 4
await researcher.conduct_research()
report = await researcher.write_report()
return report
Is there a better way to customize this? I've asked this question in the past about the o1 usage but lost track of its evolution.
#996
@assafelovic
The text was updated successfully, but these errors were encountered: