o3 #1098

guiramos · 2025-02-03T23:52:38Z

Hello, how to properly use with o3?

I am still doing a custom json with all the parameters:

{
  "RETRIEVER": "tavily",
  "EMBEDDING": "openai:text-embedding-3-small",
  "SIMILARITY_THRESHOLD": 0.6,
  "FAST_LLM": "openai:gpt-4o-mini",
  "SMART_LLM": "openai:gpt-4o-2024-08-06",
  "STRATEGIC_LLM": "openai:gpt-4o-2024-08-06",
  "FAST_TOKEN_LIMIT": 2000,
  "SMART_TOKEN_LIMIT": 4000,
  "BROWSE_CHUNK_MAX_LENGTH": 8192,
  "SUMMARY_TOKEN_LIMIT": 1000,
  "TEMPERATURE": 0.5,
  "LLM_TEMPERATURE": 0.55,
  "USER_AGENT": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0",
  "MAX_SEARCH_RESULTS_PER_QUERY": 5,
  "MEMORY_BACKEND": "local",
  "TOTAL_WORDS": 1200,
  "REPORT_FORMAT": "APA",
  "MAX_ITERATIONS": 4,
  "SCRAPER": "bs",
  "MAX_SUBTOPICS": 3,
  "DOC_PATH": "./my-docs"
}

And then doing this:

researcher = GPTResearcher(query=query, report_type=REPORT_TYPE, config_path=f"{APP_PATH}/agent/resources/researcher.json")
researcher.cfg.total_words = total_words
researcher.cfg.smart_token_limit = total_words * 4
await researcher.conduct_research()
report = await researcher.write_report()
return report

Is there a better way to customize this? I've asked this question in the past about the o1 usage but lost track of its evolution.

#996

@assafelovic

The text was updated successfully, but these errors were encountered:

guiramos · 2025-02-04T01:32:42Z

The problem is that o3-mini does not accept the temperature argument and it's hardcoded in this block:

try:
        response = await create_chat_completion(
            model=cfg.smart_llm_model,
            messages=[
                {"role": "system", "content": f"{auto_agent_instructions()}"},
                {"role": "user", "content": f"task: {query}"},
            ],
            temperature=0.15,
            llm_provider=cfg.smart_llm_provider,
            llm_kwargs=cfg.llm_kwargs,
            cost_callback=cost_callback,
        )

        agent_dict = json.loads(response)
        return agent_dict["server"], agent_dict["agent_role_prompt"]

    except Exception as e:
        print("⚠️ Error in reading JSON, attempting to repair JSON")
        return await handle_json_error(response)

assafelovic · 2025-02-04T14:15:49Z

Hey @guiramos for now we've introduced o models for the STRATEGIC_LLM only, which is in charge of planning the research. Tbh I think anywhere else is an overkill for the task. Check here for more details about modifying the config: https://docs.gptr.dev/docs/gpt-researcher/gptr/config

guiramos · 2025-02-04T16:31:00Z

@assafelovic ok, that makes sense. Thank you.

regismesquita · 2025-02-12T11:14:27Z

@assafelovic 4o is more expensive than o3-mini (Double the price), even if it is a overkill it might still be a better deal?

guiramos · 2025-02-12T13:57:46Z

That is a very good point.

However the way that o3 handles/interprets the prompts is different, it will cause impact in the code

regismesquita · 2025-02-12T14:12:46Z

yes, there is some impact, to make it work locally I had to remove the temperature settings and add the reasoning effort (I am using o3-mini on all llm's fast and smart is low and strategic is high)

My hacky local patch

 % git diff master -- gpt_researcher --patch | cat
diff --git a/gpt_researcher/actions/query_processing.py b/gpt_researcher/actions/query_processing.py
index 3b08da54..062e7e7f 100644
--- a/gpt_researcher/actions/query_processing.py
+++ b/gpt_researcher/actions/query_processing.py
@@ -60,6 +60,7 @@ async def generate_sub_queries(
             llm_provider=cfg.strategic_llm_provider,
             max_tokens=None,
             llm_kwargs=cfg.llm_kwargs,
+            reasoning_effort="high",
             cost_callback=cost_callback,
         )
     except Exception as e:
diff --git a/gpt_researcher/actions/report_generation.py b/gpt_researcher/actions/report_generation.py
index 98cb7f9d..d3acf23c 100644
--- a/gpt_researcher/actions/report_generation.py
+++ b/gpt_researcher/actions/report_generation.py
@@ -47,7 +47,7 @@ async def write_report_introduction(
                     language=config.language
                 )},
             ],
-            temperature=0.25,
+            #temperature=0.25,
             llm_provider=config.smart_llm_provider,
             stream=True,
             websocket=websocket,
@@ -92,7 +92,7 @@ async def write_conclusion(
                                                                        report_content=context,
                                                                        language=config.language)},
             ],
-            temperature=0.25,
+            #temperature=0.25,
             llm_provider=config.smart_llm_provider,
             stream=True,
             websocket=websocket,
@@ -135,7 +135,7 @@ async def summarize_url(
                 {"role": "system", "content": f"{role}"},
                 {"role": "user", "content": f"Summarize the following content from {url}:\n\n{content}"},
             ],
-            temperature=0.25,
+            #temperature=0.25,
             llm_provider=config.smart_llm_provider,
             stream=True,
             websocket=websocket,
@@ -180,7 +180,7 @@ async def generate_draft_section_titles(
                 {"role": "user", "content": generate_draft_titles_prompt(
                     current_subtopic, query, context)},
             ],
-            temperature=0.25,
+            #temperature=0.25,
             llm_provider=config.smart_llm_provider,
             stream=True,
             websocket=None,
@@ -242,7 +242,7 @@ async def generate_report(
                 {"role": "system", "content": f"{agent_role_prompt}"},
                 {"role": "user", "content": content},
             ],
-            temperature=0.35,
+            #temperature=0.35,
             llm_provider=cfg.smart_llm_provider,
             stream=True,
             websocket=websocket,
@@ -257,7 +257,7 @@ async def generate_report(
                 messages=[
                     {"role": "user", "content": f"{agent_role_prompt}\n\n{content}"},
                 ],
-                temperature=0.35,
+                #temperature=0.35,
                 llm_provider=cfg.smart_llm_provider,
                 stream=True,
                 websocket=websocket,
diff --git a/gpt_researcher/utils/llm.py b/gpt_researcher/utils/llm.py
index 611e065a..deac0db0 100644
--- a/gpt_researcher/utils/llm.py
+++ b/gpt_researcher/utils/llm.py
@@ -28,7 +28,8 @@ async def create_chat_completion(
         stream: Optional[bool] = False,
         websocket: Any | None = None,
         llm_kwargs: Dict[str, Any] | None = None,
-        cost_callback: callable = None
+        cost_callback: callable = None,
+        reasoning_effort: Optional[str] = "low"
 ) -> str:
     """Create a chat completion using the OpenAI API
     Args:
@@ -51,7 +52,8 @@ async def create_chat_completion(
             f"Max tokens cannot be more than 16,000, but got {max_tokens}")

     # Get the provider from supported providers
-    provider = get_llm(llm_provider, model=model, temperature=temperature,
+    provider = get_llm(llm_provider, model=model, #temperature=temperature,
+                       reasoning_effort=reasoning_effort,
                        max_tokens=max_tokens, **(llm_kwargs or {}))

     response = ""
@@ -101,8 +103,9 @@ async def construct_subtopics(task: str, data: str, config, subtopics: list = []
         provider = get_llm(
             config.smart_llm_provider,
             model=config.smart_llm_model,
-            temperature=temperature,
+            #temperature=temperature,
             max_tokens=config.smart_token_limit,
+            reasoning_effort="high",
             **config.llm_kwargs,
         )
         model = provider.llm

and start with:

FAST_LLM="openai:o3-mini" SMART_LLM="openai:o3-mini" STRATEGIC_LLM="openai:o3-mini"    python -m uvicorn main:app --host 0.0.0.0 --reload

Edit: So some work would be necessary to add support to o3-mini while not breaking support to everything else.

sassanix · 2025-02-12T16:53:02Z

Not all of us have access to o3, so this works just fine for my use case at the moment.

regismesquita · 2025-02-12T17:59:23Z

Looking back now , I could probably check if the model name contains o3 and just change the llm.py if it does to omit the temperature and add the reasoning effor field.

regismesquita · 2025-02-12T18:12:33Z

This is much cleaner and should support current existing models 🤔

 % git diff --staged master -- gpt_researcher | cat
diff --git a/gpt_researcher/actions/query_processing.py b/gpt_researcher/actions/query_processing.py
index 3b08da54..062e7e7f 100644
--- a/gpt_researcher/actions/query_processing.py
+++ b/gpt_researcher/actions/query_processing.py
@@ -60,6 +60,7 @@ async def generate_sub_queries(
             llm_provider=cfg.strategic_llm_provider,
             max_tokens=None,
             llm_kwargs=cfg.llm_kwargs,
+            reasoning_effort="high",
             cost_callback=cost_callback,
         )
     except Exception as e:
diff --git a/gpt_researcher/utils/llm.py b/gpt_researcher/utils/llm.py
index 611e065a..bab610e2 100644
--- a/gpt_researcher/utils/llm.py
+++ b/gpt_researcher/utils/llm.py
@@ -28,7 +28,8 @@ async def create_chat_completion(
         stream: Optional[bool] = False,
         websocket: Any | None = None,
         llm_kwargs: Dict[str, Any] | None = None,
-        cost_callback: callable = None
+        cost_callback: callable = None,
+        reasoning_effort: Optional[str] = "low"
 ) -> str:
     """Create a chat completion using the OpenAI API
     Args:
@@ -51,9 +52,18 @@ async def create_chat_completion(
             f"Max tokens cannot be more than 16,000, but got {max_tokens}")
 
     # Get the provider from supported providers
-    provider = get_llm(llm_provider, model=model, temperature=temperature,
-                       max_tokens=max_tokens, **(llm_kwargs or {}))
-
+    kwargs = {
+        'model': model,
+        'max_tokens': max_tokens,
+        **(llm_kwargs or {})
+    }
+
+    if 'o3' in model:
+        kwargs['reasoning_effort'] = reasoning_effort
+    else:
+        kwargs['temperature'] = temperature
+
+    provider = get_llm(llm_provider, **kwargs)
     response = ""
     # create response
     for _ in range(10):  # maximum of 10 attempts
@@ -103,6 +113,7 @@ async def construct_subtopics(task: str, data: str, config, subtopics: list = []
             model=config.smart_llm_model,
             temperature=temperature,
             max_tokens=config.smart_token_limit,
+            reasoning_effort="high",
             **config.llm_kwargs,
         )
         model = provider.llm

You might want to make it possible for people for select the reasoning effort for each LLM but this at least allows people to use o3 for now.

Edit: Interestingly o3-mini is also much faster than 4o for a simple research it completed the task in a third of the time.

edit2: nvm, this just failed on the detailed report inside of the runnable sequence (Construct subtopics), the fix is easy just replicate this logic on the generate subtopics.

guiramos closed this as completed Feb 4, 2025

guiramos reopened this Feb 12, 2025

regismesquita mentioned this issue Feb 12, 2025

Adds support for o3-mini #1144

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

o3 #1098

o3 #1098

guiramos commented Feb 3, 2025

guiramos commented Feb 4, 2025

assafelovic commented Feb 4, 2025

guiramos commented Feb 4, 2025

regismesquita commented Feb 12, 2025 •

edited

Loading

guiramos commented Feb 12, 2025

regismesquita commented Feb 12, 2025 •

edited

Loading

sassanix commented Feb 12, 2025

regismesquita commented Feb 12, 2025

regismesquita commented Feb 12, 2025 •

edited

Loading

o3 #1098

o3 #1098

Comments

guiramos commented Feb 3, 2025

guiramos commented Feb 4, 2025

assafelovic commented Feb 4, 2025

guiramos commented Feb 4, 2025

regismesquita commented Feb 12, 2025 • edited Loading

guiramos commented Feb 12, 2025

regismesquita commented Feb 12, 2025 • edited Loading

sassanix commented Feb 12, 2025

regismesquita commented Feb 12, 2025

regismesquita commented Feb 12, 2025 • edited Loading

regismesquita commented Feb 12, 2025 •

edited

Loading

regismesquita commented Feb 12, 2025 •

edited

Loading

regismesquita commented Feb 12, 2025 •

edited

Loading