Skip to content

GenseeAI/open_deep_research

Repository files navigation

🔬 Open Deep Research - Enhanced with Gensee Search

Fork Notice: This is an enhanced version of LangChain's Open Deep Research, integrated with Gensee Search for improved search capabilities and reasoning. Check LangChain's repo to learn more about how it's built.

🚀 What's Different in This Version

  • 🔍 Gensee Search Integration: Replaced Tavily with Gensee Search for enhanced search quality and AI application optimization
  • 🧠 Improved Reasoning: Enhanced agent prompts to encourage more thorough search and reasoning processes
  • 🛠️ Easy Integration: Demonstrates simple integration of Gensee's testing and optimization tools for GenAI applications

Learn more about Gensee's AI testing and optimization platform at gensee.ai

🚀 Quickstart

  1. Clone the repository and activate a virtual environment:
git clone https://github.com/GenseeAI/open_deep_research.git
cd open_deep_research
uv venv --python=3.12
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:
uv sync
# or
uv pip install -r pyproject.toml
  1. Set up your .env file to customize the environment variables (for model selection, search tools, and other configuration settings):

Get FREE access to Gensee Search from https://platform.gensee.ai/

Image

cp .env.example .env
# GENSEE_API_KEY=your_api_key_here
  1. Launch agent with the LangGraph server locally:
# Install dependencies and start the LangGraph server
uvx --refresh --from "langgraph-cli[inmem]" --with-editable . --python 3.11 langgraph dev --allow-blocking

This will open the LangGraph Studio UI in your browser.

- 🚀 API: http://127.0.0.1:2024
- 🎨 Studio UI: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
- 📚 API Docs: http://127.0.0.1:2024/docs

Ask a question in the messages input field and click Submit. Select different configuration in the "Manage Assistants" tab.

⚙️ Configurations

See the fields in the run_evaluate.py to config the model usage and other agent behaviors.

📊 Evaluation

Open Deep Research is configured for evaluation with Deep Research Bench. This benchmark has 100 PhD-level research tasks (50 English, 50 Chinese), crafted by domain experts across 22 fields (e.g., Science & Tech, Business & Finance) to mirror real-world deep-research needs. It has 2 evaluation metrics, but the leaderboard is based on the RACE score. This uses LLM-as-a-judge (Gemini) to evaluate research reports against a golden set of reports compiled by experts across a set of metrics.

Usage

Warning: Running across the 100 examples can cost ~$20-$100 depending on the model selection.

# Run comprehensive evaluation on LangSmith datasets
python tests/run_evaluate.py

This will provide a link to a LangSmith experiment, which will have a name YOUR_EXPERIMENT_NAME. Once this is done, extract the results to a JSONL file that can be submitted to the Deep Research Bench.

python tests/extract_langsmith_data.py --project-name "YOUR_EXPERIMENT_NAME" --model-name "you-model-name" --dataset-name "deep_research_bench"

This creates tests/expt_results/deep_research_bench_model-name.jsonl with the required format. Move the generated JSONL file to a local clone of the Deep Research Bench repository and follow their Quick Start guide for evaluation submission.

Results

Name Summarization Research Compression Total Cost Total Tokens RACE Score
Gensee Search openai:gpt-4.1-mini openai:gpt-5 openai:gpt-4.1 $158.56 165,689,034 0.5079
LangChain GPT-5 openai:gpt-4.1-mini openai:gpt-5 openai:gpt-4.1 204,640,896 0.4943
LangChain Submission openai:gpt-4.1-nano openai:gpt-4.1 openai:gpt-4.1 $87.83 207,005,549 0.4344

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published