A tool to analyze and evaluate code review comments from different AI code review bots using LLMs.
- Fetches and analyzes Pull Request data from GitHub repositories
- Evaluates code review comments using Google's Gemini model
- Categorizes comments into:
- Critical Bugs
- Nitpicks
- Other feedback
- Generates visual analysis and detailed reports
- Clone the repository:
git clone https://github.com/Entelligence-AI/code_review_evals.git
cd code_review_evals
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env and add your API keys
- Run the analysis:
python main.py
Required environment variables in your .env
file:
GITHUB_TOKEN=your_github_personal_access_token_here
GOOGLE_API_KEY=your_gemini_api_key_here
GITHUB_REPO=owner/repo # default: microsoft/typescript
NUM_PRS=5 # number of PRs to analyze
To get the required API keys:
- GitHub Token: https://github.com/settings/tokens
- Needs
repo
scope access
- Needs
- Google API Key: https://makersuite.google.com/app/apikey
- Enable Gemini API access
The tool generates several outputs in the analysis_results
directory:
comment_distribution.png
- Visual breakdown of comment categoriesbot_comparison.png
- Comparison of different bot performancesanalysis_report.txt
- Detailed metrics and analysis
For interactive analysis, you can use the provided notebook:
jupyter notebook notebooks/code_review_analysis.ipynb
Project structure:
code_review_evals/
├── analyzers/ # Analysis modules for different LLMs
├── github/ # GitHub API interaction
├── utils/ # Utility functions
├── visualization/ # Visualization tools
├── models.py # Data models
├── prompts.py # LLM prompts
├── main.py # Main execution script
└── requirements.txt
- Fork the repository
- Create a feature branch
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.