Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

confident-ai / deepeval Public

Notifications You must be signed in to change notification settings
Fork 300
Star 3.8k

Code
Issues 108
Pull requests 10
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: confident-ai/deepeval

Releases Tags

Releases · confident-ai/deepeval

Lots of new features

14 Dec 10:50

penguine-ip

v0.20.35

c5045b1

Compare

Choose a tag to compare

View all tags

Lots of new features

Lots of new features this release:

JudgementalGPT now allows for different languages - useful for our APAC and European friends
RAGAS metrics now supports all OpenAI models - useful for those running into context length issues
LLMEvalMetric now returns a reasoning for its score
deepeval test run now has hooks that call on test run completion
evaluate now displays retrieval_context for RAG evaluation
RAGAS metric now displays metric breakdown for all its distinct metrics

Assets 2

All reactions

Continuous Evaluation

22 Nov 12:45

penguine-ip

v0.20.27

75bb4c8

Compare

Choose a tag to compare

View all tags

Continuous Evaluation Pre-release

Pre-release

Automatically integrated with Confident AI for continous evaluation throughout the lifetime of your LLM (app):

-log evaluation results and analyze metrics pass / fails
-compare and pick the optimal hyperparameters (eg. prompt templates, chunk size, models used, etc.) based on evaluation results
-debug evaluation results via LLM traces
-manage evaluation test cases / datasets in one place
-track events to identify live LLM responses in production
-add production events to existing evaluation datasets to strength evals over time

Assets 2

All reactions

Continuous Evaluation

04 Dec 10:42

penguine-ip

v0.20.23

0a57b91

Compare

Choose a tag to compare

View all tags

Continuous Evaluation

Automatically integrated with Confident AI for continous evaluation throughout the lifetime of your LLM (app):

Assets 2

All reactions

Evaluate entire datasets

16 Nov 07:20

penguine-ip

v0.20.19

c7c0b8b

Compare

Choose a tag to compare

View all tags

Evaluate entire datasets

Mid-week bug fixes release with an extra feature:

run_test now works
new function evaluate, evaluates a list of test cases (dataset) on metrics you define, all without having to go through the CLI. More info here: https://docs.confident-ai.com/docs/evaluation-datasets#evaluate-your-dataset-without-pytest

Assets 2

All reactions

Judgemental GPT

14 Nov 05:12

penguine-ip

v0.20.18

727fdb3

Compare

Choose a tag to compare

View all tags

Judgemental GPT

In this release, deepeval has added support for:

JudgementalGPT, a dedicated LLM app developed by Confident AI to perform evaluations more robustly and accurately. JudgementalGPT provides a score and a reason for the score.
Parallel testing: execute test cases in parallel and speed up evaluation up to 100x.

Assets 2