Skip to content

Releases: confident-ai/deepeval

Red teaming, safety testing, and improved synthesizer, conversational metrics, multi-modal metrics

31 Oct 23:01
Compare
Choose a tag to compare

In DeepEval 1.4.7, we're releasing:

Agentic Evaluation Metric, Custom Evaluation LLMs, and Async for Synthetic Data Generation

30 Jul 17:27
Compare
Choose a tag to compare

In DeepEval v0.21.74, we have:

Verbosity in Metrics, Hyperparameter Logging, Improved Synthetic Data Generation, Better Async Support

25 Jun 12:14
Compare
Choose a tag to compare

In DeepEval v0.21.62, we:

Synthetic Data, Caching, Benchmarks, and GEval improvement

31 Mar 18:30
Compare
Choose a tag to compare

For deepeval's latest release v0.21.15, we release:

Async Support for Prod

09 Mar 17:27
Compare
Choose a tag to compare

In deepeval v0.20.85:

Conversational Metrics and Synthetic Data Generation

04 Mar 18:04
Compare
Choose a tag to compare

In DeepEval's latest release, there is now:

Production Stability

25 Feb 11:18
Compare
Choose a tag to compare

For the newest release, deepeval now is now stable for production use:

  • reduced package size
  • separated functionality of pytest vs deepeval test run command
  • included coverage score for summarization
  • fix contextual precision node error
  • released docs for better transparency into metrics calculation
  • allows users to configure RAGAS metrics for custom embedding models: https://docs.confident-ai.com/docs/metrics-ragas#example
  • fixed bugs with checking for package updates

Hugging Face and LlamaIndex integration

14 Feb 06:05
Compare
Choose a tag to compare

For the latest release, DeepEval:

LLM-Evals now support all LangChain chatmodels

16 Jan 11:22
Compare
Choose a tag to compare
  • LLM-Evals (LLM evaluated metrics) now support all of langchain's chat models.
  • LLMTestCase now has execution_time and cost, useful for those looking to evaluate on these parameters
  • minimum_score is now threshold instead, meaning you can now create custom metrics that either have a "minimum" or "maximum" threshold
  • LLMEvalMetric is now GEval
  • Llamaindex Tracing integration: (https://docs.llamaindex.ai/en/stable/module_guides/observability/observability.html#deepeval)

ALL RAG Metrics now offers score reasoning, and a lot more.

28 Dec 11:50
Compare
Choose a tag to compare

In this release: