Add validator module #61

elisno · 2025-03-21T00:07:32Z

Tutorial for this

https://github.com/cleanlab/cleanlab-studio-docs/pull/868

Related tutorial for TrustworthyRAG module:
https://github.com/cleanlab/cleanlab-studio-docs/pull/867

The idea is users who are doing RAG and just want real-time detection of issues should use TrustworthyRAG (we will clarify that later on). Users who are doing RAG and want detection + flagging/logging + remediation of issues should use this tutorial (Codex). We can unify everything better later on.

What changed?

Added a module with a new Validator class that scores responses from RAG systems and detects & remediates bad responses.

A few notes:

A BadResponseThresholds for this module is a Pydantic BaseModel, mainly to validate that the thresholds are 0-1. I'm fine with just having this be a dictionary.
By default, we'll add explanations to the trustworthiness score of TrustworthyRAG, but only return the trustworthiness score and the response helpfulness score (the other default scores are not used for validation at this moment). The get_default_evaluations() function will control what Evals to use with TrustworthyRAG by default. This get_default_evaluations() functions is different from the one defined in cleanlab_tlm (IIRC it's called get_default_evals().
TrustworthyRAG will work fine when the tlm_api_key is None, as long as a CLEANLAB_TLM_API_KEY is set as an environment variable. So a minimal constructor would look like validator = Validator(codex_access_key="<your-access-key>"), with the rest having pre-defined configurations as defaults.

Usage example:

from cleanlab_codex import Validator

validator = Validator(codex_access_key="<your-access-key>")

CONTEXT = """Simple Water Bottle - Amber (limited edition launched Jan 1st 2025)
A water bottle designed with a perfect blend of functionality and aesthetics in mind. Crafted from high-quality, durable plastic with a sleek honey-colored finish.
Price: $24.99 \nDimensions: 10 inches height x 4 inches width"""

results = validator.validate(
    query="How much water can the Simple Water Bottle hold?",
    context=CONTEXT,
    response="The Simple Water Bottle can hold 34 oz of Water",
)
results

prints out:

{
    "is_bad_response": True,
    "expert_answer": "32oz",
    "trustworthiness": {
        "log": {
            "explanation": "The proposed response states that the Simple Water Bottle can hold 34 oz of water. However, the context information provided does not specify the capacity of the water bottle. Without explicit details about the volume it can hold, the response cannot be verified as correct. The dimensions of the bottle (10 inches height x 4 inches width) do not directly indicate its volume capacity, and the assumption made in the response could be inaccurate. A more appropriate response would acknowledge the lack of specific information regarding the bottle's capacity and suggest that the user check the product specifications for accurate details. Therefore, the proposed response is not substantiated by the provided context. \nThis response is untrustworthy due to lack of consistency in possible responses from the model. Here's one inconsistent alternate response that the model considered (which may not be accurate either): \nThe Simple Water Bottle can hold approximately 70 fluid ounces of water."
        },
        "score": 0.18455227478066594,
        "is_bad": True,
    },
    "response_helpfulness": {
        "score": 0.9975124364465637,
        "is_bad": False,
    },
}

Checklist

Did you link the GitHub issue?
Did you follow deployment steps or bump the version if needed?
Did you add/update tests? At least for some internal functions.
What QA did you do?
- Tested...

elisno · 2025-03-21T00:34:32Z

Current test coverage does not include src/cleanlab_codex/validator.py (which CI is complaining about). I plan to add tests after the initial round of reviews to avoid unnecessary rework if further changes are needed. Let me know if you prefer adding them earlier.

src/cleanlab_codex/internal/validator.py

src/cleanlab_codex/validator.py

jwmueller · 2025-03-21T07:07:58Z

instead of your gist here: https://gist.github.com/elisno/65dca2bebb20e1749afa753784bab920

Please go ahead and PR the same gist as a tutorial notebook to: https://github.com/cleanlab/cleanlab-studio-docs/

For now, have the tutorial show:

running with default settings
running with advanced settings, where you set basically everything to non-default that the user could possibly specify.

…stead of a dict

Adds support for custom evaluation thresholds, introduces ThresholdedTrustworthyRAGScore type, and improves validation error handling with better documentation.

pyproject.toml

src/cleanlab_codex/internal/validator.py

src/cleanlab_codex/validator.py

anishathalye

Overall LGTM. High-level comment mirrors my comment on the tutorial—how is a user supposed to understand whether they should use Validator, TrustworthyRAG, or TLM directly?

Left a bunch of smaller comments inline.

pyproject.toml

src/cleanlab_codex/internal/validator.py

src/cleanlab_codex/validator.py

src/cleanlab_codex/internal/validator.py

src/cleanlab_codex/validator.py

anishathalye · 2025-03-25T20:57:40Z

src/cleanlab_codex/validator.py

+            **scores,
+        }
+
+    def detect(


If a user just wants to detect bad responses, should they use TrustworthyRAG or Validate.detect? How is a user supposed to understand how these two relate to each other?

The idea (which docstring should be updated to reflect) was that Validator is just a version of TrustworthyRAG with different default evals & predetermined thresholds.

The practical impact of those thresholds is they determine when we lookup things in Codex (what is logged in the Project for SME to answer, what gets answered by Codex instead of RAG app). But that impact is primarily realized in Validator.validate(), not in Validator.detect().

So we could make detect() a private method? It's essentially just another version of the .validate() method that is not hooked up to any Codex project (e.g. for testing detection configurations out without impacting the Codex project via logging).

That solution sounds fine to me—making it private, and updating the instructions to indicate that detect -> TrustworthyRAG, detect + remediate -> Validator.

sgtm. @elisno can you also add an optional flag to Validator.validate(), which allows users to run the detection for testing purposes but without interacting with Codex in anyway? (no querying codex at all to ensure testing runs aren't polluting the Codex Project).

This flag could be something like: testing_mode = False by default (try to think of better name)

On second thought, we should keep the detect() method public for threshold-tuning and testing purposes (without affecting Codex). I've updated the docstring to reflect this.

No need for another optional flag in Validator.validate().

also include screenshot of tutorial where you show that it's clearly explained when to use validate() vs. detect()

I pushed more docstring changes to clearly distinguish these, so review those

also include screenshot of tutorial where you show that it's clearly explained when to use validate() vs. detect()

https://github.com/cleanlab/cleanlab-studio-docs/pull/868#issuecomment-2756947611

that screenshot does not explain the main reason to use detect(), which is to test/tune detection configurations like the evaluation score thresholds and TrustworthyRAG settings

anishathalye · 2025-03-25T21:01:06Z

This doesn't necessarily have to block merging of this PR, but it would be great for us to dogfood Validator in migrating rag.app to use Codex as a backup.

Co-authored-by: Anish Athalye <[email protected]>

…n in favor of the new Validator API.

src/cleanlab_codex/validator.py

…ecation in favor of the new Validator API.

src/cleanlab_codex/validator.py

elisno added 2 commits March 20, 2025 21:56

add cleanlab-tlm as a dependency in pyproject.toml

1bc7370

Add response validation functionality using TrustworthyRAG

2529ae6

elisno requested a review from jwmueller March 21, 2025 00:07

alt_answer -> expert_answer

722d287