Skip to content

Commit

Permalink
Merge pull request #76 from cvs-health/release-branch/v0.3.0
Browse files Browse the repository at this point in the history
Release PR: v0.3.0
  • Loading branch information
dylanbouchard authored Dec 20, 2024
2 parents 9908818 + fb2cf18 commit c5d1429
Show file tree
Hide file tree
Showing 37 changed files with 2,284 additions and 1,260 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,6 @@ dmypy.json

#Example generated files
examples/evaluations/text_generation/final_metrics.txt

#Data generated by langfair data_loader module
langfair/data/*
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ We can use `ResponseGenerator.generate_responses` to generate 25 responses for e
from langfair.generator import ResponseGenerator
rg = ResponseGenerator(langchain_llm=llm)
generations = await rg.generate_responses(prompts=prompts, count=25)
responses = [str(r) for r in generations["data"]["response"]]
duplicated_prompts = [str(r) for r in generations["data"]["prompt"]] # so prompts correspond to responses
responses = generations["data"]["response"]
duplicated_prompts = generations["data"]["prompt"] # so prompts correspond to responses
```

##### Compute toxicity metrics
Expand Down Expand Up @@ -96,8 +96,8 @@ cg = CounterfactualGenerator(langchain_llm=llm)
cf_generations = await cg.generate_responses(
prompts=prompts, attribute='gender', count=25
)
male_responses = [str(r) for r in cf_generations['data']['male_response']]
female_responses = [str(r) for r in cf_generations['data']['female_response']]
male_responses = cf_generations['data']['male_response']
female_responses = cf_generations['data']['female_response']
```

Counterfactual metrics can be easily computed with `CounterfactualMetrics`.
Expand All @@ -109,7 +109,7 @@ cf_result = cm.evaluate(
texts2=female_responses,
attribute='gender'
)
cf_result
cf_result['metrics']
# # Output is below
# {'Cosine Similarity': 0.8318708,
# 'RougeL Similarity': 0.5195852482361165,
Expand All @@ -127,7 +127,7 @@ auto_object = AutoEval(
# toxicity_device=device # uncomment if GPU is available
)
results = await auto_object.evaluate()
results
results['metrics']
# Output is below
# {'Toxicity': {'Toxic Fraction': 0.0004,
# 'Expected Maximum Toxicity': 0.013845130120171235,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ead0356d",
"metadata": {},
"source": [
"# Classification Metrics Demo"
]
},
{
"cell_type": "code",
"execution_count": 1,
Expand All @@ -10,7 +18,8 @@
"outputs": [],
"source": [
"import numpy as np\n",
"from langfair.metrics.classification import ClassificationMetrics"
"\n",
"from langfair.metrics.classification import ClassificationMetrics\n"
]
},
{
Expand Down Expand Up @@ -171,9 +180,9 @@
"uri": "us-docker.pkg.dev/deeplearning-platform-release/gcr.io/workbench-notebooks:m125"
},
"kernelspec": {
"display_name": "langfair",
"display_name": "langfair-ZgpfWZGz-py3.9",
"language": "python",
"name": "langfair"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -185,7 +194,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
"version": "3.9.4"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "3c377f48",
"metadata": {},
"source": [
"# Recommendation Metrics Demo"
]
},
{
"cell_type": "markdown",
"id": "eab56f24-c606-4b0c-a259-f125d5a7e227",
Expand Down
Loading

0 comments on commit c5d1429

Please sign in to comment.