Skip to content

Commit 6afd14f

Browse files
committed
[LSC_EVAL] Test Scenario Generation
[DESC] [0] Generated test cases using cursor [1] Removed unnecessary test cases generated by cursor
1 parent ca9a863 commit 6afd14f

File tree

16 files changed

+6266
-0
lines changed

16 files changed

+6266
-0
lines changed

lsc_eval/pdm.lock

Lines changed: 3036 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

lsc_eval/pytest.ini

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
[tool:pytest]
2+
# Pytest configuration for LSC Evaluation Framework tests
3+
4+
# Test discovery
5+
testpaths = tests
6+
python_files = test_*.py
7+
python_classes = Test*
8+
python_functions = test_*
9+
10+
# Output options
11+
addopts =
12+
-v
13+
--tb=short
14+
--strict-config
15+
--color=yes
16+
--durations=10
17+
18+
# Markers for test categorization
19+
markers =
20+
unit: Unit tests for individual components
21+
integration: Integration tests across components
22+
slow: Tests that take longer to run
23+
config: Tests related to configuration loading
24+
models: Tests for Pydantic models
25+
validation: Tests for data validation
26+
llm: Tests requiring LLM API calls (may be skipped in CI)
27+
output: Tests for output generation and formatting
28+
29+
# Minimum version
30+
minversion = 6.0
31+
32+
# Test collection
33+
collect_ignore =
34+
setup.py
35+
build
36+
dist
37+
38+
# Warnings
39+
filterwarnings =
40+
ignore::DeprecationWarning
41+
ignore::PendingDeprecationWarning
42+
ignore::UserWarning:litellm.*
43+
ignore::UserWarning:ragas.*
44+
ignore::UserWarning:deepeval.*
45+
46+
# Coverage options (if pytest-cov is installed)
47+
# addopts = --cov=lsc_eval --cov-report=html --cov-report=term-missing

lsc_eval/tests/README.md

Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
# LSC Evaluation Framework - Test Suite
2+
3+
This directory contains comprehensive test cases for the LSC Evaluation Framework, generated based on the `system.yaml` configuration file.
4+
5+
## Test Structure
6+
7+
```
8+
tests/
9+
├── conftest.py # Pytest fixtures and configuration
10+
├── test_runner.py # Test runner script
11+
├── README.md # This file
12+
├── core/ # Core functionality tests
13+
│ ├── test_config_loader.py # ConfigLoader class tests
14+
│ ├── test_models.py # Pydantic models tests
15+
│ └── test_data_validator.py # DataValidator class tests
16+
├── llm_managers/ # LLM manager tests
17+
│ └── test_llm_manager.py # LLMManager class tests
18+
├── metrics/ # Metrics component tests
19+
│ └── test_custom_metrics.py # Custom metrics tests
20+
└── output/ # Output component tests
21+
└── test_utils.py # Output utilities tests
22+
```
23+
24+
## Test Categories
25+
26+
The tests are organized into several categories using pytest markers:
27+
28+
- **`unit`**: Unit tests for individual components
29+
- **`integration`**: Integration tests across components
30+
- **`config`**: Configuration loading and validation tests
31+
- **`models`**: Pydantic model validation tests
32+
- **`validation`**: Data validation tests
33+
- **`output`**: Output generation and formatting tests
34+
- **`slow`**: Tests that take longer to run
35+
- **`llm`**: Tests requiring LLM API calls (may be skipped in CI)
36+
37+
## Running Tests
38+
39+
### Using the Test Runner Script
40+
41+
The easiest way to run tests is using the provided test runner:
42+
43+
```bash
44+
# Run all tests
45+
python tests/test_runner.py all
46+
47+
# Run specific test categories
48+
python tests/test_runner.py unit
49+
python tests/test_runner.py config
50+
python tests/test_runner.py models
51+
python tests/test_runner.py validation
52+
53+
# Run tests with coverage
54+
python tests/test_runner.py coverage
55+
56+
# Run specific test file
57+
python tests/test_runner.py file tests/core/test_models.py
58+
59+
# Run fast tests only (exclude slow tests)
60+
python tests/test_runner.py fast
61+
```
62+
63+
### Using pytest directly
64+
65+
You can also run tests directly with pytest:
66+
67+
```bash
68+
# Run all tests
69+
pytest tests/
70+
71+
# Run with verbose output
72+
pytest -v tests/
73+
74+
# Run specific test file
75+
pytest tests/core/test_config_loader.py
76+
77+
# Run tests with specific markers
78+
pytest -m "config" tests/
79+
pytest -m "not slow" tests/
80+
81+
# Run with coverage
82+
pytest --cov=lsc_eval --cov-report=html tests/
83+
```
84+
85+
## Test Configuration
86+
87+
### Environment Setup
88+
89+
Tests use fixtures to set up clean environments:
90+
91+
- **`clean_environment`**: Clears environment variables before/after tests
92+
- **`temp_dir`**: Provides temporary directory for test files
93+
- **`sample_system_config`**: Provides sample system configuration
94+
- **`sample_evaluation_data`**: Provides sample evaluation data
95+
96+
### Mock Data
97+
98+
Tests use realistic mock data based on the actual system.yaml configuration:
99+
100+
- **LLM Configuration**: OpenAI, Azure, Anthropic, Gemini, WatsonX, Ollama providers
101+
- **Metrics**: Ragas, DeepEval, and Custom metrics as defined in system.yaml
102+
- **Output Formats**: CSV, JSON, TXT formats with visualization options
103+
- **Evaluation Data**: Multi-turn conversations with contexts and expected responses
104+
105+
## Test Coverage
106+
107+
The test suite covers:
108+
109+
### Core Components
110+
- **ConfigLoader**: System configuration loading, environment setup, logging configuration
111+
- **Models**: Pydantic model validation for TurnData, EvaluationData, EvaluationResult
112+
- **DataValidator**: Evaluation data validation, metric requirements checking
113+
114+
### LLM Managers
115+
- **LLMManager**: Provider-specific configuration, environment validation, model name construction
116+
117+
### Metrics
118+
- **CustomMetrics**: LLM-based evaluation, score parsing, prompt generation
119+
120+
### Output Components
121+
- **Utils**: Statistics calculation, result aggregation, evaluation scoping
122+
123+
## Key Test Scenarios
124+
125+
### Configuration Testing
126+
- Valid and invalid system configurations
127+
- Environment variable setup and validation
128+
- Logging configuration with different levels
129+
- Metric mapping and validation
130+
131+
### Model Validation Testing
132+
- Field validation for all Pydantic models
133+
- Edge cases and boundary conditions
134+
- Required field validation
135+
- Data type validation
136+
137+
### Data Validation Testing
138+
- Evaluation data structure validation
139+
- Metric requirement checking
140+
- Context and expected response validation
141+
- Multi-conversation validation
142+
143+
### LLM Manager Testing
144+
- Provider-specific environment validation
145+
- Model name construction for different providers
146+
- Error handling for missing credentials
147+
- Configuration parsing
148+
149+
### Metrics Testing
150+
- Custom metric evaluation
151+
- LLM response parsing
152+
- Score normalization
153+
- Error handling for failed evaluations
154+
155+
### Output Testing
156+
- Statistics calculation
157+
- Result aggregation by metric and conversation
158+
- Score statistics computation
159+
- Edge cases with empty or error results
160+
161+
## Running Tests in CI/CD
162+
163+
For continuous integration, you can:
164+
165+
```bash
166+
# Run fast tests only (exclude slow/LLM tests)
167+
pytest -m "not slow and not llm" tests/
168+
169+
# Run with XML output for CI systems
170+
pytest --junitxml=test-results.xml tests/
171+
172+
# Run with coverage for code quality metrics
173+
pytest --cov=lsc_eval --cov-report=xml --cov-report=term tests/
174+
```
175+
176+
## Adding New Tests
177+
178+
When adding new functionality:
179+
180+
1. Create test files following the naming convention `test_*.py`
181+
2. Use appropriate pytest markers to categorize tests
182+
3. Follow the existing fixture patterns for setup/teardown
183+
4. Include both positive and negative test cases
184+
5. Test edge cases and error conditions
185+
6. Update this README if adding new test categories
186+
187+
## Test Data
188+
189+
Test fixtures provide realistic data based on system.yaml:
190+
191+
- **Metrics**: All metrics defined in system.yaml with proper thresholds
192+
- **Providers**: All LLM providers with required environment variables
193+
- **Output Formats**: All output formats and visualization options
194+
- **Evaluation Scenarios**: Multi-turn conversations with various metric combinations
195+
196+
This ensures tests accurately reflect the actual system configuration and usage patterns.
197+

lsc_eval/tests/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
"""Test package for LSC Evaluation Framework."""
2+

0 commit comments

Comments
 (0)