-
Notifications
You must be signed in to change notification settings - Fork 22
feat: support for adding names to eval runs #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mairin
wants to merge
7
commits into
lightspeed-core:main
Choose a base branch
from
mairin:feature/run-name-support
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
789be50
Add run_name field to EvaluationData model
mairin b7dc675
Add sanitization utility for run names
mairin 58a1270
Extract default run_name from YAML filename in DataValidator
mairin 37cc365
Add --run-name CLI argument with three-tier priority
mairin 40894f6
Use run_name in output filenames
mairin 8185f56
Add run_name to output artifacts metadata
mairin ade1fe7
Add comprehensive tests for run_name sanitization
mairin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| """Core utility functions.""" | ||
|
|
||
| from lightspeed_evaluation.core.utils.sanitize import sanitize_run_name | ||
|
|
||
| __all__ = ["sanitize_run_name"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| """Utility functions for sanitizing user input.""" | ||
|
|
||
| import re | ||
|
|
||
| from lightspeed_evaluation.core.constants import MAX_RUN_NAME_LENGTH | ||
|
|
||
|
|
||
| def sanitize_run_name(run_name: str) -> str: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider pathvalidate, the advantage would be that we don't need maintain+test this func. |
||
| """Sanitize run name for safe filesystem usage. | ||
| Replaces filesystem-unsafe characters with underscores, collapses | ||
| multiple spaces/underscores, and enforces max length. | ||
| Args: | ||
| run_name: Raw run name string to sanitize | ||
| Returns: | ||
| Sanitized run name safe for filesystem usage. Returns empty string | ||
| if input is empty or becomes empty after sanitization. | ||
| Examples: | ||
| >>> sanitize_run_name("test/run:123") | ||
| 'test_run_123' | ||
| >>> sanitize_run_name(" multiple spaces ") | ||
| 'multiple_spaces' | ||
| >>> sanitize_run_name("rh124: filesystem basics") | ||
| 'rh124_filesystem_basics' | ||
| """ | ||
| if not run_name: | ||
| return "" | ||
|
|
||
| # Strip leading/trailing whitespace | ||
| sanitized = run_name.strip() | ||
|
|
||
| # Replace invalid filesystem characters with underscores | ||
| # Invalid chars: / \ : * ? " ' ` < > | and control characters (0x00-0x1f) | ||
| sanitized = re.sub(r'[/\\:*?"\'`<>|\x00-\x1f]', "_", sanitized) | ||
|
|
||
| # Replace multiple spaces/underscores with single underscore | ||
| sanitized = re.sub(r"[\s_]+", "_", sanitized) | ||
|
|
||
| # Strip leading/trailing underscores that may have been created | ||
| sanitized = sanitized.strip("_") | ||
|
|
||
| # Enforce max length, strip trailing underscores if truncated | ||
| if len(sanitized) > MAX_RUN_NAME_LENGTH: | ||
| sanitized = sanitized[:MAX_RUN_NAME_LENGTH].rstrip("_") | ||
|
|
||
| return sanitized | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| """Tests for sanitization utilities.""" | ||
|
|
||
| import pytest | ||
|
|
||
| from lightspeed_evaluation.core.constants import MAX_RUN_NAME_LENGTH | ||
| from lightspeed_evaluation.core.utils import sanitize_run_name | ||
|
|
||
|
|
||
| class TestSanitizeRunName: | ||
| """Test cases for sanitize_run_name function.""" | ||
|
|
||
| def test_basic_alphanumeric(self): | ||
| """Test that basic alphanumeric strings pass through unchanged.""" | ||
| assert sanitize_run_name("test123") == "test123" | ||
| assert sanitize_run_name("rh124_filesystem_basics") == "rh124_filesystem_basics" | ||
|
|
||
| def test_empty_string(self): | ||
| """Test that empty string returns empty string.""" | ||
| assert sanitize_run_name("") == "" | ||
|
|
||
| def test_whitespace_trimming(self): | ||
| """Test that leading/trailing whitespace is removed.""" | ||
| assert sanitize_run_name(" test ") == "test" | ||
| assert sanitize_run_name("\ttest\n") == "test" | ||
|
|
||
| def test_filesystem_unsafe_characters(self): | ||
| """Test that filesystem-unsafe characters are replaced with underscores.""" | ||
| assert sanitize_run_name("test/run") == "test_run" | ||
| assert sanitize_run_name("test\\run") == "test_run" | ||
| assert sanitize_run_name("test:run") == "test_run" | ||
| assert sanitize_run_name("test*run") == "test_run" | ||
| assert sanitize_run_name("test?run") == "test_run" | ||
| assert sanitize_run_name('test"run') == "test_run" | ||
| assert sanitize_run_name("test'run") == "test_run" | ||
| assert sanitize_run_name("test`run") == "test_run" | ||
| assert sanitize_run_name("test<run") == "test_run" | ||
| assert sanitize_run_name("test>run") == "test_run" | ||
| assert sanitize_run_name("test|run") == "test_run" | ||
|
|
||
| def test_multiple_special_characters(self): | ||
| """Test strings with multiple special characters.""" | ||
| assert sanitize_run_name("test/run:123") == "test_run_123" | ||
| assert sanitize_run_name("rh124: filesystem basics") == "rh124_filesystem_basics" | ||
| assert sanitize_run_name("test's `command`") == "test_s_command" | ||
|
|
||
| def test_space_collapsing(self): | ||
| """Test that multiple spaces are collapsed to single underscore.""" | ||
| assert sanitize_run_name("multiple spaces") == "multiple_spaces" | ||
| assert sanitize_run_name("test run") == "test_run" | ||
|
|
||
| def test_underscore_collapsing(self): | ||
| """Test that multiple underscores are collapsed to single underscore.""" | ||
| assert sanitize_run_name("test___run") == "test_run" | ||
| assert sanitize_run_name("test_____run") == "test_run" | ||
|
|
||
| def test_mixed_whitespace_underscore_collapsing(self): | ||
| """Test that mixed spaces and underscores collapse properly.""" | ||
| assert sanitize_run_name("test _ _ run") == "test_run" | ||
| assert sanitize_run_name("test _ run") == "test_run" | ||
|
|
||
| def test_leading_trailing_underscores_stripped(self): | ||
| """Test that leading/trailing underscores created during sanitization are removed.""" | ||
| assert sanitize_run_name("/test/") == "test" | ||
| assert sanitize_run_name(":test:") == "test" | ||
| assert sanitize_run_name("_test_") == "test" | ||
|
|
||
| def test_max_length_enforcement(self): | ||
| """Test that strings exceeding max length are truncated.""" | ||
| long_string = "a" * (MAX_RUN_NAME_LENGTH + 50) | ||
| result = sanitize_run_name(long_string) | ||
| assert len(result) <= MAX_RUN_NAME_LENGTH | ||
| assert result == "a" * MAX_RUN_NAME_LENGTH | ||
|
|
||
| def test_max_length_with_trailing_underscores(self): | ||
| """Test that truncation removes trailing underscores.""" | ||
| # Create a string that when truncated would end with underscore | ||
| long_string = "a" * (MAX_RUN_NAME_LENGTH - 1) + "_" + "b" * 50 | ||
| result = sanitize_run_name(long_string) | ||
| assert len(result) <= MAX_RUN_NAME_LENGTH | ||
| assert not result.endswith("_") | ||
|
|
||
| def test_control_characters(self): | ||
| """Test that control characters are replaced.""" | ||
| assert sanitize_run_name("test\x00run") == "test_run" | ||
| assert sanitize_run_name("test\x1frun") == "test_run" | ||
|
|
||
| def test_unicode_characters_preserved(self): | ||
| """Test that Unicode characters (emojis, kanji, etc.) are preserved.""" | ||
| # Emojis | ||
| assert sanitize_run_name("test🚀run") == "test🚀run" | ||
| assert sanitize_run_name("📊evaluation") == "📊evaluation" | ||
|
|
||
| # Japanese kanji | ||
| assert sanitize_run_name("テスト実行") == "テスト実行" | ||
| assert sanitize_run_name("test_日本語_run") == "test_日本語_run" | ||
|
|
||
| # Chinese characters | ||
| assert sanitize_run_name("测试运行") == "测试运行" | ||
|
|
||
| # Mix of Unicode and ASCII | ||
| assert sanitize_run_name("test_🎯_goal") == "test_🎯_goal" | ||
|
|
||
| def test_unicode_with_unsafe_characters(self): | ||
| """Test Unicode strings with filesystem-unsafe characters.""" | ||
| assert sanitize_run_name("テスト/実行") == "テスト_実行" | ||
| assert sanitize_run_name("test🚀:run") == "test🚀_run" | ||
| assert sanitize_run_name("評価 💯 test") == "評価_💯_test" | ||
|
|
||
| def test_real_world_yaml_filenames(self): | ||
| """Test realistic YAML filename scenarios.""" | ||
| assert sanitize_run_name("rh124_lesson_01") == "rh124_lesson_01" | ||
| assert sanitize_run_name("filesystem-basics") == "filesystem-basics" | ||
| assert sanitize_run_name("Module 1: Introduction") == "Module_1_Introduction" | ||
| assert sanitize_run_name("test (copy)") == "test_(copy)" # Parentheses are valid |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanitize YAML-provided run_name values.
The current logic sanitizes the CLI override and filename fallback but doesn't sanitize run_name values provided in the YAML file. This could allow filesystem-unsafe characters to pass through validation.
Apply this diff to ensure all run_name values are sanitized:
# Set run_name with priority: CLI override > YAML value > filename if run_name_override is not None: # CLI override takes highest priority data_dict["run_name"] = sanitize_run_name(run_name_override) elif "run_name" not in data_dict or data_dict["run_name"] is None: # Default to YAML filename if not provided yaml_filename = Path(data_path).stem data_dict["run_name"] = sanitize_run_name(yaml_filename) + else: + # Sanitize YAML-provided run_name + data_dict["run_name"] = sanitize_run_name(data_dict["run_name"])📝 Committable suggestion
🤖 Prompt for AI Agents