-
Notifications
You must be signed in to change notification settings - Fork 0
Feat/rapid 5915 multicompare #379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the order creation workflow to improve datapoint validation and introduces multi-ranking functionality. The main changes include renaming the criteria parameter to instruction in RankingWorkflow for consistency, extracting datapoint validation logic into a reusable DatapointsValidator class, and adding support for multi-ranking orders.
- Renamed
criteriatoinstructioninRankingWorkflowfor consistency with other workflows - Extracted datapoint validation and mapping logic into
DatapointsValidatorclass - Added
MultiRankingWorkflowandcreate_multi_ranking_ordermethod for grouped ranking support - Added
groupfield toDatapointmodel to support grouping datapoints
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
_ranking_workflow.py |
Renamed parameter from criteria to instruction for consistency |
_multi_ranking_workflow.py |
New workflow class for multi-ranking with grouped datapoints |
_base_workflow.py |
Extended return type union to include GroupedRankingWorkflowModel |
__init__.py |
Exported new MultiRankingWorkflow class |
rapidata_order_manager.py |
Refactored order creation to use DatapointsValidator, added create_multi_ranking_order method |
_rapidata_order_builder.py |
Changed __openapi_service from private to protected (_openapi_service) |
_datapoints_validator.py |
New utility class for validating and mapping datapoints |
_datapoint_uploader.py |
Added group field when uploading datapoints |
_datapoint.py |
Added optional group field to Datapoint model |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from rapidata.api_client.models.grouped_ranking_workflow_model import ( | ||
| GroupedRankingWorkflowModel, | ||
| ) |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of GroupedRankingWorkflowModel is duplicated on lines 6-8 and lines 17-19. Remove the duplicate import.
| group=group, | ||
| ) | ||
| for asset, context, media_context, sentence, private_note, group in cast( | ||
| "Iterable[tuple[str | list[str], str | None, str | None, str | None, str | None, str | None]]", # because itterator only supports 5 arguments with specific type casting |
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'itterator' to 'iterator'.
| "Iterable[tuple[str | list[str], str | None, str | None, str | None, str | None, str | None]]", # because itterator only supports 5 arguments with specific type casting | |
| "Iterable[tuple[str | list[str], str | None, str | None, str | None, str | None, str | None]]", # because iterator only supports 5 arguments with specific type casting |
| @@ -0,0 +1,66 @@ | |||
| from itertools import zip_longest | |||
| from typing import Literal, cast, Iterable | |||
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'Iterable' is not used.
| from typing import Literal, cast, Iterable | |
| from typing import Literal, cast |
| @@ -1,4 +1,4 @@ | |||
| from typing import Sequence, Optional, Literal | |||
| from typing import Sequence, Optional, Literal, cast, Iterable | |||
Copilot
AI
Nov 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'Iterable' is not used.
Import of 'cast' is not used.
| from typing import Sequence, Optional, Literal, cast, Iterable | |
| from typing import Sequence, Optional, Literal |
|
Claude finished @LinoGiger's task —— View job Code Review CompleteI've reviewed the PR and found several issues that should be addressed. The refactoring introduces multi-ranking functionality and improves the codebase structure, but there are some important issues to fix. Issues Found1. Duplicate Import in
|
Code Review for PR #379: Feat/rapid 5915 multicompareSummaryThis PR adds a new MultiRankingWorkflow feature for grouped ranking operations, along with a refactored DatapointsValidator utility class. The changes introduce support for the Issues Found1. Critical Bug in
|
PR Review: Feat/rapid 5915 multicompareSummaryThis PR introduces multi-ranking workflow functionality and refactors datapoint validation logic. The changes add support for grouped ranking comparisons and extract common validation patterns into a reusable validator class. Code Quality & Best PracticesStrengths
Issues & SuggestionsCritical Issues1. Missing Type Hint ( random_comparisons_ratio, # Missing type hintShould be: 2. Incomplete Comment ( 3. Mutable Default Argument ( metadatas: list[Metadata] = [],This is a classic Python pitfall. Mutable default arguments are shared across all function calls. Should be: metadatas: list[Metadata] | None = None,
# Then in __init__:
self.metadatas = metadatas or []Code Quality Issues4. Overly Complex Type Cast ( return [
Datapoint(
asset=asset,
data_type=data_type,
context=context,
media_context=media_context,
sentence=sentence,
private_note=private_note,
group=group,
)
for asset, context, media_context, sentence, private_note, group in zip_longest(
datapoints,
contexts or [],
media_contexts or [],
sentences or [],
private_notes or [],
groups or [],
)
]The cast appears unnecessary and makes the code harder to understand. 5. Parameter Name Inconsistency ( Potential Bugs1. Group Assignment Logic ( for i, datapoint in enumerate(datapoints):
for d in datapoint:
datapoints_instances.append(
Datapoint(
asset=d,
data_type=data_type,
context=contexts[i] if contexts else None,
media_context=media_contexts[i] if media_contexts else None,
group=str(i),
)
)This code doesn't validate that 2. No Validation of Empty Groups 3. Private Note Validation Context ( Performance Considerations1. Redundant List Comprehensions datapoints_instances = [
Datapoint(
asset=d,
data_type=data_type,
context=contexts[i] if contexts else None,
media_context=media_contexts[i] if media_contexts else None,
group=str(i),
)
for i, datapoint_list in enumerate(datapoints)
for d in datapoint_list
]2. Memory Efficiency Security ConcernsNo major security issues identified. The changes are primarily internal refactoring and feature additions that:
Test CoverageMajor Concern: This PR adds 301 lines and removes 118 lines (net +183 lines) including:
However, no test files were modified or added. This is a significant gap. Recommended Tests:
Additional Observations
Recommendations PriorityMust Fix Before Merge:
Should Fix: Nice to Have: Overall, this is a solid refactoring that improves code organization and adds useful functionality. The main concerns are the lack of test coverage and a few code quality issues that should be addressed before merging. |
| from ._evaluation_workflow import EvaluationWorkflow | ||
| from ._timestamp_workflow import TimestampWorkflow | ||
| from ._ranking_workflow import RankingWorkflow | ||
| from ._multi_ranking_workflow import MultiRankingWorkflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try and consolidate the naming, with luca we once decided on GroupedRanking instead of Multi
Pull Request Review - PR #379: Feat/rapid 5915 multicompareSummaryThis PR introduces a new multi-ranking workflow feature that allows ranking multiple independent lists of datapoints. The changes refactor the existing ranking order API to support grouped rankings and introduce a new Code Quality and Best PracticesStrengths
Areas for Improvement
Potential Bugs and IssuesCritical Issues
Minor Issues
Performance Considerations
Security Concerns
Test CoverageMajor Concern: No Tests for New FeatureThe PR adds a significant feature but includes no tests for:
Strongly recommend adding:
Additional Recommendations
ConclusionWell-structured feature that follows existing architecture, but has critical gaps:
Recommendation: Request changes - add tests and validation before merging. Review generated with Claude Code |
PR Review - Multi-Ranking Feature (RAPID-5915)SummaryThis PR introduces a new multi-ranking feature that allows multiple independent ranking sets to be processed in a single order. The implementation changes the Code Quality and Best PracticesStrengths:
Issues:
Potential Bugs and Issues
Performance Considerations
Security Concerns
Test CoverageCritical Gap:
Recommendations:
Additional Notes
SummaryThis is a solid feature implementation with good code structure. However, there are some concerns: Must Address:
Should Address:
Overall Assessment: The code quality is good, but the lack of tests and breaking changes are concerning. I recommend adding comprehensive tests before merging and considering the impact on existing users. |
No description provided.