-
Notifications
You must be signed in to change notification settings - Fork 6
Redesign. Dataloaders instead of datasets. Adapters instead of postprocessing and implicit behavior. Containers and our own validation instead of pydantic. Refactor runner and config. #142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis pull request introduces modifications across various files, primarily focusing on enhancing the Changes
Possibly related PRs
Suggested labels
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Maybe good to generate some tests for this too?
Absolutely, this was just a quick implementation without even running it, wanted to see what you think about this idea |
…ntiation of config, add enums, removed pydantic. Comment out postprocessing in system until it is refactored.
I realized that it was not a good idea, additional complexity to explain to the user. Resorted to simply defining dataloaders (look at the cifar10 YAML). Refactoring Lighter as a whole in this PR - WIP. |
…dapters now. Changed "id" to "identifier". Turned Metrics into an nn.Module container to remove the need for PatchedModuleDict
…container issue of 'train' being reserved in nn.Module by switching to a dataclass and manual registration.
…. Add back the support for multiple config files.
…lity, return all data from the _step no matter what mode. Add loss, metric, step, and epoch to Data enum. Switch str, Enum to StrEnum. Tuner.scale_batch_size() does not work since we switched to dataloaders in config - it needs read/write access to the batch size, figuring out the fix.
…all mode specific hooks in a `_setup_mode_hooks()`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (22)
lighter/engine/config.py (2)
16-49
: Consider verifying overrides for nonexistent keys, and add more thorough docstrings.
The constructor appropriately loads and parses configurations, then applies overrides. However, it might be beneficial to log or handle cases where overridden keys are not present in the underlying config structure to avoid user confusion. Additional docstrings or debug logs could clarify which keys are being overridden.
62-86
: Recursively formatting errors is helpful but could be expensive for large schemas.
The approach is fine for moderate configurations. If your config or schema grows, consider short-circuiting after a threshold or summarizing repeated patterns to avoid overly verbose error messages.lighter/utils/types/containers.py (2)
33-48
:Metrics
dataclass ensures consistent usage ofMetricCollection
.
Be mindful of edge cases if the user passes a single metric that is incompatible withMetricCollection
. Consider adding a small safety check or helpful error message for that scenario.
58-94
: Train/Val/Test/Predict dataclasses cleanly organize mode-specific adapters.
This division is intuitive and simple for maintainers. As usage grows, keep an eye on potential duplication across modes—common logic might be extracted if it grows large.lighter/engine/runner.py (2)
19-25
:Runner
constructor is minimal but functional.
Deferring initialization untilrun
is called seems logical. Consider logging or partial validation in__init__
if you expect future expansions, especially around concurrency or advanced scenario handling.
48-72
:_setup_stage
organizes stage-specific config and system/trainer initialization.
Extracting the project module path, system, and trainer from the config is a nice separation of concerns. Keep an eye on error handling for nonexistent or invalidproject
paths.lighter/system.py (5)
2-5
: Update docstring references to match new dataloader approach
The docstring still mentions "datasets" even though the code now usesdataloaders
. Let's maintain consistency by updating the description accordingly.-This module defines the System class, which encapsulates the components of a deep learning system, -including the model, optimizer, datasets, and more. It extends PyTorch Lightning's LightningModule. +This module defines the System class, which encapsulates the components of a deep learning system, +including the model, optimizer, dataloaders, and more. It extends PyTorch Lightning's LightningModule.
38-39
: Add documentation for theadapters
parameter
There is a TODO foradapters
, but no explanation is provided in the docstring. Consider adding a brief description for clarity on what adapters are and how they’re used.... adapters: dict[str, Callable] | None = None, ... + """ + adapters: A dictionary of adapters that define how batches, losses, and metrics + are prepared for different modes (train, val, test, predict). Each entry corresponds + to a mode and implements custom logic tailored for that mode. + """
172-184
: Handle missing batch size in_log
method
The logic assumes thatself.dataloaders.<mode>.batch_size
is always defined. If a dataloader is custom without a definedbatch_size
, this may break. Consider providing a fallback.def _log(self, name: str, value: Any, on_step: bool = False, on_epoch: bool = False) -> None: batch_size = getattr(self.dataloaders, self.mode).batch_size - self.log(name, value, logger=True, batch_size=batch_size, on_step=on_step, on_epoch=on_epoch, sync_dist=on_epoch) + effective_batch_size = batch_size if batch_size is not None else 1 + self.log(name, value, logger=True, batch_size=effective_batch_size, on_step=on_step, on_epoch=on_epoch, sync_dist=on_epoch)
227-248
: Consider standardizing or streamlining the dynamic hook assignments
Your_setup_mode_hooks
approach dynamically bindstraining_step
,validation_step
, etc. While flexible, it can be less discoverable than a conventional override approach. Evaluate if the dynamic approach aligns best with your maintainability goals or if a more explicit pattern suits your codebase better.
Line range hint
256-268
: Support multiple optimizer parameter groups for learning rate
Currently, setting or gettinglearning_rate
raises an error if multiple parameter groups exist. For advanced users, you might consider returning or accepting a dictionary of learning rates instead, though this may complicate usage.lighter/__init__.py (1)
12-16
: Consider adding exports to__all__
or removing if unused
Static analysis tags these imports as unused. If they're intentionally exposed for convenience, add them to__all__
; otherwise, remove them.__version__ = "0.0.3a11" __all__ = [ + "Trainer", + "Config", + "Runner", + "System", ]🧰 Tools
🪛 Ruff (0.8.2)
12-12:
pytorch_lightning.Trainer
imported but unused(F401)
14-14:
.engine.config.Config
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
15-15:
.engine.runner.Runner
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
16-16:
.system.System
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
lighter/utils/types/enums.py (1)
4-11
: Consider simplifying the docstring
WhileStrEnum
is self-explanatory, it would be helpful to include a concise example of usage in the docstring to guide new contributors on how to define or use these enumerations.tests/unit/test_callbacks_utils.py (1)
Line range hint
5-22
: Commendable single-image reshape test
This test effectively validates reshaping of a single 3D image. As a good practice, consider adding checks for edge cases, such as an empty tensor or unexpected dimensions, to harden your test coverage.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
pytest
imported but unusedRemove unused import:
pytest
(F401)
lighter/engine/resolver.py (3)
5-8
: Docstring clarity
The class docstring concisely outlines the purpose ofResolver
. Consider explicitly mentioning how this resolver interacts with adapters or dataloaders for a more comprehensive overview.
10-17
: Stage-to-mode mapping
This mapping design is straightforward. If you add or rename stages or modes, check for possible mismatches to prevent silent errors. Implementing a quick validation test or assertion in the constructor might reduce debugging time.
22-25
: ValueError check
Raising aValueError
for an invalid stage is good. Consider customizing the error message further by appending the actual input stage for clearer debugging.tests/unit/test_callbacks_writer_table.py (1)
9-10
: Remove unused import.
Static analysis flagsSystem
as unused. Consider removing it to clean up imports:- from lighter.system import System
🧰 Tools
🪛 Ruff (0.8.2)
10-10:
lighter.system.System
imported but unusedRemove unused import:
lighter.system.System
(F401)
lighter/engine/schema.py (2)
1-3
: Introductory fields_meta_
and_requires_
.
Moving away from Pydantic to a dict scheme might reduce overhead but be mindful of thorough validation logic in subsequent code.
17-17
:trainer
schema.
A generic dict type is fine, but consider more validation if needed (e.g., max epochs type check).projects/cifar10/experiments/example.yaml (1)
3-15
: Consider enabling GPU acceleration.The trainer is configured to use CPU (
accelerator: cpu
). For better performance, consider enabling GPU acceleration by uncommenting thedevices
andstrategy
configurations.lighter/callbacks/__init__.py (1)
1-3
: Add__all__
declaration for exported symbols.The static analysis indicates that these imports are unused. Since these are likely meant to be exported symbols, they should be declared in
__all__
.Add this at the beginning of the file:
+__all__ = ["Freezer", "FileWriter", "TableWriter"] + from .freezer import Freezer from .writer.file import FileWriter from .writer.table import TableWriter🧰 Tools
🪛 Ruff (0.8.2)
1-1:
.freezer.Freezer
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
2-2:
.writer.file.FileWriter
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
3-3:
.writer.table.TableWriter
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (43)
.gitignore
(1 hunks)README.md
(1 hunks)docs/advanced/callbacks.md
(1 hunks)docs/basics/config.md
(10 hunks)docs/basics/projects.md
(2 hunks)docs/basics/quickstart.md
(4 hunks)docs/index.md
(2 hunks)lighter/__init__.py
(1 hunks)lighter/adapters.py
(1 hunks)lighter/callbacks/__init__.py
(1 hunks)lighter/callbacks/freezer.py
(4 hunks)lighter/callbacks/utils.py
(0 hunks)lighter/callbacks/writer/base.py
(4 hunks)lighter/callbacks/writer/file.py
(3 hunks)lighter/callbacks/writer/table.py
(4 hunks)lighter/engine/config.py
(1 hunks)lighter/engine/resolver.py
(1 hunks)lighter/engine/runner.py
(1 hunks)lighter/engine/schema.py
(1 hunks)lighter/system.py
(2 hunks)lighter/utils/collate.py
(0 hunks)lighter/utils/dynamic_imports.py
(1 hunks)lighter/utils/logging.py
(2 hunks)lighter/utils/misc.py
(1 hunks)lighter/utils/model.py
(1 hunks)lighter/utils/patches.py
(0 hunks)lighter/utils/types/containers.py
(1 hunks)lighter/utils/types/enums.py
(1 hunks)projects/cifar10/experiments/example.yaml
(1 hunks)projects/cifar10/experiments/example_overrides.yaml
(1 hunks)projects/cifar10/experiments/monai_bundle_prototype.yaml
(0 hunks)pyproject.toml
(1 hunks)tests/integration/test_cifar.py
(1 hunks)tests/unit/test_callbacks_freezer.py
(8 hunks)tests/unit/test_callbacks_utils.py
(1 hunks)tests/unit/test_callbacks_writer_base.py
(5 hunks)tests/unit/test_callbacks_writer_file.py
(13 hunks)tests/unit/test_callbacks_writer_table.py
(6 hunks)tests/unit/test_engine_runner.py
(9 hunks)tests/unit/test_engine_schema.py
(0 hunks)tests/unit/test_system.py
(13 hunks)tests/unit/test_utils_data.py
(1 hunks)tests/unit/test_utils_misc.py
(1 hunks)
💤 Files with no reviewable changes (5)
- tests/unit/test_engine_schema.py
- lighter/utils/collate.py
- lighter/callbacks/utils.py
- lighter/utils/patches.py
- projects/cifar10/experiments/monai_bundle_prototype.yaml
✅ Files skipped from review due to trivial changes (5)
- docs/advanced/callbacks.md
- lighter/utils/dynamic_imports.py
- lighter/utils/model.py
- tests/unit/test_utils_data.py
- tests/unit/test_callbacks_freezer.py
🧰 Additional context used
🪛 Ruff (0.8.2)
lighter/__init__.py
12-12: pytorch_lightning.Trainer
imported but unused
(F401)
14-14: .engine.config.Config
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
15-15: .engine.runner.Runner
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
16-16: .system.System
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
tests/unit/test_callbacks_writer_table.py
10-10: lighter.system.System
imported but unused
Remove unused import: lighter.system.System
(F401)
lighter/adapters.py
240-243: Combine if
branches using logical or
operator
Combine if
branches
(SIM114)
lighter/callbacks/__init__.py
1-1: .freezer.Freezer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
2-2: .writer.file.FileWriter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
3-3: .writer.table.TableWriter
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
🪛 Markdownlint (0.37.0)
README.md
46-46: Expected: 0; Actual: 1
Unordered list indentation
(MD007, ul-indent)
docs/basics/config.md
84-84: Expected: fenced; Actual: indented
Code block style
(MD046, code-block-style)
🔇 Additional comments (121)
lighter/utils/misc.py (2)
50-55
: Renaming_callable
tofn
for improved clarity.
No issues spotted. This rename makes the parameter name more descriptive.
61-61
: Retaining the same internal logic.
Even though the parameter name changed, the function’s logic and returned result remain untouched.tests/unit/test_utils_misc.py (1)
7-7
: Removingapply_fns
from the import impacts dependent code.
With the function gone fromlighter.utils.misc
, ensure that no other code referencesapply_fns
.As a follow-up, here is a quick script to check for references across the entire codebase:
✅ Verification successful
No remaining references to
apply_fns
found in the codebase
The removal ofapply_fns
fromlighter.utils.misc
is safe as there are no remaining references to this function in the codebase.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Searching all references to `apply_fns` in the codebase. rg -A 3 'apply_fns'Length of output: 19
Script:
#!/bin/bash # Search for any remaining references to apply_fns through various patterns echo "=== Searching direct imports ===" rg "from lighter.utils.misc import.*apply_fns" echo -e "\n=== Searching module-level access ===" rg "misc\.apply_fns" echo -e "\n=== Searching in all Python files for any mention of apply_fns ===" fd -e py -x grep -l "apply_fns" {} \; echo -e "\n=== Checking AST for function calls ===" ast-grep --pattern 'apply_fns($_)'Length of output: 559
lighter/engine/config.py (3)
3-5
: Imports look appropriate for configuration parsing and validation.
No immediate issues found with these imports. Integration withcerberus
andConfigParser
is clear and well-contained.
9-14
: Custom exception class is well-defined and meaningful.
Raising aConfigurationException
when validation fails is an effective approach to surface configuration errors in a user-friendly manner.
51-60
: Clarify usage betweenget
andget_parsed_content
.
The distinction between raw vs. parsed content is sensible, but it could be confusing for maintainers or new contributors. Consider adding docstring examples or references to possible use cases.lighter/utils/types/containers.py (3)
10-31
:nested
decorator is a clever way to instantiate nested dataclasses automatically.
This preserves readability in user-facing code and keeps your nested structures consistent. No issues noted.
50-57
:DataLoaders
placeholders are clear, but test coverage is recommended.
Given their central role, it’s crucial to confirm that these placeholders work as expected for different or atypical loading processes.
95-103
:Adapters
class effectively unifies all modes under a nested configuration.
The composition approach is straightforward. No major concerns here; keep maintaining type hints and thorough testing to ensure correctness.lighter/engine/runner.py (3)
26-39
:run
method handles seeding, config parsing, and stage resolution in a linear flow.
This design is clear. Double-check that callingrun
multiple times re-initializes properly, especially regarding seeds or global states, in case you ever need repeated runs in the same process.
40-47
:_run_stage
elegantly delegates toTuner
orTrainer
.
The approach is straightforward. Just ensure robust error handling if stage names deviate from the known list.
81-111
:cli
function offers a convenient entry point for multiple commands.
Mappingfire.Fire
subcommands to internal methods is clean. No further comments; just ensure your docstrings align with each subcommand for clarity.lighter/adapters.py (3)
6-42
: Well-structured transforms application
The_TransformsAdapter
class cleanly handles multiple functions and applies them in sequence. No concerns regarding functionality or clarity.
153-157
: Proper validation of transform usage
Good use ofValueError
to prevent misconfiguration (e.g., transform supplied but no corresponding argument). This is an appropriate safeguard, ensuring consistent usage.
240-243
: Static analysis recommendation is likely not beneficial
A tool suggests combining these branches withor
, but each branch handles a distinct type of accessor (index vs. key). Merging them could reduce clarity.🧰 Tools
🪛 Ruff (0.8.2)
240-243: Combine
if
branches using logicalor
operatorCombine
if
branches(SIM114)
tests/integration/test_cifar.py (1)
16-16
: Confirm the new config file path is accurate
It appears the configuration file reference has changed to"./projects/cifar10/experiments/example.yaml"
for all three method calls (fit
,test
, andpredict
). Please ensure that:
- The new file path exists and is spelled correctly.
- The file contains all necessary configuration sections for each operation.
- The tests accurately reflect the project’s current settings.
Also applies to: 21-21, 26-26
lighter/utils/types/enums.py (4)
1-2
: Consider backward compatibility or third-party usage
The direct reference to Python’s built-inEnum
is fine. If your project is used by third-party code that relies on a different approach to enumerations, consider documenting possible constraints or usage patterns.
14-23
: Validate data labeling usage
Enums likeIDENTIFIER
,INPUT
,TARGET
, etc., clarify key data pipeline elements. Ensure these string constants are used consistently across the codebase (e.g., in custom adapters, dataloaders, or writer classes) to avoid “magic strings.”
25-32
: Stages enumeration seems comprehensive
The enumeration for possible pipeline stages is well-structured. Double-check usage in your new resolver or adapters to ensure that changing or omitting any custom stage won't break downstream logic.
34-38
: Confirm Mode alignment with Stage
Mode
enumerations and their usage appear consistent with theStage
definitions. Verify that any future expansions (for instance, an auxiliary mode for additional transformations) remain properly mapped in the resolver.tests/unit/test_callbacks_utils.py (2)
4-4
: Ensure consistent imports across the codebase
The new import forpreprocess_image
looks good. Confirm that other modules referencingpreprocess_image
have also been updated to reflect its location inlighter.callbacks.utils
.
Line range hint
23-42
: Comprehensive batch-3D reshape test
Testing a batch of 3D images is a great way to ensure correctness at scale. If the code allows for varied batch sizes, consider parameterizing the test to ensure consistent reshaping across different batch dimensions.🧰 Tools
🪛 Ruff (0.8.2)
1-1:
pytest
imported but unusedRemove unused import:
pytest
(F401)
lighter/engine/resolver.py (2)
1-2
: Imports are aligned with the new architecture
UsingConfig
fromlighter.engine.config
and the new enums fromlighter.utils.types.enums
reflects the revised structure. No issues found.
19-21
: Initializer
The constructor neatly stores theConfig
. You might add optional type checking or ensure thatconfig
is notNone
to avoid runtime errors in the subsequent method calls.lighter/callbacks/writer/table.py (6)
2-2
: Docstring addition looks good.
This top-level docstring concisely describes the module’s purpose and is consistent with the new naming conventions.
14-15
: Imports align with the refactoring from LighterSystem to System.
These imports are consistent with the new structure and class names in the refactor.
18-18
: Renaming class toTableWriter
is consistent with the newBaseWriter
usage.
The inheritance fromBaseWriter
aligns well with the overall refactoring strategy.
37-45
: Param rename fromid
toidentifier
improves clarity.
Renaming the parameter toidentifier
and storing it incsv_records
clarifies the record’s purpose.
47-53
: Updated reference toSystem
is consistent with class renaming.
Switching thepl_module
parameter type fromLighterSystem
toSystem
maintains consistency across the refactor.
66-69
: Sorting and setting index byidentifier
might cause collisions if duplicates exist.
While the logic appears correct for unique identifiers, having multiple rows with the same identifier will overwrite or merge indices in the DataFrame. Consider verifying uniqueness if that’s a requirement.lighter/utils/logging.py (2)
1-2
: Top-level docstring provides clear overview of the logging module.
This helps maintainable documentation and quickly orients new contributors to the logging utilities.
80-81
: Switch toimportlib.import_module
aids clarity and maintainability.
This is a preferable replacement to direct__import__
calls and keeps the code more consistent.tests/unit/test_callbacks_writer_base.py (6)
8-8
: Direct import ofBaseWriter
aligns with the new class naming convention.
This import change is consistent with the project-wide refactoring.
22-24
:MockWriter
derivation fromBaseWriter
is a clear test approach.
Adhering toBaseWriter
ensures tests remain aligned with the abstract writer interface.
40-46
: Parameter rename toidentifier
in thewrite
method matches production code changes.
Maintaining consistent naming in test mocks reduces confusion and prevents mismatches during integration.
69-69
: Testing instantiation constraints ofBaseWriter
.
RaisingTypeError
on directBaseWriter
instantiation is the correct behavior for an abstract base class.
95-95
: Initializingoutputs
withidentifier
key clarifies usage expectations.
This ensures test alignment with the refactored writer logic.
101-101
: Verifyingidentifier
assignment inon_predict_batch_end
.
Asserting the predicted IDs remain tracked is a correct approach to ensure the callback logic.lighter/callbacks/writer/file.py (6)
2-2
: Docstring clearly statesFileWriter
’s purpose.
This aids discoverability and understanding of its file-based writing approach.
16-16
: ImportingBaseWriter
aligns with the refactored writer hierarchy.
20-20
:FileWriter
consistently inherits fromBaseWriter
.
Renaming fromLighterFileWriter
toFileWriter
follows the new naming scheme.
51-57
: Renamingid
toidentifier
ensures param clarity.
It better communicates the purpose for file naming.
60-60
: RuntimeError emphasizesFileWriter
directory requirement.
This explicit error message helps users diagnose invalid paths quickly.
63-63
: Constructing the file path fromidentifier
is a logical approach.
Appending the parameter to the directory path helps produce a clear naming structure for outputs.lighter/callbacks/writer/base.py (7)
17-18
: Consider verifying usage vs. overhead of specialized imports.
These imports look correct and are actively used throughout this file. However, reconfirm that bothSystem
andData, Stage
provide genuine benefit here, and avoid extraneous imports if usage evolves.
21-21
: Good renaming for clarity.
Renaming the class toBaseWriter
aligns well with its purpose and the rename conventions seen across the codebase.
Line range hint
57-67
: Method signature alignment looks good.
Switching fromid
toidentifier
is more descriptive and consistent with the docstring. The parameter’s usage for naming or record-keeping is clearly documented.
70-79
: Practical stage check.
The early return forstage != Stage.PREDICT
is sensible to bypass unnecessary setup steps in other phases. The docstring referencespl_module: System
correctly, matching the new naming.
106-113
: Consistent callback method signature.
UsingSystem
in place of the olderLighterSystem
is coherent with the rename efforts, aligning the type hints with the actual code usage.
120-129
: Global ID generation logic seems correct.
The approach ensures a unique global range in multi-process scenarios. Consider verifying thatoutputs[Data.PRED]
always has the expected size to avoid potential edge cases with an empty batch.
132-133
: Ensure lengths match between predictions and identifiers.
When zippingoutputs[Data.PRED]
andoutputs[Data.IDENTIFIER]
, a mismatch would cause a runtime error. Confirm each batch’s identifier list precisely matches its predictions list.tests/unit/test_callbacks_writer_table.py (11)
28-28
: Clear test docstring.
“Test proper initialization of TableWriter” is concise and informative. No issues found.
34-34
: TableWriter instantiation is straightforward.
Good practice: verifying that thepath
is converted to aPath
object.
40-40
: Descriptive test docstring.
This clarifies the custom writer functionality test well.
47-49
: Custom writer integration works.
Populatingcsv_records
with the processed data is aligned with the writer’s purpose.
55-59
: Multifunction docstring.
The multi-case coverage in these lines is thorough, ensuring the test scenario is well-documented.
71-76
: Handled multiple record formats.
Creating an array of differentidentifier
types is a robust approach to testing varied data.
79-82
: Incremental assertion pattern.
Asserting intermediate results ensures swift detection of unexpected changes. The approach is well-structured.Also applies to: 85-86
93-97
: Identifier column cast logic.
Casting to string before re-checking each record is a reliable approach to match actual vs. expected data.
108-108
: Well-labeled test scenario for multi-process usage.
The docstring clarifies distributed environment tests. This is a solid pattern to ensure data consistency under concurrency.
130-135
: Collective record gathering simulation.
Using mocked rank data accurately tests multi-process record merging. This is beneficial for distributed setups.
159-164
: Comprehensive data verification.
Casting identifiers and evaluating predictions ensures the final CSV is correct for every rank’s records.lighter/callbacks/freezer.py (7)
2-2
: Updated documentation refers to new class name.
Renaming toFreezer
is consistent with the overall rename strategy. The docstring matches the new class name.
11-11
: Import usage validated.
System
is indeed used to differentiate logic in_set_model_requires_grad
. Keep the import.
15-15
: Class rename clarity.
Transition fromLighterFreezer
toFreezer
improves naming uniformity.
60-66
: Train batch start logic.
Switchingpl_module
toSystem
is consistent with refactoring. The docstring and parameter usage match.
73-73
: Consistent invocation for validation, test, predict.
Repeating_on_batch_start
ensures a uniform freezing mechanism across all stages.Also applies to: 78-78, 83-83
87-93
: Encapsulated freeze/unfreeze logic.
Centrally handling the freeze logic in_on_batch_start
allows consistent behavior. No issues found.
113-113
: Sensible fallback to the underlying model.
Unwrappingpl_module: System
into its.model
field is consistent and ensures the correct parameters are frozen. Use caution for any advanced or nested model structures.Also applies to: 121-122
tests/unit/test_callbacks_writer_file.py (14)
10-10
: Imported writer references match new naming.
FileWriter
reference is properly aligned with the rename. Import usage is valid.
14-17
: Self-documenting tests.
Mentioning the exact checks for path, writer, and directory creation helps future maintainers quickly identify what’s tested.
27-27
: FileWriter instantiation with “tensor”.
Straightforward approach for verifying correct writer function association.
35-35
: Clear test rationale.
Ensures coverage of essential tensor writing behavior, including file existence and data integrity upon load.
48-50
: Direct saving & re-loading.
This is a robust approach to confirm that the saved file can be correctly interpreted by torch later.
64-64
: Comprehensive image test docstring.
Explicitly stating shape, format, and dimension checks ensures clarity for new contributors.
77-79
: Saving as PNG.
Using"image"
writer for an RGB tensor is consistent. The approach is verified by subsequent shape checks.
94-94
: Video test coverage.
It’s good that multiple frames, dimension checks, and existence checks are included in the docstring.
106-108
: Video writer usage.
Properly verifying file presence after writing. The test ensures the output is indeed an.mp4
.
118-118
: Grayscale video scenario.
Highlighting the single-channel input format is valuable for testing robust dimension handling.
131-134
: Confirming grayscale to RGB conversion.
Ensures edge-case coverage. If future logic changes for single-channel data, this test will catch regressions.
144-144
: ITK image docstring.
Providing explicit type constraints (MONAI MetaTensor) is especially helpful to new readers.
157-157
: Distinguishing MONAI MetaTensor usage.
The test ensures that invalid formats raise an error, while valid MetaTensors are stored. This fosters clarity around custom data workflows.Also applies to: 162-166
176-176
: Error handling tests.
Verifying runtime errors when file paths are mistaken for directories or when directories don’t exist. Clear messages guide users effectively.Also applies to: 189-191, 196-197
tests/unit/test_engine_runner.py (10)
7-7
: Rename from LighterSystem to System looks good.
This import aligns correctly with the updated class name and references.
13-18
: Docstring updated to reference the new class name.
The docstring and return statement now consistently refer toSystem
, matching the rename.
74-74
: Docstring parameter reference updated.
Ensure the mention ofmock_system
aligns with standard naming conventions. Currently, the naming is consistent with usage.
123-123
: Docstring parameter reference updated for newSystem
.
No issues noted, good consistency throughout the tests.
151-151
: Docstring formock_system
updated.
All references toLighterSystem
are removed.
184-184
: Docstring clarifies ValueError condition for invalid System.
This helps maintain clarity for test expectations.
194-194
: Exception message updated to referenceSystem
.
The raisedValueError
message is now accurate and consistent with the new class name.
203-203
: Docstring parameter reference formock_system
is now coherent.
No concerns here; it remains consistent with the rest of the file.
225-225
: Docstring parameter reference formock_system
.
All naming changes appear correct.
253-253
: Docstring parameter reference formock_system
.
No additional issues.tests/unit/test_system.py (14)
12-12
: Import ofSystem
in place ofLighterSystem
.
This import aligns with the project-wide rename effort.
56-56
: InstantiatingSystem
with new naming and batch size.
This ensures consistency with the refactored usage. Looks good.
68-68
: Refactoring toSystem
with a dict-based loss.
The logic for using a dictionary-based criterion is consistent with the approach in other tests.
85-85
: Refactoring toSystem
with an invalid dict criterion.
The test properly handles the invalid scenario by raising aValueError
.
105-106
: Additional check onidentifier
in the batch.
Now thatidentifier
is a recognized field, this error-handling improves robustness.
181-181
: Test for step and epoch injection inforward
withSystem
.
Ensures advanced usage of partial arguments in the forward pass. Nicely handled.
204-204
: MinimalSystem
instantiation.
Emptydatasets
dict triggers an error, which clarifies the need for at least a 'train' dataset.
221-221
: Criterion using only predictions (System
).
The test covers an edge case where target might be absent. Good coverage.
237-237
: Batch type validation updated forSystem
.
This enforces stronger checks on the input batch, aligning with the new approach.
258-258
: Postprocessing dictionary forSystem
.
The approach is consistent with the newly introduced adapters concept.
287-287
: Multiple optimizer parameter groups error.
Clear error message prevents ambiguous learning rate usage.
298-298
: No optimizer scenario forSystem
.
The test logic ensures aValueError
clarifies the missing optimizer.
304-304
: Refactored metric usage withSystem
.
Integration with PyTorch Lightning metrics indicates consistent usage.
29-29
: Changed key name toidentifier
in the dataset.
Be sure all references in code and tests are updated toidentifier
to avoid mismatches.✅ Verification successful
All references to
identifier
are consistent across the codebaseThe key name change has been properly implemented throughout the codebase. The key is defined as an enum
Data.IDENTIFIER
and is consistently used in all relevant files, including tests and callbacks. No remaining references to the oldid
key were found.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that `id_` references are updated across the codebase rg 'data\["id"\]' -A2 -B2 rg 'data\.get\("id' -A2 -B2Length of output: 57
Script:
#!/bin/bash # Search for both old and new key patterns with broader context echo "=== Searching for 'id' key patterns ===" rg -i "[\"\']id[\"\']" -A3 -B3 echo -e "\n=== Searching for 'identifier' key patterns ===" rg -i "[\"\']identifier[\"\']" -A3 -B3 echo -e "\n=== Searching in Python files for dictionary access patterns ===" rg "\.get\(['\"](?:id|identifier)['\"]|[\[\'](?:id|identifier)[\]\']" --type py -A3 -B3Length of output: 873
Script:
#!/bin/bash # Search for dictionary access patterns with simpler regex echo "=== Searching for 'id' key in dictionary access ===" rg '"id"' --type py -A2 -B2 rg "'id'" --type py -A2 -B2 echo -e "\n=== Searching for 'identifier' key in dictionary access ===" rg '"identifier"' --type py -A2 -B2 rg "'identifier'" --type py -A2 -B2 echo -e "\n=== Searching for .get() patterns ===" rg '\.get\(' --type py -A2 -B2Length of output: 12375
projects/cifar10/experiments/example_overrides.yaml (1)
1-2
: Added trainer and system overrides in YAML.
Specifying a single epoch and a new optimizer target for CIFAR-10 is straightforward. Ensure usage aligns with any updated config schema.lighter/engine/schema.py (3)
4-5
:project
andvars
fields.
No immediate concerns; ensure they match the actual usage in the code and documentation.
6-16
:args
schema definitions.
Splitting subcommands likefit
,validate
,test
, etc., underargs
is a good approach for clarity.
18-53
:system
schema with nested fields.
This structure is significantly more flexible than Pydantic-based classes. Carefully confirm runtime validation to avoid silent misconfigurations.✅ Verification successful
Schema validation is properly implemented and enforced
The configuration system uses Cerberus for strict runtime validation with proper error handling and detailed error messages. The implementation includes:
- Mandatory schema validation during config initialization
- Custom exception handling with formatted validation errors
- Comprehensive real-world examples demonstrating correct usage
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Confirm references to the new dictionary schema usage. rg 'SCHEMA' -A 10 -B 10Length of output: 2413
Script:
#!/bin/bash # Check Config class implementation and tests ast-grep --pattern 'def validate' rg -g '*.py' 'def validate' -A 5 -B 5 # Look for test files fd test_config.py # Look for example configurations fd -e yml -e yamlLength of output: 923
Script:
#!/bin/bash # Check test and example configurations cat tests/configs/test1.yaml cat projects/cifar10/experiments/example.yaml # Look for Config class validation implementation ast-grep --pattern 'class Config { $$$ def __init__($$$) { $$$ } $$$ }' rg -g 'config.py' 'class Config' -A 50Length of output: 7256
.gitignore (1)
153-153
: LGTM! Improved ignore patterns.The changes enhance the ignore patterns by:
- Using
**/.DS_Store
to ignore.DS_Store
files at any directory level- Adding
.code-workspace
to ignore VS Code workspace files- Adding
.scale_batch_size*
to ignore batch size scaling artifactsAlso applies to: 156-157
projects/cifar10/experiments/example.yaml (3)
31-51
: LGTM! Comprehensive metrics configuration.Excellent setup of metrics with consistent configuration across train, val, and test phases. The use of macro averaging and proper task specification for multiclass classification is appropriate.
96-104
: LGTM! Clear adapter configuration.The batch adapter configuration is well-structured with proper input and target accessors.
52-95
: Verify memory requirements for the batch size.The batch size of 512 with CIFAR10 images might require significant memory. Consider implementing a batch size finder or memory estimation.
pyproject.toml (1)
196-197
: Verify the necessity of redefining 'input' built-in.The addition of 'input' to
allowed-redefined-builtins
suggests that the code redefines the built-ininput
function. This could lead to confusion and potential issues.✅ Verification successful
The redefinition of 'input' built-in is justified
The codebase consistently uses 'input' as a parameter name in data processing methods, following a clear pattern (input, target, pred) that enhances code readability. The redefinitions in adapters.py are part of data transformation logic where 'input' naturally describes the processed data.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for redefinitions of 'input' echo "Searching for redefinitions of 'input'..." rg -n "(?:^|\s+)input\s*=[^=]" --type py echo "Searching for 'input' as parameter name..." rg -n "def\s+\w+\s*\([^)]*\binput\b[^)]*\)" --type pyLength of output: 1415
docs/index.md (1)
31-31
: LGTM! Consistent naming updates.Documentation updates correctly reflect the rename from
LighterSystem
toSystem
, maintaining consistency with the codebase changes.Also applies to: 110-110
docs/basics/projects.md (1)
18-18
: LGTM! Documentation alignment with new naming.The documentation correctly reflects the system class renaming, maintaining consistency across the project.
Also applies to: 106-106
docs/basics/quickstart.md (1)
38-41
: 🛠️ Refactor suggestionDocumentation needs update to reflect architectural changes.
The System component description needs to be updated to align with the PR's architectural changes, particularly regarding the replacement of datasets with dataloaders.
README.md (1)
46-46
: LGTM!The renaming from "LighterSystem" to "System" is consistent with the overall refactoring effort.
🧰 Tools
🪛 Markdownlint (0.37.0)
46-46: Expected: 0; Actual: 1
Unordered list indentation(MD007, ul-indent)
docs/basics/config.md (1)
74-77
: 🛠️ Refactor suggestionUpdate System description to reflect new architecture.
The description needs to be updated to reflect the architectural changes mentioned in the PR objectives, particularly:
- The replacement of datasets with dataloaders
- The use of adapters instead of postprocessing
- The removal of Pydantic in favor of containers and manual validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
lighter/engine/schema.py (1)
24-24
: Add validation for optimizer configuration.The optimizer dictionary should validate common parameters like learning rate and weight decay.
-"optimizer": {"type": "dict"}, +"optimizer": { + "type": "dict", + "schema": { + "_target_": {"type": "string", "required": True}, + "lr": {"type": "float", "min": 0, "required": True}, + "weight_decay": {"type": "float", "min": 0}, + "momentum": {"type": "float", "min": 0} + } +},
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lighter/engine/config.py
(1 hunks)lighter/engine/schema.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- lighter/engine/config.py
🧰 Additional context used
📓 Learnings (1)
lighter/engine/schema.py (1)
Learnt from: ibro45
PR: project-lighter/lighter#142
File: lighter/engine/schema.py:0-0
Timestamp: 2025-01-27T02:17:12.626Z
Learning: The adapters schema in Lighter requires:
- batch and logging keys for all modes (train, val, test, predict)
- criterion key only for train and val modes
- metrics key for train, val, and test modes but not for predict mode
🔇 Additional comments (3)
lighter/engine/schema.py (3)
17-17
: Add schema validation for trainer configuration.The trainer configuration is marked as required but lacks schema validation for critical parameters.
35-43
: Documentation needs updating to reflect dataloader configuration.
44-81
: 🛠️ Refactor suggestionUpdate adapters schema to enforce required fields.
Based on the retrieved learnings, the schema should enforce required fields for each mode:
- batch and logging keys for all modes
- criterion key only for train and val modes
- metrics key for train, val, and test modes
"adapters": { "type": "dict", "schema": { "train": { "type": "dict", "schema": { - "batch": {"type": "dict"}, - "criterion": {"type": "dict"}, - "metrics": {"type": "dict"}, - "logging": {"type": "dict"} + "batch": {"type": "dict", "required": True}, + "criterion": {"type": "dict", "required": True}, + "metrics": {"type": "dict", "required": True}, + "logging": {"type": "dict", "required": True} } }, "val": { "type": "dict", "schema": { - "batch": {"type": "dict"}, - "criterion": {"type": "dict"}, - "metrics": {"type": "dict"}, - "logging": {"type": "dict"} + "batch": {"type": "dict", "required": True}, + "criterion": {"type": "dict", "required": True}, + "metrics": {"type": "dict", "required": True}, + "logging": {"type": "dict", "required": True} } }, "test": { "type": "dict", "schema": { - "batch": {"type": "dict"}, - "metrics": {"type": "dict"}, - "logging": {"type": "dict"} + "batch": {"type": "dict", "required": True}, + "metrics": {"type": "dict", "required": True}, + "logging": {"type": "dict", "required": True} } }, "predict": { "type": "dict", "schema": { - "batch": {"type": "dict"}, - "logging": {"type": "dict"} + "batch": {"type": "dict", "required": True}, + "logging": {"type": "dict", "required": True} } } } }Likely invalid or redundant comment.
lighter/engine/schema.py
Outdated
"args": { | ||
"type": "dict", | ||
"schema": { | ||
"fit": {"type": "dict"}, | ||
"validate": {"type": "dict"}, | ||
"test": {"type": "dict"}, | ||
"predict": {"type": "dict"}, | ||
"lr_find": {"type": "dict"}, | ||
"scale_batch_size": {"type": "dict"} | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation for args operation modes.
The args schema allows any dictionary for operation modes (fit, validate, test, etc.). Consider adding validation for common parameters:
- Batch size limits
- Number of devices/workers
- Other mode-specific parameters
"args": {
"type": "dict",
"schema": {
- "fit": {"type": "dict"},
+ "fit": {
+ "type": "dict",
+ "schema": {
+ "batch_size": {"type": "integer", "min": 1},
+ "num_workers": {"type": "integer", "min": 0},
+ "accelerator": {"type": "string", "allowed": ["cpu", "gpu", "tpu"]},
+ "devices": {"type": ["integer", "list"], "min": 1}
+ }
+ },
"validate": {"type": "dict"},
"test": {"type": "dict"},
"predict": {"type": "dict"},
"lr_find": {"type": "dict"},
"scale_batch_size": {"type": "dict"}
}
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
"args": { | |
"type": "dict", | |
"schema": { | |
"fit": {"type": "dict"}, | |
"validate": {"type": "dict"}, | |
"test": {"type": "dict"}, | |
"predict": {"type": "dict"}, | |
"lr_find": {"type": "dict"}, | |
"scale_batch_size": {"type": "dict"} | |
} | |
}, | |
"args": { | |
"type": "dict", | |
"schema": { | |
"fit": { | |
"type": "dict", | |
"schema": { | |
"batch_size": {"type": "integer", "min": 1}, | |
"num_workers": {"type": "integer", "min": 0}, | |
"accelerator": {"type": "string", "allowed": ["cpu", "gpu", "tpu"]}, | |
"devices": {"type": ["integer", "list"], "min": 1} | |
} | |
}, | |
"validate": {"type": "dict"}, | |
"test": {"type": "dict"}, | |
"predict": {"type": "dict"}, | |
"lr_find": {"type": "dict"}, | |
"scale_batch_size": {"type": "dict"} | |
} | |
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ibro45
I think over time, if we populate this schema, this would allow users to know the config system possibilities at a glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
I'm not sure about it. I haven't used it and I don't see it in the repo either. Maybe you'll need to look into the Makefile changes Correction: I do see it now. Its based on poetry that's why it doesn't work. I vote to remove this unless you think its useful |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (6)
.pre-commit-config.yaml (1)
Line range hint
1-26
: Consider reinstating the pyupgrade hook.The removal of the pyupgrade hook might impact Python version compatibility checks. Consider keeping it to ensure code stays compatible with the supported Python versions.
Add this configuration to reinstate the pyupgrade hook:
- repo: local hooks: + - id: pyupgrade + name: pyupgrade + entry: uvx run pyupgrade --py310-plus + types: [python] + language: systemCONTRIBUTING.md (3)
38-38
: Add missing comma after introductory phrase.Add a comma after "After installation" for better readability.
-After installation you may execute code formatting: +After installation, you may execute code formatting:🧰 Tools
🪛 LanguageTool
[uncategorized] ~38-~38: A comma is probably missing here.
Context: ...commit install## Codestyle After installation you may execute code formatting:
ba...(MISSING_COMMA_AFTER_INTRODUCTORY_PHRASE)
63-75
: Improve the submission checklist formatting.The submission steps should use infinitive form for consistency.
Before submitting your code: -1. Add any changes you want -2. Add tests for the new changes -3. Edit documentation if you have changed something significant -4. Format all your code +1. To add your changes +2. To add tests for the new changes +3. To edit documentation for significant changes +4. To format all your code ```bash make codestyle-5. Run all checks:
+5. To run all checks:make lint<details> <summary>🧰 Tools</summary> <details> <summary>🪛 LanguageTool</summary> [grammar] ~65-~65: The verb “Add” needs to be in the to-infinitive form. Context: ...ting your code: 1. Add any changes you want 2. Add tests for the new changes 3. Edit doc... (MISSING_TO_BEFORE_A_VERB) </details> </details> --- `92-92`: **Consider using requirements file for documentation dependencies.** Instead of installing documentation dependencies directly, consider using a `docs/requirements.txt` file for better maintainability. 1. Create `docs/requirements.txt`:
mkdocs-material
mkdocs-autorefs
mkdocstrings
mkdocs-gen-files
mkdocs-literate-nav
mkdocs-section-index2. Update the installation command: ```diff -uv pip install mkdocs-material mkdocs-autorefs mkdocstrings mkdocs-gen-files mkdocs-literate-nav mkdocs-section-index +uv pip install -r docs/requirements.txt
README.md (2)
46-46
: Fix list indentation.The unordered list item should not be indented.
- - [PyTorch Lightning](https://github.com/Lightning-AI/lightning) - Our [`System`](https://project-lighter.github.io/lighter/reference/system/) is based on the PyTorch Lightning [`LightningModule`](https://lightning.ai/docs/pytorch/stable/common/lightning_module.html) and implements all the necessary training logic for you. Couple it with the PyTorch Lightning [Trainer](https://lightning.ai/docs/pytorch/stable/common/trainer.html) and you're good to go. + - [PyTorch Lightning](https://github.com/Lightning-AI/lightning) - Our [`System`](https://project-lighter.github.io/lighter/reference/system/) is based on the PyTorch Lightning [`LightningModule`](https://lightning.ai/docs/pytorch/stable/common/lightning_module.html) and implements all the necessary training logic for you. Couple it with the PyTorch Lightning [Trainer](https://lightning.ai/docs/pytorch/stable/common/trainer.html) and you're good to go.🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
46-46: Unordered list indentation
Expected: 0; Actual: 1(MD007, ul-indent)
71-71
: Add a comment about uv installation.Since the installation now uses uv, it would be helpful to add a comment about installing uv first.
make setup -make install # Install lighter via uv +make install # Install lighter via uv (requires uv to be installed first) make pre-commit-install # Set up the pre-commit hook for code formatting
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/ci.yml
(1 hunks).pre-commit-config.yaml
(1 hunks)CONTRIBUTING.md
(2 hunks)README.md
(2 hunks)
🧰 Additional context used
🪛 LanguageTool
CONTRIBUTING.md
[uncategorized] ~38-~38: A comma is probably missing here.
Context: ...commit install ## Codestyle After installation you may execute code formatting:
ba...
(MISSING_COMMA_AFTER_INTRODUCTORY_PHRASE)
[grammar] ~65-~65: The verb “Add” needs to be in the to-infinitive form.
Context: ...ting your code: 1. Add any changes you want 2. Add tests for the new changes 3. Edit doc...
(MISSING_TO_BEFORE_A_VERB)
🪛 markdownlint-cli2 (0.17.2)
README.md
46-46: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
🔇 Additional comments (2)
.pre-commit-config.yaml (1)
15-15
: Verify the availability ofuvx
in the development environment.The switch from
poetry run
touvx run
requires thatuvx
is properly installed and available in the development environment.Run this script to check if the development environment is properly configured:
Also applies to: 23-23
✅ Verification successful
uvx
setup is properly documented in CONTRIBUTING.mdThe switch from
poetry run
touvx run
is valid as the project has clear setup instructions in CONTRIBUTING.md that guide developers to:
- Create a virtual environment with
uv venv
- Sync dependencies with
uv sync
- Install pre-commit hooks with
uvx pre-commit install
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify uvx installation and configuration # Check if uvx is available command -v uvx >/dev/null 2>&1 || { echo "uvx not found. Please install it first."; exit 1; } # Check uvx version uvx --version # List installed packages to verify isort and black are available uv pip list | grep -E "isort|black"Length of output: 114
Script:
#!/bin/bash # Search for uvx-related configuration and documentation echo "=== Searching for uvx in documentation and configuration files ===" rg -i "uvx|uv " -g "!*.pyc" -g "!*.git*" -g "!*.ipynb" -g "README*" -g "*.md" -g "*.toml" -g "*.yaml" -g "*.yml" -A 3 echo -e "\n=== Checking for uv/uvx in requirements files ===" fd -e txt -e in requirements | xargs cat echo -e "\n=== Checking CI configurations ===" fd -e yml -e yaml -g ".github/workflows/*" | xargs catLength of output: 1709
.github/workflows/ci.yml (1)
32-32
: Verify the presence and generation of uv.lock file.The cache key now depends on
uv.lock
instead ofpoetry.lock
. Ensure that:
uv.lock
is being properly generated during dependency installation- The file is committed to the repository
Run this script to verify the lock file:
✅ Verification successful
Lock file verification successful
The
uv.lock
file exists and is properly tracked in the repository, making it suitable for use in the cache key.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if uv.lock exists and is properly tracked # Check if uv.lock exists if [ ! -f "uv.lock" ]; then echo "uv.lock not found" exit 1 fi # Check if uv.lock is tracked by git if ! git ls-files --error-unmatch uv.lock >/dev/null 2>&1; then echo "uv.lock is not tracked by git" exit 1 fi # Show last modification date stat uv.lockLength of output: 174
Script:
#!/bin/bash # Search for uv.lock file and check git status echo "=== Searching for uv.lock ===" fd uv.lock echo -e "\n=== Checking git tracked files ===" git ls-files | grep -i "uv.lock" echo -e "\n=== Checking pyproject.toml for uv configuration ===" if [ -f "pyproject.toml" ]; then cat pyproject.toml fi echo -e "\n=== Checking .gitignore for uv.lock ===" if [ -f ".gitignore" ]; then cat .gitignore | grep -i "uv.lock" fiLength of output: 4629
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (19)
tests/integration/test_overrides.yaml (1)
4-8
: LGTM! Consider DRY-ing up common values.The new structure provides better organization and flexibility by allowing different configurations for train and val dataloaders. Since both dataloaders currently share the same values, you could consider extracting these common settings to reduce duplication.
Here's a suggestion to make it DRY-er:
-system#dataloaders#train#batch_size: 16 -system#dataloaders#train#num_workers: 2 - -system#dataloaders#val#batch_size: 16 -system#dataloaders#val#num_workers: 2 +system#dataloaders#defaults: + batch_size: 16 + num_workers: 2 +system#dataloaders#train: + <<: *defaults +system#dataloaders#val: + <<: *defaultstests/integration/test_cifar.py (1)
27-36
: Update docstring to reflect Stage enum usage.The implementation looks good, but the docstring should be updated to reflect that
stage
is now aStage
enum rather than a string.Apply this diff to fix the docstring:
""" Test the specified stage using the given configuration. Args: - stage: The stage to run (e.g., "fit", "test", "predict"). + stage: The stage to run (e.g., Stage.FIT, Stage.TEST, Stage.PREDICT). config: Path to the configuration file. """tests/unit/test_engine_config.py (3)
8-20
: Consider adding edge cases to the initialization test.While the basic initialization test is good, consider adding edge cases such as:
- Empty dictionary
- Nested empty dictionaries
- Dictionary with None values
28-67
: Add test cases for partial validation.The validation test covers complete validation well, but consider adding test cases for:
- Partial validation of specific sections
- Validation of optional fields
158-196
: Enhance error message validation.The error formatting tests could be more comprehensive. Consider adding test cases for:
- Deeply nested validation errors (>3 levels)
- Multiple errors at the same path
- Unicode characters in error messages
tests/unit/test_engine_resolver.py (2)
10-38
: Enhance test configuration fixture.The test configuration could be more realistic by:
- Including actual model class paths instead of string placeholders
- Adding more complex nested configurations
- Including all possible configuration options
47-56
: Add edge cases for resolver initialization.Consider adding test cases for:
- Empty configuration
- Configuration with missing optional sections
- Configuration with only required fields
tests/unit/test_utils_misc.py (1)
143-181
: Enhance optimizer stats test coverage.The test for optimizers with betas is good but could be improved by:
- Testing with other optimizers that use betas (e.g., AdamW, Adamax)
- Adding edge cases:
- Zero learning rate
- Default betas
- Custom beta ranges
- Testing error handling for invalid beta values
Example test cases to add:
def test_get_optimizer_stats_with_invalid_betas(): model = torch.nn.Linear(10, 1) # Test with invalid beta range with pytest.raises(ValueError): optimizer = Adam(model.parameters(), lr=0.001, betas=(1.1, 0.999)) get_optimizer_stats(optimizer) def test_get_optimizer_stats_with_different_optimizers(): model = torch.nn.Linear(10, 1) # Test AdamW optimizer = torch.optim.AdamW(model.parameters(), lr=0.001, betas=(0.9, 0.999)) stats = get_optimizer_stats(optimizer) assert stats["optimizer/AdamW/momentum"] == 0.9tests/unit/test_adapters.py (2)
37-37
: Replace lambda functions with regular functions.Lambda functions are used for transforms in multiple test cases. While they work, using regular functions would improve readability and follow Python's style guidelines.
Apply this diff to replace the lambda functions with regular functions:
- transform = lambda x: x + 1 + def transform(x): + return x + 1 - transform1 = lambda x: x * 2 + def transform1(x): + return x * 2 - transform2 = lambda x: x - 1 + def transform2(x): + return x - 1Also applies to: 47-47, 48-48, 279-279, 289-289, 290-290
🧰 Tools
🪛 Ruff (0.8.2)
37-37: Do not assign a
lambda
expression, use adef
Rewrite
transform
as adef
(E731)
161-167
: Add assertions for callable access test.The
test_callable_access
function could benefit from additional assertions to verify the behavior of the callable accessors.Add assertions to verify that the callable accessors are invoked correctly:
adapter = BatchAdapter(input_accessor=lambda b: b["data"], target_accessor=lambda b: b["label"]) batch = {"data": create_tensor(), "label": create_tensor()} input, target, identifier = adapter(batch) assert input.equal(batch["data"]) assert target.equal(batch["label"]) assert identifier is None + # Add assertions to verify callable behavior + assert isinstance(input, torch.Tensor), "Input accessor should return a Tensor" + assert isinstance(target, torch.Tensor), "Target accessor should return a Tensor"tests/unit/test_utils_model.py (2)
363-378
: Enhance test assertions for perfect match case.The test for perfect match could benefit from additional assertions to verify that all weights are loaded correctly.
Add assertions to verify the weights after loading:
# Load state dict with perfect match adjust_prefix_and_load_state_dict(dummy_model, perfect_match_state_dict_file) - # Note: We can't directly test the logger output, but the function should complete without raising exceptions + # Load the state dict to compare weights + state_dict = torch.load(perfect_match_state_dict_file) + + # Verify all weights are loaded correctly + assert torch.equal(dummy_model.layer1.weight, state_dict["layer1.weight"]) + assert torch.equal(dummy_model.layer1.bias, state_dict["layer1.bias"]) + assert torch.equal(dummy_model.layer2.weight, state_dict["layer2.weight"]) + assert torch.equal(dummy_model.layer2.bias, state_dict["layer2.bias"])
380-400
: Clarify test behavior for no matching prefix.The test for no matching prefix could be more explicit about the expected behavior when no keys match the prefix mapping.
Add comments and assertions to clarify the behavior:
# Load state dict with prefix mapping that won't match any keys ckpt_to_model_prefix = {"nonexistent_prefix": ""} + # Store original weights for comparison + original_state_dict = { + "layer1.weight": dummy_model.layer1.weight.clone(), + "layer1.bias": dummy_model.layer1.bias.clone(), + "layer2.weight": dummy_model.layer2.weight.clone(), + "layer2.bias": dummy_model.layer2.bias.clone(), + } + adjust_prefix_and_load_state_dict(dummy_model, path, ckpt_to_model_prefix=ckpt_to_model_prefix) - # Verify weights match the original state dict + # Verify weights are updated from the original state dict, not preserved assert torch.equal(dummy_model.layer1.weight, state_dict["layer1.weight"]) assert torch.equal(dummy_model.layer1.bias, state_dict["layer1.bias"]) + assert not torch.equal(dummy_model.layer1.weight, original_state_dict["layer1.weight"]) + assert not torch.equal(dummy_model.layer1.bias, original_state_dict["layer1.bias"])tests/unit/test_utils_logging.py (2)
28-28
: Add stacklevel to warning.The warning is raised without a stacklevel, which could make it harder to trace the source of the warning.
Add stacklevel to the warning:
- warnings.warn("Test warning") + warnings.warn("Test warning", stacklevel=2)🧰 Tools
🪛 Ruff (0.8.2)
28-28: No explicit
stacklevel
keyword argument found(B028)
29-31
: Verify warning message content.The test verifies that the warning was captured but doesn't check the content of the warning message.
Add assertions to verify the warning message:
# Verify the warning was captured by loguru mock_logger.opt.assert_called_with(depth=2) - mock_opt.warning.assert_called_once() + mock_opt.warning.assert_called_once_with("Test warning")tests/unit/test_callbacks_writer_base.py (2)
226-228
: Combine nested with statements.The nested with statements could be combined into a single statement for better readability.
Combine the with statements:
- with patch.object(Path, "exists", return_value=False), patch("torch.distributed.get_rank", return_value=1): - with pytest.raises(RuntimeError, match="Rank 1 does not share storage with rank 0"): - writer.setup(trainer, pl_module, stage="predict") + with ( + patch.object(Path, "exists", return_value=False), + patch("torch.distributed.get_rank", return_value=1), + pytest.raises(RuntimeError, match="Rank 1 does not share storage with rank 0") + ): + writer.setup(trainer, pl_module, stage="predict")🧰 Tools
🪛 Ruff (0.8.2)
226-227: Use a single
with
statement with multiple contexts instead of nestedwith
statements(SIM117)
181-204
: Enhance test assertions for existing path case.The test for existing path could benefit from additional assertions to verify the warning behavior.
Add assertions to verify warning behavior:
with patch.object(Path, "exists", return_value=True): writer.setup(trainer, pl_module, stage="predict") + # Verify warning messages assert any("already exists" in msg for msg in warning_messages) assert any("existing predictions will be overwritten" in msg for msg in warning_messages) + # Verify warning count + assert len([msg for msg in warning_messages if "already exists" in msg]) == 1 + assert len([msg for msg in warning_messages if "existing predictions will be overwritten" in msg]) == 1 + # Verify writer is still set up correctly + assert writer._pred_counter == 0 + assert writer.path == target_pathtests/unit/test_callbacks_writer_table.py (1)
10-10
: Remove unused import.The
System
import is not used in this file.-from lighter.system import System
🧰 Tools
🪛 Ruff (0.8.2)
10-10:
lighter.system.System
imported but unusedRemove unused import:
lighter.system.System
(F401)
tests/unit/test_engine_runner.py (1)
7-7
: Remove unused import.The
Tuner
import from pytorch_lightning.tuner is not used in this file.-from pytorch_lightning.tuner import Tuner
🧰 Tools
🪛 Ruff (0.8.2)
7-7:
pytorch_lightning.tuner.Tuner
imported but unusedRemove unused import:
pytorch_lightning.tuner.Tuner
(F401)
tests/unit/test_system.py (1)
3-3
: Remove unused import.The
patch
import from unittest.mock is not used in this file.-from unittest.mock import patch
🧰 Tools
🪛 Ruff (0.8.2)
3-3:
unittest.mock.patch
imported but unusedRemove unused import:
unittest.mock.patch
(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
pyproject.toml
(2 hunks)tests/integration/test_cifar.py
(1 hunks)tests/integration/test_overrides.yaml
(1 hunks)tests/unit/test_adapters.py
(1 hunks)tests/unit/test_callbacks_freezer.py
(8 hunks)tests/unit/test_callbacks_writer_base.py
(7 hunks)tests/unit/test_callbacks_writer_file.py
(8 hunks)tests/unit/test_callbacks_writer_table.py
(7 hunks)tests/unit/test_engine_config.py
(1 hunks)tests/unit/test_engine_resolver.py
(1 hunks)tests/unit/test_engine_runner.py
(1 hunks)tests/unit/test_system.py
(1 hunks)tests/unit/test_utils_logging.py
(1 hunks)tests/unit/test_utils_misc.py
(2 hunks)tests/unit/test_utils_model.py
(2 hunks)tests/unit/test_utils_patches.py
(0 hunks)tests/unit/test_utils_types_containers.py
(1 hunks)
💤 Files with no reviewable changes (1)
- tests/unit/test_utils_patches.py
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- tests/unit/test_callbacks_writer_file.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/unit/test_utils_logging.py
28-28: No explicit stacklevel
keyword argument found
(B028)
tests/unit/test_callbacks_writer_base.py
226-227: Use a single with
statement with multiple contexts instead of nested with
statements
(SIM117)
tests/unit/test_engine_runner.py
7-7: pytorch_lightning.tuner.Tuner
imported but unused
Remove unused import: pytorch_lightning.tuner.Tuner
(F401)
tests/unit/test_callbacks_writer_table.py
10-10: lighter.system.System
imported but unused
Remove unused import: lighter.system.System
(F401)
tests/unit/test_adapters.py
37-37: Do not assign a lambda
expression, use a def
Rewrite transform
as a def
(E731)
47-47: Do not assign a lambda
expression, use a def
Rewrite transform1
as a def
(E731)
48-48: Do not assign a lambda
expression, use a def
Rewrite transform2
as a def
(E731)
279-279: Do not assign a lambda
expression, use a def
Rewrite transform
as a def
(E731)
289-289: Do not assign a lambda
expression, use a def
Rewrite transform1
as a def
(E731)
290-290: Do not assign a lambda
expression, use a def
Rewrite transform2
as a def
(E731)
tests/unit/test_system.py
3-3: unittest.mock.patch
imported but unused
Remove unused import: unittest.mock.patch
(F401)
tests/unit/test_utils_types_containers.py
1-1: lighter.utils.types.containers.nested
imported but unused
Remove unused import: lighter.utils.types.containers.nested
(F401)
🪛 GitHub Actions: Lighter CI
tests/unit/test_engine_runner.py
[medium] 268-275: [B102:exec_used] Use of exec detected. This could potentially allow arbitrary code execution.
[medium] 280-287: [B102:exec_used] Use of exec detected. This could potentially allow arbitrary code execution.
🔇 Additional comments (6)
tests/integration/test_cifar.py (2)
4-4
: LGTM! Enhanced type safety with Runner class and Stage enum.The change from importing
run
to usingRunner
class andStage
enum improves type safety and follows better object-oriented principles.
10-24
: LGTM! Improved test parameterization with Stage enum.The changes enhance type safety by using
Stage
enum instead of strings, making the tests more maintainable and less prone to errors. The updated configuration path aligns with the PR's refactoring goals.tests/unit/test_engine_config.py (1)
1-7
: LGTM! Well-structured test file organization.The imports and module docstring are clear and appropriate.
tests/unit/test_engine_runner.py (1)
268-275
: Consider safer alternatives to exec() for testing main guard.The use of
exec()
for testing the__main__
guard pattern could potentially allow arbitrary code execution. While this is a test file and the code being executed is controlled, consider using a safer alternative like mocking the module's__name__
attribute directly.Here's a safer alternative approach:
def test_main_guard(): """Test the __main__ guard.""" with patch("lighter.engine.runner.cli") as mock_cli: # Test when module is main with patch.dict('__main__.__dict__', {'__name__': '__main__'}): from lighter.engine.runner import __name__ as module_name if module_name == '__main__': cli() mock_cli.assert_called_once() # Test when module is not main mock_cli.reset_mock() with patch.dict('__main__.__dict__', {'__name__': 'not_main'}): from lighter.engine.runner import __name__ as module_name if module_name == '__main__': cli() mock_cli.assert_not_called()Also applies to: 280-287
🧰 Tools
🪛 GitHub Actions: Lighter CI
[medium] 268-275: [B102:exec_used] Use of exec detected. This could potentially allow arbitrary code execution.
tests/unit/test_callbacks_freezer.py (1)
Line range hint
1-570
: Well-structured and comprehensive test coverage!The test suite thoroughly covers various aspects of the Freezer callback, including:
- Initialization with different parameter combinations
- Basic functionality for freezing/unfreezing layers
- Edge cases with step and epoch limits
- Exception handling
- Complex scenarios with multiple layer combinations
tests/unit/test_system.py (1)
1-613
: Excellent test coverage aligning with PR objectives!The test suite comprehensively covers the System class functionality and aligns well with the PR's objectives of:
- Testing dataloader integration
- Verifying adapter functionality
- Validating the new container-based configuration approach
🧰 Tools
🪛 Ruff (0.8.2)
3-3:
unittest.mock.patch
imported but unusedRemove unused import:
unittest.mock.patch
(F401)
# Should remove optimizer and scheduler but keep criterion | ||
assert "optimizer" not in system_config | ||
assert "scheduler" not in system_config | ||
assert "criterion" in system_config | ||
|
||
# Should keep only validate args | ||
assert list(config.get("args").keys()) == ["validate"] | ||
|
||
|
||
def test_test_stage_config(resolver): | ||
"""Test configuration for test stage.""" | ||
config = resolver.get_stage_config(Stage.TEST) | ||
system_config = config.get("system") | ||
|
||
# Should keep only test dataloader | ||
assert "test" in system_config["dataloaders"] | ||
assert "train" not in system_config["dataloaders"] | ||
assert "val" not in system_config["dataloaders"] | ||
assert "predict" not in system_config["dataloaders"] | ||
|
||
# Should keep only test metrics | ||
assert "test" in system_config["metrics"] | ||
assert "train" not in system_config["metrics"] | ||
assert "val" not in system_config["metrics"] | ||
|
||
# Should remove optimizer, scheduler, and criterion | ||
assert "optimizer" not in system_config | ||
assert "scheduler" not in system_config | ||
assert "criterion" not in system_config | ||
|
||
# Should keep only test args | ||
assert list(config.get("args").keys()) == ["test"] | ||
|
||
|
||
def test_predict_stage_config(resolver): | ||
"""Test configuration for predict stage.""" | ||
config = resolver.get_stage_config(Stage.PREDICT) | ||
system_config = config.get("system") | ||
|
||
# Should keep only predict dataloader | ||
assert "predict" in system_config["dataloaders"] | ||
assert "train" not in system_config["dataloaders"] | ||
assert "val" not in system_config["dataloaders"] | ||
assert "test" not in system_config["dataloaders"] | ||
|
||
# Should remove all metrics | ||
assert "metrics" in system_config | ||
assert not system_config["metrics"] | ||
|
||
# Should remove optimizer, scheduler, and criterion | ||
assert "optimizer" not in system_config | ||
assert "scheduler" not in system_config | ||
assert "criterion" not in system_config | ||
|
||
# Should keep only predict args | ||
assert list(config.get("args").keys()) == ["predict"] | ||
|
||
|
||
def test_lr_find_stage_config(resolver): | ||
"""Test configuration for lr_find stage.""" | ||
config = resolver.get_stage_config(Stage.LR_FIND) | ||
system_config = config.get("system") | ||
|
||
# Should keep train and val dataloaders | ||
assert "train" in system_config["dataloaders"] | ||
assert "val" in system_config["dataloaders"] | ||
assert "test" not in system_config["dataloaders"] | ||
assert "predict" not in system_config["dataloaders"] | ||
|
||
# Should keep train and val metrics | ||
assert "train" in system_config["metrics"] | ||
assert "val" in system_config["metrics"] | ||
assert "test" not in system_config["metrics"] | ||
|
||
# Should keep optimizer, scheduler, and criterion | ||
assert "optimizer" in system_config | ||
assert "scheduler" in system_config | ||
assert "criterion" in system_config | ||
|
||
|
||
def test_scale_batch_size_stage_config(resolver): | ||
"""Test configuration for scale_batch_size stage.""" | ||
config = resolver.get_stage_config(Stage.SCALE_BATCH_SIZE) | ||
system_config = config.get("system") | ||
|
||
# Should keep train and val dataloaders | ||
assert "train" in system_config["dataloaders"] | ||
assert "val" in system_config["dataloaders"] | ||
assert "test" not in system_config["dataloaders"] | ||
assert "predict" not in system_config["dataloaders"] | ||
|
||
# Should keep train and val metrics | ||
assert "train" in system_config["metrics"] | ||
assert "val" in system_config["metrics"] | ||
assert "test" not in system_config["metrics"] | ||
|
||
# Should keep optimizer, scheduler, and criterion | ||
assert "optimizer" in system_config | ||
assert "scheduler" in system_config | ||
assert "criterion" in system_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Reduce test duplication using parametrization.
The stage-specific tests follow similar patterns. Consider using pytest's parametrize to reduce code duplication:
@pytest.mark.parametrize("stage,expected_config", [
(Stage.FIT, {
"dataloaders": ["train", "val"],
"metrics": ["train", "val"],
"has_optimizer": True,
"has_scheduler": True,
"has_criterion": True,
"args": ["fit"]
}),
# Add other stages...
])
def test_stage_config(resolver, stage, expected_config):
config = resolver.get_stage_config(stage)
system_config = config.get("system")
for loader in expected_config["dataloaders"]:
assert loader in system_config["dataloaders"]
# Add other assertions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/unit/test_engine_runner.py (3)
35-51
: LGTM! Well-structured mock configuration.The mock configuration fixture provides a clear example of the expected configuration structure.
Consider adding a docstring that explains the structure and purpose of each configuration key to help other developers understand the expected format.
106-116
: Consider adding more configuration validation test cases.While the tests cover invalid system and trainer types well, consider adding test cases for:
- Invalid configuration structure
- Missing required fields
- Type validation for configuration values
This aligns with the PR's goal of moving from Pydantic to custom validation.
Also applies to: 119-129
227-261
: LGTM! Well-structured CLI command tests.The tests effectively verify the CLI interface and command handling.
Consider adding test cases for:
- CLI command help text
- Command argument validation
- Error handling for invalid configuration files
This would improve the user experience when using the CLI.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/unit/test_engine_runner.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/unit/test_engine_runner.py
7-7: pytorch_lightning.tuner.Tuner
imported but unused
Remove unused import: pytorch_lightning.tuner.Tuner
(F401)
🔇 Additional comments (2)
tests/unit/test_engine_runner.py (2)
60-66
: LGTM! Comprehensive initialization tests.The tests thoroughly verify that the Runner initializes with correct default values.
170-224
: LGTM! Comprehensive stage handling tests.The tests effectively cover:
- All normal stages (fit, validate, test, predict)
- Tuner stages (lr_find, scale_batch_size)
- Invalid stage error handling
This commit adds test coverage for: 1. Optional module imports with error handling 2. Nested dataclass initialization and behavior 3. Metrics and training stage dataclass default factories
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (10)
tests/unit/test_utils_dynamic_imports.py (3)
118-130
: Enhance docstring to match existing documentation style.The test function would benefit from a more detailed docstring that follows the established pattern in the file.
Apply this diff to improve the docstring:
def test_optional_import_success(): """ - Tests that an available module is imported successfully - and is stored in the 'imports' dictionary. + Test successful import and storage of an available module. + + This test verifies that: + 1. A built-in module (sys) can be imported successfully + 2. The imported module is identical to the one from direct import + 3. The module is properly stored in the OPTIONAL_IMPORTS.imports dictionary + + Setup: + - Uses the built-in 'sys' module as a guaranteed available import """
133-145
: Enhance docstring to match existing documentation style.The test function would benefit from a more detailed docstring that follows the established pattern in the file.
Apply this diff to improve the docstring:
def test_optional_import_already_imported(): """ - Tests that requesting the same module a second time - does not re-import it but just returns the stored instance. + Test caching behavior of optional imports. + + This test verifies that: + 1. The first import request properly imports the module + 2. Subsequent requests return the cached instance + 3. The same object reference is maintained across requests + + Setup: + - Uses the built-in 'sys' module for consistent availability """
148-158
: Enhance docstring to match existing documentation style.The test function would benefit from a more detailed docstring that follows the established pattern in the file.
Apply this diff to improve the docstring:
def test_optional_import_failure(): """ - Tests that attempting to import a non-existent module - raises an ImportError. + Test error handling for non-existent optional imports. + + This test verifies that attempting to import a non-existent module: + 1. Raises an ImportError + 2. Includes the module name in the error message + + Raises: + ImportError: Expected to be raised when importing non-existent module """tests/unit/test_system.py (2)
3-3
: Remove unused import.The
patch
import fromunittest.mock
is not used in the code.-from unittest.mock import MagicMock, patch +from unittest.mock import MagicMock🧰 Tools
🪛 Ruff (0.8.2)
3-3:
unittest.mock.patch
imported but unusedRemove unused import:
unittest.mock.patch
(F401)
97-613
: Consider adding more edge cases to strengthen test coverage.While the existing tests are well-implemented and provide good coverage, consider adding the following test cases to make the test suite more robust:
- Test with empty dataloaders
- Test with invalid batch sizes
- Test with malformed data structures
- Test boundary conditions for learning rates
- Test error cases for metrics computation
Would you like me to help generate these additional test cases?
tests/integration/test_cifar.py (1)
28-34
: Update docstring to match the actual parameter type.The docstring describes the
stage
parameter as a string (e.g., "fit", "test", "predict"), but it's actually aStage
enum.- stage: The stage to run (e.g., "fit", "test", "predict"). + stage: The stage to run (e.g., Stage.FIT, Stage.TEST, Stage.PREDICT).tests/unit/test_utils_types_containers.py (3)
1-1
: Remove unused import ‘field’.
dataclasses.field
is imported but never used in this file.Consider removing it with the following diff:
-from dataclasses import dataclass, field, is_dataclass +from dataclasses import dataclass, is_dataclass🧰 Tools
🪛 Ruff (0.8.2)
1-1:
dataclasses.field
imported but unusedRemove unused import:
dataclasses.field
(F401)
4-4
: Remove unused import ‘Metric’.
torchmetrics.Metric
is imported but never used in this file.Consider removing it with the following diff:
-from torchmetrics import Accuracy, Metric, MetricCollection +from torchmetrics import Accuracy, MetricCollection🧰 Tools
🪛 Ruff (0.8.2)
4-4:
torchmetrics.Metric
imported but unusedRemove unused import:
torchmetrics.Metric
(F401)
77-138
: Consolidate repeated test patterns for default factories.The tests for
Train
,Val
,Test
, andPredict
each verify identical default factories. Consider extracting common verification logic into a shared helper function to minimize duplication and promote maintainability.lighter/engine/schema.py (1)
1-17
: Add documentation for schema structure.Consider adding docstrings or comments to explain:
- Purpose and usage of
_meta_
and_requires_
fields- Expected structure of
vars
dictionary- Common parameters for each operation mode in
args
+# Configuration schema for Lighter project +# _meta_: Metadata about the configuration (version, description, etc.) +# _requires_: Dependencies or required configurations +# project: Project name or identifier +# vars: Variables that can be referenced throughout the configuration +# args: Operation-specific arguments (fit, validate, test, etc.) +# trainer: Training-specific configuration (required) SCHEMA = { "_meta_": {"type": "dict"}, "_requires_": {"type": ["string", "list", "dict"]},
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
lighter/engine/schema.py
(1 hunks)tests/integration/test_cifar.py
(1 hunks)tests/unit/test_callbacks_freezer.py
(8 hunks)tests/unit/test_system.py
(1 hunks)tests/unit/test_utils_dynamic_imports.py
(1 hunks)tests/unit/test_utils_types_containers.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/test_callbacks_freezer.py
🧰 Additional context used
📓 Learnings (1)
lighter/engine/schema.py (2)
Learnt from: ibro45
PR: project-lighter/lighter#142
File: lighter/engine/schema.py:44-81
Timestamp: 2025-01-30T19:31:55.980Z
Learning: Adapter fields in the configuration schema should not be marked as required, allowing for optional configuration of batch, criterion, metrics, and logging adapters across different modes.
Learnt from: ibro45
PR: project-lighter/lighter#142
File: lighter/engine/schema.py:0-0
Timestamp: 2025-01-27T02:17:12.626Z
Learning: The adapters schema in Lighter requires:
- batch and logging keys for all modes (train, val, test, predict)
- criterion key only for train and val modes
- metrics key for train, val, and test modes but not for predict mode
🪛 Ruff (0.8.2)
tests/unit/test_utils_types_containers.py
1-1: dataclasses.field
imported but unused
Remove unused import: dataclasses.field
(F401)
4-4: torchmetrics.Metric
imported but unused
Remove unused import: torchmetrics.Metric
(F401)
tests/unit/test_system.py
3-3: unittest.mock.patch
imported but unused
Remove unused import: unittest.mock.patch
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: build (3.12)
- GitHub Check: build (3.11)
- GitHub Check: build (3.10)
🔇 Additional comments (11)
tests/unit/test_system.py (2)
17-33
: LGTM! Well-structured test classes.The test classes (
DummyDataset
,SimpleModel
, andDictLossNoTotal
) are well-implemented and serve their testing purposes effectively.Also applies to: 41-49, 260-267
52-94
: LGTM! Well-designed test fixtures.The fixtures
dummy_dataloaders
andsimple_system
are well-documented, properly set up, and provide reusable test components following pytest best practices.tests/integration/test_cifar.py (3)
5-5
: LGTM! Good use of type-safe enums.The change from string literals to the
Stage
enum improves type safety and IDE support.
35-37
: LGTM! Clean implementation with good error handling.The test implementation is clear and provides helpful feedback on failure.
11-25
: Verify the new configuration file path.The test now uses
example.yaml
instead ofmonai_bundle_prototype.yaml
. Please ensure this file exists and contains the necessary configuration.✅ Verification successful
Configuration file verification successful
The new configuration file
example.yaml
exists and contains all necessary components for training, testing, and prediction stages. No references to the old configuration file remain in the codebase.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence and content of the new configuration file # Check if the file exists if [ -f "./projects/cifar10/experiments/example.yaml" ]; then echo "File exists. Checking content..." cat "./projects/cifar10/experiments/example.yaml" else echo "ERROR: Configuration file not found!" exit 1 fi # Check if there are any remaining references to the old file rg "monai_bundle_prototype.yaml"Length of output: 3256
tests/unit/test_utils_types_containers.py (2)
10-53
: Great job adding tests for the nested decorator.This addresses the previous request to implement tests for the
nested
decorator. The coverage is thorough and verifies initialization, default values, and error handling.
55-66
: Comprehensive validation of metric conversion.It’s good that you verify single metrics are automatically converted to
MetricCollection
and remainNone
when appropriate inMetrics
. This ensures proper handling of metric objects throughout the code.lighter/engine/schema.py (4)
27-34
: LGTM!The metrics schema correctly:
- Supports both list and dict configurations
- Includes train, val, and test modes
- Excludes predict mode as metrics aren't needed for predictions
35-43
: Reference the documentation task for dataloaders.The dataloader schema is correctly defined but needs documentation. A separate task exists to update
docs/basics/config.md
anddocs/basics/workflows.md
with dataloader configuration examples.
44-74
: LGTM!The adapters schema correctly implements the requirements:
- All adapter fields are optional as specified
- Mode-specific adapter keys are correctly defined:
- train/val: batch, criterion, metrics, logging
- test: batch, metrics, logging
- predict: batch, logging
18-26
: Verify system target class existence.The
_target_
field is required but there's no validation of the target class existence. This could lead to runtime errors if the specified class doesn't exist.
This commit includes: - Adding a comprehensive docstring to the schema module explaining its purpose and structure - Minor import cleanup in test files - Updating stage parameter documentation in test_cifar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (11)
tests/unit/test_utils_types_containers.py (4)
24-53
: Consider adding more edge cases to the nested decorator tests.While the current test coverage is good, consider adding tests for:
- Deeply nested structures (>2 levels)
- Multiple nested fields in a single class
- Optional nested fields (using Optional[Type])
Example structure:
@nested @dataclass class DeepInner: value: int @nested @dataclass class MultipleInner: deep1: DeepInner deep2: Optional[DeepInner] value: str @nested @dataclass class MultipleOuter: inner1: MultipleInner inner2: Optional[MultipleInner] name: str = "default"
55-75
: Enhance metrics conversion test coverage.Consider adding tests for:
- Multiple metrics in a collection
- Invalid metric types
Example test cases:
def test_metrics_multiple_metrics(): """Test conversion of multiple metrics to MetricCollection.""" accuracy = Accuracy(task="binary") precision = Precision(task="binary") metrics_instance = Metrics(train=[accuracy, precision]) assert isinstance(metrics_instance.train, MetricCollection) assert len(metrics_instance.train) == 2 def test_metrics_invalid_type(): """Test that invalid metric types raise appropriate errors.""" with pytest.raises(TypeError): Metrics(train="not a metric")
77-113
: Reduce code duplication in Train and Val tests.The test cases for Train and Val are nearly identical. Consider using a parametrized test to reduce duplication.
Example refactor:
@pytest.mark.parametrize("factory_class", [Train, Val]) def test_train_val_default_factory(factory_class): """Test that Train/Val dataclasses use correct default factories.""" instance = factory_class() assert isinstance(instance.batch, BatchAdapter) assert instance.batch.input_accessor == 0 assert instance.batch.target_accessor == 1 assert isinstance(instance.criterion, CriterionAdapter) assert instance.criterion.pred_argument == 0 assert instance.criterion.target_argument == 1 assert isinstance(instance.metrics, MetricsAdapter) assert instance.metrics.pred_argument == 0 assert instance.metrics.target_argument == 1 assert isinstance(instance.logging, LoggingAdapter)
130-138
: Enhance Predict adapter input_accessor testing.The current test only verifies that the input_accessor is callable and handles a simple string. Consider testing more complex batch structures.
Example test cases to add:
def test_predict_batch_adapter_complex_inputs(): """Test that Predict batch adapter handles various input types.""" predict_instance = Predict() # Test with tuple assert predict_instance.batch.input_accessor((1, 2)) == (1, 2) # Test with dict batch_dict = {"input": "data", "meta": "info"} assert predict_instance.batch.input_accessor(batch_dict) == batch_dict # Test with custom object class CustomBatch: def __init__(self, data): self.data = data custom = CustomBatch("test") assert predict_instance.batch.input_accessor(custom) == customtests/integration/test_cifar.py (1)
28-37
: Consider adding more specific assertions.While the basic completion check is good, consider adding stage-specific assertions to verify the expected outcomes:
- For FIT: Check if model weights were updated
- For TEST: Verify metrics were computed
- For PREDICT: Ensure predictions were generated
Example enhancement:
def test_trainer_stage(stage: Stage, config: str): """ Test the specified stage using the given configuration. Args: stage: The stage to run (e.g., Stage.FIT, Stage.TEST, Stage.PREDICT). config: Path to the configuration file. """ runner = Runner() runner.run(stage, config=f"{config},{test_overrides}") assert runner.trainer.state.finished, f"Stage {stage} did not finish successfully." + if stage == Stage.FIT: + assert runner.trainer.state.model != None, "Model should be updated after training" + elif stage == Stage.TEST: + assert hasattr(runner.trainer.state, 'metrics'), "Test metrics should be computed" + elif stage == Stage.PREDICT: + assert hasattr(runner.trainer.state, 'predictions'), "Predictions should be generated"lighter/engine/schema.py (2)
2-13
: Consider enhancing documentation with examples.The documentation clearly explains the schema structure. Consider adding example configurations for each section to improve user understanding.
""" Defines the schema for configuration validation in the Lighter framework. The schema ensures user configurations are correctly structured and typed. It includes: - `_meta_`: Metadata as a dictionary. - `_requires_`: Runs first, primarily to be used for imports. - `project`: Project name as a string. - `vars`: Variables as a dictionary. - `args`: Arguments to pass to Trainer stage methods like `fit`, `validate`, `test`. - `trainer`: Trainer setup. - `system`: System setup, encapsulates model, criterion, optimizer, scheduler, inferer, metrics, dataloaders, and adapters. +Example: +```yaml +_meta_: + name: example-config +project: my-project +vars: + batch_size: 32 +args: + fit: + max_epochs: 10 +trainer: + accelerator: gpu + devices: 1 +system: + _target_: lighter.System + model: + _target_: torchvision.models.resnet18 +``` """
21-31
: Add validation for common args parameters.The args schema could benefit from validation of common parameters like max_epochs, batch_size, etc.
"args": { "type": "dict", "schema": { - "fit": {"type": "dict"}, + "fit": { + "type": "dict", + "schema": { + "max_epochs": {"type": "integer", "min": 1}, + "max_steps": {"type": "integer", "min": -1}, + "limit_train_batches": {"type": ["float", "integer"], "min": 0}, + "limit_val_batches": {"type": ["float", "integer"], "min": 0} + } + }, "validate": {"type": "dict"}, "test": {"type": "dict"}, "predict": {"type": "dict"}, "lr_find": {"type": "dict"}, "scale_batch_size": {"type": "dict"}, }, },tests/unit/test_system.py (4)
17-33
: Add type hints and improve documentation for helper classes.The helper classes would benefit from:
- Type hints for method parameters and return values
- More detailed docstrings explaining the purpose and behavior
- Examples in docstrings showing usage patterns
Apply these improvements:
class DummyDataset(Dataset): - """Dataset returning (input_tensor, target_int)""" + """A dummy dataset for testing purposes. + + Returns tuples of (input_tensor, target_int) when with_target=True, + or just input_tensor when with_target=False. + + Args: + size (int): Number of samples in the dataset. Defaults to 8. + with_target (bool): Whether to include target values. Defaults to True. + """ - def __init__(self, size=8, with_target=True): + def __init__(self, size: int = 8, with_target: bool = True) -> None: - def __getitem__(self, idx): + def __getitem__(self, idx: int) -> Union[Tuple[torch.Tensor, int], torch.Tensor]: - def __len__(self): + def __len__(self) -> int:Similar improvements for
SimpleModel
andDictLossNoTotal
.Also applies to: 41-49, 260-267
52-94
: Enhance test fixtures with error handling and validation.The fixtures could be more robust by:
- Validating input parameters
- Handling edge cases
- Adding cleanup code
Apply these improvements:
@pytest.fixture -def dummy_dataloaders(): +def dummy_dataloaders(request): """Provides train/val/test/predict DataLoaders""" + # Add validation for batch size + batch_size = getattr(request, 'param', {}).get('batch_size', 2) + if batch_size < 1: + raise ValueError("batch_size must be positive") + return { - "train": DataLoader(DummyDataset(size=8, with_target=True), batch_size=2), + "train": DataLoader(DummyDataset(size=8, with_target=True), batch_size=batch_size), # ... similar changes for other dataloaders } @pytest.fixture -def simple_system(dummy_dataloaders): +def simple_system(dummy_dataloaders, request): """Creates a System instance with a mock trainer and mocked log method""" + # Add cleanup + yield system + # Clean up any resources + system.trainer = None + system._log_stats = None
97-188
: Enhance test coverage with parameterized tests.Several test functions could be parameterized to cover more test cases efficiently:
- Learning rate tests
- Optimizer configuration tests
- Error handling tests
Example improvement:
+@pytest.mark.parametrize("learning_rate,expected", [ + (0.01, 0.01), + (0.001, 0.001), + (1e-5, 1e-5), +]) -def test_learning_rate_property(simple_system): +def test_learning_rate_property(simple_system, learning_rate, expected): """Check that learning_rate getter/setter works properly with a single param group.""" - initial_lr = simple_system.learning_rate - assert initial_lr == 0.01 + simple_system.learning_rate = learning_rate + assert simple_system.learning_rate == expectedAlso applies to: 247-316
572-613
: Enhance logging tests with more explicit assertions.The logging tests verify call counts but could be more thorough by checking the actual logged values and their formats.
Example improvement:
def test_log_stats_with_logger(simple_system): """Test _log_stats with a logger.""" simple_system._log_stats = System._log_stats.__get__(simple_system) simple_system.trainer.logger = MagicMock() simple_system.log = MagicMock() # Test single loss value simple_system.mode = Mode.TRAIN simple_system._log_stats(torch.tensor(1.0), None, 0) - # Twice (on step, on epoch) for the loss, twice for the SGD optimizer - assert simple_system.log.call_count == 4 + # Verify exact calls with their arguments + expected_calls = [ + call('train/loss', 1.0, on_step=True, on_epoch=True), + call('train/loss', 1.0, on_step=False, on_epoch=True), + call('train/lr', 0.01, on_step=True, on_epoch=False), + call('train/momentum', 0.0, on_step=True, on_epoch=False) + ] + assert simple_system.log.call_args_list == expected_calls
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
lighter/engine/schema.py
(1 hunks)tests/integration/test_cifar.py
(1 hunks)tests/unit/test_system.py
(1 hunks)tests/unit/test_utils_types_containers.py
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
lighter/engine/schema.py (2)
Learnt from: ibro45
PR: project-lighter/lighter#142
File: lighter/engine/schema.py:44-81
Timestamp: 2025-01-30T19:31:55.980Z
Learning: Adapter fields in the configuration schema should not be marked as required, allowing for optional configuration of batch, criterion, metrics, and logging adapters across different modes.
Learnt from: ibro45
PR: project-lighter/lighter#142
File: lighter/engine/schema.py:0-0
Timestamp: 2025-01-27T02:17:12.626Z
Learning: The adapters schema in Lighter requires:
- batch and logging keys for all modes (train, val, test, predict)
- criterion key only for train and val modes
- metrics key for train, val, and test modes but not for predict mode
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build (3.12)
🔇 Additional comments (5)
tests/unit/test_utils_types_containers.py (1)
1-22
: LGTM! Well-structured imports and class definitions.The imports are logically organized and the nested dataclass definitions are clean and follow best practices.
tests/integration/test_cifar.py (2)
5-5
: LGTM! Good improvement in type safety.The change from importing
run
to importingRunner
andStage
improves type safety by using an enum instead of strings for stages.
11-24
: Verify the configuration file changes.The test now uses
example.yaml
instead ofmonai_bundle_prototype.yaml
. Please ensure that:
- The new configuration file exists and contains all necessary settings
- Using the same configuration file for all stages (FIT, TEST, PREDICT) is intentional
✅ Verification successful
Configuration changes verified successfully
The new
example.yaml
configuration file exists and contains all necessary stage-specific settings. The reuse of configurations across stages (FIT, TEST, PREDICT) is intentional through YAML references.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the existence and content of the configuration files # Check if the new config file exists if [ -f "./projects/cifar10/experiments/example.yaml" ]; then echo "New config file exists" # Check if it contains necessary stage-specific configurations echo "Checking for stage-specific configurations:" grep -A 5 "fit\|test\|predict" "./projects/cifar10/experiments/example.yaml" else echo "New config file not found" fi # Check if the old config file is still referenced elsewhere echo "Checking for other references to the old config:" rg "monai_bundle_prototype\.yaml"Length of output: 884
lighter/engine/schema.py (2)
59-86
: LGTM! Adapters schema aligns with requirements.The adapters schema correctly:
- Makes adapter fields optional (not required)
- Includes mode-specific adapter configurations:
- batch and logging for all modes
- criterion for train and val only
- metrics for train, val, and test only
50-58
: 🛠️ Refactor suggestionAdd validation for PyTorch DataLoader parameters.
The dataloaders schema should validate common PyTorch DataLoader parameters.
"dataloaders": { "type": "dict", "schema": { - "train": {"type": "dict"}, + "train": { + "type": "dict", + "schema": { + "_target_": {"type": "string", "required": True}, + "batch_size": {"type": "integer", "min": 1}, + "shuffle": {"type": "boolean"}, + "num_workers": {"type": "integer", "min": 0}, + "pin_memory": {"type": "boolean"}, + "dataset": {"type": "dict", "required": True} + } + }, "val": {"type": "dict"}, "test": {"type": "dict"}, "predict": {"type": "dict"}, }, },Likely invalid or redundant comment.
# Test batch processing | ||
batch = {"input": torch.randn(2, 3, 32, 32), "target": torch.randint(0, 10, (2,))} | ||
_ = system._base_step(batch, 0, "train") | ||
assert count_calls.calls == 1 | ||
|
||
|
||
def test_multiple_param_groups(): | ||
model = DummyModel() | ||
optimizer = Adam([{"params": model.net[0].parameters(), "lr": 0.001}, {"params": model.net[1].parameters(), "lr": 0.002}]) | ||
|
||
system = LighterSystem(model=model, batch_size=8, optimizer=optimizer, datasets={"train": DummyDataset()}) | ||
|
||
with pytest.raises(ValueError, match="multiple optimizer parameter groups"): | ||
_ = system.learning_rate | ||
|
||
with pytest.raises(ValueError, match="multiple optimizer parameter groups"): | ||
system.learning_rate = 0.1 | ||
|
||
|
||
def test_configure_optimizers_no_optimizer(): | ||
model = DummyModel() | ||
system = LighterSystem(model=model, batch_size=8, criterion=nn.CrossEntropyLoss(), datasets={"train": DummyDataset()}) | ||
|
||
with pytest.raises(ValueError, match="Please specify 'system.optimizer' in the config."): | ||
system.configure_optimizers() | ||
|
||
model = DummyModel() | ||
system = LighterSystem( | ||
# Assert that only test hooks are overridden, other hooks remain as default | ||
assert system.training_step == super(System, system).training_step | ||
assert system.train_dataloader == super(System, system).train_dataloader | ||
assert system.on_train_start == super(System, system).on_train_start | ||
assert system.on_train_end == super(System, system).on_train_end | ||
|
||
assert system.validation_step == super(System, system).validation_step | ||
assert system.val_dataloader == super(System, system).val_dataloader | ||
assert system.on_validation_start == super(System, system).on_validation_start | ||
assert system.on_validation_end == super(System, system).on_validation_end | ||
|
||
assert system.test_step != super(System, system).test_step | ||
assert system.test_dataloader != super(System, system).test_dataloader | ||
assert system.on_test_start != super(System, system).on_test_start | ||
assert system.on_test_end != super(System, system).on_test_end | ||
|
||
assert system.predict_step == super(System, system).predict_step | ||
assert system.predict_dataloader == super(System, system).predict_dataloader | ||
assert system.on_predict_start == super(System, system).on_predict_start | ||
assert system.on_predict_end == super(System, system).on_predict_end | ||
|
||
# Test case 5: Only predict dataloader is provided | ||
model = SimpleModel() | ||
optimizer = SGD(model.parameters(), lr=0.01) | ||
system = System( | ||
model=model, | ||
batch_size=8, | ||
optimizer=Adam(model.parameters()), | ||
datasets={"train": DummyDataset()}, | ||
metrics={"train": Accuracy(task="multiclass", num_classes=10)}, | ||
dataloaders={"predict": DataLoader(DummyDataset())}, | ||
optimizer=optimizer, | ||
scheduler=None, | ||
criterion=nn.CrossEntropyLoss(), | ||
metrics=None, | ||
adapters=None, | ||
inferer=None, | ||
) | ||
|
||
# Mock trainer with logger | ||
system.trainer = Mock() | ||
system.trainer.logger = Mock() | ||
|
||
# Test logging loss | ||
loss = torch.tensor(1.0) | ||
metrics = {"accuracy": torch.tensor(0.8)} | ||
system._log_stats(loss, metrics, "train", 0) | ||
|
||
# Test dict loss logging | ||
dict_loss = {"total": torch.tensor(1.0), "component": torch.tensor(0.5)} | ||
system._log_stats(dict_loss, metrics, "train", 0) | ||
# Assert that only predict hooks are overridden, other hooks remain as default | ||
assert system.training_step == super(System, system).training_step | ||
assert system.train_dataloader == super(System, system).train_dataloader | ||
assert system.on_train_start == super(System, system).on_train_start | ||
assert system.on_train_end == super(System, system).on_train_end | ||
|
||
assert system.validation_step == super(System, system).validation_step | ||
assert system.val_dataloader == super(System, system).val_dataloader | ||
assert system.on_validation_start == super(System, system).on_validation_start | ||
assert system.on_validation_end == super(System, system).on_validation_end | ||
|
||
assert system.test_step == super(System, system).test_step | ||
assert system.test_dataloader == super(System, system).test_dataloader | ||
assert system.on_test_start == super(System, system).on_test_start | ||
assert system.on_test_end == super(System, system).on_test_end | ||
|
||
assert system.predict_step != super(System, system).predict_step | ||
assert system.predict_dataloader != super(System, system).predict_dataloader | ||
assert system.on_predict_start != super(System, system).on_predict_start | ||
assert system.on_predict_end != super(System, system).on_predict_end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor dynamic mode hooks tests to reduce code duplication.
The test cases for different dataloader combinations contain significant duplication. Consider refactoring using parameterization and helper functions.
Example refactoring:
+def create_system_with_dataloaders(dataloader_keys):
+ """Helper function to create system with specific dataloaders."""
+ model = SimpleModel()
+ optimizer = SGD(model.parameters(), lr=0.01)
+ dataloaders = {
+ key: DataLoader(DummyDataset())
+ for key in dataloader_keys
+ }
+ return System(
+ model=model,
+ dataloaders=dataloaders,
+ optimizer=optimizer,
+ scheduler=None,
+ criterion=nn.CrossEntropyLoss(),
+ metrics=None,
+ adapters=None,
+ inferer=None,
+ )
+
+@pytest.mark.parametrize("dataloader_keys,expected_overridden", [
+ (["train", "val", "test", "predict"], ["train", "val", "test", "predict"]),
+ (["train"], ["train"]),
+ (["val"], ["val"]),
+ (["test"], ["test"]),
+ (["predict"], ["predict"]),
+])
+def test_dynamic_mode_hooks(dataloader_keys, expected_overridden):
+ """Test the dynamic attachment of mode-specific hooks."""
+ system = create_system_with_dataloaders(dataloader_keys)
+
+ # Helper function to check if hooks are overridden
+ def check_hooks_override(mode):
+ is_overridden = mode in expected_overridden
+ assert (system.f"{mode}_step" != super(System, system).f"{mode}_step") == is_overridden
+ assert (system.f"{mode}_dataloader" != super(System, system).f"{mode}_dataloader") == is_overridden
+ assert (system.f"on_{mode}_start" != super(System, system).f"on_{mode}_start") == is_overridden
+ assert (system.f"on_{mode}_end" != super(System, system).f"on_{mode}_end") == is_overridden
+
+ for mode in ["train", "val", "test", "predict"]:
+ check_hooks_override(mode)
Committable suggestion skipped: line range outside the PR's diff.
Description
datasets
, now there's simplydataloaders
. Keys likesampler
orbatch_size
removed from rootsystem
lighter fit
won't instantiate test or predict dataloaders, etc. parts of the config.Related Issue
#3 - using
_disabled_
has issues, it is better to filter out the unused content from config after resolving it and before instantiating it#60
#129
Type of Change
Checklist
CODE_OF_CONDUCT.md
document.CONTRIBUTING.md
guide.make codestyle
.Summary by CodeRabbit
Release Notes
New Features
Improvements
LighterSystem
toSystem
.Breaking Changes
LighterSystem
toSystem
.LighterFreezer
,LighterFileWriter
, andLighterTableWriter
.Documentation
Bug Fixes
Chores
uv
instead ofpoetry
.