-
Notifications
You must be signed in to change notification settings - Fork 571
UN-3011 [FEAT] Limit maximum file execution count with file history viewer #1676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
UN-3011 [FEAT] Limit maximum file execution count with file history viewer #1676
Conversation
…iewer - Add file history management UI with bulk operations - Add execution count tracking and limits - Add permissions for workflow-scoped access 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Summary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds execution-count tracking and limits for file processing: new config, DB fields/indexes, atomic file-history creation with race handling, APIs/UI to view/manage histories, permission checks, worker logic to honor limits, and caching changes to only reuse COMPLETED results. Changes
Sequence Diagram(s)sequenceDiagram
participant FE as Frontend
participant API as Django API
participant Helper as FileHistoryHelper
participant DB as Postgres
participant Cache as Redis
FE->>API: POST /workflow/{id}/file-histories/ (create)
API->>Helper: create_file_history(workflow, file_hash, status, result, metadata, error, is_api)
Helper->>DB: SELECT FileHistory WHERE workflow & file_hash
alt exists
Helper->>DB: UPDATE execution_count = F('execution_count') + 1, set status/result/error
DB-->>Helper: updated record
Helper->>Cache: invalidate per-file cache
Helper-->>API: return updated FileHistory (execution_count++)
else not exists
Helper->>DB: INSERT FileHistory (execution_count=1)
alt IntegrityError (race)
Helper->>DB: SELECT & UPDATE execution_count = F('execution_count') + 1
Helper->>Cache: invalidate per-file cache
Helper-->>API: return existing+incremented FileHistory
else success
Helper-->>API: return new FileHistory (execution_count=1)
end
end
API-->>FE: Response {file_history_id, execution_count, ...}
sequenceDiagram
participant Worker as Worker
participant Pipeline as filter_pipeline
participant API as Django API
participant DB as Postgres
Worker->>Pipeline: evaluate file for processing
Pipeline->>API: request file history info (status, execution_count, max_execution_count)
API->>DB: fetch FileHistory and Workflow.max_file_execution_count
DB-->>API: return file history + workflow limit
API-->>Pipeline: return history data
alt has_exceeded_limit true OR execution_count >= max
Pipeline-->>Worker: SKIP (limit reached)
else status allows reprocess
Pipeline-->>Worker: ACCEPT for processing
else status indicates skip
Pipeline->>Pipeline: compare history_path vs current_path
alt same path
Pipeline-->>Worker: SKIP (already processed)
else
Pipeline-->>Worker: ACCEPT (new path)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
workers/shared/processing/filter_pipeline.py (1)
388-445: Execution-count check may over-enforce limits (especially for API workflows) and is inconsistent with individual pathThe new logic in
_evaluate_batch_file_history:
- Unconditionally skips when
execution_count >= max_execution_count, before considering status or path.- Does not use the backend’s
has_exceeded_limitflag, even thoughFileHistorySerializernow exposes it.- Has no counterpart in
_evaluate_file_history, so behavior differs between batch (get_files_history_batch) and individual (get_file_history) flows.Two concrete risks:
API workflows
On the backend,FileHistory.has_exceeded_limit()explicitly returnsFalsefor API deployments, so limits are not meant to be enforced for them. However, the worker’s batch filter now enforces a hard cap purely fromexecution_count/max_execution_count, so API-type workflows could start being skipped once the numeric threshold is reached, contradicting backend semantics.Inconsistent semantics between batch and fallback paths
If the batch API fails and_process_file_history_individualis used,_evaluate_file_historynever applies the execution-count check, so the max-count limit “disappears” in that mode.A more robust approach would be to:
- Prefer the backend’s boolean flag when present, and only fall back to raw counts for backward compatibility:
- execution_count = file_history.get("execution_count", 0) - max_execution_count = file_history.get("max_execution_count", 3) + has_exceeded_limit = file_history.get("has_exceeded_limit") + execution_count = file_history.get("execution_count", 0) + max_execution_count = file_history.get("max_execution_count") ... - if execution_count >= max_execution_count: + if has_exceeded_limit is True: + logger.warning( + f"FileHistoryFilter - {file_hash.file_name}: SKIP " + f"(backend reports has_exceeded_limit=True)" + ) + return True + if ( + has_exceeded_limit is None + and max_execution_count is not None + and execution_count >= max_execution_count + ): logger.warning( f"FileHistoryFilter - {file_hash.file_name}: SKIP " - f"(max execution count reached: {execution_count}/{max_execution_count})" + f"(max execution count reached: {execution_count}/{max_execution_count})" ) return True
- Align
_evaluate_file_history(individual path) to apply the samehas_exceeded_limitflag onceget_file_history_internalstarts returning it, so both code paths enforce limits identically.This keeps the worker honoring backend rules (including “no limit for API workflows”) and avoids surprising discrepancies between batch and fallback behavior.
workers/shared/workflow/execution/service.py (1)
1448-1496: Reset_last_execution_errorper execution to avoid stale ERROR statuses in file history.The new logic for:
- Deriving
file_status/error_messagefromprocessing_erroror_last_execution_error, and- Always creating file history entries (with correct COMPLETED/ERROR) via
_should_create_file_historyis conceptually correct and matches the intended execution-count tracking.
However,
_last_execution_erroris only initialized in__init__and is never explicitly reset at the beginning ofexecute_workflow_for_file. If this service instance is reused across files, a previous failure can leave_last_execution_errorpopulated and cause:
file_statusto be set toERROReven when the current run succeeded, anderror_message/ finalFinalOutputResult.errorto contain an old error.That now directly affects stored
FileHistory.statusanderror.Consider explicitly clearing
_last_execution_errorat the start ofexecute_workflow_for_file(or just before starting a new workflow execution), for example:def execute_workflow_for_file( self, file_processing_context: FileProcessingContext, organization_id: str, workflow_id: str, execution_id: str, is_api: bool = False, use_file_history: bool = False, workflow_file_execution_id: str = None, workflow_logger=None, ) -> WorkflowExecutionResult: - """Execute workflow with clean, linear flow and comprehensive result propagation.""" + """Execute workflow with clean, linear flow and comprehensive result propagation.""" + # Reset last execution error for this run to avoid leaking state between files + self._last_execution_error = NoneThis keeps your new status/error propagation for file history accurate per run.
Also applies to: 1511-1530
🧹 Nitpick comments (13)
backend/workflow_manager/internal_views.py (2)
505-506: Consider downgrading the extra status log to debug to reduce noiseYou now log status updates at Line 473, Line 505, and Line 517. The new Line 505 log is redundant at INFO level; consider switching it to DEBUG to avoid triple INFO logs per status change:
-logger.info(f"Updating status for execution {id} to {status_enum}") +logger.debug(f"Updating status for execution {id} to {status_enum}")
2309-2352: File history batch/create responses now carry execution limits; update docstrings for accuracyThe additions to
FileHistoryBatchCheckViewandFileHistoryCreateViewcorrectly:
- Fetch
workflow.get_max_execution_count()once and attach it per record and at the top level.- Expose
execution_countandstatusalongside existing fields.- Return
execution_countfrom the create endpoint, which is useful for worker-side decisions.The only follow‑up is to update the docstrings/examples for these endpoints so they describe the enriched response shapes (including
execution_count,status,max_execution_count) to keep internal API documentation accurate.Also applies to: 2483-2494
frontend/src/components/pipelines-or-deployments/pipelines/Pipelines.jsx (1)
474-502: View File History action and modal wiring look good; consider small cleanupsFunctionally this integration is solid: you guard on
selectedPorD?.workflow_idbefore opening the modal and surface a clear error if it’s missing, and the modal gets the right identifiers.Two small polish suggestions:
- Drop or downgrade the
console.log("Opening File History for:", …)in the action handler to avoid noisy logs in production.FileHistoryModal’sworkflowNameprop is currently fedselectedPorD?.pipeline_name, so the title “File History - …” will show the pipeline name, not the underlying workflow name. If the intent is to show the workflow label, consider passingselectedPorD?.workflow_nameinstead, or rename the prop to something liketitle/pipelineNamefor clarity.Also applies to: 785-790
backend/workflow_manager/workflow_v2/views.py (1)
1073-1094: Avoid re-serializing the sameFileHistoryfor multiple identifiersInside the
for result_identifier in matched_identifiersloop you buildFileHistorySerializer(fh)andserializer.dataon every iteration, even thoughfhis the same for all identifiers. This is minor but unnecessary work.You can serialize once per
fhand reuse the dict:- for result_identifier in matched_identifiers: - is_completed_result = fh.is_completed() - serializer = FileHistorySerializer(fh) - file_history_data = serializer.data + is_completed_result = fh.is_completed() + serializer = FileHistorySerializer(fh) + file_history_data = serializer.data + + for result_identifier in matched_identifiers: ... response_data[result_identifier] = { "found": True, "is_completed": is_completed_result, "file_history": file_history_data, }This keeps the new logging/fields but avoids redundant serialization work.
backend/workflow_manager/workflow_v2/permissions.py (1)
20-45: Make permission more robust to kwarg naming and unauthenticated users
has_permissioncurrently assumes that all views using this permission expose the workflow id asview.kwargs["workflow_id"]. If any route uses a different kwarg (e.g.pkorworkflow_pk), this will silently deny access for all requests.Also,
useris used without an explicitis_authenticatedguard; if this permission is ever applied without an authentication gate, accessinguser.organizationcould be fragile.Consider:
- Either asserting/standardizing that all relevant routes use
workflow_id, or reading from a small set of accepted keys:workflow_id = ( view.kwargs.get("workflow_id") or view.kwargs.get("workflow_pk") or view.kwargs.get("pk") ) if not workflow_id: return False
- Optionally short‑circuiting unauthenticated users:
user = request.user if not getattr(user, "is_authenticated", False): return FalseThis keeps the caching behavior but makes the permission less sensitive to routing details.
backend/workflow_manager/workflow_v2/models/file_history.py (1)
23-45: Execution limit modelling looks consistent; minor optional tighteningThe new
__str__,execution_countfield, andhas_exceeded_limitlogic line up well with the workflow-levelget_max_execution_count()and the “no limit for API workflows” rule.If you want to tighten the model further, consider:
- Using
PositiveIntegerField(or at leastvalidators=[MinValueValidator(0)]) forexecution_countto prevent negative values.- Documenting explicitly in the docstring whether
execution_countis “completed runs so far” vs “including current in‑flight run”, to keep helpers and workers aligned on when they increment.Functionally, though, this change looks sound.
Also applies to: 65-68
backend/workflow_manager/workflow_v2/serializers.py (1)
133-162: Serializer methods are fine; watch for N+1 onworkflowaccessThe added
max_execution_count/has_exceeded_limitmethods correctly defer toWorkflow.get_max_execution_count()andFileHistory.has_exceeded_limit().One thing to double‑check is that any views returning lists of
FileHistoryuseselect_related("workflow")so these per-row method calls don’t introduce an N+1 query pattern when serializing many histories. Infile_history_batch_lookup_internalyou already do this; mirroring that in other entry points would keep performance predictable.backend/workflow_manager/workflow_v2/file_history_helper.py (1)
271-359: Execution count increment + race handling look correct; consider tightening the try/except and logging.The new flow for existing histories (atomic
F("execution_count") + 1) and the IntegrityError race-resolution path both look solid for concurrent workers and align with the unique constraints onFileHistory.Two non-blocking refinements:
- Narrow the
tryblock aroundFileHistory.objects.create(**create_data)so that only the actual create is treated as a “race condition” source; this avoids accidentally classifying unrelatedIntegrityErrors as duplicate races.- In the rare
existing_record is Nonebranch, preferlogger.exception(...)to retain the original traceback before re‑raising, and optionally usef"... Error: {e!s}"instead ofstr(e)for cleaner formatting.These are style/diagnostics improvements only; behavior is otherwise correct.
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (3)
68-137: Unify error handling for fetching histories withuseExceptionHandlerand drop console noise.The fetch logic and query param construction look correct, but two small cleanups will make this more consistent with the rest of the app:
- Instead of manually unpacking
err.response.dataand buildingerrorMessage, reuseuseExceptionHandlerso all errors (validation, subscription, network) are handled uniformly:- } catch (err) { - console.error("Error fetching file histories:", err); - console.error("Error details:", { - message: err?.message, - response: err?.response, - status: err?.response?.status, - data: err?.response?.data, - }); - - // Extract more detailed error message - const errorMessage = - err?.response?.data?.message || - err?.response?.data?.error || - err?.response?.data?.detail || - err?.message || - "Failed to fetch file histories"; - - setAlertDetails({ - type: "error", - content: errorMessage, - }); - } finally { + } catch (err) { + setAlertDetails( + handleException( + err, + "Failed to fetch file histories" + ) + ); + } finally {
- Remove or gate the
console.logcalls used for debugging ("Fetching file histories:","File histories response:") so they don’t spam production logs.These are behavioral no-ops but improve consistency and cleanliness.
153-167: Filter reset comment is misleading; logic relies on the explicit refetch, not an effect.The reset handler works by clearing filter state and then explicitly refetching:
setStatusFilter([]); setExecutionCountMin(null); setExecutionCountMax(null); // Reset will trigger effect to fetch unfiltered data setTimeout(() => { fetchFileHistories(1, pagination.pageSize); }, 0);The comment suggests that a React effect will fire on filter change, but
useEffectcurrently only depends on[open, workflowId]. Consider updating the comment (or just removing thesetTimeoutand callingfetchFileHistoriesdirectly) to avoid confusion for future maintainers.
169-211: Batch delete behavior is correct but can be slightly more informative on partial failures.The
handleDeleteSelectedimplementation correctly:
- Short-circuits when nothing is selected.
- Deletes all selected IDs via
Promise.all.- Refreshes and clears selection, with a success alert.
If you want to harden it further, consider catching per‑ID failures and reporting how many deletions actually succeeded vs. failed instead of treating the whole batch as a single success/failure. Not required for correctness, but improves UX for large selections.
backend/workflow_manager/workflow_v2/file_history_views.py (2)
19-25: Consider adding type annotations for class attributes.While the current implementation works correctly, adding
ClassVarannotations would improve type safety and make the intent clearer.Apply this diff to add type annotations:
+from typing import ClassVar + class FileHistoryViewSet(viewsets.ReadOnlyModelViewSet): """ViewSet for file history operations with filtering support.""" - serializer_class = FileHistorySerializer - lookup_field = "id" - permission_classes = [IsAuthenticated, IsWorkflowOwnerOrShared] - pagination_class = CustomPagination + serializer_class: ClassVar = FileHistorySerializer + lookup_field: ClassVar[str] = "id" + permission_classes: ClassVar = [IsAuthenticated, IsWorkflowOwnerOrShared] + pagination_class: ClassVar = CustomPagination
55-69: Consider using 204 No Content for DELETE operations.The method returns 200 OK with a message body. While functional, RESTful conventions typically use 204 No Content for successful DELETE operations with no response body.
Apply this diff to use standard REST response:
return Response( - {"message": "File history deleted successfully"}, status=status.HTTP_200_OK + status=status.HTTP_204_NO_CONTENT )Note: The static analysis warnings about unused
requestandidparameters are false positives. These parameters are required by DRF's routing andself.get_object()internally uses them.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (19)
backend/backend/settings/base.py(1 hunks)backend/sample.env(1 hunks)backend/workflow_manager/internal_views.py(5 hunks)backend/workflow_manager/workflow_v2/file_history_helper.py(5 hunks)backend/workflow_manager/workflow_v2/file_history_views.py(1 hunks)backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py(1 hunks)backend/workflow_manager/workflow_v2/models/file_history.py(2 hunks)backend/workflow_manager/workflow_v2/models/workflow.py(3 hunks)backend/workflow_manager/workflow_v2/permissions.py(1 hunks)backend/workflow_manager/workflow_v2/serializers.py(1 hunks)backend/workflow_manager/workflow_v2/urls/workflow.py(3 hunks)backend/workflow_manager/workflow_v2/views.py(1 hunks)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.css(1 hunks)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx(1 hunks)frontend/src/components/pipelines-or-deployments/pipelines/Pipelines.jsx(5 hunks)frontend/src/components/workflows/workflow/workflow-service.js(1 hunks)unstract/core/src/unstract/core/data_models.py(2 hunks)workers/shared/processing/filter_pipeline.py(1 hunks)workers/shared/workflow/execution/service.py(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (12)
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
frontend/src/hooks/useExceptionHandler.jsx (1)
useExceptionHandler(4-127)
backend/workflow_manager/workflow_v2/views.py (1)
backend/workflow_manager/workflow_v2/serializers.py (1)
FileHistorySerializer(133-161)
backend/workflow_manager/workflow_v2/serializers.py (2)
backend/workflow_manager/workflow_v2/models/file_history.py (1)
has_exceeded_limit(27-44)backend/workflow_manager/workflow_v2/models/workflow.py (1)
get_max_execution_count(107-126)
backend/workflow_manager/workflow_v2/file_history_helper.py (1)
backend/workflow_manager/workflow_v2/models/file_history.py (2)
FileHistory(14-133)update(115-133)
workers/shared/processing/filter_pipeline.py (1)
unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
backend/workflow_manager/workflow_v2/permissions.py (2)
backend/workflow_manager/workflow_v2/models/workflow.py (1)
Workflow(33-137)backend/workflow_manager/internal_views.py (6)
get(628-688)get(1188-1298)get(1603-1656)get(1851-1948)get(2159-2242)get(2513-2616)
backend/workflow_manager/workflow_v2/models/file_history.py (2)
backend/workflow_manager/workflow_v2/models/workflow.py (3)
Workflow(33-137)WorkflowType(34-39)get_max_execution_count(107-126)backend/workflow_manager/workflow_v2/serializers.py (1)
get_max_execution_count(141-150)
backend/workflow_manager/workflow_v2/models/workflow.py (1)
backend/workflow_manager/workflow_v2/serializers.py (1)
get_max_execution_count(141-150)
backend/workflow_manager/internal_views.py (2)
backend/workflow_manager/workflow_v2/models/workflow.py (1)
get_max_execution_count(107-126)backend/workflow_manager/workflow_v2/serializers.py (1)
get_max_execution_count(141-150)
frontend/src/components/pipelines-or-deployments/pipelines/Pipelines.jsx (1)
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
FileHistoryModal(32-533)
backend/workflow_manager/workflow_v2/urls/workflow.py (1)
backend/workflow_manager/workflow_v2/file_history_views.py (1)
FileHistoryViewSet(19-135)
workers/shared/workflow/execution/service.py (2)
unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)unstract/core/src/unstract/core/worker_models.py (3)
has_error(302-304)has_error(760-762)has_error(820-822)
🪛 GitHub Check: SonarCloud Code Analysis
backend/workflow_manager/workflow_v2/models/workflow.py
[warning] 119-119: Complete the task associated to this "TODO" comment.
🪛 Ruff (0.14.5)
backend/workflow_manager/workflow_v2/file_history_views.py
24-24: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
55-55: Unused method argument: request
(ARG002)
55-55: Unused method argument: id
(ARG002)
122-122: Do not catch blind exception: Exception
(BLE001)
134-134: Do not catch blind exception: Exception
(BLE001)
backend/workflow_manager/workflow_v2/file_history_helper.py
319-319: Consider moving this statement to an else block
(TRY300)
327-327: Use explicit conversion flag
Replace with conversion flag
(RUF010)
355-358: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py
7-9: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
11-42: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (11)
backend/backend/settings/base.py (1)
203-204: Env-backed MAX_FILE_EXECUTION_COUNT constant is wired correctlyReading
MAX_FILE_EXECUTION_COUNTfrom the environment with an integer default of 3 is consistent with other settings and aligns with the new sample.env entry; no issues from a configuration or type perspective.unstract/core/src/unstract/core/data_models.py (1)
1178-1184: FileHistoryData.execution_count wiring matches backend defaultsAdding
execution_countwith a default of1and mapping it infrom_dictkeeps this shared DTO in sync with the Django migration default and the helper logic that increments counts. Missing values will sensibly fall back to1, so this remains backward compatible.Also applies to: 1203-1212
backend/sample.env (1)
220-223: Sample MAX_FILE_EXECUTION_COUNT env is consistent with backend configurationThe new
MAX_FILE_EXECUTION_COUNTentry is clearly documented (default 3, ETL/TASK scope) and matches the Django setting name and default; nothing else needed.backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py (1)
1-42: Migration correctly adds execution_count/max_file_execution_count and supporting indexesThe new fields and defaults line up with the model and shared DTO changes, and the indexes on
(workflow, status)and(workflow, execution_count)are appropriate for the expected lookup patterns. This migration looks ready to ship as-is.frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.css (1)
1-90: FileHistoryModal styles are well-scoped and responsiveCSS is correctly namespaced under
.file-history-modal, with sensible spacing and responsive tweaks for mobile/tablet/desktop breakpoints; there are no apparent clashes or layout issues.frontend/src/components/workflows/workflow/workflow-service.js (1)
71-100: File history service endpoints are consistent with backend routes and CSRF usage.The three helpers correctly map to the new DRF endpoints, pass query params for listing, and send CSRF-protected DELETE/POST requests for mutating operations. Shapes (
paramsvsdata) also align with the described backend filters. No changes needed here.backend/workflow_manager/workflow_v2/urls/workflow.py (1)
29-33: File history URL wiring matches the viewset and frontend usage.The new
FileHistoryViewSetbindings and nested routes (workflow_id+id) align with the viewset’slookup_field="id"and the frontend service paths. Naming and scoping look good; I don’t see any routing or parameter issues.Also applies to: 79-94
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (2)
260-383: Table columns and execution limit display align with backend fields.The table configuration correctly:
- Uses
idasrowKey.- Renders
execution_countalongsidemax_execution_countandhas_exceeded_limit, highlighting exceeded entries in red.- Maps known statuses to AntD tag colors and falls back gracefully for unknown ones.
- Provides copy‑to‑clipboard for file paths and error messages with proper event stopping.
This matches the serializer contract (
execution_count,max_execution_count,has_exceeded_limit,modified_at,error) and looks good.
384-532: Modal behavior and actions are coherent with the workflow service and UX.The modal wiring (open/close, filters, bulk actions, responsive layout, scrollable table) works cohesively:
- Disables destructive actions when no rows are selected.
- Correctly calls
bulkClearFileHistorieswith the same filters used for listing.- Keeps pagination and selection in sync with refreshes.
- Uses PropTypes to document expected props.
No blocking issues here; behavior is consistent with the introduced backend APIs.
backend/workflow_manager/workflow_v2/models/workflow.py (2)
4-4: LGTM!The import is necessary for accessing
settings.MAX_FILE_EXECUTION_COUNTin theget_max_execution_count()method.
107-127: No action required.The
settings.MAX_FILE_EXECUTION_COUNTis properly defined inbackend/backend/settings/base.py:204with a sensible default value of 3. The method is safe and will not raise anAttributeErrorat runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
workers/api-deployment/tasks.py (1)
1128-1133: Fix inconsistent return type when no file hashes are availableHere the function returns only
hash_values_of_files, but the signature and all other exit paths return a(files_to_process, cached_results)tuple, and the caller unpacks two values. This will raise aValueError: not enough values to unpackwhenever this branch is hit (e.g., when no hashes are present butuse_file_historyis enabled).Recommend returning an empty cached-results dict for consistency:
- logger.info( - f"No file hashes available for history check in execution {execution_id}, processing all files" - ) - return hash_values_of_files + logger.info( + f"No file hashes available for history check in execution {execution_id}, processing all files" + ) + return hash_values_of_files, {}
🧹 Nitpick comments (7)
workers/api-deployment/tasks.py (1)
1168-1195: Status-gated caching logic is solid; consider small cleanups and confirm limit semanticsThe new logic only treats histories with
status == ExecutionStatus.COMPLETED.valueas cacheable and routes ERROR/STOPPED/other statuses back intofiles_to_process, which aligns with the docstring (“successful results are returned as cached”) and avoids reusing failed executions. That behavior looks correct.Two follow-ups:
skipped_filesis populated but never used; either log a summary with it or remove it to avoid confusion.- With the new global max-execution-count feature, please confirm that
check_file_history_batchalready encodes “limit exceeded” behavior appropriately (e.g., by not including those hashes inprocessed_file_hashesor by adjustingstatus/flags) so this function doesn’t accidentally keep re-queuing files that should be hard-stopped by the limit.backend/workflow_manager/workflow_v2/file_history_helper.py (1)
306-404: Race-safe increment path looks good; TRY300 linter suggestion is optionalThe new
create_file_historyflow correctly:
- Resolves
file_pathbased onis_api.- Reuses existing histories via
_increment_file_historywithF("execution_count") + 1.- Handles uniqueness races by catching
IntegrityError, refetching, and incrementing using the same atomic helper.Behaviorally this gives you N increments under N concurrent creators without lost updates.
Ruff’s TRY300 hint about using a
try/except/elsehere is stylistic; if you want to silence it, you can move the “happy path” logging/return into anelse:block, but that’s not functionally required.frontend/src/components/agency/agency/Agency.jsx (1)
45-46: File history modal integration looks consistent; consider guarding on missing workflow idThe wiring for
FileHistoryModal(state, menu item, click handler, and render) is consistent with the Pipelines integration and respects existing loading/clearing disables.One small optional hardening: if there’s any path where
details?.idmight be unset when the page becomes interactive, you could either:
- Disable the “View File History” menu item until
details?.idis available, or- Early-return in
handleMenuClickwhen!details?.idbefore opening the modal.If the store invariant guarantees
details.idis always present post‑load, this is fine as is.Also applies to: 100-101, 1007-1030, 1333-1339
workers/shared/processing/files/processor.py (1)
102-117: Only caching COMPLETED histories is a good safety gate; inner import can be removedLimiting cache reuse to
status == ExecutionStatus.COMPLETED.valueis a sensible tightening: it prevents serving partial, pending, or error results from history and cleanly falls back to fresh execution otherwise.Minor cleanup: this function re-imports
ExecutionStatuseven though it’s already imported at the top of the module. You can safely drop the in-function import and rely on the module-level one to avoid redundant imports:- status = file_history_data.get("status") - - from unstract.core.data_models import ExecutionStatus - - if status != ExecutionStatus.COMPLETED.value: + status = file_history_data.get("status") + if status != ExecutionStatus.COMPLETED.value: ...workers/shared/processing/filter_pipeline.py (1)
393-423: Execution-count limit integration is correct; note the fallback default couplingThe new
_evaluate_batch_file_historylogic looks consistent with the backend model:
- It prefers the backend-computed
has_exceeded_limitflag, so workflow-type nuances (API vs ETL/TASK) stay centralized in the backend.- When
has_exceeded_limitisNone, it falls back to a rawexecution_count >= max_execution_countcheck, which is a sensible compatibility path.- Status/path handling remains as before: only
EXECUTING/PENDING/COMPLETEDstatuses participate in the “skip if same path” rule; other statuses (e.g.,ERROR) allow reprocessing until the limit is hit.Two minor considerations:
Fallback default of 3
The local defaultmax_execution_count = file_history.get("max_execution_count", 3)is only used when older backends don’t sendmax_execution_countorhas_exceeded_limit, but it does implicitly couple worker behavior to the historical global default. If that default ever changes for legacy servers, remember to update this fallback to stay consistent.Clarify intent in comments
The comments already mention backend precedence; you might add a brief note that completed runs are still skipped based on status/path, and the limit primarily caps repeated non-terminal/error runs.Functionally, the limit logic and logging look solid.
Also applies to: 428-459
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
169-178: Consider removingsetTimeoutin favor of immediate fetch.The
setTimeout(..., 0)on lines 175-177 defers the fetch to allow state updates to complete. SincefetchFileHistoriesreads from state parameters passed as arguments (not from state directly), the setTimeout is unnecessary.Apply this diff to simplify:
setStatusFilter([]); setExecutionCountMin(null); setExecutionCountMax(null); setFilePathFilter(""); - // Reset will trigger effect to fetch unfiltered data - setTimeout(() => { - fetchFileHistories(1, pagination.pageSize); - }, 0); + fetchFileHistories(1, pagination.pageSize); };backend/workflow_manager/workflow_v2/file_history_views.py (1)
28-47: Consider adding exception chaining for better error context.Line 47 raises
ValidationErrorwithout chaining the original exception. Addingfrom errwould preserve the full error context for debugging.Apply this diff:
except (ValueError, TypeError): - raise ValidationError({param_name: "Must be a valid integer"}) + raise ValidationError({param_name: "Must be a valid integer"}) from NoneNote: Using
from Nonesuppresses the traceback since the ValueError/TypeError is expected and the ValidationError message is sufficient.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (11)
backend/workflow_manager/workflow_v2/file_history_helper.py(5 hunks)backend/workflow_manager/workflow_v2/file_history_views.py(1 hunks)backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py(1 hunks)backend/workflow_manager/workflow_v2/models/file_history.py(4 hunks)backend/workflow_manager/workflow_v2/models/workflow.py(3 hunks)backend/workflow_manager/workflow_v2/views.py(1 hunks)frontend/src/components/agency/agency/Agency.jsx(5 hunks)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx(1 hunks)workers/api-deployment/tasks.py(3 hunks)workers/shared/processing/files/processor.py(1 hunks)workers/shared/processing/filter_pipeline.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/workflow_manager/workflow_v2/views.py
- backend/workflow_manager/workflow_v2/models/workflow.py
🧰 Additional context used
🧬 Code graph analysis (7)
frontend/src/components/agency/agency/Agency.jsx (2)
frontend/src/components/pipelines-or-deployments/pipelines/Pipelines.jsx (1)
isClearingFileHistory(70-71)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
FileHistoryModal(33-621)
backend/workflow_manager/workflow_v2/models/file_history.py (2)
backend/workflow_manager/workflow_v2/models/workflow.py (3)
Workflow(34-133)WorkflowType(35-40)get_max_execution_count(109-122)backend/workflow_manager/workflow_v2/serializers.py (1)
get_max_execution_count(141-150)
workers/shared/processing/filter_pipeline.py (2)
backend/workflow_manager/workflow_v2/models/file_history.py (1)
has_exceeded_limit(28-45)unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
workers/shared/processing/files/processor.py (2)
backend/workflow_manager/workflow_v2/views.py (1)
status(507-548)unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
workers/api-deployment/tasks.py (1)
unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
backend/workflow_manager/workflow_v2/file_history_helper.py (2)
backend/workflow_manager/workflow_v2/models/file_history.py (2)
FileHistory(15-146)update(128-146)unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
backend/workflow_manager/workflow_v2/file_history_views.py (5)
backend/workflow_manager/workflow_v2/models/file_history.py (1)
FileHistory(15-146)backend/workflow_manager/workflow_v2/permissions.py (1)
IsWorkflowOwnerOrShared(7-45)backend/workflow_manager/workflow_v2/serializers.py (1)
FileHistorySerializer(133-161)workers/shared/processing/filter_pipeline.py (1)
_create_cache_key(107-120)backend/utils/cache_service.py (1)
clear_cache_optimized(46-129)
🪛 Ruff (0.14.5)
backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py
10-12: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
14-59: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
backend/workflow_manager/workflow_v2/file_history_helper.py
368-368: Consider moving this statement to an else block
(TRY300)
backend/workflow_manager/workflow_v2/file_history_views.py
25-25: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
45-45: Consider moving this statement to an else block
(TRY300)
47-47: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
84-84: Unused method argument: request
(ARG002)
84-84: Unused method argument: id
(ARG002)
156-156: Do not catch blind exception: Exception
(BLE001)
168-168: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (7)
workers/api-deployment/tasks.py (2)
28-28: Importing ApiDeploymentResultStatus is appropriate for API-specific status mappingThe new import is correctly scoped and used only where API deployment results are constructed, keeping worker-level statuses aligned with core models.
527-548: Cached API results now surface correct status and error informationMapping cached ExecutionStatus to ApiDeploymentResultStatus and propagating
errorpluscached_statusinmetadatafixes the previous “always-success/no-error” behavior for cached responses and makes the API output consistent with underlying execution state. This block looks correct and matches the caching contract introduced in_check_file_history_api.backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.py (1)
1-59: Migration schema matches the model changes and indexing goalsThe migration cleanly introduces
execution_count,max_file_execution_count, and the expected workflow/file_history indexes (including the trigram GIN index). The operations align with the model snippet and should support the new query/limit semantics without issues.RUF012 warnings about
dependencies/operationsbeing mutable class attributes can be safely ignored here; this is standard Django migration structure.backend/workflow_manager/workflow_v2/models/file_history.py (3)
24-26: LGTM!The string representation is helpful for debugging and includes the relevant fields.
66-69: No issues found.The
execution_countfield is incremented usingF("execution_count") + 1after processing, which aligns perfectly with thedefault=1. A newly createdFileHistoryrecord starts at 1 (first processing), and subsequent increments reflect additional processing attempts. The semantics are consistent.
3-3: The migration already properly handles the PostgreSQL extension.Migration 0018 correctly includes
TrigramExtension()as the first operation (before the index creation), and theGinIndexproperly specifiesopclasses=["gin_trgm_ops"]. No changes needed.backend/workflow_manager/workflow_v2/file_history_views.py (1)
159-169: LGTM!The broad exception handling (Line 168) is appropriate for cache operations where failures should not block the main delete operation. The warning log ensures issues are tracked without interrupting the workflow.
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx
Show resolved
Hide resolved
…on_count_with_file_history_viewer
…th_file_history_viewer' of github.com:Zipstack/unstract into feat/UN-3011-FEAT_limit_maximum_file_execution_count_with_file_history_viewer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
backend/workflow_manager/workflow_v2/file_history_views.py (1)
29-48: Consider exception chaining for improved debugging.The validation logic is correct. For slightly better debugging context, consider chaining the exception to preserve the original error traceback.
Apply this diff:
except (ValueError, TypeError): - raise ValidationError({param_name: "Must be a valid integer"}) + raise ValidationError({param_name: "Must be a valid integer"}) from NoneAdding
from Nonesuppresses the underlying exception (which is intentional here since we're providing a clearer message), or usefrom errif you want to preserve the chain.frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
157-162: Consider useCallback for fetchFileHistories to satisfy exhaustive-deps.The
useEffectdependency array omitsfetchFileHistoriesandpagination.pageSize. While functionally correct (you only want to refetch when the modal opens), this may trigger linter warnings.Optionally wrap
fetchFileHistorieswithuseCallback:const fetchFileHistories = useCallback(async (page = 1, pageSize = 10) => { // ... existing implementation }, [workflowId, statusFilter, executionCountMin, executionCountMax, filePathFilter]);Then update the useEffect:
useEffect(() => { if (open && workflowId) { fetchFileHistories(1, pagination.pageSize); setSelectedRowKeys([]); } }, [open, workflowId, fetchFileHistories, pagination.pageSize]);This eliminates the linter warning while preserving the current behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (4)
backend/workflow_manager/workflow_v2/file_history_helper.py(5 hunks)backend/workflow_manager/workflow_v2/file_history_views.py(1 hunks)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx(1 hunks)frontend/src/components/workflows/workflow/workflow-service.js(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
backend/workflow_manager/workflow_v2/file_history_helper.py (2)
backend/workflow_manager/workflow_v2/models/file_history.py (1)
FileHistory(15-146)unstract/core/src/unstract/core/data_models.py (1)
ExecutionStatus(336-369)
frontend/src/components/workflows/workflow/workflow-service.js (5)
frontend/src/components/agency/ds-settings-card/DsSettingsCard.jsx (2)
options(27-27)axiosPrivate(53-53)frontend/src/components/agency/agency/Agency.jsx (1)
axiosPrivate(69-69)frontend/src/components/pipelines-or-deployments/pipelines/Pipelines.jsx (1)
axiosPrivate(68-68)frontend/src/components/deployments/api-deployment/ApiDeployment.jsx (1)
axiosPrivate(62-62)frontend/src/components/pipelines-or-deployments/etl-task-deploy/EtlTaskDeploy.jsx (1)
axiosPrivate(45-45)
backend/workflow_manager/workflow_v2/file_history_views.py (3)
backend/workflow_manager/workflow_v2/models/file_history.py (1)
FileHistory(15-146)backend/workflow_manager/workflow_v2/permissions.py (1)
IsWorkflowOwnerOrShared(7-45)backend/workflow_manager/workflow_v2/serializers.py (1)
FileHistorySerializer(133-161)
🪛 Ruff (0.14.6)
backend/workflow_manager/workflow_v2/file_history_helper.py
368-368: Consider moving this statement to an else block
(TRY300)
backend/workflow_manager/workflow_v2/file_history_views.py
26-26: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
46-46: Consider moving this statement to an else block
(TRY300)
48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
85-85: Unused method argument: request
(ARG002)
85-85: Unused method argument: id
(ARG002)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (10)
backend/workflow_manager/workflow_v2/file_history_views.py (4)
1-18: LGTM!Imports are appropriate and the
MAX_BULK_DELETE_LIMITconstant is well-defined for protecting against excessive bulk deletions.
21-27: LGTM!ViewSet configuration is correct. The permission classes ensure proper access control, and the use of
CustomPaginationprovides consistent pagination behavior.Note: The static analysis hint about
ClassVarannotation on line 26 is a false positive; Django REST Framework class attributes should not useClassVar.
50-83: LGTM!The queryset filtering is well-implemented with proper validation. The workflow caching optimization avoids redundant database queries, and the filters support flexible querying with appropriate validation.
85-96: LGTM!Single deletion is correctly implemented with appropriate logging. The static analysis warnings about unused parameters are false positives—these parameters are part of the DRF method signature.
frontend/src/components/workflows/workflow/workflow-service.js (1)
71-112: LGTM! Clean API integration.The four new methods follow the established patterns in this service. CSRF token handling is appropriate (GET omits it, DELETE and POST include it), and the URL construction is consistent with existing endpoints.
backend/workflow_manager/workflow_v2/file_history_helper.py (2)
241-294: LGTM! Atomic update implementation is solid.The helpers are well-designed:
_safe_strcorrectly handlesNoneexplicitly (avoiding the falsy-value issue)_truncate_hashimproves log readability_increment_file_historyusesF("execution_count") + 1for atomic increment and properly refreshes from the database to retrieve updated values
296-403: LGTM! Race condition handling is thorough.The three-tier approach correctly handles concurrent file history creation:
- Check for existing record and increment if found
- Create new record if none exists
- Handle
IntegrityErrorby fetching and incrementing the record created by another workerThe atomic
F()expression in_increment_file_historyensures the execution count is safely incremented even under concurrent access.Note: The static analysis hint (TRY300) about moving line 368 to an
elseblock is a false positive—the current structure is idiomatic and correct.frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (3)
204-237: LGTM! Bulk delete implementation is optimal.The current implementation uses a dedicated backend endpoint (
bulkDeleteFileHistoriesByIds) rather than multiple individual delete calls. This is superior to the previous approach suggested in past reviews—the backend can handle partial failures, transaction management, and atomicity better than client-side orchestration withPromise.allSettled.The 100-item limit (line 207) prevents overwhelming the backend.
240-316: LGTM! Smart two-phase bulk clear workflow.The implementation correctly:
- Fetches count with
page_size: 1to minimize data transfer (line 245)- Shows confirmation dialog with the exact count before deletion
- Constructs filter parameters consistently with
fetchFileHistories- Resets to page 1 after clearing to avoid displaying empty pages
327-630: LGTM! Well-designed responsive UI with good UX.Highlights:
- Copy-to-clipboard buttons for file paths and error messages improve usability
- Execution count display gracefully handles unlimited scenarios (
∞symbol)- Responsive column visibility (
responsive: ["md"]) optimizes mobile experience- Proper use of tooltips and ellipsis for long text
- Table scrolling and pagination configured appropriately
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
backend/workflow_manager/workflow_v2/file_history_views.py (3)
29-48: Validation helper is solid; optionally preserve original exception context
_validate_execution_countcorrectly centralizes integer/non‑negative validation and raisesValidationErrorwith clear messages. To satisfy tools like Ruff (B904) and aid debugging, you could optionally chain the original exception:- except (ValueError, TypeError): - raise ValidationError({param_name: "Must be a valid integer"}) + except (ValueError, TypeError) as exc: + raise ValidationError({param_name: "Must be a valid integer"}) from excThis is purely a nicety; current behavior is functionally fine.
85-96: Clean up unusedrequest/idparameters indestroy
destroydoesn’t use itsrequestoridarguments, which triggers linters (ARG002). If you want to quiet those without changing behavior, you can rename them to underscore‑prefixed:- def destroy(self, request, workflow_id=None, id=None): + def destroy(self, _request, workflow_id=None, _id=None):DRF will still route DELETE calls correctly.
98-178: Clear endpoint safeguards and filters are well-structured; consider reusing cached workflowThe
clearaction now:
- Requires at least one of
ids,status,execution_count_min/max, orfile_pathbefore proceeding.- Enforces
MAX_BULK_DELETE_LIMITfor ID‑based deletes.- Reuses
_validate_execution_countfor body filters.- Restricts deletes to the specified
workflow.This addresses the earlier “delete everything with empty body” risk nicely. One small optimization would be to reuse the workflow already cached on
request(as inget_queryset) to avoid an extra DB lookup:- workflow = get_object_or_404(Workflow, id=workflow_id) + if hasattr(request, "_workflow_cache"): + workflow = request._workflow_cache + else: + workflow = get_object_or_404(Workflow, id=workflow_id)Not required for correctness, just a minor efficiency/consistency tweak.
frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (4)
111-129: Consider removing or gatingconsole.*logging infetchFileHistoriesWithFiltersThe debug
console.log("Fetching file histories", ...)and detailedconsole.errorlogging are useful during development but can be noisy in production consoles. Since you already surface a user‑facing alert, you could:
- Remove the
console.*calls, or- Wrap them in a development check (e.g.,
if (process.env.NODE_ENV !== "production")).Functionality is fine either way; this is just cleanliness/observability hygiene.
Also applies to: 162-192
224-240: Resetting filters should probably also clear any existing row selection
handleResetFiltersclears all filter state and reapplies an empty filter set but leavesselectedRowKeysuntouched. That means after a reset, the “Delete Selected (N)” button can still reflect selections from the previous filter context, which may be surprising.You could align reset behavior with apply/refresh by also clearing selection:
const handleResetFilters = () => { setStatusFilter([]); setExecutionCountMin(null); setExecutionCountMax(null); setFilePathFilter(""); + setSelectedRowKeys([]); // Reset applied filtersPure UX consistency; current behavior isn’t broken.
247-297: Guard destructive actions whenworkflowIdis missing, or makeworkflowIdrequiredHandlers like
handleDeleteSingle,handleDeleteSelected,handlePrepareBulkClear, andperformBulkClearassumeworkflowIdis defined and pass it straight into the API service. If this modal is ever rendered without a validworkflowId(it’s currently optional inpropTypes), these calls will fail in less obvious ways.Two options:
- Mark
workflowIdas required inpropTypes, or- Add a simple guard at the top of each handler, similar to
fetchFileHistoriesWithFilters:const handleDeleteSelected = async () => { - if (selectedRowKeys.length === 0) return; + if (!workflowId || selectedRowKeys.length === 0) return; … };Same pattern can be applied to the other handlers. This makes the component more robust against misuse.
Also applies to: 300-390
263-287: Bulk delete / clear flows align well with backend constraintsThe bulk delete by IDs honors the frontend
MAX_BULK_DELETEof 100 and delegates to a single bulk API call, and the “Clear with Filters” workflow fetches a count first and only enables the button when filters are applied, matching the backendclearsafeguard.To keep things future‑proof, you might add a comment noting that
MAX_BULK_DELETEshould stay in sync with the backend’sMAX_BULK_DELETE_LIMIT, but otherwise the flows look solid.Also applies to: 635-685
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (2)
backend/workflow_manager/workflow_v2/file_history_views.py(1 hunks)frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
backend/workflow_manager/workflow_v2/file_history_views.py
26-26: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
46-46: Consider moving this statement to an else block
(TRY300)
48-48: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
85-85: Unused method argument: request
(ARG002)
85-85: Unused method argument: id
(ARG002)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (2)
backend/workflow_manager/workflow_v2/file_history_views.py (1)
50-83: Queryset construction and filtering look correct and consistent with API contractThe use of the cached workflow, status IN filter, execution_count_min/max via
_validate_execution_count, andfile_path__icontainsall look correct and align with the expected query parameters from the frontend. Ordering by-created_atis also a sensible default for a history view. No changes needed here.frontend/src/components/pipelines-or-deployments/file-history-modal/FileHistoryModal.jsx (1)
448-458: This concern is invalid due to backend validation constraints.The
max_file_execution_countfield in the Workflow model hasMinValueValidator(1)(backend/workflow_manager/workflow_v2/models/workflow.py, line 75), which prevents it from ever being set to 0. Theget_max_execution_count()method always returns an integer >= 1 or falls back tosettings.MAX_FILE_EXECUTION_COUNT(default: 3). Consequently, the value will never be falsy at the frontend, making the distinction between||and??moot. The current implementation is correct.Likely an incorrect or invalid review comment.
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but try to address the NIT comments
| return None | ||
|
|
||
| @staticmethod | ||
| def _safe_str(value: Any) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Can be a utility method
| try: | ||
| int_value = int(value) | ||
| if int_value < 0: | ||
| raise ValidationError({param_name: "Must be a non-negative integer"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Hope such errors get rendered / parsed correctly by the FE code. It seems different from other error messages
| # Check status - only return COMPLETED results | ||
| status = file_history_data.get("status") | ||
|
|
||
| from unstract.core.data_models import ExecutionStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Move this to the top
|
@muhammad-ali-e make sure to raise a docs PR as well for this |
once it went to staging |
| os.environ.get("MAX_PARALLEL_FILE_BATCHES_MAX_VALUE", 100) | ||
| ) | ||
| # Maximum number of times a file can be executed in a workflow | ||
| MAX_FILE_EXECUTION_COUNT = int(os.environ.get("MAX_FILE_EXECUTION_COUNT", 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muhammad-ali-e is this applicable for Any workflows or ETL alone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any Workflow except API. in another word only for ETL and Task workflows
| except Exception as e: | ||
| logger.warning(f"FileHistoryFilter - Error checking status {status}: {e}") | ||
| # Fallback to original is_completed logic if status checking fails | ||
| if not is_completed: | ||
| logger.info( | ||
| f"FileHistoryFilter - {file_hash.file_name}: ACCEPT " | ||
| f"(fallback: history exists but not completed, status={status})" | ||
| ) | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muhammad-ali-e why was this try except removed. Is it safe?
ritwik-g
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muhammad-ali-e please get the UI code reviewed by FE folks.
jagadeeswaran-zipstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move all inline css to external css file. Remove all the logs in FE and use optional chaining where ever required.
| const copyToClipboard = async (text, label = "Text") => { | ||
| try { | ||
| await navigator.clipboard.writeText(text); | ||
| message.success(`${label} copied to clipboard!`); | ||
| } catch (err) { | ||
| console.error("Failed to copy:", err); | ||
| message.error("Failed to copy to clipboard"); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CopyToClipboard util func should be available in getStaticData,js file please make use o that.
| filters = null | ||
| ) => { | ||
| if (!workflowId) { | ||
| console.error("FileHistoryModal: workflowId is missing"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use UseAlert() hook for alert messages. Don't use console.
| params | ||
| ); | ||
|
|
||
| console.log("File histories response:", response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the log in FE. Please log in backend if needed.
| console.error("Error fetching file histories:", err); | ||
| console.error("Error details:", { | ||
| message: err?.message, | ||
| response: err?.response, | ||
| status: err?.response?.status, | ||
| data: err?.response?.data, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| appliedFilters.executionCountMin !== null && | ||
| appliedFilters.executionCountMin !== undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use optional chaining where ever required.
| <Space size="small" style={{ width: "100%" }}> | ||
| <Text ellipsis={{ tooltip: text }} style={{ flex: 1, maxWidth: 400 }}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this styling into external css files. Inline css get hard to manage.



What
Why
How
Backend (Django):
FileHistoryViewSetwith read-only endpoints for listing, retrieving, deleting individual and bulk clearing file history recordsIsWorkflowOwnerOrSharedpermission class to ensure only workflow owners and shared users can access file historyexecution_countfield toFileHistorymodel andmax_file_execution_countfield toWorkflowmodel (prepared for future workflow-level limits)FileHistorymodel with execution count tracking and limit checking logicWorkflowmodel withget_max_execution_count()method supporting configuration hierarchy (prepared for future workflow > org > global expansion)FileHistorySerializerto include computed fields for max execution count and limit exceeded status/workflows/<id>/file-histories/for CRUD operationsFrontend (React):
FileHistoryModalcomponent with Ant Design table, pagination, and multi-select row selectionPipelinescomponent with "View File History" menu itemworkflow-service.jsfor file history operationsWorkers:
Future Expansion:
max_file_execution_countfield onWorkflowmodel for future workflow-level limit configurationCan this PR break any existing features. If yes, please list possible items. If no, please explain why.
Database Migrations
backend/workflow_manager/workflow_v2/migrations/0018_filehistory_execution_count_and_more.pyexecution_countfield toFileHistorymodel (IntegerField, default=0)max_file_execution_countfield toWorkflowmodel (IntegerField, nullable, prepared for future workflow-level configuration)FileHistory.statusandFileHistory.execution_countfor efficient filteringEnv Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Backend Testing:
max_file_execution_countfield exists but is not enforced yet (prepared for future)Frontend Testing:
Integration Testing:
Screenshots
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code