Fix brittle OWASP event date parsing and add regression tests#3295
Fix brittle OWASP event date parsing and add regression tests#3295arkid15r merged 8 commits intoOWASP:mainfrom
Conversation
Summary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughRefactors the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/apps/owasp/models/event.py (1)
97-100: Move thedatetimeimport to module level.Importing inside a function adds overhead on each call and is non-idiomatic. Since
datetimeis part of the standard library and already conditionally imported for type hints at line 11, move this to the top-level imports section.♻️ Suggested refactor
Add to the top-level imports (around line 11):
from datetime import date, datetimeThen remove line 99:
max_day_length = 2 try: - from datetime import datetime - # Normalize dashes (hyphen, en-dash, em-dash)backend/tests/apps/owasp/models/event_test.py (1)
31-56: Unit test mock may not exercise the expected code path.For the test case
("May 26-30, 2025", date(2025, 5, 26), date(2025, 5, 30)), theparse_datesimplementation returnsstart_date.replace(day=30)directly via line 120 inevent.py—parser.parseis never called. The mock setup on line 50 is bypassed, and the test passes only because the expected result matches whatstart_date.replace()returns.Consider either:
- Removing this case from the unit test (since integration test covers it)
- Adding a test case where
parser.parseis actually invoked (e.g., a month-crossover like "May 30 - June 2")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/apps/owasp/models/event.pybackend/tests/apps/owasp/models/event_test.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/apps/owasp/models/event.pybackend/tests/apps/owasp/models/event_test.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/apps/owasp/models/event.pybackend/tests/apps/owasp/models/event_test.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/event.py
🧬 Code graph analysis (1)
backend/tests/apps/owasp/models/event_test.py (1)
backend/apps/owasp/models/event.py (1)
parse_dates(92-130)
🔇 Additional comments (11)
backend/apps/owasp/models/event.py (7)
5-5: LGTM!Adding
reimport at module level is the correct approach for the regex operations used inparse_dates.
101-130: Solid refactor of the date parsing logic.The implementation correctly handles:
- Dash normalization (hyphen, en-dash, em-dash)
- ISO date format guard
- Same-month ranges via day extraction
- Month/year crossovers via
parser.parsewithdefaultcontextThe exception handling appropriately catches parsing edge cases.
68-81: LGTM!Docstring simplification while preserving the query logic and behavior.
83-89: LGTM!Concise docstring appropriately describes the method.
132-150: LGTM!Docstring simplified; method logic unchanged.
151-170: LGTM!Docstring simplified; method correctly maps category and fields from dict data.
172-231: LGTM!Docstrings for
generate_geo_location,generate_suggested_location,generate_summary,get_context, andsaveare appropriately concise. Implementation logic remains unchanged.backend/tests/apps/owasp/models/event_test.py (4)
58-74: Excellent integration test coverage.The test cases comprehensively cover the PR objectives:
- ISO dates, same-month ranges, month crossovers, year crossovers
- Em-dash (
—) and en-dash (–) normalizationThis provides strong regression protection for the parsing logic.
76-102: LGTM!The test correctly verifies that
update_dataretrieves an existing event and returns it after processing.
104-129: LGTM!The test verifies
from_dictcorrectly assigns category and end_date from parsed data.
131-151: LGTM!Using
date(2025, 1, 1)directly forstart-dateon line 143 is correct sincefrom_dicthandles both string and date types for this field.
backend/apps/owasp/models/event.py
Outdated
| QuerySet: A queryset of upcoming Event instances ordered by start date. | ||
|
|
||
| """ | ||
| """Get upcoming events.""" |
There was a problem hiding this comment.
Any good reason for removing docstrings?
There was a problem hiding this comment.
Thanks for calling this out.
I shortened docstrings where they were repeating the method name and obvious return behavior, without adding extra context. The idea was to keep documentation focused on places where the behavior isn’t selfevident (like parse_dates).
Happy to restore the longer version here if consistency is preferred.
There was a problem hiding this comment.
It's not a longer version. It's a template we use across BE code base even though not all methods are compliant yet.
Please revert.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/apps/owasp/models/event.py:
- Around line 136-147: The end-date parsing block using clean_dates and
parser.parse with default_dt can produce an end date earlier than start_date for
ranges that cross years (e.g., "December 30 - January 2"); after computing the
parsed end_date (from parser.parse or the day_val branch), detect if end_date <
start_date and, if so, adjust end_date by incrementing the year by 1 (e.g.,
replace year = end_date.year + 1 or add one year) before returning; update the
logic around the day_match branch and the default_dt parser.parse branch to
apply this year-rollover correction so the returned date is always on or after
start_date for cross-year ranges.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/apps/owasp/models/event.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/event.py
🔇 Additional comments (4)
backend/apps/owasp/models/event.py (4)
5-5: LGTM! Necessary imports for enhanced parsing.The
remodule enables dash normalization and pattern matching, whiledatetimeis needed to create a default datetime object fordateutil.parser.Also applies to: 13-14
74-74: LGTM! Docstring formatting improvement.Adding the colon after "Returns" aligns with standard docstring conventions.
109-152: Enhanced robustness with comprehensive normalization and error handling.The refactored parsing logic successfully addresses the brittleness issues:
- Dash normalization handles em-dash, en-dash, and hyphen variants
- ISO date detection prevents false range splitting
- Flexible range parsing with dateutil.parser and fallback logic
- Comprehensive exception handling for edge cases
The overall approach is sound and aligns with the PR objectives.
175-175: LGTM! Removed misleading comment.The inline comment "No start date." was removed appropriately, as
KeyErrorcan be raised for any missing required key in the data dictionary, not just the start date.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/apps/owasp/models/event.py (2)
140-144: The comma heuristic may incorrectly ignore an explicit year.When
end_stris something like"28, 2025", the condition"," in end_strtriggers the early return using onlystart_date.replace(day=...), discarding the explicit year. Ifstart_dateis in 2024 but the dates string specifies 2025, the result would be incorrect.Consider falling through to
parser.parsefor strings with commas to let dateutil handle the full date:Suggested change
day_match = re.match(r"^(\d{1,2})", end_str) if day_match: day_val = day_match.group(1) - if len(end_str) <= max_day_length or "," in end_str: + if len(end_str) <= max_day_length: return start_date.replace(day=int(day_val))Alternatively, if the comma case is needed for a specific known format, add a comment explaining when this applies.
107-109: Consider updating or removing the TODO comment.The
parse_datesmethod has now been substantially refactored to address the brittleness mentioned in issue #3286. The TODO may be outdated or could be updated to reflect any remaining improvements.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/apps/owasp/models/event.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/event.py
🔇 Additional comments (6)
backend/apps/owasp/models/event.py (6)
5-5: LGTM!The
reimport supports the new regex operations, and movingdatetimeoutsideTYPE_CHECKINGis necessary sincedatetime.combine()is now called at runtime.Also applies to: 13-14
74-74: LGTM!Minor docstring style fix for consistency.
125-133: LGTM!The dash normalization correctly handles varied delimiters (hyphen, en-dash, em-dash), and the ISO date guard prevents false positives when splitting "2025-05-26" on hyphens. The
noqa: RUF001is appropriate since the Unicode dashes are intentional.
146-151: LGTM on year crossover handling.Using
start_dateas the default context forparser.parseand incrementing the year whenend_date < start_datecorrectly handles month/year boundary crossovers (e.g., "December 26 - January 2").
153-156: LGTM!The exception handling covers parsing failures gracefully. Including
OverflowErrorprotects against invalid day replacements, and the fallback toNoneis appropriate for a parsing utility.
177-180: LGTM!Removing the inline comment is a minor cleanup; the
KeyErrorhandling for missingstart-dateis clear from context.
This reverts commit d8d0df8.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/apps/owasp/models/event.py (1)
140-151: Clarify the day-detection condition with an inline comment.The logic at lines 143-144 is subtle: it triggers the same-month path only when
end_strstarts with digits AND is either a bare day ("30") or a "day, year" format ("30, 2025"). This works because strings like "June 2, 2025" fail the initialday_matchsince they don't start with digits.Consider adding a brief inline comment to clarify why the comma check is safe here:
day_match = re.match(r"^(\d{1,2})", end_str) if day_match: day_value = day_match.group(1) + # Same-month range: bare day ("30") or "day, year" ("30, 2025"). + # Full month names (e.g., "June 2, 2025") fail day_match above. if len(end_str) <= max_day_length or "," in end_str: return start_date.replace(day=int(day_value))
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/apps/owasp/models/event.pybackend/tests/apps/owasp/models/event_test.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-31T05:17:39.659Z
Learnt from: kart-u
Repo: OWASP/Nest PR: 3101
File: backend/apps/common/extensions.py:92-98
Timestamp: 2025-12-31T05:17:39.659Z
Learning: In this codebase, import OperationType for GraphQL operations from the graphql-core package rather than from strawberry. Use 'from graphql import OperationType'. Strawberry re-exports via graphql-core internally, so relying on strawberry's API may be brittle. Apply this rule to all Python files that deal with GraphQL operation types; ensure imports come from graphql (graphql-core) and not from strawberry packages. This improves compatibility and avoids coupling to strawberry's internals.
Applied to files:
backend/tests/apps/owasp/models/event_test.pybackend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T17:48:23.963Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/management/commands/owasp_generate_community_snapshot_video.py:41-47
Timestamp: 2026-01-01T17:48:23.963Z
Learning: In Django code, be aware that a QuerySet's boolean evaluation (e.g., if not queryset) runs a database query to determine emptiness. While it is technically valid to use the queryset in a boolean context, use queryset.exists() for existence checks to avoid unnecessary queries and improve performance. Applicable broadly to Python/Django files rather than just this specific path.
Applied to files:
backend/tests/apps/owasp/models/event_test.pybackend/apps/owasp/models/event.py
📚 Learning: 2026-01-01T18:57:05.007Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2948
File: backend/apps/owasp/video.py:189-215
Timestamp: 2026-01-01T18:57:05.007Z
Learning: In the OWASP backend area, maintain the established pattern: when dealing with sponsors, include all entries from Sponsor.objects.all() (including NOT_SPONSOR) and perform in-memory sorting using the same criteria/pattern used by the GraphQL sponsor query implemented in backend/apps/owasp/api/internal/queries/sponsor.py. Apply this behavior consistently to files in backend/apps/owasp (not just video.py), and ensure code paths that render sponsor lists follow this in-code sorting approach rather than pre-filtering NOT_SPONSOR entries before sorting.
Applied to files:
backend/apps/owasp/models/event.py
🔇 Additional comments (10)
backend/apps/owasp/models/event.py (5)
5-5: LGTM - Imports appropriately added for the new parsing logic.The
remodule is needed for dash normalization, anddatetimeis correctly imported at runtime fordatetime.combine()usage.Also applies to: 13-14
123-126: LGTM - Dash normalization handles all expected variants.The regex correctly normalizes hyphen, en-dash, and em-dash characters as specified in the PR objectives. The
max_day_lengthconstant is appropriately named.
128-133: LGTM - ISO date guard prevents incorrect splitting.This correctly handles ISO-format dates like "2025-05-26" that would otherwise be incorrectly parsed as a date range due to the hyphen separators.
154-157: LGTM - Comprehensive exception handling.The broad exception catch appropriately handles parser failures, invalid day replacements (e.g., February 30), and edge cases with overflow values.
178-181: LGTM - KeyError handling unchanged.The exception handling behavior is preserved; removing the comment is a minor cleanup.
backend/tests/apps/owasp/models/event_test.py (5)
40-56: Unit test structure is sound, with a note on mock coverage.The unit test correctly verifies behavior, though for some inputs (e.g., "May 26-30, 2025"), the mock for
parser.parseis set up but never invoked because the implementation usesstart_date.replace()instead. The test still passes because the actual code returns the expected result.This is acceptable since the integration tests at lines 58-74 provide complementary coverage without mocking, ensuring both code paths are exercised.
58-74: Excellent integration test coverage for the new parsing logic.The test cases comprehensively cover the PR objectives:
- ISO-style dates (line 61)
- Year crossover handling (line 62)
- En-dash and em-dash variants (lines 66-67)
- Cross-month same-year ranges (line 68)
This ensures the refactored
parse_datesmethod handles real-world input variations correctly.
76-102: LGTM - Simplified test focuses on observable behavior.The test correctly verifies that
update_datareturns the expected event when the event already exists. Removing the detailed mock call assertions is a reasonable simplification that reduces test fragility.
104-129: LGTM - Focused assertions on key behaviors.The test appropriately verifies category mapping and end_date assignment, which are the non-trivial aspects of
from_dict. The straightforward field assignments don't require explicit assertions.
140-155: LGTM - Explicit test data improves clarity.Adding
"dates": ""explicitly (line 144) makes the test data self-contained rather than relying on implicit defaults fromdata.get("dates", "").



Proposed change
Resolves #3286
This PR addresses several edge cases in OWASP event date parsing that were causing failures due to slight variations in input format. The earlier logic depended on manual string splitting and assumed a very specific format (for example, a single dash type, fixed spacing, and same-month ranges). In practice, event dates often include em/en dashes, extra spaces, ISO-style dates, or ranges that cross months or years, which caused the parser to return incorrect results or
None.This change:
dateutil.parserwith the start date as context, allowing month and year crossoversThe behavior for previously valid inputs remains unchanged.
Checklist
make check-testlocally and all tests passed