Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added a validator for forbidden shape_distance #1896

Merged
merged 3 commits into from
Oct 21, 2024

Conversation

jcpitre
Copy link
Contributor

@jcpitre jcpitre commented Oct 17, 2024

Summary:
Closes #1885
Added a validator that checks for stopTime entries with location_id or location_group_id values that also have shape_dist_traveled value. If found, add a ForbiddenShapeDistTraveledNotice notice.

Expected behavior:

Used http://data.trilliumtransit.com/gtfs/thurston-wa-us/thurston-wa-us--flex-v2.zip as a test dataset.

Report before the change:

image


Report after the change:
image


Essentially we created 156 forbidden_shape_dist_traveled notices.
#1895 should take care of removing the 156 decreasing_or_equal_stop_time_distance notices.

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@jcpitre jcpitre linked an issue Oct 17, 2024 that may be closed by this pull request
@jcpitre jcpitre changed the title Added a validator for the presence of shape_distance whith location_i… feat: Added a validator for the presence of shape_distance Oct 17, 2024
Comment on lines +59 to +61
return header.hasColumn(GtfsStopTime.SHAPE_DIST_TRAVELED_FIELD_NAME)
&& (header.hasColumn(GtfsStopTime.LOCATION_ID_FIELD_NAME)
|| header.hasColumn(GtfsStopTime.LOCATION_GROUP_ID_FIELD_NAME));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

📝 Acceptance Test Report

📋 Summary

✅ The rule acceptance has passed for commit 653e196
Download the full acceptance test report here (report will disappear after 90 days).

📊 Notices Comparison

New Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

New Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

🛡️ Corruption Check

0 out of 1602 sources (~0 %) are corrupted.

⏱️ Performance Assessment

📈 Validation Time

Assess the performance in terms of seconds taken for the validation process.

Time Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 4.00 4.05 ⬆️+0.05
Median -- 1.38 1.44 ⬆️+0.06
Standard Deviation -- 11.39 11.43 ⬆️+0.04
Minimum in References Reports us-california-flex-v2-developer-test-feed-2-gtfs-1818 0.52 0.55 ⬆️+0.03
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 296.65 294.58 ⬇️-2.06
Minimum in Latest Reports us-california-city-of-wasco-gtfs-1788 0.56 0.53 ⬇️-0.03
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 296.65 294.58 ⬇️-2.06
📜 Memory Consumption
Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 473.10 MiB 489.83 MiB ⬆️+16.73 MiB
Median -- 248.00 MiB 246.02 MiB ⬇️-1.98 MiB
Standard Deviation -- 834.47 MiB 891.97 MiB ⬆️+57.50 MiB
Minimum in References Reports us-oregon-hut-airport-shuttle-gtfs-635 34.05 MiB 34.08 MiB ⬆️+32.00 KiB
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 9.93 GiB 10.06 GiB ⬆️+135.76 MiB
Minimum in Latest Reports us-california-flex-v2-developer-test-feed-2-gtfs-1818 34.06 MiB 34.05 MiB ⬇️-16.00 KiB
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 9.93 GiB 10.06 GiB ⬆️+135.76 MiB

Copy link
Contributor

📝 Acceptance Test Report

📋 Summary

✅ The rule acceptance has passed for commit 4f009ba
Download the full acceptance test report here (report will disappear after 90 days).

📊 Notices Comparison

New Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

New Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

🛡️ Corruption Check

0 out of 1602 sources (~0 %) are corrupted.

⏱️ Performance Assessment

📈 Validation Time

Assess the performance in terms of seconds taken for the validation process.

Time Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 3.99 4.07 ⬆️+0.08
Median -- 1.38 1.43 ⬆️+0.05
Standard Deviation -- 11.45 11.60 ⬆️+0.14
Minimum in References Reports us-california-city-of-wasco-gtfs-1788 0.51 0.52 ⬆️+0.01
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 294.66 300.56 ⬆️+5.91
Minimum in Latest Reports us-california-city-of-wasco-gtfs-1788 0.51 0.52 ⬆️+0.01
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 294.66 300.56 ⬆️+5.91
📜 Memory Consumption
Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 515.60 MiB 486.33 MiB ⬇️-29.27 MiB
Median -- 245.94 MiB 246.25 MiB ⬆️+315.02 KiB
Standard Deviation -- 971.66 MiB 869.81 MiB ⬇️-101.86 MiB
Minimum in References Reports us-massachusetts-massachusetts-area-express-max-gtfs-431 34.05 MiB 34.06 MiB ⬆️+16.00 KiB
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 10.00 GiB 10.20 GiB ⬆️+199.76 MiB
Minimum in Latest Reports tr-kocaeli-metro-izmir-gtfs-1824 34.06 MiB 34.06 MiB ⬇️0 bytes
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 10.00 GiB 10.20 GiB ⬆️+199.76 MiB

@jcpitre jcpitre changed the title feat: Added a validator for the presence of shape_distance feat: Added a validator for forbidden shape_distance Oct 18, 2024
Copy link
Member

@davidgamez davidgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

📝 Acceptance Test Report

📋 Summary

✅ The rule acceptance has passed for commit 97741e7
Download the full acceptance test report here (report will disappear after 90 days).

📊 Notices Comparison

New Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Errors (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

New Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

Dropped Warnings (0 out of 1602 datasets, ~0%) ✅

No changes were detected due to the code change.

🛡️ Corruption Check

0 out of 1602 sources (~0 %) are corrupted.

⏱️ Performance Assessment

📈 Validation Time

Assess the performance in terms of seconds taken for the validation process.

Time Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 4.02 4.11 ⬆️+0.09
Median -- 1.39 1.47 ⬆️+0.08
Standard Deviation -- 11.51 11.60 ⬆️+0.09
Minimum in References Reports us-massachusetts-massachusetts-area-express-max-gtfs-431 0.53 0.63 ⬆️+0.10
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 302.65 302.82 ⬆️+0.17
Minimum in Latest Reports us-california-city-of-wasco-gtfs-1788 0.56 0.54 ⬇️-0.02
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 302.65 302.82 ⬆️+0.17
📜 Memory Consumption
Metric Dataset ID Reference (s) Latest (s) Difference (s)
Average -- 484.98 MiB 500.40 MiB ⬆️+15.42 MiB
Median -- 245.70 MiB 246.00 MiB ⬆️+306.62 KiB
Standard Deviation -- 875.66 MiB 906.68 MiB ⬆️+31.01 MiB
Minimum in References Reports tr-kocaeli-metro-izmir-gtfs-1824 34.05 MiB 34.06 MiB ⬆️+16.00 KiB
Maximum in Reference Reports gb-unknown-uk-aggregate-feed-gtfs-2014 9.75 GiB 10.19 GiB ⬆️+448.89 MiB
Minimum in Latest Reports us-california-catalina-express-gtfs-299 34.06 MiB 34.05 MiB ⬇️-8.00 KiB
Maximum in Latest Reports gb-unknown-uk-aggregate-feed-gtfs-2014 9.75 GiB 10.19 GiB ⬆️+448.89 MiB

@jcpitre jcpitre merged commit b8012a5 into master Oct 21, 2024
335 checks passed
@jcpitre jcpitre deleted the 1885-flex-forbidden_shape_dist_traveled branch October 21, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flex: forbidden_shape_dist_traveled
3 participants