Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow partial reads and align with GTFS specifications for stop_lat, stop_long, and transfer_type #68

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

InterferencePattern
Copy link
Collaborator

@InterferencePattern InterferencePattern commented Dec 15, 2022

I'm not sure if you're interested in this change of behavior, but it may be valuable to read a GTFS file partially instead of failing due to a single malformed record.

Additionally, changed validators to match reference

  • Allows 4 and 5 as transfer_type
  • Allows None for stop_lat and stop_long

@InterferencePattern InterferencePattern changed the title Skip malformed records Skip malformed records and align with GTFS specifications for stop_lat, stop_long, and transfer_type Dec 15, 2022
@InterferencePattern InterferencePattern changed the title Skip malformed records and align with GTFS specifications for stop_lat, stop_long, and transfer_type Allow partial reads and align with GTFS specifications for stop_lat, stop_long, and transfer_type Dec 20, 2022
Copy link
Owner

@jarondl jarondl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the contribution!

@@ -70,6 +70,8 @@ def in_range(self, key, value):
def _validate_float_range(float_min, float_max, *field_names):
@validates(*field_names)
def in_range(self, key, value):
if value is None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this make all of the in_range validated fields nullable? What am I missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since some floats are nullable, this rule was previously too strict, but you're right- this solution may be overly broad.

I can look into adding a nullable argument so this check won't fail only for nullable columns.

Copy link
Collaborator Author

@InterferencePattern InterferencePattern Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check this, I think I've made the necessary change to at least allow a check on nullable columns. I still need to make it conditional on the value of another column (but I'm new to sqlalchemy)

pygtfs/loader.py Outdated Show resolved Hide resolved
pygtfs/loader.py Show resolved Hide resolved
@InterferencePattern
Copy link
Collaborator Author

InterferencePattern commented Jan 6, 2023

Do you have any examples where you've successfully performed conditional checks? The checks I see only require one column at a time.

For stop_lat and stop_long:

Conditionally Required:

  • Required for locations which are stops (location_type=0), stations (location_type=1) or entrances/exits (location_type=2).
  • Optional for locations which are generic nodes (location_type=3) or boarding areas (location_type=4).

@jarondl
Copy link
Owner

jarondl commented Jan 26, 2023

No I don't have examples for that. I guess we can simply be lenient here.. I'm not a fan of standards with optional required fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants