Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] warning if more than one timezone is reported on agency file #705

Merged
merged 3 commits into from
Sep 21, 2020

Conversation

patochectp
Copy link
Member

@patochectp patochectp commented Sep 18, 2020

ref: ND-1005.

src/gtfs/read.rs Outdated
Comment on lines 580 to 593
for (current_agency, next_agency) in gtfs_agencies.iter().zip(gtfs_agencies.iter().skip(1)) {
if current_agency.timezone != next_agency.timezone {
warn!(
"different agency timezone: {} - {}",
current_agency.timezone, next_agency.timezone
);
break;
}
}
Copy link
Contributor

@woshilapin woshilapin Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like this?

Suggested change
for (current_agency, next_agency) in gtfs_agencies.iter().zip(gtfs_agencies.iter().skip(1)) {
if current_agency.timezone != next_agency.timezone {
warn!(
"different agency timezone: {} - {}",
current_agency.timezone, next_agency.timezone
);
break;
}
}
let timezones: HashSet<_> = gtfs_agencies.iter().map(|agency| agency.timezone).collect();
if timezones.len() > 1 {
if current_agency.timezone != next_agency.timezone {
warn!(
"multiple agency timezones found: {:?}",
timezones,
);
break;
}
}

src/gtfs/read.rs Outdated
Comment on lines 1475 to 1487
testing_logger::validate(|captured_logs| {
assert_eq!(captured_logs.len(), 2);
assert_eq!(
captured_logs[1].body,
"different agency timezone: Europe/London - Europe/Paris"
);
assert_eq!(captured_logs[1].level, LogLevel::Warn);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@pbougue pbougue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add test with only one agency ?

src/gtfs/read.rs Outdated
@@ -576,6 +576,20 @@ where
{
let filename = "agency.txt";
let gtfs_agencies = read_objects::<_, Agency>(file_handler, filename)?;

for (current_agency, next_agency) in gtfs_agencies.iter().zip(gtfs_agencies.iter().skip(1)) {
Copy link
Contributor

@pbougue pbougue Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just compare all others to the first one (that would be the only referent, and it would save one iterator) ?
It looks weird to me to use a different "referent" each time we compare (in the current version), but it works to output desired warning.

Do we want to point out the less-represented timezone ? Not sure it's super useful, and it would require more code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pbougue
It breaks at the first difference, and without memory allocation 🤔

Copy link
Contributor

@pbougue pbougue Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify my comment after discussion, I suggest comparing only gtfs_agencies.first() with all the following.
Something along that (not tested at all, maybe store only referent timezone, not agency):

let referent_agency = gtfs_agencies.first();

for agency in gtfs_agencies.iter().skip(1) {
    if referent_agency.timezone != agency.timezone {
        [...]
    }
}

Anyway it's just a suggestion not a veto as current code will warn if and only if problems arise.

And indeed I didn't see the break (proposed solution could warn on all differences).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the number of iterations will be the same.
If we have a, b, ,c and d

my proposal will give :
a compare with b
b compare with c
c compare with d

your proposal will give :
a compare with b
a compare with c
a compare with d

Copy link
Contributor

@woshilapin woshilapin Sep 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @pbougue's solution because:

  • it's one iterator instead of 2 iterators (1N complexity instead of 2N complexity)
  • it feels easier to understand what we're trying to do (the .zip() feel a little less natural)

But in any case, I'm fine with whatever we end up merging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I capitulate because it is more readable

@patochectp patochectp force-pushed the different_agency_timezone branch from b79b3fb to c0f7eb7 Compare September 21, 2020 13:27
@patochectp patochectp merged commit 2acdd2b into hove-io:master Sep 21, 2020
@patochectp patochectp deleted the different_agency_timezone branch September 21, 2020 14:44
@ArnaudOggy ArnaudOggy mentioned this pull request Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants