Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specs gtfs2ntfs: reading calendars & common pre-processing #438

Merged
merged 3 commits into from
Nov 7, 2019

Conversation

papailio
Copy link
Contributor

I added the specs on reading calendars and I modified the file data_prefix.md to common.md where I put together the common parts for all converters: for now, the specs on data prefix and datasets/contributors.
Eventually, santizer will be part of this file too.

The objects `contributor` and `dataset` are required, containing at least the
corresponding identifier (and the name for `contributor`), otherwise the conversion
stops with an error. The object `feed_infos` is optional.
The files datasets.txt and contributors.txt provide additional information about the data source.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

In general I think we should write files in bold or something else

@@ -178,6 +179,17 @@ specified, the conversion should stop immediately with an error.

(1) When several GTFS Routes with different `route_type`s are grouped together, the commercial_mode_id with the smallest priority should be used (as specified in chapter "Mapping of route_type with modes").

### Reading calendars.txt and calendar_dates.txt
Dates of service are trasnformed into explicit active service exceptions as if using a single NTFS file calendar_dates.txt. The resulting files might be different following an optimization operation applied at the end of the conversion, but the result should be functionally identical.
* In case both files calendar.txt and calendar_dates.txt are present in the input dataset, the days of the week of the specified services within the date range [`start_date` - `end_date`] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (`service_id`, `date`) pairs must be unique.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what @datanel already mentioned above.

Suggested change
* In case both files calendar.txt and calendar_dates.txt are present in the input dataset, the days of the week of the specified services within the date range [`start_date` - `end_date`] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (`service_id`, `date`) pairs must be unique.
* In case both files `calendar.txt` and `calendar_dates.txt` are present in the input dataset, the days of the week of the specified services within the date range [`start_date` - `end_date`] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (`service_id`, `date`) pairs must be unique.

### Reading calendars.txt and calendar_dates.txt
Dates of service are trasnformed into explicit active service exceptions as if using a single NTFS file calendar_dates.txt. The resulting files might be different following an optimization operation applied at the end of the conversion, but the result should be functionally identical.
* In case both files calendar.txt and calendar_dates.txt are present in the input dataset, the days of the week of the specified services within the date range [`start_date` - `end_date`] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (`service_id`, `date`) pairs must be unique.
* In case the file calendar.txt is empty or not present in the input dataset, the active service dates are loaded as is.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* In case the file calendar.txt is empty or not present in the input dataset, the active service dates are loaded as is.
* In case the file `calendar.txt` is empty or not present in the input dataset, the active service dates are loaded as is.

A configuration file `config.json`, as it is shown below, is provided for each
converter and contains additional information about the data source as well as about
the upstream system that generated the data (if available). In particular, it
provides the necessary information for the required NTFS files datasets.txt and contributors.txt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
provides the necessary information for the required NTFS files datasets.txt and contributors.txt.
provides the necessary information for:
- the required NTFS files datasets.txt and contributors.txt,
- some additional metadata to be inserted in the `feed_infos.txt` file.

@@ -178,6 +179,17 @@ specified, the conversion should stop immediately with an error.

(1) When several GTFS Routes with different `route_type`s are grouped together, the commercial_mode_id with the smallest priority should be used (as specified in chapter "Mapping of route_type with modes").

### Reading calendars.txt and calendar_dates.txt
Dates of service are trasnformed into explicit active service exceptions as if using a single NTFS file calendar_dates.txt. The resulting files might be different following an optimization operation applied at the end of the conversion, but the result should be functionally identical.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Dates of service are trasnformed into explicit active service exceptions as if using a single NTFS file calendar_dates.txt. The resulting files might be different following an optimization operation applied at the end of the conversion, but the result should be functionally identical.
GTFS services are transformed into lists of active dates as if using a single NTFS file calendar_dates.txt. The resulting NTFS files might be different following an optimization operation applied at the end of the conversion, but the result should be functionally identical.

Shouldn't this calendar optimization be described in the common.md file ?

@mergify mergify bot dismissed woshilapin’s stale review October 30, 2019 12:59

Pull request has been modified.

@woshilapin
Copy link
Contributor

This PR has been here for a while now. @datanel @prhod, since you both made comments on this PR, are each you fine for a merge of this one?

@datanel datanel force-pushed the doc_read_gtfs_calendars branch from 2b23714 to e85e429 Compare November 7, 2019 10:49
@mergify mergify bot merged commit 918183f into hove-io:master Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants