This document aims to describe how the GTFS format is read in the Navitia Transit Model. To improve readability of this document, the specification will describe the transformation of a GTFS feed into a NTFS feed (which is a bunch of csv files accordingly to the memory Navitia Transit Model).
If at any time of the conversion, the GTFS is not conform to the GTFS specification, the conversion should stop immediately with an error, unless otherwise specified.
At the end of the conversion, a sanitizing operation is started on the final model. See common NTFS rules for more information.
As explained in common NTFS rules, a prefix is added to all identifiers during the conversion in order to guarantee uniqueness among objects IDs. In the following chapters, identifiers may be prepend with this prefix using this pattern : <prefix>:<object_id>. The use of this specific pattern is shown explicitly using the value ID in the column Constraint in the tables below.
To reinforce the uniqueness some objects might have a sub-prefix (generated automatically) in addition to their prefix.
The pattern is the following <prefix>:<sub_prefix>:<object_id>.
Objects concerned by this sub-prefix in this connector are: calendars
, trips
, trip_properties
, frequencies
, comments
, comment_links
, geometries
, equipments
.
In addition, the NTFS format introduces 2 objects to enable the manipulation of several datasets: contributors and datasets. Those two objects are described in common NTFS rules.
Two parameters can be specified as CLI arguments of the converter in order to determine if on demand transport (ODT) data should be considered when reading the input GTFS (in particular, when reading the stop_times.txt file):
- a boolean parameter
--odt
, by default set tofalse
, indicating if the GTFS should be considered as containing ODT information - a string
--odt-comment "some message"
setting the message associated to an ODT comment.
A third boolean CLI argument (--read-as-line
) may affect the reading of the file routes.txt. If true, each GTFS "Route" will generate a different "Line" else we group the routes by "agency_id" and "route_short_name" (or "route_long_name" if the short name is empty) and create a "Line" for each group.
GTFS object | NTFS object(s) |
---|---|
agency | network and company |
route | line, route, physical_mode, commercial_mode |
trip | route and trip |
stop_time | stop_time |
transfer | transfer |
shape | geometry |
frequency | trip and stop_time |
The field "agency_id" may not be provided in the GTFS as it's an optional field.
- If there is only one agency, the "agency_id" is considered to be "1".
- If there are several agencies, the program will raise an exception as it won't be able to choose the right agency for the routes.
If 2 networks with the same ID are specified, the conversion should stop immediately with an error.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
networks.txt | network_id | ID | agency.txt | agency_id | See above when not specified |
networks.txt | network_name | Required | agency.txt | agency_name | |
networks.txt | network_url | Optional | agency.txt | agency_url | |
networks.txt | network_timezone | Optional | agency.txt | agency_timezone | |
networks.txt | network_lang | Optional | agency.txt | agency_lang | |
networks.txt | network_phone | Optional | agency.txt | agency_phone | |
networks.txt | network_fare_url | Optional | agency.txt | agency_fare_url |
"Source" complementary code :
A complementary object_code
is added to each network with the following properties:
object_type
: the fixed valuenetwork
object_id
: the value of thenetwork_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified value ofagency_id
(or1
if the value is not provided as stated above)
If 2 companies with the same ID are specified, the conversion should stop immediately with an error.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
companies.txt | company_id | ID | agency.txt | agency_id | 1 if the value is not provided (same rule as networks) |
companies.txt | company_name | Required | agency.txt | agency_name | |
companies.txt | company_url | Optional | agency.txt | agency_lang | |
companies.txt | company_phone | Optional | agency.txt | agency_phone |
"Source" complementary code :
A complementary object_code
is added to each company with the following properties:
object_type
: the fixed valuecompany
object_id
: the value of thecompany_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified value ofagency_id
(or1
if the value is not provided as stated above)
Like the GTFS, the NTFS group stop_points and stop_areas in on file : stops.txt. If the stop_points have the same ID, the conversion should stop immediately with an error. Likewise for the stop_areas.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
stops.txt | stop_id | ID | stops.txt | stop_id | All slashes / will be removed |
stops.txt | stop_code | Optional | stops.txt | stop_code | Additionally, this GTFS property is stored as an associated code for this stop. See (2) for complementary properties. |
stops.txt | stop_name | Required | stops.txt | stop_name | |
stops.txt | stop_lat | Required | stops.txt | stop_lat | |
stops.txt | stop_lon | Required | stops.txt | stop_lon | |
stops.txt | location_type | Optional | stops.txt | location_type | The value is set to 0 if the input value is 0 or unspecified or invalid, 1 if the input value is 1 , 3 if the input value is 2 , 4 if the input value is 3 and 5 if the input value is 4 |
stops.txt | parent_station | Optional | stops.txt | parent_station | All slashes / are removed (1) |
stops.txt | stop_timezone | Optional | stops.txt | stop_timezone | |
stops.txt | fare_zone_id | Optional | stops.txt | zone_id | Only for stop_point (location_type = 0) |
comments.txt | comment_value | Optional | stops.txt | stop_desc | See (3) for additional properties |
equipments.txt | wheelchair_boarding | Optional | stops.txt | wheelchair_boarding | If value is not one of 0 , 1 or 2 , then set to 0 . See (4) for detailed info. |
(1) If the parent_station
field of a stop_point (location_type
= 0 or empty) is missing or empty, then a stop_area should be created, using the following properties :
stop_id
: the stop_id of the stop_point, with the following pattern : Navitia:<stop_id of the stop_point>stop_name
: the stop_name of the stop_pointstop_lat
: the stop_lat of the stop_pointstop_lon
: the stop_lon of the stop_pointlocation_type
: fixed value "1" (to specify it's a stop_area) Theparent_station
of the stop_point should then contain the generatedstop_area.id
.
(2) The stop_code
field is added as a complementary object_code
with the following properties:
object_type
:stop_point
orstop_area
accordingly to thelocation_type
valueobject_id
: NTFSstop_id
object_system
: Fixed valuegtfs_stop_code
object_code
: value of thestop_code
property Thegtfs_stop_code
complementaryobject_code
is kept here for backward compatibility reasons. It will be removed in the future.
(3) The comment
object is a complex type with additional properties :
comment_id
: specify an identifier with the pattern :stop:<stop_id of GTFS>comment_type
: specify the fixed value "information"
(4) The equipment
object is a complex type with additional properties :
equipment_id
: should be generated by the reader.wheelchair_boarding
: possible values are the same in both GTFS and NTFS. Be careful to only create necessary equipments and avoid duplicates.
"Source" complementary code :
A complementary object_code
is added to each stop with the following properties:
object_type
: the fixed valuestop_point
orstop_area
(depending on the object)object_id
: the NTFS value of thestop_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified GTFS value ofstop_id
The standard values of the route_type
field are directly mapped to the NTFS modes. Extended GTFS modes are read by categories mapping the most prominent mode. The priority is used to prioritize the use of a commercial mode when creating a Line grouping routes with different route_type
s. This priorization follow the Netex Specification in chapter 6.2.3 (and also indicated in the NTFS Specification).
GTFS route_type | NTFS physical_mode ID (1) | NTFS commercial_mode ID (2) | NTFS commercial_mode name | Priority |
---|---|---|---|---|
0, 9XX | Tramway | Tramway | Tramway | 3 |
1, 4XX, 5XX, 6XX | Metro | Metro | Metro | 4 |
2, 1XX, 3XX | Train | Train | Train | 2 |
3, 7XX, 8XX | Bus | Bus | Bus | 8 |
4, 10XX, 12XX | Ferry | Ferry | Ferry | 1 |
5 | Funicular | CableCar | Cable car | 6 |
6, 13XX | SuspendedCableCar | SuspendedCableCar | Suspended cable car | 7 |
7, 14XX | Funicular | Funicular | Funicular | 5 |
2XX | Coach | Coach | Coach | 8 |
11XX | Air | Air | Airplane | 0 |
15XX | Taxi | Taxi | Taxi | 8 |
16XX, 17XX | Bus | UnknownMode | Unknown mode | 8 |
(1) The physical_mode ID is a mapping with a specific value as described in the NTFS format specification. This value must not be prefixed. (2) The commercial_mode ID are standardized when converting from GTFS. This value must not be prefixed.
All physical_mode
are enhanced with CO2 emission and fallback modes, following
the documentation in common NTFS rules.
A Route is created for each direction of existing trips. If 2 routes with the same ID are specified, the conversion should stop immediately with an error. Warning : If the GTFS route has no trips, the Navitia Route should NOT be created and a warning should be logged.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
routes.txt | route_id | ID | routes.txt | route_id | append a _R suffix for the Route grouping trips with direction_id = 1 (no suffix for 0 or undefined direction_id ) |
routes.txt | route_name | Required | routes.txt | route_long_name | (1) |
routes.txt | direction_type | Optional | (2) | ||
routes.txt | line_id | Required | corresponding line.id (see Line construction above) |
||
routes.txt | destination_id | Optional | This field contains a stop_area.id of the most frequent destination of the contained trips (ie. the parent_station of the most frequent last stop of trips) | ||
comments.txt | comment_value | Optional | routes.txt | route_desc | The comment is generated only when the parameter read-as-line is deactivated. See (3) for additional properties |
(1) if only one route is created (only one direction in included trips), use
route_long_name
or, if empty, use route_short_name
. In case of multiple
routes created (multiple directions in included trips), see common NTFS rules
for generating the route_name
.
(2) the field direction_type
contains backward
when grouping GTFS Trips with direction_id
= 1, forward
otherwise
(3) The comment
object is a complex type with additional properties :
comment_id
: specify an identifier with the pattern <prefix>:route:<route_id of GTFS>comment_type
: specify the fixed value "Information"
"Source" complementary code :
A complementary object_code
is added to each route with the following properties:
object_type
: the fixed valueroute
object_id
: the NTFS value of theroute_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified GTFS value ofroute_id
A Navitia Line is created to group one or several Navitia Routes when they are
created with the same gtfs agency_id
and the same route_short_name
(or
route_long_name
if the latter is empty). If 2 lines with the same ID are
specified, the conversion should stop immediately with an error.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
lines.txt | network_id | Required | This field should contain the network.id corresponding to the agency_id of the routes; if no agency_id is specified in the route, use the ID of the unique network; if no network or multiple networks are available, the conversion should stop immediately with an error |
||
lines.txt | line_id | ID | routes.txt | route_id | Use the smallest route_id of the grouped gtfs Route |
lines.txt | line_code | Optional | routes.txt | route_short_name | |
lines.txt | line_name | Required | routes.txt | The Navitia route_name of the Route with the smallest route_id (as a string) is used. |
|
lines.txt | line_color | Optional | routes.txt | route_color | if several values are available, a warning is logged and the color of the smallest route_id is used; if color format is incorrect, the value is dropped |
lines.txt | line_text_color | Optional | routes.txt | route_text_color | same as line_color; if color format is incorrect, the value is dropped |
lines.txt | line_sort_order | Optional | routes.txt | route_sort_order | |
lines.txt | commercial_mode_id | Required | routes.txt | route_type | See "Mapping of route_type with modes" chapter (1). |
comments.txt | comment_value | Optional | routes.txt | route_desc | The comment is generated only when the parameter read-as-line is activated. See (2) for additional properties. |
(1) When several GTFS Routes with different route_type
s are grouped together, the commercial_mode_id with the smallest priority should be used (as specified in chapter "Mapping of route_type with modes").
(2) The comment
object is a complex type with additional properties :
comment_id
: specify an identifier with the pattern <prefix>:line:<route_id of GTFS>comment_type
: specify the fixed value "Information"
A complementary object_code
is added to each line with the following properties:
object_type
: the fixed valueline
object_id
: the NTFS value of theline_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified GTFS value ofroute_id
GTFS services are transformed into lists of active dates as if using a single NTFS
file calendar_dates.txt
. The resulting NTFS files might be different following an
optimization operation applied at the end of the conversion, but the result should be
functionally identical.
- In case both files
calendar.txt
andcalendar_dates.txt
are present in the input dataset, the days of the week of the specified services within the date range [start_date
-end_date
] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (service_id
,date
) pairs must be unique. - In case the file
calendar.txt
is empty or not present in the input dataset, the active service dates are loaded as is.
If 2 trips with the same ID are specified, the conversion should stop immediately with an error.
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
trips.txt | route_id | Required | trips.txt | route_id | cf. NTFS route_id definition above to specify the proper reference. |
trips.txt | service_id | Required | trips.txt | service_id | |
trips.txt | trip_id | Required | trips.txt | trip_id | |
trips.txt | trip_headsign | Optional | trips.txt | trip_short_name , or if empty trip_headsign |
|
trips.txt | block_id | Optional | trips.txt | block_id | |
trips.txt | company_id | Required | routes.txt | agency_id | The company corresponding to the agency_id of the trip's route_id |
trips.txt | physical_mode_id | Required | use the route_type See "Mapping of route_type with modes" chapter |
||
trips.txt | trip_property_id | Optional | trips.txt | (1) | |
trips.txt | dataset_id | Required | The dataset_id provided (cf. gtfs2ntfs.md ) |
||
trips.txt | geometry_id | Optional | trips.txt | shape_id | All slashes / are removed |
(1) The trip_property
object is a complex type with additional properties :
trip_property_id
: should be generated by the reader.wheelchair_accessible
: possible values are the same in both GTFS and NTFS; if value is not one of0
,1
or2
, then set to0
.bike_accepted
: corresponding to the GTFSbikes_allowed
property. Possible values are the same in both GTFS and NTFS; if value is not one of0
,1
or2
, then set to0
. Be careful to only create necessarytrip_properties
and avoid duplicates.
"Source" complementary code :
A complementary object_code
is added to each vehicle journey with the following properties:
object_type
: the fixed valuetrip
object_id
: the value of thetrip_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified GTFS value oftrip_id
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
stop_times.txt | trip_id | Required | stop_times.txt | trip_id | All slashes / are removed; if the corresponding trip doesn't exist, the conversion should stop immediately with an error |
stop_times.txt | arrival_time | Optional | stop_times.txt | arrival_time | If not specified, see (1) |
stop_times.txt | departure_time | Optional | stop_times.txt | departure_time | If not specified, see (1) |
stop_times.txt | stop_id | Required | stop_times.txt | stop_id | If the corresponding stop doesn't exist, the conversion should stop immediately with an error |
stop_times.txt | stop_sequence | Required | stop_times.txt | stop_sequence | |
stop_times.txt | stop_headsign | Optional | stop_times.txt | stop_headsign | |
stop_times.txt | pickup_type | Optional | stop_times.txt | pickup_type | If invalid unsigned integer, default to 0 . If 2 , see (3) for the generation of comments. |
stop_times.txt | drop_off_type | Optional | stop_times.txt | drop_off_type | If invalid unsigned integer, default to 0 . If 2 , see (3) for the generation of comments. |
stop_times.txt | stop_time_precision | Optional | stop_times.txt | timepoint | GTFS and NTFS values are inverted when no ODT information is considered. See (2). If invalid unsigned integer, default to 1 |
(1) GTFS arrival_time
and departure_time
should contain values.
- if both of them are empty :
- if the stop_time is the first or the last of the trip, an error is returned
- if not, the time should be interpolated (see below).
- if one of them is empty, a warning should be logged and the value of the other field should be copied to the empty one.
Interpolation If a stop_time needs to be interpolated :
- collect the nearest preceding stop_time and the nearest following stop_time containing a valid time value
- apply a simple distribution for all the intermediate stop_times For exemple :
GTFS passing time | NTFS Extrapolated time |
---|---|
9:00 | 9:00 |
- | 9:30 |
- | 10:00 |
10:30 | 10:30 |
(2) Depending of the value of the parameter odt
, the GTFS timepoint
conversion rules for NTFS stop_time_precision
are :
- if
odt
is set tofalse
or empty:- if
timepoint
is unspecified =>stop_time_precision
equals 0 (Exact) - if
timepoint
equals 1 =>stop_time_precision
equals 0 (Exact) - if
timepoint
equals 0 =>stop_time_precision
equals 1 (Approximate)
- if
- if
odt
is set totrue
:- if
timepoint
is unspecified =>stop_time_precision
equals 0 (Exact) - if
timepoint
equals 1 =>stop_time_precision
equals 0 (Exact) - if
timepoint
equals 0 =>stop_time_precision
equals 2 (Estimated, the bus may not even pass through this point)
- if
(3) A comment associated to the stop_time is created in the files comments.txt and comment_links.txt as follows:
NTFS file | NTFS field | Constraint | Value/Note |
---|---|---|---|
comments.txt | comment_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by - . Note that this field is prefixed as explained in common NTFS rules. |
comments.txt | comment_type | Optional | on_demand_transport |
comments.txt | comment_name | Required | The message set for the parameter odt_comment . |
comment_links.txt | object_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by - . Note that this field is prefixed as explained in common NTFS rules. |
comment_links.txt | object_type | Required | stop_time |
comment_links.txt | comment_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by - . Note that, as this field references the comment in file comments.txt, it should be prefixed as explained in common NTFS rules. |
- If 2 transfers with the same ID are specified, the conversion should stop immediately with an error
- If a line of the file is not conform to the specification, then the line is ignored
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
transfers.txt | from_stop_id | Required | transfers.txt | from_stop_id | All slashes / are removed; if the stop_id doesn't exist in stops.txt , the transfer is ignored |
transfers.txt | to_stop_id | Required | transfers.txt | to_stop_id | All slashes / are removed; if the stop_id doesn't exist in stops.txt , the transfer is ignored |
transfers.txt | min_transfer_time | Optional | transfers.txt | see (1) | |
transfers.txt | real_min_transfer_time | Optional | transfers.txt | see (1) | |
transfers.txt | equipment_id | Optional | transfers.txt |
(1) NTFS min_transfer_time
and real_min_transfer_time
are calculated as
follows. Note that if value is not one of 0
, 1
, 2
or 3
, then set to 0
.
GTFS transfer_type |
NTFS min_transfer_time |
NTFS real_min_transfer_time |
Note |
---|---|---|---|
0 | time between 2 stop points | time between 2 stop points + 2 minutes | The time is calculated with the distance as the crow flies and a walking speed of 0.785 m/s. Speed value is lowered because effective transit is not straight |
1 | 0 | 0 | |
2 | GTFS min_transfer_time |
GTFS min_transfer_time |
Log a warning message if the min_transfer_time is empty |
3 | 86400 | 86400 |
NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
---|---|---|---|---|---|
geometries.txt | geometry_id | ID | shapes.txt | shape_id | All slashes / are removed |
geometries.txt | geometry_wkt | Required | shapes.txt | shape_pt_lat, shape_pt_lon, shape_pt_sequence | A WKT LINESTRING geometry is created from the 3 input fields. |
Frequencies are transformed into explicit passing times by creating new trips that operate on regular times within the specified period. For each line of the GTFS frequencies.txt file, the referenced trip and its stop_times are used as a sample to create the new trips whose stop_times are calculated based on the given headway.
A new trip is created, departing from the first stop every headway_secs
seconds within the time period between start_time
and end_time
. Stop times of the referenced trip are used to calculate the time interval between two stop departures.
The departure time at the first stop of the last trip should not be later than the end_time
value. In case both values for start_time
and end_time
are equal or end_time
is smaller than start_time
, the frequency is ignored (no new trip is created).
Note that the referenced trip (and its stop_times) is only used as a sample and is deleted in the resulting data. In case the referenced trip and/or its associated stop_times do not exist, the frequency is ignored (no new trip is created).
The identifier for each new trip is generated using the following pattern: <trip_id>:<auto-incrimented integer> and maintains the rest of the attributes of the sample trip. That is, all new trips are assigned to the same route as the route of the sample trip, have the same service_id, etc.
A complementary object_code
is added to each new trip with the following properties:
object_type
: the fixed valuetrip
object_id
: the value of thetrip_id
fieldobject_system
: the fixed valuesource
object_code
: the unmodified initial GTFS value oftrip_id