This is a set of guidelines for data publishers providing machine readable lists of their feeds and for data aggregation platforms providing machine readable lists of their feed contents to each other. This project is rooted in publishing and sharing lists of GTFS feeds for fixed-route public-transit networks. It's also applicable to real-time transit, bike-share, e-scooter, and other mobility datasets that take the form of "feeds" published at stable URLs:
- Publishers provide their own small registries To provide data creators (e.g., transit agencies and data vendors) a means of posting a list of their public feeds online. The format should be light-weight (no server required to power an API). The registry should also be machine readable, making it simple for data aggregation platforms to automatically recognize and consume newly added feeds.
- Aggregator platforms share their registries To provide data aggregation platforms (e.g., Transitland, OpenMobilityData, Navitia) a means of sharing their feed registries with each other. Each platform may have a particular focus in terms of functionality provided on top of their feed registries. By distributing feed lists among any and all platforms, open data is shared my widely and the burden of data curation is (hopefully) reduced for each platform.
- Related feeds are linked Different feed types reference each other (e.g., GTFS-realtime references a static GTFS feed, an MDS e-scooter feed references a GBFS bike-share feed). This registry format will provide a light-weight means for data publishers and aggregator platforms to identify these linkages.
- Put it into practice and experiment
The more contributors to these guidelines, the better! Let's consider many options and discuss the pros/cons of each of the registry specifics. Let's also be pragmatic. Our goal at Transitland will be to implement this registry format for both incoming feed submissions (to complement the existing Transitland Feed Registry add a feed process) and outputting lists of known feeds (the Datastore API feeds endpoint).- DMFR now powers the new Transitland Atlas, which is the source of truth for both Transitland v1 and Transitland v2's Feed Registry.
The stands on the shoulders of:
linked_datasets.txt
google/transit#93- GBFS
gbfs.json
https://github.com/NABSA/gbfs/blob/master/gbfs.md#gbfsjson - Transitland Feed Registry v1 https://github.com/transitland/transitland-feed-registry
- Transitland Feed Registry v2 http://transit.land/feed-registry/
- MDS
providers.csv
https://github.com/CityOfLosAngeles/mobility-data-specification/blob/dev/providers.csv - GBFS
systems.csv
https://github.com/NABSA/gbfs/blob/master/systems.csv
Single static GTFS feed:
{
"feeds": [
{
"spec": "gtfs", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"]
"id": "XXXX", // IDs are internally unique, but not necessarily globally unique
"urls": { // "Transitland style URL" to support nested zip archives
"static_current": "",
"static_historic": [""],
"static_planned": [""]
},
"languages": ["en-US"], // IETF language tags, see https://tools.ietf.org/html/bcp47
"license": { // license covering the contents of the feed
"spdx_identifier": "", // see https://spdx.org/licenses/
"url": "",
"use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
"create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
"redistribute": "yes", // enum: ["yes", "no", "unknown"]
"attribution_text": "",
}
}
],
"license_spdx_identifier": "CC0-1.0" // license covering the DMFR file itself; see https://spdx.org/licenses/
}
Single GTFS-realtime feed:
{
"feeds": [
{
"type": "gtfs-rt", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"]
"id": "XXXX", // unique ID for this feed record; may be a Onestop ID or your own ID scheme
"urls": {
"realtime_vehicle_positions": "",
"realtime_trip_updates": "",
"realtime_alerts": ""
},
"languages": ["en-US"], // IETF language tags, see https://tools.ietf.org/html/bcp47
"license": {
"spdx_identifier": "", // see https://spdx.org/licenses/
"url": "",
"use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
"create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
"redistribute": "yes", // enum: ["yes", "no", "unknown"]
"attribution_text": "",
}
},
{
"type": "gtfs", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"],
"id": "XXXX", // unique ID for this feed record; may be a Onestop ID or your own ID scheme
// ...
}
],
"license_spdx_identifier": "CC0-1.0" // required to meet this spec
}
Group together multiple feeds using an operator:
{
"$schema": "https://dmfr.transit.land/json-schema/dmfr.schema-v0.3.0.json",
"feeds": [
{
"spec": "gtfs",
"id": "f-9q9-bart",
"urls": {
"static_current": "http://www.bart.gov/dev/schedules/google_transit.zip"
},
"license": {
"url": "http://www.bart.gov/schedules/developers/developer-license-agreement",
"use_without_attribution": "yes",
"create_derived_product": "unknown",
"redistribute": "yes"
},
"tags": {
"gtfs_data_exchange": "airbart"
}
},
{
"spec": "gtfs-rt",
"id": "f-bart~rt",
"urls": {
"realtime_alerts": "http://api.bart.gov/gtfsrt/alerts.aspx",
"realtime_trip_updates": "http://api.bart.gov/gtfsrt/tripupdate.aspx"
}
}
],
"license_spdx_identifier": "CDLA-Permissive-1.0",
"operators": [
{
"onestop_id": "o-9q9-bart",
"tags": {
"us_ntd_id": "90003",
"omd_provider_id": "bart",
"wikidata_id": "Q610120",
"twitter_general": "sfbart",
"twitter_service_alerts": "SFBARTalert"
},
"name": "Bay Area Rapid Transit",
"short_name": "BART",
"associated_feeds": [
{
"feed_onestop_id": "f-bart~rt"
},
{
"feed_onestop_id": "f-9q9-bart"
}
]
}
]
}
Feed IDs can be any strings that are unique with a given DMFR file. These feed IDs can be Onestop IDs, although that is not required by the DMFR spec. In the Transitland Atlas repository, DMFR files are required to use Onestop IDs.
For static feeds contained in a zip archive, ideally the feed files are all in the root directory of the archive. However, this is not always the case.
Transitland Feed Registry supports an extended URL format that can reference files nested within a subdirectory. The extended URL format can also reference a zip file nested within another zip file.
https://github.com/septadev/GTFS/releases/download/v201810010/gtfs_public.zip#google_bus.zip
Based on Transitland's approach to handling open data licenses in all their variety.
"license": {
"spdx_identifier": "", // see https://spdx.org/licenses/
"url": "",
"use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
"create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
"redistribute": "yes", // enum: ["yes", "no", "unknown"]
"attribution_text": "",
}
Requiring authentication for public data feeds is typically not a good idea. However, it's reasonable to require an API key for a GTFS-realtime endpoints and other feeds that involve active queries.
"authorization": {
"type": "", // enum: ["header", "basic_auth", "query_param"]
"param_name": "",
"info_url": ""
}
Tags allow extra information to be added to feeds and operators. Keys and values must both be strings.
"operators": [
{
"onestop_id": "o-9q9-bart",
"tags": {
"us_ntd_id": "90003",
"omd_provider_id": "bart",
"wikidata_id": "Q610120",
"twitter_general": "sfbart",
"twitter_service_alerts": "SFBARTalert"
}
}
]