3.2. Writing data: GTFS¶
Available as a jupyter notebook or wiki page.
GeNet can generate CSV or text files in GTFS-like format based on a Schedule
object. It will return the following file
calendar.txt
stop_times.txt
stops.txt
trips.txt
routes.txt
(or .csv depending on the method used)
When reading a GTFS feed, GeNet expects a date in YYYYMMDD
format. It does not retain this information. You are required to pass this GTFS day when exporting to GTFS. Schedule
object does not contain a lot of the optional GTFS data point (a lot more is retained if the Schedule was created through GeNet). In particular, if using MATSim files as input, your Schedule will mostly contain only the required fields. Other important remarks:
- list of network link IDs associated with the transit routes as the path and the relation of stops to network link IDs is not exported.
agency.txt
is not generatedservice_id
is generated from Service objects. If the Schedule object is generated from GTFS through GeNet, those service IDs are based off of theroute_id
field in GTFS. This means the two fields in output GTFS are the same. The transit route split of services (based on ordered chain of stops) is lost when exporting to GTFS.- you can pass you own
mode_to_route_type
dictionary that will map modes in the Schedule to theroute_type
codes you want. Otherwise it will default to
{
"tram": 0, "subway": 1, "rail": 2, "bus": 3, "ferry": 4, "cablecar": 5, "gondola": 6, "funicular": 7
}
based on this. Caveat - if you have read Schedule from GTFS, the route_type
codes are retained and will not be changed.
Here is the reference page for the schema of GTFS data.
from genet import read_gtfs
s = read_gtfs("example_data/example_gtfs", "20190603")
s
2022-07-14 15:32:40,982 - Reading GTFS from example_data/example_gtfs 2022-07-14 15:32:40,983 - Reading the calendar for GTFS 2022-07-14 15:32:40,988 - Reading GTFS data into usable format 2022-07-14 15:32:40,989 - Reading stop times 2022-07-14 15:32:40,997 - Reading trips 2022-07-14 15:32:41,003 - Reading stops 2022-07-14 15:32:41,007 - Reading routes
<Schedule instance at 140268498459024: with 2 services>
You can export Schedule
straight to files:
s.write_to_csv("example_data/outputs/gtfs", "20190603")
2022-07-14 15:32:41,111 - Saving Schedule to GTFS csv in example_data/output_gtfs 2022-07-14 15:32:41,200 - Saving example_data/output_gtfs/stops.csv 2022-07-14 15:32:41,212 - Saving example_data/output_gtfs/routes.csv 2022-07-14 15:32:41,218 - Saving example_data/output_gtfs/trips.csv 2022-07-14 15:32:41,221 - Saving example_data/output_gtfs/stop_times.csv 2022-07-14 15:32:41,227 - Saving example_data/output_gtfs/calendar.csv
s.write_to_gtfs("example_data/outputs/gtfs", "20190603")
2022-07-14 15:32:41,242 - Saving Schedule to GTFS txt in example_data/output_gtfs 2022-07-14 15:32:41,303 - Saving example_data/output_gtfs/stops.txt 2022-07-14 15:32:41,305 - Saving example_data/output_gtfs/routes.txt 2022-07-14 15:32:41,308 - Saving example_data/output_gtfs/trips.txt 2022-07-14 15:32:41,311 - Saving example_data/output_gtfs/stop_times.txt 2022-07-14 15:32:41,313 - Saving example_data/output_gtfs/calendar.txt
Or generate pandas.DataFrame
GTFS tables:
gtfs = s.to_gtfs("20190603")
gtfs.keys()
dict_keys(['stops', 'routes', 'trips', 'stop_times', 'calendar'])
gtfs["stops"].head()
stop_id | stop_name | stop_lat | stop_lon | stop_code | stop_desc | zone_id | stop_url | location_type | parent_station | stop_timezone | wheelchair_boarding | level_id | platform_code | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BSE | BSE | Bus Stop snap to edge | 51.522686 | -0.141362 | NaN | NaN | NaN | NaN | 0.0 | 210G433 | NaN | NaN | NaN | NaN |
RSE | RSE | Rail Stop snap to edge | 51.519261 | -0.142159 | NaN | NaN | NaN | NaN | 0.0 | 210G431 | NaN | NaN | NaN | NaN |
RSN | RSN | Rail Stop snap to node | 51.523134 | -0.141095 | NaN | NaN | NaN | NaN | 0.0 | 210G430 | NaN | NaN | NaN | NaN |
BSN | BSN | Bus Stop snap to node | 51.521620 | -0.140053 | NaN | NaN | NaN | NaN | 0.0 | 210G432 | NaN | NaN | NaN | NaN |