4.1. Using the `Network` object: Accessing data¶

Available as a jupyter notebook or wiki page.

Let's read in a sample MATSim network into GeNet's Network object.

In [1]:

Copied!





import os

from genet import read_matsim

path_to_matsim_network = "example_data/pt2matsim_network"

network = os.path.join(path_to_matsim_network, "network.xml")
schedule = os.path.join(path_to_matsim_network, "schedule.xml")
vehicles = os.path.join(path_to_matsim_network, "vehicles.xml")
n = read_matsim(
    path_to_network=network, epsg="epsg:27700", path_to_schedule=schedule, path_to_vehicles=vehicles
)
# # you don't need to read the vehicles file, but doing so ensures all vehicles
# # in the schedule are of the expected type and the definition of the vehicle
# # is preserved

n.print()
import os

from genet import read_matsim

path_to_matsim_network = "example_data/pt2matsim_network"

network = os.path.join(path_to_matsim_network, "network.xml")
schedule = os.path.join(path_to_matsim_network, "schedule.xml")
vehicles = os.path.join(path_to_matsim_network, "vehicles.xml")
n = read_matsim(
    path_to_network=network, epsg="epsg:27700", path_to_schedule=schedule, path_to_vehicles=vehicles
)
# # you don't need to read the vehicles file, but doing so ensures all vehicles
# # in the schedule are of the expected type and the definition of the vehicle
# # is preserved

n.print()

Graph info: Name: 
Type: MultiDiGraph
Number of nodes: 1662
Number of edges: 3166
Average in degree:   1.9049
Average out degree:   1.9049 
Schedule info: Schedule:
Number of services: 9
Number of routes: 68
Number of stops: 118

Summary¶

The network summary report can be accessed using the summary_report method

In [2]:

Copied!

n.summary_report()
n.summary_report()

2022-08-25 14:51:20,883 - Creating a summary report

Out[2]:

{'network': {'network_graph_info': {'Number of network links': 1662,
   'Number of network nodes': 3166},
  'modes': {'Modes on network links': {'artificial', 'bus', 'car', 'pt'},
   'Number of links by mode': {'artificial': 3,
    'car': 3161,
    'pt': 153,
    'bus': 182}},
  'osm_highway_tags': {'Number of links by tag': {'living_street': 7,
    'tertiary_link': 2,
    'trunk': 213,
    'unclassified': 1027,
    'tertiary': 326,
    'residential': 758,
    'secondary_link': 2,
    'primary_link': 5,
    'service': 2,
    'primary': 619,
    'secondary': 185,
    'trunk_link': 17}}},
 'schedule': {'schedule_info': {'Number of services': 9,
   'Number of routes': 68,
   'Number of stops': 118},
  'modes': {'Modes in schedule': {'bus'},
   'Services by mode': {'bus': 9},
   'PT stops by mode': {'bus': 45}},
  'accessibility_tags': {'Stops with tag bikeAccessible': 0,
   'Unique values for bikeAccessible tag': set(),
   'Stops with tag carAccessible': 0,
   'Unique values for carAccessible tag': set()}}}

The data saved on the edges or nodes of the graph can be nested. There are a couple of convenient methods that summarise the schema of the data found on the nodes and links. If data=True, the output also shows up to 5 unique values stored in that location.

In [3]:

Copied!

n.node_attribute_summary(data=True)
n.node_attribute_summary(data=True)

attribute
├── id: ['3085005043', '200047', '852019112', '107824', '14790693']
├── x: [528387.4250512555, 528391.4406755936, 528393.2742107178, 528396.6287644263, 528396.3513181042]
├── y: [181547.5850354673, 181552.72935927223, 181558.10532352765, 181559.970402835, 181562.0370527053]
├── lon: [-0.15178558709839862, -0.135349787087776, -0.122919287085967, -0.13766218709633904, -0.14629008709559344]
├── lat: [51.52643403323907, 51.51609983324067, 51.51595583324104, 51.5182034332405, 51.52410423323943]
└── s2_id: [5221390710015643649, 5221390314367946753, 5221366508477440003, 5221390682291777543, 5221390739236081673]

In [4]:

Copied!

n.link_attribute_summary(data=False)
n.link_attribute_summary(data=False)

attribute
├── id
├── from
├── to
├── freespeed
├── capacity
├── permlanes
├── oneway
├── modes
├── s2_from
├── s2_to
├── attributes
│   ├── osm:way:access
│   ├── osm:way:highway
│   ├── osm:way:id
│   ├── osm:way:name
│   ├── osm:relation:route
│   ├── osm:way:lanes
│   ├── osm:way:oneway
│   ├── osm:way:tunnel
│   ├── osm:way:psv
│   ├── osm:way:vehicle
│   ├── osm:way:traffic_calming
│   ├── osm:way:junction
│   └── osm:way:service
└── length

Once you see the general schema for the data stored on nodes and links, you may decide to look at or perform analysis on all of the data stored in the netowrk under a particular key. A GeNet network has two methods which generate a pandas.Series object, which stores the nodes or links data present at the specified key, indexed by the same index as the nodes or links.

In [5]:

Copied!

s2_id = n.node_attribute_data_under_key("s2_id")
s2_id = n.node_attribute_data_under_key("s2_id")

In [6]:

Copied!

s2_id
s2_id

Out[6]:

101982       5221390329378179879
101986       5221390328605860387
101990       5221390304444511271
101991       5221390303978897267
101992       5221390304897644929
                    ...         
983839058    5221390693831817171
99936        5221390297975475113
99937        5221390299484831045
99940        5221390294354743413
99943        5221390298004852605
Length: 1662, dtype: int64

In [7]:

Copied!

n.link_attribute_data_under_key("freespeed").head()
n.link_attribute_data_under_key("freespeed").head()

Out[7]:

1       4.166667
10      4.166667
100     4.166667
1000    4.166667
1001    4.166667
dtype: float64

Or you can access nested data,

In [8]:

Copied!

n.link_attribute_data_under_key({"attributes": "osm:way:lanes"}).head()
n.link_attribute_data_under_key({"attributes": "osm:way:lanes"}).head()

Out[8]:

1007    2
1008    2
1037    2
1038    2
1039    2
dtype: object

You can also build a pandas.DataFrame out of several keys.

In [9]:

Copied!

n.link_attribute_data_under_keys(["freespeed", {"attributes": "osm:way:highway"}]).head()
n.link_attribute_data_under_keys(["freespeed", {"attributes": "osm:way:highway"}]).head()

Out[9]:

	freespeed	attributes::osm:way:highway
1	4.166667	unclassified
10	4.166667	unclassified
100	4.166667	unclassified
1000	4.166667	residential
1001	4.166667	residential

Extracting links of interest¶

The function below gathers link ids which satisfy conditions to arbitrary level of nested-ness. It also allows quite flexible conditions---below we require that the link value at data['attributes']['osm:way:highway'] == 'primary', where data is the data dictionary stored on that link.

In [11]:

Copied!

links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": "primary"}}
)
links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": "primary"}}
)

In [12]:

Copied!

links[:5]
links[:5]

Out[12]:

['1007', '1008', '1023', '1024', '103']

In [13]:

Copied!

len(links)
len(links)

Out[13]:

Note, it is possible to set data in long format, specifying the JAVA class of the data stored, i.e.

{'id': '1007',
 'from': '4356572310',
 'to': '5811263955',
 'attributes': {'osm:way:highway': {'name': 'osm:way:highway',
   'class': 'java.lang.String',
   'text': 'primary'},
  'osm:way:id': {'name': 'osm:way:id',
   'class': 'java.lang.Long',
   'text': '589660342'},
  'osm:way:lanes': {'name': 'osm:way:highway',
   'class': 'java.lang.String',
   'text': 'primary'},
  'osm:way:name': {'name': 'osm:way:name',
   'class': 'java.lang.String',
   'text': 'Shaftesbury Avenue'},
  'osm:way:oneway': {'name': 'osm:way:oneway',
   'class': 'java.lang.String',
   'text': 'yes'}},
 'length': 13.941905154249884}

This is useful if you want to force the data to be saved to MATSim XML file with that specific data type.

In that case, to find primary highway links, you would instead set the following condition:

links = n.extract_links_on_edge_attributes(
    conditions= {'attributes': {'osm:way:highway': {'text': 'primary'}}},
)

Below we now require that the link value at data['attributes']['osm:way:highway'] in ['primary', 'something else']. There is nothing in the data that has such tags, so the output is the same.

In [14]:

Copied!

links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": ["primary", "something else"]}}
)
links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": ["primary", "something else"]}}
)

In [15]:

Copied!

links[:5]
links[:5]

Out[15]:

['1007', '1008', '1023', '1024', '103']

In [16]:

Copied!

len(links)
len(links)

Out[16]:

We can also pass a list of conditions. In this case it makes sense for us to specify how multiple conditions should be handled. We can do it via

how=all - all conditions need to be met
how=any - at least one condition needs to be met

It is set to any as default.

In [17]:

Copied!





links = n.extract_links_on_edge_attributes(
    conditions=[
        {"attributes": {"osm:way:highway": "primary"}},
        {"attributes": {"osm:way:highway": "something else"}},
    ],
    how=any,
)
links = n.extract_links_on_edge_attributes(
    conditions=[
        {"attributes": {"osm:way:highway": "primary"}},
        {"attributes": {"osm:way:highway": "something else"}},
    ],
    how=any,
)

In [18]:

Copied!

links[:5]
links[:5]

Out[18]:

['1007', '1008', '1023', '1024', '103']

In [19]:

Copied!

len(links)
len(links)

Out[19]:

In [20]:

Copied!





links = n.extract_links_on_edge_attributes(
    conditions=[
        {"attributes": {"osm:way:highway": "primary"}},
        {"attributes": {"osm:way:highway": "something else"}},
    ],
    how=all,
)
links = n.extract_links_on_edge_attributes(
    conditions=[
        {"attributes": {"osm:way:highway": "primary"}},
        {"attributes": {"osm:way:highway": "something else"}},
    ],
    how=all,
)

In [21]:

Copied!

links[:5]
links[:5]

Out[21]:

[]

As expected, no links satisfy both data['attributes']['osm:way:highway'] == 'primary' and data['attributes']['osm:way:highway'] == 'something else'.

Below, we give an example of subsetting a numeric boundary. We find links where 0 <= 'freespeed' <= 20.

In [22]:

Copied!

links = n.extract_links_on_edge_attributes(conditions={"freespeed": (0, 20)})
links = n.extract_links_on_edge_attributes(conditions={"freespeed": (0, 20)})

In [23]:

Copied!

links[:5]
links[:5]

Out[23]:

['1', '10', '100', '1000', '1001']

In [24]:

Copied!

len(links)
len(links)

Out[24]:

Finally, we can define a function that will handle the condition for us. The function should take the value expected at the key in the data dictionary and return either True or False.

For example, below we give an example equivalent to our first example of data['attributes']['osm:way:highway']['text'] == 'primary' but using a function we defined ourselves to handle the condition.

In [25]:

Copied!

def highway_primary(value):
    return value == "primary"

links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": highway_primary}}
)
def highway_primary(value):
    return value == "primary"

links = n.extract_links_on_edge_attributes(
    conditions={"attributes": {"osm:way:highway": highway_primary}}
)

In [26]:

Copied!

links[:5]
links[:5]

Out[26]:

['1007', '1008', '1023', '1024', '103']

In [27]:

Copied!

len(links)
len(links)

Out[27]:

This allows for really flexible subsetting of the network based on data stored on the edges. Another example, similar to the numeric boundary, but this time we only care about the upper bound and we make it a strict inequality.

In [28]:

Copied!

def below_20(value):
    return value < 20

links = n.extract_links_on_edge_attributes(conditions={"freespeed": below_20})
def below_20(value):
    return value < 20

links = n.extract_links_on_edge_attributes(conditions={"freespeed": below_20})

In [29]:

Copied!

links[:5]
links[:5]

Out[29]:

['1', '10', '100', '1000', '1001']

In [30]:

Copied!

len(links)
len(links)

Out[30]:

In [31]:

Copied!

n.links_on_modal_condition("bus")[:5]
n.links_on_modal_condition("bus")[:5]

Out[31]:

['1021', '1023', '1024', '1079', '1105']

nodes_on_modal_condition will return nodes connected to the links satisfying the modal condition.

In [32]:

Copied!

n.nodes_on_modal_condition(["car", "bus"])[:5]
n.nodes_on_modal_condition(["car", "bus"])[:5]

Out[32]:

['852019112', '107824', '14790693', '21651810', '1166234800']

Spatial convenience methods¶

For spatial extraction conditions you have a choice of:

shapely.geometry.Polygon or shapely.geometry.GeometryCollection of Polygons (in epsg:4326)
geojson file, can be generated with http://geojson.io/
S2 Geometry hex string which can be generated and copied from http://s2.sidewalklabs.com/regioncoverer

In [33]:

Copied!

_ = n.to_geodataframe()
gdf_nodes, gdf_links = _["nodes"], _["links"]
_ = n.to_geodataframe()
gdf_nodes, gdf_links = _["nodes"], _["links"]

In [34]:

Copied!

region = "48761ad71,48761ad723,48761ad724c,48761ad73c,48761ad744,48761ad75d3,48761ad75d5,48761ad765,48761ad767,48761ad76c,48761ad774,48761ad779,48761ad77b,48761ad783,48761ad784c,48761ad7854,48761ad794,48761ad79c,48761ad7a4,48761ad7ac,48761ad7b1,48761ad7bc"
_nodes = n.nodes_on_spatial_condition(region)[:5]
len(_nodes)
region = "48761ad71,48761ad723,48761ad724c,48761ad73c,48761ad744,48761ad75d3,48761ad75d5,48761ad765,48761ad767,48761ad76c,48761ad774,48761ad779,48761ad77b,48761ad783,48761ad784c,48761ad7854,48761ad794,48761ad79c,48761ad7a4,48761ad7ac,48761ad7b1,48761ad7bc"
_nodes = n.nodes_on_spatial_condition(region)[:5]
len(_nodes)

Out[34]:

In [35]:

Copied!

gdf_nodes.plot(), gdf_nodes[gdf_nodes["id"].isin(_nodes)].plot()
gdf_nodes.plot(), gdf_nodes[gdf_nodes["id"].isin(_nodes)].plot()

Out[35]:

(<matplotlib.axes._subplots.AxesSubplot at 0x7fd8786c1050>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd87879d550>)

No description has been provided for this image

In [36]:

Copied!

geojson = "example_data/Fitzrovia_polygon.geojson"

# here the area is too small for any routes to be within it
_links = n.links_on_spatial_condition(geojson, how="intersect")
len(_links)
geojson = "example_data/Fitzrovia_polygon.geojson"

# here the area is too small for any routes to be within it
_links = n.links_on_spatial_condition(geojson, how="intersect")
len(_links)

Out[36]:

In [37]:

Copied!

gdf_links.plot(), gdf_links[gdf_links["id"].isin(_links)].plot()
gdf_links.plot(), gdf_links[gdf_links["id"].isin(_links)].plot()

Out[37]:

(<matplotlib.axes._subplots.AxesSubplot at 0x7fd8781b8490>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd8790d0cd0>)

In [38]:

Copied!





from shapely.geometry import Polygon

region = Polygon(
    [
        (-0.1487016677856445, 51.52556684350165),
        (-0.14063358306884766, 51.5255134425896),
        (-0.13865947723388672, 51.5228700191647),
        (-0.14093399047851562, 51.52006622056997),
        (-0.1492595672607422, 51.51974577545329),
        (-0.1508045196533203, 51.52276321095246),
        (-0.1487016677856445, 51.52556684350165),
    ]
)

_links = n.links_on_spatial_condition(region, how="within")
len(_links)
from shapely.geometry import Polygon

region = Polygon(
    [
        (-0.1487016677856445, 51.52556684350165),
        (-0.14063358306884766, 51.5255134425896),
        (-0.13865947723388672, 51.5228700191647),
        (-0.14093399047851562, 51.52006622056997),
        (-0.1492595672607422, 51.51974577545329),
        (-0.1508045196533203, 51.52276321095246),
        (-0.1487016677856445, 51.52556684350165),
    ]
)

_links = n.links_on_spatial_condition(region, how="within")
len(_links)

Out[38]:

In [39]:

Copied!

gdf_links.plot(), gdf_links[gdf_links["id"].isin(_links)].plot()
gdf_links.plot(), gdf_links[gdf_links["id"].isin(_links)].plot()

Out[39]:

(<matplotlib.axes._subplots.AxesSubplot at 0x7fd878005510>,
 <matplotlib.axes._subplots.AxesSubplot at 0x7fd8780d5650>)

Using the `Schedule` object¶

Schedule is a representation of public transit and is a part of any genet.Network, it is initiated as empty. A Network can exist and still be valid with an empty Schedule. Earlier we read a MATSim transit schedule.

A Schedule is comprised of a number of nested objects. Each Schedule has a number of Services. Each Service is made up a number of Routes. A Route is defined by an ordered list of Stop objects. Every Service should, logically, have at least two Routes, one going in one direction and another going back. Each Route also hold information about the trips, their timing and offsets arriving and departing at the Stops.

We can look at quick stats:

In [40]:

Copied!

n.schedule.print()
n.schedule.print()

Schedule:
Number of services: 9
Number of routes: 68
Number of stops: 118

Or we can plot the Schedule object. A Schedule on its' own does not have information about the Network, even if it has refrences to it via network routes in the Route objects. Thus calling a plot method on a Schedule will result in a plot of connections between stops for all Routes within all Services. To plot the network routes of the Schedule we use the plot method for the Network object which holds that Schedule.

In [41]:

Copied!

# n.schedule.plot()
# n.schedule.plot()

Summary¶

Schedules can get large and complicated. GeNet includes methods similar to ones presented for Network objects. This time, instead of inspecting data stored on links and edges of a graph, we summarise data held for Stops, Routes and Services in the Schedule.

In [42]:

Copied!

n.schedule.stop_attribute_summary(data=False)
n.schedule.stop_attribute_summary(data=False)

attribute
├── services
├── routes
├── id
├── x
├── y
├── epsg
├── name
├── lon
├── lat
├── s2_id
├── linkRefId
├── isBlocking
└── stopAreaId

In [43]:

Copied!

n.schedule.route_attribute_summary(data=True)
n.schedule.route_attribute_summary(data=True)

attribute
├── route_short_name: ['N55', 'N5', '113', 'N20', '134']
├── mode: ['bus']
├── arrival_offsets: ['00:01:52', '00:02:18', '00:01:34', '00:03:48', '00:01:10']
├── departure_offsets: ['00:01:52', '00:02:18', '00:01:34', '00:03:48', '00:01:10']
├── route_long_name: ['']
├── id: ['VJea6046f64f85febf1854290fb8f76e921e3ac96b', 'VJf6055fdf9ef0dd6d0500b6c11adcfdd4d10655dc', 'VJ5b511605b1e07428c2e0a7d676d301c6c40dcca6', 'VJ85c23573d670bab5485618b0c5fddff3314efc89', 'VJ28a8a6a4ab02807a4fdfd199e5c2ca0622d34d0c']
├── trips
│   ├── trip_id: ['VJcc2e00b98a2837e18c555477c6e44ca2efe332e7_10:49:00', 'VJ0d5c884e960469ac2ced50a704e57d965da26018_17:20:56', 'VJc239057734e457e3ba45979b2d87a019b62742da_20:51:13', 'VJ5c2b1116530ef2e405c69e0bb12dfeaca4c08b24_16:54:00', 'VJe165350c77c2d832b595c5c02cf61a9291d87f88_19:13:00']
│   ├── trip_departure_time: ['01:24:00', '22:53:08', '13:59:00', '18:35:00', '16:56:56']
│   └── vehicle_id: ['veh_1757_bus', 'veh_885_bus', 'veh_1919_bus', 'veh_935_bus', 'veh_1935_bus']
├── route: ['87', '485', '1180', '2867', '3155']
├── await_departure: [True]
└── ordered_stops: ['490000252KA.link:1437', '490000235P.link:15', '490002124ZZ.link:1172', '490000091G.link:1242', '490000173RG.link:2614']

In [44]:

Copied!

n.schedule.service_attribute_summary(data=True)
n.schedule.service_attribute_summary(data=True)

attribute
├── id: ['20274', '15234', '18915', '12430', '18853']
└── name: ['N55', 'N5', '113', 'N20', '134']

Again, similarly to Network objects, we can generate pandas.DataFrames for chosen attributes of Stops, Routes and Services. These dataframes are indexed by the index of the object you query, i.e. Stop ID, Route ID or Service ID. During intantiation of a Schedule object, Route and Service indices are checked and forced to be unique, reindexing them as neccessary.

In [45]:

Copied!

n.schedule.stop_attribute_data(keys=["lat", "lon", "name"]).head()
n.schedule.stop_attribute_data(keys=["lat", "lon", "name"]).head()

Out[45]:

	lat	lon	name
490000235X.link:834	51.516685	-0.128096	Tottenham Court Road Station (Stop X)
490000235YB.link:574	51.516098	-0.134044	Oxford Street Soho Street (Stop YB)
490014214HE.link:3154	51.515923	-0.135392	Wardour Street (Stop OM)
490010689KB.link:981	51.515472	-0.139893	Great Titchfield Street Oxford Circus Station...
490000235V.link:3140	51.516380	-0.131929	Tottenham Court Road Station (Stop V)

In [46]:

Copied!

n.schedule.route_attribute_data(keys=["route_short_name", "mode"]).head()
n.schedule.route_attribute_data(keys=["route_short_name", "mode"]).head()

Out[46]:

	route_short_name	mode
VJ375a660d47a2aa570aa20a8568012da8497ffecf	N55	bus
VJ812fad65e7fa418645b57b446f00cba573f2cdaf	N55	bus
VJ6c64ab7b477e201cae950efde5bd0cb4e2e8888e	N55	bus
VJea6046f64f85febf1854290fb8f76e921e3ac96b	94	bus
VJf6055fdf9ef0dd6d0500b6c11adcfdd4d10655dc	94	bus

In [47]:

Copied!

n.schedule.service_attribute_data(keys="name", index_name="service_id").head()
n.schedule.service_attribute_data(keys="name", index_name="service_id").head()

Out[47]:

	name
service_id
20274	N55
12430	205
15234	134
18915	N5
18853	N8

Each trip in the schedule has a vehicle assigned to it. By default, each trip will have a unique vehicle, but this can be changed by the user (have a look in modification notebook). Each vehicle is linked to a type. Each schedule begins with types based off of a config genet/configs/vehicles/vehicle_definitions.yml, the user may like to point to their own config file or set those values through the Schedule object.

In [48]:

Copied!

n.schedule.vehicles["veh_2331_bus"]
n.schedule.vehicles["veh_2331_bus"]

Out[48]:

{'type': 'Bus'}

In [49]:

Copied!

n.schedule.vehicle_types["Bus"]["capacity"]["standingRoom"]["persons"] = 5
n.schedule.vehicle_types["Bus"]
n.schedule.vehicle_types["Bus"]["capacity"]["standingRoom"]["persons"] = 5
n.schedule.vehicle_types["Bus"]

Out[49]:

{'capacity': {'seats': {'persons': '70'}, 'standingRoom': {'persons': 5}},
 'length': {'meter': '18.0'},
 'width': {'meter': '2.5'},
 'accessTime': {'secondsPerPerson': '0.5'},
 'egressTime': {'secondsPerPerson': '0.5'},
 'doorOperation': {'mode': 'serial'},
 'passengerCarEquivalents': {'pce': '2.8'}}

There exists a method to check that all vehicles are linked to a vehicle type which exists in the schedule.

In [50]:

Copied!

n.schedule.validate_vehicle_definitions()
n.schedule.validate_vehicle_definitions()

Out[50]:

True

trips_to_dataframe is a useful method to extract all of the trips, their departures and vehicle IDs associated with the trips in the schedule. Trip ids need not be unique, route IDs provide a secondary index. Associated service IDs are also given for convenience. There is another method set_trips_dataframe which takes this dataframe and applies changes to all route trips based on the data in the dataframe. This means you can generate this DataFrame as shown below, manipulate trips (delete them, add new ones), change their departure times or change their vehicle ids to be shared for differnt trips, perhaps on some temporal logic and as long as the dataframe has the same schema, you can use it to set new trips in the schedule. This will appear in the changelog as a route level modify event. More info on this can be found in the Modifying Network notebook or wiki page.

In [51]:

Copied!

n.schedule.trips_to_dataframe(gtfs_day="20210101").head()
n.schedule.trips_to_dataframe(gtfs_day="20210101").head()

Out[51]:

	route_id	mode	service_id	trip_id	trip_departure_time	vehicle_id
0	VJ375a660d47a2aa570aa20a8568012da8497ffecf	bus	20274	VJ2cdccea96e0e3e6a53a968bcb132941415d6d7c9_04:...	2021-01-01 04:53:00	veh_2331_bus
1	VJ375a660d47a2aa570aa20a8568012da8497ffecf	bus	20274	VJ375a660d47a2aa570aa20a8568012da8497ffecf_03:...	2021-01-01 03:53:00	veh_2332_bus
2	VJ375a660d47a2aa570aa20a8568012da8497ffecf	bus	20274	VJ3b9d77d2ef200b21c8048fea5eedc2d2788a1b94_01:...	2021-01-01 01:54:00	veh_2333_bus
3	VJ375a660d47a2aa570aa20a8568012da8497ffecf	bus	20274	VJ79974c386a39426e06783650a759828438432aa4_05:...	2021-01-01 05:23:00	veh_2334_bus
4	VJ375a660d47a2aa570aa20a8568012da8497ffecf	bus	20274	VJa09c394b71031216571d813a6266c83f2d30bf0a_04:...	2021-01-01 04:23:00	veh_2335_bus

Headways¶

You can generate a dataframe with headway information for all trips and services

In [52]:

Copied!

n.schedule.trips_headways().head()
n.schedule.trips_headways().head()

Out[52]:

	route_id	mode	service_id	trip_id	trip_departure_time	vehicle_id	headway	headway_mins
0	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	12430	VJ70cdcef7ccba9c599c70f89bdf8b10852e33bb04_11:...	1970-01-01 11:15:42	veh_409_bus	0 days 00:00:00	0.0
1	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	12430	VJ126aa65811277b9774ae127ff819495441bc4e75_11:...	1970-01-01 11:24:42	veh_392_bus	0 days 00:09:00	9.0
2	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	12430	VJ0d3b026c4060cd0325803e488a965a5ab91fd4c0_11:...	1970-01-01 11:32:42	veh_390_bus	0 days 00:08:00	8.0
3	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	12430	VJ4155b3d5d916db07a50061ae1c15b24ecfc2f96f_11:...	1970-01-01 11:41:42	veh_401_bus	0 days 00:09:00	9.0
4	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	12430	VJc9a308474ed72f769664413e686f3447613c5b3a_11:...	1970-01-01 11:49:42	veh_425_bus	0 days 00:08:00	8.0

You can also generate a dataframe with summary information about headways for each route in the schedule

In [53]:

Copied!

n.schedule.headway_stats().head()
n.schedule.headway_stats().head()

Out[53]:

	service_id	route_id	mode	mean_headway_mins	std_headway_mins	max_headway_mins	trip_count
0	12430	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	bus	8.688889	1.378771	10.0	45.0
1	12430	VJ06cd41dcd58d947097df4a8f33234ef423210154	bus	115.333333	266.361909	659.0	6.0
2	12430	VJ0f3c08222de16c2e278be0a1bf0f9ea47370774e	bus	9.851064	8.032485	63.0	47.0
3	12430	VJ15419796737689e742962a625abcf3fd5b3d58b1	bus	22.928571	75.682049	409.0	28.0
4	12430	VJ235c8fca539cf931b3c673f9b056606384aff950	bus	24.433333	86.248512	481.0	30.0

In another notebook on modification, you can find information about generating new trips to replace the old using headway information. This is useful when creating scenario networks.

Extracting Stops, Routes, Services of interest¶

There are times when we need to extract Service, Route or Stop IDs depending on some logic. Building conditions works exactly the same as for links and nodes of genet.Network which was presented exhaustively above. Here we present some examples. There are separate methods for Service, Route or Stop objects that return the IDs of these objects if they satisfy the conditions given by the user. Note, attribute_summary methods presented above help in building these conditions.

In general¶

In [54]:

Copied!

n.schedule.extract_service_ids_on_attributes(conditions={"name": "N55"})
n.schedule.extract_service_ids_on_attributes(conditions={"name": "N55"})

Out[54]:

['20274']

In [55]:

Copied!

n.schedule.extract_route_ids_on_attributes(
    conditions=[{"mode": "bus"}, {"route_short_name": "N55"}], how=all
)[:5]
n.schedule.extract_route_ids_on_attributes(
    conditions=[{"mode": "bus"}, {"route_short_name": "N55"}], how=all
)[:5]

Out[55]:

['VJ375a660d47a2aa570aa20a8568012da8497ffecf',
 'VJ812fad65e7fa418645b57b446f00cba573f2cdaf',
 'VJ6c64ab7b477e201cae950efde5bd0cb4e2e8888e']

In [56]:

Copied!





def oxford_street_in_name(attribs):
    if "Oxford Street" in attribs:
        return True
    else:
        return False


n.schedule.extract_stop_ids_on_attributes(conditions={"name": oxford_street_in_name})[:5]
def oxford_street_in_name(attribs):
    if "Oxford Street" in attribs:
        return True
    else:
        return False


n.schedule.extract_stop_ids_on_attributes(conditions={"name": oxford_street_in_name})[:5]

Out[56]:

['490000235YB.link:574',
 '490000235P.link:15',
 '490000173W.link:1868',
 '490000235Z.link:15',
 '490000235Z']

There are several common extraction logics we might need. They relate to modes and spatial and temporal logic. Below we go through some convenience methods for those.

Below are convenience methods for extracting object IDs based on the modes they are related to. Note that only Route objects actually hold information about their mode of transport. When we extract Service of mode x, we pick services whose at least one route is of that mode. Similarly with Stops, we extract those used by routes of that mode.

In [57]:

Copied!

n.schedule.services_on_modal_condition(modes="bus")[:5]
n.schedule.services_on_modal_condition(modes="bus")[:5]

Out[57]:

['20274', '15234', '12430', '18853', '18915']

In [58]:

Copied!

n.schedule.routes_on_modal_condition(modes=["bus", "rail"])[:5]
n.schedule.routes_on_modal_condition(modes=["bus", "rail"])[:5]

Out[58]:

['VJ375a660d47a2aa570aa20a8568012da8497ffecf',
 'VJ812fad65e7fa418645b57b446f00cba573f2cdaf',
 'VJ6c64ab7b477e201cae950efde5bd0cb4e2e8888e',
 'VJea6046f64f85febf1854290fb8f76e921e3ac96b',
 'VJf6055fdf9ef0dd6d0500b6c11adcfdd4d10655dc']

In [59]:

Copied!

n.schedule.stops_on_modal_condition(modes="bus")[:5]
n.schedule.stops_on_modal_condition(modes="bus")[:5]

Out[59]:

['490000235X.link:834',
 '490000235YB.link:574',
 '490014214HE.link:3154',
 '490010689KB.link:981',
 '490000235V.link:3140']

Spatial¶

For spatial extraction conditions, similarly to the Network object, you have a choice of:

shapely.geometry.Polygon or shapely.geometry.GeometryCollection of Polygons (in epsg:4326)
geojson file, can be generated with http://geojson.io/
S2 Geometry hex string which can be generated and copied from http://s2.sidewalklabs.com/regioncoverer

Again, methods exist for Service, Route or Stop objects seperately.

In [60]:

Copied!





from shapely.geometry import Polygon

region = Polygon(
    [
        (-0.1487016677856445, 51.52556684350165),
        (-0.14063358306884766, 51.5255134425896),
        (-0.13865947723388672, 51.5228700191647),
        (-0.14093399047851562, 51.52006622056997),
        (-0.1492595672607422, 51.51974577545329),
        (-0.1508045196533203, 51.52276321095246),
        (-0.1487016677856445, 51.52556684350165),
    ]
)

n.schedule.services_on_spatial_condition(region)
from shapely.geometry import Polygon

region = Polygon(
    [
        (-0.1487016677856445, 51.52556684350165),
        (-0.14063358306884766, 51.5255134425896),
        (-0.13865947723388672, 51.5228700191647),
        (-0.14093399047851562, 51.52006622056997),
        (-0.1492595672607422, 51.51974577545329),
        (-0.1508045196533203, 51.52276321095246),
        (-0.1487016677856445, 51.52556684350165),
    ]
)

n.schedule.services_on_spatial_condition(region)

Out[60]:

['12430']

There are two options for Service and Route objects. They can either intersect the area, meaning at least one of their Stops lie in the specified area, or be within this area.

In [61]:

Copied!

geojson = "example_data/Fitzrovia_polygon.geojson"

# here the area is too small for any routes to be within it
n.schedule.routes_on_spatial_condition(geojson, how="within")
geojson = "example_data/Fitzrovia_polygon.geojson"

# here the area is too small for any routes to be within it
n.schedule.routes_on_spatial_condition(geojson, how="within")

Out[61]:

[]

In [62]:

Copied!

# a lot of routes intersect it however
n.schedule.routes_on_spatial_condition(geojson, how="intersect")[:5]
# a lot of routes intersect it however
n.schedule.routes_on_spatial_condition(geojson, how="intersect")[:5]

Out[62]:

['VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3',
 'VJeae6e634f8479e0b6712780d5728f0afca964e64',
 'VJ15419796737689e742962a625abcf3fd5b3d58b1',
 'VJf8e38a73359b6cf743d8e35ee64ef1f7b7914daa',
 'VJ06cd41dcd58d947097df4a8f33234ef423210154']

In [63]:

Copied!

hex_region = "48761ad71,48761ad723,48761ad724c,48761ad73c,48761ad744,48761ad75d3,48761ad75d5,48761ad765,48761ad767,48761ad76c,48761ad774,48761ad779,48761ad77b,48761ad783,48761ad784c,48761ad7854,48761ad794,48761ad79c,48761ad7a4,48761ad7ac,48761ad7b1,48761ad7bc"
n.schedule.stops_on_spatial_condition(hex_region)
hex_region = "48761ad71,48761ad723,48761ad724c,48761ad73c,48761ad744,48761ad75d3,48761ad75d5,48761ad765,48761ad767,48761ad76c,48761ad774,48761ad779,48761ad77b,48761ad783,48761ad784c,48761ad7854,48761ad794,48761ad79c,48761ad7a4,48761ad7ac,48761ad7b1,48761ad7bc"
n.schedule.stops_on_spatial_condition(hex_region)

Out[63]:

['490000091G.link:1242',
 '490000091H.link:1912',
 '490000091F',
 '490000091E',
 '490000091G',
 '490000091H',
 '9400ZZLUGPS2',
 '490013600C']

Temporal¶

These methods are under construction. A useful one in the meantime is presented below. It generates a pandas.DataFrame of departure and arrival times between all stops for all trips.

In [64]:

Copied!

n.schedule.trips_with_stops_to_dataframe(gtfs_day="20200101").head()
n.schedule.trips_with_stops_to_dataframe(gtfs_day="20200101").head()

Out[64]:

	service_name	service_id	arrival_time	from_stop_name	to_stop	route_id	from_stop	departure_time	mode	to_stop_name	route_name	trip_id	vehicle_id
0	205	12430	2020-01-01 16:35:25	Euston Square (Stop P)	4900020147W.link:2634	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	490000078P.link:1383	2020-01-01 16:33:42	bus	University College Hosp Warren Street Stn (Sto...	205	VJ03f4f8905d6dc7868242f3fd29828ee9b366a906_16:...	veh_388_bus
1	205	12430	2020-01-01 16:37:08	University College Hosp Warren Street Stn (Sto...	490000252V.link:1182	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	4900020147W.link:2634	2020-01-01 16:35:25	bus	Warren Street Station (Stop V)	205	VJ03f4f8905d6dc7868242f3fd29828ee9b366a906_16:...	veh_388_bus
2	205	12430	2020-01-01 16:38:51	Warren Street Station (Stop V)	490000091G.link:1242	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	490000252V.link:1182	2020-01-01 16:37:08	bus	Great Portland Street (Stop G)	205	VJ03f4f8905d6dc7868242f3fd29828ee9b366a906_16:...	veh_388_bus
3	205	12430	2020-01-01 16:40:34	Great Portland Street (Stop G)	490000191B.link:305	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	490000091G.link:1242	2020-01-01 16:38:51	bus	Regent's Park (Stop B)	205	VJ03f4f8905d6dc7868242f3fd29828ee9b366a906_16:...	veh_388_bus
4	205	12430	2020-01-01 16:42:17	Regent's Park (Stop B)	490007807W.link:2922	VJ06420fdab0dfe5c8e7f2f9504df05cf6289cd7d3	490000191B.link:305	2020-01-01 16:40:34	bus	Harley Street (Stop L)	205	VJ03f4f8905d6dc7868242f3fd29828ee9b366a906_16:...	veh_388_bus

Accessing `Stop`, `Route`, `Service` objects¶

Once you extract IDs of interest, you can access these objects. You can also modify them, check out the Modify Network notebook for usage examples.

Each Service is indexed and can be accessed by its' ID. It also has a plot method.

In [65]:

Copied!

n.schedule.service_ids()[:5]
n.schedule.service_ids()[:5]

Out[65]:

['20274', '12430', '15234', '18915', '18853']

In [66]:

Copied!

service = n.schedule["12430"]
service.print()
service = n.schedule["12430"]
service.print()

Service ID: 12430
Name: 205
Number of routes: 12
Number of stops: 11

In [67]:

Copied!

# service.plot()
# service.plot()

Similarly, each Route is indexed and can be accessed by its' id. It also has a plot method.

In [68]:

Copied!

n.schedule.route_ids()[:5]
n.schedule.route_ids()[:5]

Out[68]:

['VJ375a660d47a2aa570aa20a8568012da8497ffecf',
 'VJ812fad65e7fa418645b57b446f00cba573f2cdaf',
 'VJ6c64ab7b477e201cae950efde5bd0cb4e2e8888e',
 'VJea6046f64f85febf1854290fb8f76e921e3ac96b',
 'VJf6055fdf9ef0dd6d0500b6c11adcfdd4d10655dc']

In [69]:

Copied!

route = n.schedule.route("VJ948e8caa0f08b9c6bf6330927893942c474b5100")
route.print()
route = n.schedule.route("VJ948e8caa0f08b9c6bf6330927893942c474b5100")
route.print()

Route ID: VJ948e8caa0f08b9c6bf6330927893942c474b5100
Name: 205
Number of stops: 5
Number of trips: 10

In [70]:

Copied!

# route.plot()
# route.plot()

Finally, each Stop is indexed too, and can be accessed by its' id.

In [71]:

Copied!

stop = n.schedule.stop("490007807E.link:1154")
stop.print()
stop = n.schedule.stop("490007807E.link:1154")
stop.print()

Stop ID: 490007807E.link:1154
Projection: epsg:27700
Lat, Lon: 51.52336503, -0.14951799
linkRefId: 1154

4.1. Using the Network object: Accessing data¶

Summary¶

Extracting links of interest¶

Modal convenience methods¶

Spatial convenience methods¶

Using the Schedule object¶

Summary¶

Headways¶

Extracting Stops, Routes, Services of interest¶

In general¶

Modal¶

Spatial¶

Temporal¶

Accessing Stop, Route, Service objects¶

4.1. Using the `Network` object: Accessing data¶

Using the `Schedule` object¶

Accessing `Stop`, `Route`, `Service` objects¶