6.2. Validating the Network
object: Google Directions API for speed calculation¶
This page goes through methods available for generating speeds based off of Google Directions requests based off of the network links. Available as a jupyter notebook or wiki page.
You can generate and send google directions API requests based on the network. The process will create a 'car' modal
subgraph and generate API requests for all edges in the subgraph. The number of requests is at most the number of edges
in the subgraph. The process simplifies edges using osmnx
library's method to extract a chains of nodes with no
intersections, in this way reducing the number of requests. If your graph is already simplified, the number of requests
will be equal to the number of edges.
import logging
import os
import random
import matplotlib.pyplot as plt
from shapely.geometry import LineString
import genet
from genet import google_directions, read_matsim
from genet.output.spatial import generate_geodataframes
from genet.utils.io import save_geodataframe
1. Creating requests¶
First of all, we need to read in the network for which the requests are being generated
path_to_matsim_network = "example_data/pt2matsim_network"
network = os.path.join(path_to_matsim_network, "network.xml")
schedule = os.path.join(path_to_matsim_network, "schedule.xml")
vehicles = os.path.join(path_to_matsim_network, "vehicles.xml")
network_epsg = "epsg:27700"
n = read_matsim(
path_to_network=network, epsg=network_epsg, path_to_schedule=schedule, path_to_vehicles=vehicles
)
# you don't need to read the vehicles file, but doing so ensures all vehicles in the schedule
# are of the expected type and the definition of the vehicle is preserved
n.print()
Graph info: Name: Type: MultiDiGraph Number of nodes: 1662 Number of edges: 3166 Average in degree: 1.9049 Average out degree: 1.9049 Schedule info: Schedule: Number of services: 9 Number of routes: 68 Number of stops: 118
Next we define the function to generate requests; it takes as input the network and the list of osm tags
Let's create requests to get speed information for the main roads, print the number of requests, and save requests in a json file.
# osm_tag = takes a list of osm tags to filter the network on, e.g. ['primary', 'secondary', 'tertiary']
osm_tag = ["secondary"]
requests = genet.utils.google_directions.generate_requests(n, osm_tag)
print(len(requests))
genet.utils.google_directions.dump_all_api_requests_to_json(
requests, "example_data/outputs/google_speed_data", "api_requests_send.json"
)
2022-07-14 16:11:30,771 - Generating Google Directions API requests for a non-simplified network. 2022-07-14 16:11:30,799 - Identified 45 edge endpoints 2022-07-14 16:11:30,800 - Identified 57 possible paths 2022-07-14 16:11:30,801 - Processing 57 paths 2022-07-14 16:11:30,805 - Saving Google Directions API requests to example_data/
52
52 requests were generated. The number of requests to be sent is important as it will influence the cost of Google Directions API. Current pricing can be found here: https://developers.google.com/maps/documentation/directions/usage-and-billing
It can be useful to visualise the requests before sending them, to confirm that they are as expected. Section 5 explains how to visualise requests you generate and results you receive from the API using Kepler.
2. Sending requests¶
2.1 Sending requests created in Section 1¶
To send requests to Google Direction API you need a key (read more here). After obtaining a key, you can either pass it to the elevant function directly
# Specify your own key
api_key = "YOUR API KEY"
# Read in the API requests generated in the previous section
path = "example_data/outputs/google_speed_data/api_requests_send.json"
api_requests = genet.utils.google_directions.read_api_requests(path)
Before we send the requests, we need to specify the time of departure:
None
to get duration based on road network and average time-independent traffic conditionsnow
for current time- Time in the future as an integer in seconds since midnight, January 1, 1970 UTC, i.e. unix time (this website is useful for converting to unix time https://www.epochconverter.com )
More details about departure_time
parameter can be found here: https://developers.google.com/maps/documentation/directions/get-directions#departure_time
If you set departure_time
parameter to now
or a time in the future, you can also specify the traffic_model
parameter:
best_guess
(default): indicates that the returned duration_in_traffic should be the best estimate of travel time given what is known about both historical traffic conditions and live traffic. Live traffic becomes more important the closer the departure_time is to now.pessimistic
: indicates that the returned duration_in_traffic should be longer than the actual travel time on most days, though occasional days with particularly bad traffic conditions may exceed this value.optimistic
: indicates that the returned duration_in_traffic should be shorter than the actual travel time on most days, though occasional days with particularly good traffic conditions may be faster than this value.
More details about traffic_model
parameter can be found here: https://developers.google.com/maps/documentation/directions/get-directions#traffic_model
This method will save derived results in the output directory provided, an example can be found here:
example_data/outputs/google_speed_data
.
It comprises of the google polyline of the response and speed derived from distance and time taken to travel as well as information that was generated in order to make the response such as the node IDs in the network for which this response holds, the path_nodes
which denote any extra nodes from the non-simplified chain of nodes/edges in the request, the polyline of the network path, encoded using the same polyline encoding as the Google request polyline; as well as spatial information about the origin and destination of the request and timestamp.
2.2 Generating and sending requests in one step¶
The steps described above give you flexibility in choosing the parts of the network for which you want to send the requests. If you are happy to send the requests for the whole network, you can skip sections 1 and 2.1 and go straight to the steps below (you still need to load the network first).
If you know the API key, you can specify it in the function call:
Or set it as an environmental variable called GOOGLE_DIR_API_KEY
, if using command line: $ export GOOGLE_DIR_API_KEY='key'
If you use AWS, you can also store the key in the Secrets Manager
(read more here)
authenticate to your AWS account and then pass the secret_name
and region_name
to the send_requests_for_network
method:
3. Processing the requests¶
3.1 Attaching the speed values from requests to the network¶
Once the request results have been received, you can read them in from the output directory you specified. For now, we will use an example dataset.
api_requests = google_directions.read_api_requests(
"example_data/example_google_speed_data/api_requests_received.json"
)
random.sample(sorted(api_requests.items()), 1)
[(('1614978621', '351788581'), {'path_nodes': ['1614978621', '455705622', '9475528', '2441993346', '21704017', '351788581'], 'path_polyline': 'kxlyH`jVD\\fBlD|DlFB@z@x@', 'origin': {'id': '1614978621', 'x': 530576.8274248724, 'y': 181393.9033903911, 'lon': -0.11953208708639783, 'lat': 51.5163844332409, 's2_id': 5221390732826561719}, 'destination': {'id': '351788581', 'x': 530408.2266209685, 'y': 181187.08346980734, 'lon': -0.12203698708857721, 'lat': 51.514564733241166, 's2_id': 5221366089140130631}, 'timestamp': 1637671582.669343, 'request_payload': {'geocoded_waypoints': [{'geocoder_status': 'OK', 'place_id': 'ChIJyf-1bDUbdkgRPXzY5S3KuqI', 'types': ['street_address']}, {'geocoder_status': 'OK', 'place_id': 'ChIJ_dwjTssEdkgRiFkpcNxtMhM', 'types': ['street_address']}], 'routes': [{'bounds': {'northeast': {'lat': 51.51638999999999, 'lng': -0.1195499}, 'southwest': {'lat': 51.5145685, 'lng': -0.122045}}, 'copyrights': 'Map data ©2021 Google', 'legs': [{'distance': {'text': '0.3 km', 'value': 273}, 'duration': {'text': '1 min', 'value': 68}, 'duration_in_traffic': {'text': '1 min', 'value': 70}, 'end_address': '78 Long Acre, London WC2E 9NG, UK', 'end_location': {'lat': 51.5145685, 'lng': -0.122045}, 'start_address': '60 Kingsway, London WC2B 6DS, UK', 'start_location': {'lat': 51.51638999999999, 'lng': -0.1195499}, 'steps': [{'distance': {'text': '0.3 km', 'value': 273}, 'duration': {'text': '1 min', 'value': 68}, 'end_location': {'lat': 51.5145685, 'lng': -0.122045}, 'html_instructions': 'Head <b>southwest</b> on <b>Great Queen St</b>/<wbr/><b>B402</b> toward <b>Kingsway</b>/<wbr/><b>A4200</b><div style="font-size:0.9em">Continue to follow B402</div><div style="font-size:0.9em">Leaving toll zone in 270 m at Drury Ln</div>', 'polyline': {'points': 'mxlyHdjVDX@DBBd@`ABFh@dAFLLX@@LVT^FJb@p@NTDHFHX`@BBDF@@FHB@@@@?@AHA@?B?@?B@@@@@DDh@h@BD'}, 'start_location': {'lat': 51.51638999999999, 'lng': -0.1195499}, 'travel_mode': 'DRIVING'}], 'traffic_speed_entry': [], 'via_waypoint': []}], 'overview_polyline': {'points': 'mxlyHdjVF^h@dAbAtBnB`Dd@n@NLPCHDr@t@'}, 'summary': 'B402', 'warnings': [], 'waypoint_order': []}], 'status': 'OK'}, 'parsed_response': {'google_speed': 3.9, 'google_polyline': 'mxlyHdjVF^h@dAbAtBnB`Dd@n@NLPCHDr@t@'}})]
Once you have results, you can attach them to the network. This will create a dictionary of non-simplified edges to which the response data applies.
google_edge_data = google_directions.map_results_to_edges(api_requests)
random.sample(sorted(google_edge_data.items()), 1)
[(('3085109046', '3085109045'), {'google_speed': 4.087248322147651, 'google_polyline': 'kunyHjcUt@rAb@n@f@j@tBxBVh@tAxDn@hEpBrNn@pE'})]
n.edge("9791490", "4698712638")
{0: {'id': '596', 'from': '9791490', 'to': '4698712638', 'freespeed': 4.166666666666667, 'capacity': 600.0, 'permlanes': 1.0, 'oneway': '1', 'modes': {'car'}, 's2_from': 5221390682074967269, 's2_to': 5221390682013665025, 'attributes': {'osm:way:access': 'no', 'osm:way:highway': 'unclassified', 'osm:way:id': 476247613.0, 'osm:way:name': 'Chitty Street'}, 'length': 33.76444553419279}}
If we're working with a network that may have multiple edges between the same pair of nodes, we can restrict the links to which the data will be applied by specifying a modal condition, so that at least only links allowing cars will inherit this data.
def modal_condition(value):
return "car" in value
n.apply_attributes_to_edges(google_edge_data, conditions={"modes": modal_condition})
2022-07-14 16:11:30,927 - Changed Edge attributes for 180 edges
This will result in two new data points in the relevant links: google_speed
and google_polyline
.
n.edge("9791490", "4698712638")
{0: {'id': '596', 'from': '9791490', 'to': '4698712638', 'freespeed': 4.166666666666667, 'capacity': 600.0, 'permlanes': 1.0, 'oneway': '1', 'modes': {'car'}, 's2_from': 5221390682074967269, 's2_to': 5221390682013665025, 'attributes': {'osm:way:access': 'no', 'osm:way:highway': 'unclassified', 'osm:way:id': 476247613.0, 'osm:way:name': 'Chitty Street'}, 'length': 33.76444553419279}}
Next, we can validate the difference between freespeed
and google_speed
.
def speed_difference(link_attribs):
return link_attribs["freespeed"] - link_attribs["google_speed"]
n.apply_function_to_links(speed_difference, "speed_difference")
2022-07-14 16:11:30,968 - 2986 out of 3166 links have not been affected by the function. Links affected: ['1020', '1065', '1066', '1078', '1086', '1087', '1088', '1089', '1090', '1091', '1092', '1123', '1175', '1176', '1178', '1184', '1185', '1186', '1199', '1200', '1201', '1202', '1238', '1240', '1244', '1251', '1252', '1253', '1254', '1256', '1257', '1258', '1259', '1289', '1290', '1317', '1318', '1319', '1320', '146', '147', '1474', '1476', '1477', '1530', '1531', '1532', '1533', '1534', '1535', '1536', '1537', '1638', '1641', '1645', '1646', '1647', '1648', '1649', '1650', '1651', '1652', '1653', '1654', '1655', '1656', '1736', '1760', '1761', '1799', '18', '1800', '1891', '1892', '19', '191', '1917', '1918', '192', '1953', '1954', '1955', '1973', '1992', '2041', '2042', '2043', '2044', '2053', '2054', '2119', '2171', '2178', '2179', '2180', '2181', '2182', '2183', '2184', '2185', '2225', '2247', '2248', '2249', '2275', '2348', '2361', '2362', '2369', '2370', '2371', '2373', '2378', '2381', '2382', '2468', '2469', '252', '2566', '2567', '2568', '2569', '2600', '262', '2673', '2674', '2745', '2839', '2881', '2897', '2898', '2955', '2956', '2979', '2980', '2981', '2982', '30', '3056', '3098', '3099', '31', '3171', '3172', '3183', '327', '33', '334', '3340', '3341', '3342', '3344', '3345', '34', '366', '367', '404', '405', '409', '414', '415', '440', '486', '487', '765', '798', '799', '802', '806', '810', '811', '812', '83', '877', '878', '880', '881', '930', '931', '948'] 2022-07-14 16:11:30,996 - Changed Link attributes for 180 links
You can also choose to set google speed as the freespeed
in the network. But be mindful if you use it for MATSim
simulations, freespeed
denotes the maximum speed a vehicle can travel on a certain link, Google Directions API data
with departure_time='now'
should be ran late at night/early morning ~4am local time to the network for any reliable results.
Otherwise you are adding traffic conditions to the network which should be simulated by demand (population) side of
the model rather than supply (network).
def set_google_speed(link_attribs):
if link_attribs["google_speed"] != 0:
return link_attribs["google_speed"]
else:
return link_attribs["freespeed"]
n.apply_function_to_links(set_google_speed, "freespeed")
2022-07-14 16:11:31,037 - 2986 out of 3166 links have not been affected by the function. Links affected: ['1020', '1065', '1066', '1078', '1086', '1087', '1088', '1089', '1090', '1091', '1092', '1123', '1175', '1176', '1178', '1184', '1185', '1186', '1199', '1200', '1201', '1202', '1238', '1240', '1244', '1251', '1252', '1253', '1254', '1256', '1257', '1258', '1259', '1289', '1290', '1317', '1318', '1319', '1320', '146', '147', '1474', '1476', '1477', '1530', '1531', '1532', '1533', '1534', '1535', '1536', '1537', '1638', '1641', '1645', '1646', '1647', '1648', '1649', '1650', '1651', '1652', '1653', '1654', '1655', '1656', '1736', '1760', '1761', '1799', '18', '1800', '1891', '1892', '19', '191', '1917', '1918', '192', '1953', '1954', '1955', '1973', '1992', '2041', '2042', '2043', '2044', '2053', '2054', '2119', '2171', '2178', '2179', '2180', '2181', '2182', '2183', '2184', '2185', '2225', '2247', '2248', '2249', '2275', '2348', '2361', '2362', '2369', '2370', '2371', '2373', '2378', '2381', '2382', '2468', '2469', '252', '2566', '2567', '2568', '2569', '2600', '262', '2673', '2674', '2745', '2839', '2881', '2897', '2898', '2955', '2956', '2979', '2980', '2981', '2982', '30', '3056', '3098', '3099', '31', '3171', '3172', '3183', '327', '33', '334', '3340', '3341', '3342', '3344', '3345', '34', '366', '367', '404', '405', '409', '414', '415', '440', '486', '487', '765', '798', '799', '802', '806', '810', '811', '812', '83', '877', '878', '880', '881', '930', '931', '948'] 2022-07-14 16:11:31,065 - Changed Link attributes for 180 links
n.edge("9791490", "4698712638")
{0: {'id': '596', 'from': '9791490', 'to': '4698712638', 'freespeed': 4.166666666666667, 'capacity': 600.0, 'permlanes': 1.0, 'oneway': '1', 'modes': {'car'}, 's2_from': 5221390682074967269, 's2_to': 5221390682013665025, 'attributes': {'osm:way:access': 'no', 'osm:way:highway': 'unclassified', 'osm:way:id': 476247613.0, 'osm:way:name': 'Chitty Street'}, 'length': 33.76444553419279}}
3.2 Validating google speed values¶
Once you have attached the google speed values to the network, you may want to do some validation, to check if the values make sense and if there are any missing values.
To do that, we first need to convert the network to a geodataframe.
# If you only sent the requests for parts of the network with a certain OSM tag,
# you should pass the list of those tags to the function graph_to_gdf()
def graph_to_gdf(network, osm_tag=all):
subgraph_t = network.subgraph_on_link_conditions(
conditions=[{"attributes": {"osm:way:highway": osm_tag}}, {"modes": "car"}],
how=all,
mixed_dtypes=True,
)
# convert subgraph to geodataframe
gdf_dict = generate_geodataframes(subgraph_t)
gdf = gdf_dict["links"]
# fill in missing points (due to network structure when filtering by osm_tag)
gdf.loc[gdf["geometry"].isna(), "geometry"] = gdf.loc[gdf["geometry"].isna()].apply(
lambda x: line_geometry(x["from"], x["to"]), axis=1
)
# convert to epsg:4326 to allow visualisation in Kepler in section 4
gdf = gdf.to_crs("epsg:4326")
return gdf
# For filling in missing points
def line_geometry(u, v):
from_node = n.node(u)
to_node = n.node(v)
return LineString(
[(float(from_node["x"]), float(from_node["y"])), (float(to_node["x"]), float(to_node["y"]))]
)
gdf = graph_to_gdf(n)
with_gs = gdf[gdf["google_speed"].notna()]
google_speed_list = with_gs["google_speed"].to_list()
zeros = sum(i <= 0 for i in google_speed_list)
if zeros > 0:
google_speed_list = google_speed_list.remove(0)
minimum = min(google_speed_list)
maximum = max(google_speed_list)
average = sum(google_speed_list) / len(google_speed_list)
summary = (
"Average value of google_speed is "
+ str(average)
+ " meters/seconds (="
+ str(average * 3.6)
+ " km/hour), "
"maximum value of google_speed is "
+ str(maximum)
+ " m/s (="
+ str(maximum * 3.6)
+ " km/h), minimum (non-zero) "
"value of google_speed is "
+ str(minimum)
+ " m/s (="
+ str(minimum * 3.6)
+ " km/h); there are "
+ str(zeros)
+ " links "
"with google_speed value equal to 0 m/s"
)
logging.info(summary)
2022-07-14 16:11:32,349 - Average value of google_speed is 3.815765080269415 meters/seconds (=13.736754288969895 km/hour), maximum value of google_speed is 8.0 m/s (=28.8 km/h), minimum (non-zero) value of google_speed is 2.0416666666666665 m/s (=7.35 km/h); there are 0 links with google_speed value equal to 0 m/s
4. Visualising google speeds¶
There various tools available to visualise a genet network, and these are described in detail in the notebook 7. Visualising Network
. Here, we will have a quick look at how to visualise the google speeds in particular.
First, we can do a quick visualisation using GeoPandas tools themselves, by plotting the parts of the network which have a google_speed
value.
fig, ax = plt.subplots(1, 1, figsize=(12, 8))
gdf.plot(column="google_speed", ax=ax, legend=True, cmap="hot", linewidth=2.5)
<matplotlib.axes._subplots.AxesSubplot at 0x7f7a55d1ded0>
Since the visualisation above is quite simplistic, you may want to use Kepler instead. To do so, you just need to save the geodataframe in geojson format, and upload the file to Kepler: https://kepler.gl/demo
logging.info("saving network links with valid google speed values to geojson")
save_geodataframe(
with_gs, "api_requests_viz", "example_data/outputs/google_speed_data/", filetype="geojson"
)
2022-07-14 16:11:32,789 - saving network links with valid google speed values to geojson
Once the geojson file is uploaded to Kepler, click on the button next to Stroke Colour
field (shown by green arrow in the image below). Then, in the field Stroke Color Based On
choose google_speed
from the drop down menu. You can also click on the table icon in the top right corner to display the legend.
You can save this map in html format by clicking on the 'Share' button in the top left corner and selecting 'Export Map'.