PAM Read-Modify-Write¶
This notebook is an introduction to the basic read - modify - write use case of PAM:
- Read: Load activity plans from existing data (either tabular or MATSim)
- Modify: Use the PAM api to modify the activity plans
- Write: Write activity plans back to disk in the chosen format
For this example, we use policies to make our modifications. But you might also try the following:
- spatial sampling
- location modelling
- rescheduling
- adding noise
- simulating aging or the passing of time
- and so on...
import os
from collections import defaultdict
import geopandas as gp
import pandas as pd
from matplotlib import pyplot as plt
from pam import policy, read
from pam.policy import apply_policies
%matplotlib inline
Load Data¶
Here we load simple travel diary data of London commuters. This is a very simple 0.1% sample of data about work and education commutes from the 2011 census. Because we're sharing this data, we've aggregated locations to borough level and randomized personal attributes; so, don't get too excited about the results.
The data is available in the data/example_data
sub-directory.
All data paths in this example are relative to the notebook directory in the PAM repository
trips = pd.read_csv(
os.path.join("data", "example_data", "example_travel_diaries.csv"), index_col="uid"
)
attributes = pd.read_csv(
os.path.join("data", "example_data", "example_attributes.csv"), index_col="pid"
)
trips.head(10)
pid | hid | seq | hzone | ozone | dzone | purp | mode | tst | tet | freq | |
---|---|---|---|---|---|---|---|---|---|---|---|
uid | |||||||||||
0 | census_0 | census_0 | 0 | Harrow | Harrow | Camden | work | pt | 444 | 473 | 1000 |
1 | census_0 | census_0 | 1 | Harrow | Camden | Harrow | work | pt | 890 | 919 | 1000 |
2 | census_1 | census_1 | 0 | Greenwich | Greenwich | Tower Hamlets | work | pt | 507 | 528 | 1000 |
3 | census_1 | census_1 | 1 | Greenwich | Tower Hamlets | Greenwich | work | pt | 1065 | 1086 | 1000 |
4 | census_2 | census_2 | 0 | Croydon | Croydon | Croydon | work | pt | 422 | 425 | 1000 |
5 | census_2 | census_2 | 1 | Croydon | Croydon | Croydon | work | pt | 917 | 920 | 1000 |
6 | census_3 | census_3 | 0 | Haringey | Haringey | Redbridge | work | pt | 428 | 447 | 1000 |
7 | census_3 | census_3 | 1 | Haringey | Redbridge | Haringey | work | pt | 1007 | 1026 | 1000 |
8 | census_4 | census_4 | 0 | Hounslow | Hounslow | Westminster,City of London | work | car | 483 | 516 | 1000 |
9 | census_4 | census_4 | 1 | Hounslow | Westminster,City of London | Hounslow | work | car | 1017 | 1050 | 1000 |
Read¶
First we load example travel diary data to Activity Plans. This data represents 2011 baseline London population of commuters.
population = read.load_travel_diary(trips, attributes, trip_freq_as_person_freq=True)
Using tour based purpose parser (recommended)
Adding pid->hh mapping to persons_attributes from trips.
Adding home locations to persons attributes using trips attributes.
Using freq of 'None' for all trips.
Let's check out an example Activity Plan and Attributes:
household = population.households["census_12"]
person = household.people["census_12"]
person.print()
Person: census_12 {'gender': 'female', 'job': 'education', 'occ': 'white', 'inc': 'high', 'hzone': 'Croydon'} 0: Activity(act:home, location:Croydon, time:00:00:00 --> 07:06:00, duration:7:06:00) 1: Leg(mode:pt, area:Croydon --> Tower Hamlets, time:07:06:00 --> 07:45:00, duration:0:39:00) 2: Activity(act:education, location:Tower Hamlets, time:07:45:00 --> 15:54:00, duration:8:09:00) 3: Leg(mode:pt, area:Tower Hamlets --> Croydon, time:15:54:00 --> 16:33:00, duration:0:39:00) 4: Activity(act:home, location:Croydon, time:16:33:00 --> 00:00:00, duration:7:27:00)
Before we do any activity modification - we create a simple function to extract some example statistics. We include this as a simple demo, but would love to add more.
Note that activity plans allow us to consider detailed joint segmentations, such as socio-economic, spatial, temporal, modal, activity sequence and so on.
def print_simple_stats(population):
"""Print some simple population statistics."""
time_at_home = 0
travel_time = 0
low_income_central_trips = 0
high_income_central_trips = 0
for hh in population.households.values():
for person in hh.people.values():
freq = person.freq
for p in person.plan:
if p.act == "travel":
duration = p.duration.seconds * freq / 3600
travel_time += duration
if p.end_location.area == "Westminster,City of London":
if person.attributes["inc"] == "low":
low_income_central_trips += freq
elif person.attributes["inc"] == "high":
high_income_central_trips += freq
else: # activity
if p.act == "home":
duration = p.duration.seconds * freq / 3600
time_at_home += duration
print(f"Population total time at home: {time_at_home/1000000:.2f} million hours")
print(f"Population total travel time: {travel_time/1000000:.2f} million hours")
print(f"Low income trips to Central London: {low_income_central_trips} trips")
print(f"High income trips to Central London: {high_income_central_trips} trips")
print_simple_stats(population)
Population total time at home: 0.76 million hours Population total travel time: 0.03 million hours Low income trips to Central London: 3000 trips High income trips to Central London: 4000 trips
def plot_simple_stats(population):
"""Plot some simple population statistics."""
geoms = gp.read_file(os.path.join("data", "example_data", "geometry.geojson"))
departures = defaultdict(int)
arrivals = defaultdict(int)
for _hid, hh in population.households.items():
for _pid, person in hh.people.items():
freq = person.freq
for p in person.plan:
if p.act == "travel":
departures[p.start_location.area] += freq
arrivals[p.end_location.area] += freq
geoms["departures"] = geoms.NAME.map(departures)
geoms["arrivals"] = geoms.NAME.map(arrivals)
fig, ax = plt.subplots(1, 2, figsize=(16, 6))
for i, name in enumerate(["departures", "arrivals"]):
ax[i].title.set_text(name)
geoms.plot(name, ax=ax[i])
ax[i].axis("off")
plot_simple_stats(population)
Modify¶
Our 2011 baseline London population of commuters seems sensible, they spend about 50 million hours at home and 1.6 million hours travelling.
But what if we want to try and build some more up to date scenarios?
We consider two scenarios from a combination of policies:
Scenario A - Do Minimum:
- A household will be quarantined with p=0.025 (for example due to a possitive virus test within the household)
- A person will be staying at home (self isolating) with p=0.1 (for example due to being a vulnerable person)
Scenario B - Lockdown:
- As above plus education and work activities will be removed and plans adjusted with p=0.9 (for example because schools and work places are closed)
policy1 = policy.HouseholdQuarantined(probability=0.025)
policy2 = policy.PersonStayAtHome(probability=0.1)
policy3 = policy.RemoveHouseholdActivities(["education", "work"], probability=0.9)
do_minimum = apply_policies(population, [policy1, policy2])
lockdown = apply_policies(population, [policy1, policy2, policy3])
print_simple_stats(do_minimum)
plot_simple_stats(do_minimum)
Population total time at home: 0.67 million hours Population total travel time: 0.02 million hours Low income trips to Central London: 3000 trips High income trips to Central London: 4000 trips
print_simple_stats(lockdown)
plot_simple_stats(lockdown)
Population total time at home: 0.03 million hours Population total travel time: 0.00 million hours Low income trips to Central London: 1000 trips High income trips to Central London: 0 trips
Write¶
Assuming we are happy with our modified activity sequences we can write them to disk in our desired format. For this example we haven't prepared the population for MATSim so we write to disk as travel plans/diaries:
do_minimum.to_csv(os.path.join("tmp", "do_min"))
lockdown.to_csv(os.path.join("tmp", "lockdown"))