Demo - MATSim Population for West London¶

This notebook demonstrates an complex example workflow for creating a sample population for an area in West London. It creates agent plans for people and households using a random process.

Aim¶

Create a bigger and more realistic sample population automatically for the West London area called Londinium. The sample population includes various activities, personal attributes and modes; the population would be used as input for MATSim transport simulation.

Steps:

Import geographic data of Londinium;
Facility sampling from OpenStreetMap data;
Activity generation model with home based tours. Expand agents with different personal attributes, activities and trips;
Perform Data Visualization and validation. Plot the activity plan, distance and duration of population;
Export intermediate CSV tables of the population

In [1]:

Copied!





import os

import geopandas as gp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pam.activity import Activity, Leg
from pam.core import Household, Person, Population
from pam.plot.stats import plot_activity_times
from pam.read import load_travel_diary
from pam.report.benchmarks import distance_counts, duration_counts
from pam.samplers import facility
from pam.utils import minutes_to_datetime as mtdt
from pam.variables import END_OF_DAY
from pam.write import to_csv, write_matsim, write_od_matrices

%matplotlib inline
import os

import geopandas as gp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pam.activity import Activity, Leg
from pam.core import Household, Person, Population
from pam.plot.stats import plot_activity_times
from pam.read import load_travel_diary
from pam.report.benchmarks import distance_counts, duration_counts
from pam.samplers import facility
from pam.utils import minutes_to_datetime as mtdt
from pam.variables import END_OF_DAY
from pam.write import to_csv, write_matsim, write_od_matrices

%matplotlib inline

Import geographic data of Londinium¶

In [2]:

Copied!

# Import geographic data of west london area
network_bb_path = os.path.join("data", "network_bounding_box.geojson")
lsoas_path = os.path.join("data", "lsoas")  # lsoas: lower layer super output areas
# Import geographic data of west london area
network_bb_path = os.path.join("data", "network_bounding_box.geojson")
lsoas_path = os.path.join("data", "lsoas")  # lsoas: lower layer super output areas

We will start by plotting Londinium boundary

In [3]:

Copied!





# Read the file and plot the boundary
boundary = gp.read_file(network_bb_path)

# Transform to epsg:27700
boundary = boundary.to_crs("epsg:27700")
boundary.plot()
# Read the file and plot the boundary
boundary = gp.read_file(network_bb_path)

# Transform to epsg:27700
boundary = boundary.to_crs("epsg:27700")
boundary.plot()

Out[3]:

<Axes: >

No description has been provided for this image

Next we will plot Londinium outline shown above over a map of London to see where exactly it is located.

In [4]:

Copied!





# Plot boundary area in lsoas
lsoas = gp.read_file(lsoas_path)
lsoas.crs = "EPSG:27700"
print(lsoas.crs)
lsoas = lsoas.set_index("LSOA_CODE")

fig, ax = plt.subplots(figsize=(10, 10))
lsoas.plot(ax=ax)
boundary.plot(ax=ax, color="red")
# Plot boundary area in lsoas
lsoas = gp.read_file(lsoas_path)
lsoas.crs = "EPSG:27700"
print(lsoas.crs)
lsoas = lsoas.set_index("LSOA_CODE")

fig, ax = plt.subplots(figsize=(10, 10))
lsoas.plot(ax=ax)
boundary.plot(ax=ax, color="red")

EPSG:27700

Out[4]:

<Axes: >

Finally, we will plot Londinium with LSOA boundaries included.

In [5]:

Copied!

# Overlay the area using geopandas package
lsoas_clipped = gp.overlay(lsoas, boundary, how="intersection")
lsoas_clipped.plot()
# Overlay the area using geopandas package
lsoas_clipped = gp.overlay(lsoas, boundary, how="intersection")
lsoas_clipped.plot()

Out[5]:

<Axes: >

In [6]:

Copied!

lsoas_clipped.head()
lsoas_clipped.head()

Out[6]:

	LSOA_NAME	MSOA_CODE	MSOA_NAME	STWARDCODE	STWARDNAME	LA_CODE	LA_NAME	geometry
0	Hammersmith and Fulham 010A	E02000381	Hammersmith and Fulham 010	00ANGA	Addison	00AN	Hammersmith and Fulham	POLYGON ((523932.247 179242.842, 523959.439 17...
1	Hammersmith and Fulham 010B	E02000381	Hammersmith and Fulham 010	00ANGA	Addison	00AN	Hammersmith and Fulham	POLYGON ((524171.272 179363.077, 524212.654 17...
2	Hammersmith and Fulham 012A	E02000383	Hammersmith and Fulham 012	00ANGC	Avonmore and Brook Green	00AN	Hammersmith and Fulham	POLYGON ((524167.660 178997.302, 524060.845 17...
3	Hammersmith and Fulham 012B	E02000383	Hammersmith and Fulham 012	00ANGC	Avonmore and Brook Green	00AN	Hammersmith and Fulham	POLYGON ((523774.000 178714.003, 523831.847 17...
4	Hammersmith and Fulham 012C	E02000383	Hammersmith and Fulham 012	00ANGC	Avonmore and Brook Green	00AN	Hammersmith and Fulham	MULTIPOLYGON (((524422.688 178825.081, 524379....

Facility sampler¶

In [7]:

Copied!





facilities_path = "data/londinium_facilities_sample.geojson"
facilities = gp.read_file(facilities_path)
facilities = facilities.rename({"activities": "activity"}, axis=1)
facilities.crs = "EPSG:27700"
facilities.head()
facilities_path = "data/londinium_facilities_sample.geojson"
facilities = gp.read_file(facilities_path)
facilities = facilities.rename({"activities": "activity"}, axis=1)
facilities.crs = "EPSG:27700"
facilities.head()

Out[7]:

	activity	area	distance_to_nearest_education	distance_to_nearest_medical	distance_to_nearest_shop	distance_to_nearest_transit	floor_area	id	levels	units	geometry
0	home	574	617.965594	516.743962	77.712882	466.059745	1148.0	1084822608	2.0	1.0	POINT (524877.659 179721.080)
1	home	66	143.055807	115.674294	125.537224	286.017738	198.0	368319574	3.0	1.0	POINT (527830.357 174758.729)
2	home	103	54.946075	214.532285	41.572871	93.975944	412.0	1640220880	4.0	1.0	POINT (526060.994 178970.515)
3	home	192	164.455318	216.217139	111.674214	180.452314	768.0	1741392588	4.0	1.0	POINT (526698.625 178513.841)
4	home	123	173.648285	249.190465	188.276309	139.258340	246.0	984446626	2.0	1.0	POINT (526369.238 179166.396)

Start by plotting different facility types, e.g. educational and medical facilities

In [8]:

Copied!





education = facilities[facilities["activity"] == "education"]
medical = facilities[facilities["activity"] == "medical"]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 12))

boundary.plot(ax=ax1, color="steelblue")
education.plot(ax=ax1, color="orange", label="Educational facilities")
ax1.legend()

boundary.plot(ax=ax2, color="steelblue")
medical.plot(ax=ax2, color="red", label="Medical facilities")
ax2.legend()
education = facilities[facilities["activity"] == "education"]
medical = facilities[facilities["activity"] == "medical"]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 12))

boundary.plot(ax=ax1, color="steelblue")
education.plot(ax=ax1, color="orange", label="Educational facilities")
ax1.legend()

boundary.plot(ax=ax2, color="steelblue")
medical.plot(ax=ax2, color="red", label="Medical facilities")
ax2.legend()

Out[8]:

<matplotlib.legend.Legend at 0x1525c5000>

In [9]:

Copied!

lsoas_clipped.crs = "EPSG:27700"
len(lsoas_clipped)
lsoas_clipped.crs = "EPSG:27700"
len(lsoas_clipped)

Out[9]:

In [10]:

Copied!

lsoas_clipped = lsoas_clipped.set_index("LSOA_NAME")
lsoas_clipped = lsoas_clipped.set_index("LSOA_NAME")

In [11]:

Copied!





# build the sampler
facility_sampler = facility.FacilitySampler(
    facilities=facilities, zones=lsoas_clipped, build_xml=True, fail=False, random_default=True
)
# build the sampler
facility_sampler = facility.FacilitySampler(
    facilities=facilities, zones=lsoas_clipped, build_xml=True, fail=False, random_default=True
)

Joining facilities data to zones, this may take a while.

Building sampler, this may take a while.

Activity generation model¶

In [12]:

Copied!

# Create random area sample

def random_area_sampler():
    indexes = list(lsoas_clipped.index)
    return np.random.choice(indexes)

random_area_sampler()  # test
# Create random area sample

def random_area_sampler():
    indexes = list(lsoas_clipped.index)
    return np.random.choice(indexes)

random_area_sampler()  # test

Out[12]:

'Hammersmith and Fulham 016C'

It is a simple home based tours within 24 hours.
We create different activity types: work, leisure, education, shopping, etc. Different transport model types: car, bus, subway, etc.
Random number is assigned to the duration for each activity and transport mode

In [13]:

Copied!

# mapping the MSOA and LAD with index
mapping_dict = dict(zip(lsoas_clipped.index, lsoas_clipped.MSOA_CODE))
mapping_dict1 = dict(zip(lsoas_clipped.index, lsoas_clipped.LA_NAME))
# mapping the MSOA and LAD with index
mapping_dict = dict(zip(lsoas_clipped.index, lsoas_clipped.MSOA_CODE))
mapping_dict1 = dict(zip(lsoas_clipped.index, lsoas_clipped.LA_NAME))

In [14]:

Copied!





# Generate agents in west london area


def generate_agents(no_of_agents):
    """
    Randomly create agents with simple home-based tours.
    The trip starts from home, has a random number of various acitivites, tranport modes would be added.
    The trip finally ends at home.

    """
    population = Population()  # Initialise an empty population

    # Create simple personal attributes
    income = ["low", "medium", "high"]
    gender = ["male", "female"]
    sort_age = [
        "0 to 4",
        "5 to 10",
        "11 to 15",
        "16 to 20",
        "21 to 25",
        "26 to 29",
        "30 to 39",
        "40 to 49",
        "50 to 59",
        "60 to 64",
        "65 to 69",
        "70 to 74",
        "75 to 79",
        "80 to 84",
        "85  and over",
    ]

    # Create mode and activities
    transport = ["car", "bus", "ferry", "rail", "subway", "bike", "walk"]
    # Removed gym and park due to osmox problem
    activity = [
        "leisure",
        "work",
        "shop",
        "medical",
        "education",
        "park",
        "pub",
        "gym",
    ]  # Primary activity
    sub_activity = [
        "shop",
        "medical",
        "pub",
        "gym",
    ]  # People usually spend less time on sub activity

    # Add activity plan for each person
    for i in range(no_of_agents):
        # Create different agents and household
        agent_id = f"agent_{i}"
        hh_id = f"hh_{i}"
        hh = Household(hh_id, freq=1)

        # Adding Activities and Legs alternately to different agents
        # Activity 1 - home
        leaves_home = (np.random.randint(6, 8) * 60) + np.random.randint(0, 100)  # minutes
        location1 = random_area_sampler()
        location1_loc = facility_sampler.sample(location1, "home")
        lsoa_name = mapping_dict.get(location1)
        lad_name = mapping_dict1.get(location1)

        agent = Person(
            agent_id,
            freq=1,
            attributes={
                "subpopulation": np.random.choice(income) + " income",
                "gender": np.random.choice(gender),
                "age": np.random.choice(sort_age),
                "household_zone": location1,
                "household_LSOA": lsoa_name,
                "household_LAD": lad_name,
            },
        )

        hh.add(agent)
        population.add(hh)

        # Trip duration
        trip_duration_main_activity = np.random.randint(3, 6) * 60
        trip_duration_sub_activity = np.random.randint(1, 3) * 60

        agent.add(
            Activity(
                seq=1,
                act="home",
                area=location1,
                loc=location1_loc,
                start_time=mtdt(0),
                end_time=mtdt(leaves_home),
            )
        )

        # Initiated parameters
        location_prev = location1
        location_prev_loc = location1_loc
        leave_time = leaves_home

        # Add random numbers of activities
        no_of_activities = np.random.randint(1, 5)

        for i in range(no_of_activities):
            arrives_primary = leave_time + np.random.randint(10, 90)  # minutes

            # Activity 2.
            if i < 2:  # Start with main activity
                random_act = np.random.choice(activity)
            else:
                random_act = np.random.choice(sub_activity)

            if random_act == ("work"):
                leaves_primary = arrives_primary + trip_duration_main_activity
            else:
                leaves_primary = arrives_primary + trip_duration_sub_activity

            # Outbound leg
            location_next = random_area_sampler()
            location_next_loc = facility_sampler.sample(location_next, random_act)

            agent.add(
                Leg(
                    seq=i + 1,
                    mode=np.random.choice(transport),
                    start_area=location_prev,
                    start_loc=location_prev_loc,
                    end_area=location_next,
                    end_loc=location_next_loc,
                    start_time=mtdt(leave_time),
                    end_time=mtdt(arrives_primary),
                )
            )

            agent.add(
                Activity(
                    seq=i + 2,
                    act=random_act,
                    area=location_next,
                    loc=location_next_loc,
                    start_time=mtdt(arrives_primary),
                    end_time=mtdt(leaves_primary),
                )
            )

            # Update parameters
            leave_time = leaves_primary
            location_prev = location_next
            location_prev_loc = location_next_loc

        # Inbound leg
        arrives_home = leave_time + np.random.randint(10, 90)  # minutes
        agent.add(
            Leg(
                seq=no_of_activities + 1,
                mode=np.random.choice(transport),
                start_area=location_next,
                start_loc=location_next_loc,
                end_area=location1,
                end_loc=location1_loc,
                start_time=mtdt(leave_time),
                end_time=mtdt(arrives_home),
            )
        )

        # Activity
        agent.add(
            Activity(
                seq=no_of_activities + 2,
                act="home",
                area=location1,
                loc=location1_loc,
                start_time=mtdt(arrives_home),
                end_time=END_OF_DAY,
            )
        )

    return population
# Generate agents in west london area


def generate_agents(no_of_agents):
    """
    Randomly create agents with simple home-based tours.
    The trip starts from home, has a random number of various acitivites, tranport modes would be added.
    The trip finally ends at home.

    """
    population = Population()  # Initialise an empty population

    # Create simple personal attributes
    income = ["low", "medium", "high"]
    gender = ["male", "female"]
    sort_age = [
        "0 to 4",
        "5 to 10",
        "11 to 15",
        "16 to 20",
        "21 to 25",
        "26 to 29",
        "30 to 39",
        "40 to 49",
        "50 to 59",
        "60 to 64",
        "65 to 69",
        "70 to 74",
        "75 to 79",
        "80 to 84",
        "85  and over",
    ]

    # Create mode and activities
    transport = ["car", "bus", "ferry", "rail", "subway", "bike", "walk"]
    # Removed gym and park due to osmox problem
    activity = [
        "leisure",
        "work",
        "shop",
        "medical",
        "education",
        "park",
        "pub",
        "gym",
    ]  # Primary activity
    sub_activity = [
        "shop",
        "medical",
        "pub",
        "gym",
    ]  # People usually spend less time on sub activity

    # Add activity plan for each person
    for i in range(no_of_agents):
        # Create different agents and household
        agent_id = f"agent_{i}"
        hh_id = f"hh_{i}"
        hh = Household(hh_id, freq=1)

        # Adding Activities and Legs alternately to different agents
        # Activity 1 - home
        leaves_home = (np.random.randint(6, 8) * 60) + np.random.randint(0, 100)  # minutes
        location1 = random_area_sampler()
        location1_loc = facility_sampler.sample(location1, "home")
        lsoa_name = mapping_dict.get(location1)
        lad_name = mapping_dict1.get(location1)

        agent = Person(
            agent_id,
            freq=1,
            attributes={
                "subpopulation": np.random.choice(income) + " income",
                "gender": np.random.choice(gender),
                "age": np.random.choice(sort_age),
                "household_zone": location1,
                "household_LSOA": lsoa_name,
                "household_LAD": lad_name,
            },
        )

        hh.add(agent)
        population.add(hh)

        # Trip duration
        trip_duration_main_activity = np.random.randint(3, 6) * 60
        trip_duration_sub_activity = np.random.randint(1, 3) * 60

        agent.add(
            Activity(
                seq=1,
                act="home",
                area=location1,
                loc=location1_loc,
                start_time=mtdt(0),
                end_time=mtdt(leaves_home),
            )
        )

        # Initiated parameters
        location_prev = location1
        location_prev_loc = location1_loc
        leave_time = leaves_home

        # Add random numbers of activities
        no_of_activities = np.random.randint(1, 5)

        for i in range(no_of_activities):
            arrives_primary = leave_time + np.random.randint(10, 90)  # minutes

            # Activity 2.
            if i < 2:  # Start with main activity
                random_act = np.random.choice(activity)
            else:
                random_act = np.random.choice(sub_activity)

            if random_act == ("work"):
                leaves_primary = arrives_primary + trip_duration_main_activity
            else:
                leaves_primary = arrives_primary + trip_duration_sub_activity

            # Outbound leg
            location_next = random_area_sampler()
            location_next_loc = facility_sampler.sample(location_next, random_act)

            agent.add(
                Leg(
                    seq=i + 1,
                    mode=np.random.choice(transport),
                    start_area=location_prev,
                    start_loc=location_prev_loc,
                    end_area=location_next,
                    end_loc=location_next_loc,
                    start_time=mtdt(leave_time),
                    end_time=mtdt(arrives_primary),
                )
            )

            agent.add(
                Activity(
                    seq=i + 2,
                    act=random_act,
                    area=location_next,
                    loc=location_next_loc,
                    start_time=mtdt(arrives_primary),
                    end_time=mtdt(leaves_primary),
                )
            )

            # Update parameters
            leave_time = leaves_primary
            location_prev = location_next
            location_prev_loc = location_next_loc

        # Inbound leg
        arrives_home = leave_time + np.random.randint(10, 90)  # minutes
        agent.add(
            Leg(
                seq=no_of_activities + 1,
                mode=np.random.choice(transport),
                start_area=location_next,
                start_loc=location_next_loc,
                end_area=location1,
                end_loc=location1_loc,
                start_time=mtdt(leave_time),
                end_time=mtdt(arrives_home),
            )
        )

        # Activity
        agent.add(
            Activity(
                seq=no_of_activities + 2,
                act="home",
                area=location1,
                loc=location1_loc,
                start_time=mtdt(arrives_home),
                end_time=END_OF_DAY,
            )
        )

    return population

In [15]:

Copied!

# Create 100 agents and check the population statistics
population = generate_agents(20)
print(population.stats)
# Create 100 agents and check the population statistics
population = generate_agents(20)
print(population.stats)

Using random sample for zone:Wandsworth 014B:home

Using random sample for zone:Westminster 023A:education

Using random sample for zone:Lambeth 012E:pub

Using random sample for zone:Lambeth 008E:home

Using random sample for zone:Wandsworth 010B:education

Using random sample for zone:Lambeth 017B:gym

Using random sample for zone:Richmond upon Thames 003A:gym

Using random sample for zone:Lambeth 001B:park

Using random sample for zone:Lambeth 007A:shop

Using random sample for zone:Westminster 019D:gym

Using random sample for zone:Lambeth 010A:education

Using random sample for zone:Wandsworth 009A:leisure

Using random sample for zone:Lambeth 016D:home

Using random sample for zone:Wandsworth 007B:gym

Using random sample for zone:Hammersmith and Fulham 021F:park

Using random sample for zone:Hammersmith and Fulham 019C:pub

Using random sample for zone:Wandsworth 012D:pub

Using random sample for zone:Wandsworth 001B:work

Using random sample for zone:Wandsworth 014E:shop

Using random sample for zone:Kensington and Chelsea 017B:education

Using random sample for zone:Hammersmith and Fulham 025B:park

Using random sample for zone:Kensington and Chelsea 020A:medical

Using random sample for zone:Kensington and Chelsea 021D:pub

Using random sample for zone:Kensington and Chelsea 008B:leisure

Using random sample for zone:Hammersmith and Fulham 019B:leisure

Using random sample for zone:Richmond upon Thames 001C:medical

Using random sample for zone:Hammersmith and Fulham 010B:work

Using random sample for zone:Westminster 019B:leisure

Using random sample for zone:Hammersmith and Fulham 012B:education

Using random sample for zone:Lambeth 015C:medical

Using random sample for zone:Lambeth 017D:medical

Using random sample for zone:Hammersmith and Fulham 014D:pub

Using random sample for zone:Lambeth 011D:work

Using random sample for zone:Lambeth 010C:park

Using random sample for zone:Westminster 020B:gym

Using random sample for zone:Wandsworth 004D:education

Using random sample for zone:Wandsworth 017B:medical

Using random sample for zone:Wandsworth 004D:medical

Using random sample for zone:Wandsworth 002E:education

Using random sample for zone:Hammersmith and Fulham 018A:pub

Using random sample for zone:Kensington and Chelsea 013D:gym

Using random sample for zone:Westminster 020C:education

Using random sample for zone:Wandsworth 004D:gym

Using random sample for zone:Hammersmith and Fulham 014C:shop

Using random sample for zone:Wandsworth 019D:gym

{'num_households': 20, 'num_people': 20, 'num_activities': 88, 'num_legs': 68}

In [16]:

Copied!

population.random_person().print()
population.random_person().print()

Person: agent_5
{'subpopulation': 'high income', 'gender': 'male', 'age': '50 to 59', 'household_zone': 'Hammersmith and Fulham 019C', 'household_LSOA': 'E02000390', 'household_LAD': 'Hammersmith and Fulham'}
0:	Activity(act:home, location:POINT (524343.4362620147 177108.19360233188), time:00:00:00 --> 06:02:00, duration:6:02:00)
1:	Leg(mode:walk, area:POINT (524343.4362620147 177108.19360233188) --> POINT (527050.9400479832 175587.71339122538), time:06:02:00 --> 06:37:00, duration:0:35:00)
2:	Activity(act:gym, location:POINT (527050.9400479832 175587.71339122538), time:06:37:00 --> 07:37:00, duration:1:00:00)
3:	Leg(mode:rail, area:POINT (527050.9400479832 175587.71339122538) --> POINT (524343.4362620147 177108.19360233188), time:07:37:00 --> 07:57:00, duration:0:20:00)
4:	Activity(act:home, location:POINT (524343.4362620147 177108.19360233188), time:07:57:00 --> 00:00:00, duration:16:03:00)

In [17]:

Copied!

population.random_person().attributes
population.random_person().attributes

Out[17]:

{'subpopulation': 'medium income',
 'gender': 'male',
 'age': '0 to 4',
 'household_zone': 'Kensington and Chelsea 018B',
 'household_LSOA': 'E02000594',
 'household_LAD': 'Kensington and Chelsea'}

Data Visulazation and validation¶

In [18]:

Copied!

# Validation if it works
population.validate()
# Validation if it works
population.validate()

In [19]:

Copied!

# Print random person activity plan
population.random_person().print()
# Print random person activity plan
population.random_person().print()

Person: agent_17
{'subpopulation': 'low income', 'gender': 'male', 'age': '40 to 49', 'household_zone': 'Hammersmith and Fulham 014C', 'household_LSOA': 'E02000385', 'household_LAD': 'Hammersmith and Fulham'}
0:	Activity(act:home, location:POINT (524524.4295456774 177814.43974830746), time:00:00:00 --> 07:12:00, duration:7:12:00)
1:	Leg(mode:rail, area:POINT (524524.4295456774 177814.43974830746) --> POINT (529142.9020819605 176914.57844477845), time:07:12:00 --> 08:04:00, duration:0:52:00)
2:	Activity(act:education, location:POINT (529142.9020819605 176914.57844477845), time:08:04:00 --> 10:04:00, duration:2:00:00)
3:	Leg(mode:bus, area:POINT (529142.9020819605 176914.57844477845) --> POINT (530746.7903905836 178006.64461880838), time:10:04:00 --> 11:26:00, duration:1:22:00)
4:	Activity(act:shop, location:POINT (530746.7903905836 178006.64461880838), time:11:26:00 --> 13:26:00, duration:2:00:00)
5:	Leg(mode:subway, area:POINT (530746.7903905836 178006.64461880838) --> POINT (523535.5622872363 177581.13204180764), time:13:26:00 --> 14:17:00, duration:0:51:00)
6:	Activity(act:pub, location:POINT (523535.5622872363 177581.13204180764), time:14:17:00 --> 16:17:00, duration:2:00:00)
7:	Leg(mode:bus, area:POINT (523535.5622872363 177581.13204180764) --> POINT (525599.5944321514 178654.82806090862), time:16:17:00 --> 17:43:00, duration:1:26:00)
8:	Activity(act:gym, location:POINT (525599.5944321514 178654.82806090862), time:17:43:00 --> 19:43:00, duration:2:00:00)
9:	Leg(mode:car, area:POINT (525599.5944321514 178654.82806090862) --> POINT (524524.4295456774 177814.43974830746), time:19:43:00 --> 20:04:00, duration:0:21:00)
10:	Activity(act:home, location:POINT (524524.4295456774 177814.43974830746), time:20:04:00 --> 00:00:00, duration:3:56:00)

Plot the activities as a 24-hour diary schedules for 5 randomly chosen agents

In [20]:

Copied!

for _i in range(5):
    p = population.random_person()
    p.plot()
for _i in range(5):
    p = population.random_person()
    p.plot()

Plot the frequency with which each of the activity types happens throughout the 24-hour period.

In [21]:

Copied!

fig = plot_activity_times(population)
fig = plot_activity_times(population)

In [22]:

Copied!

# Check the duration of trips
durations = duration_counts(population)
durations
# Check the duration of trips
durations = duration_counts(population)
durations

/Users/bryn.pickering/Repos/arup-group/pam/src/pam/report/benchmarks.py:128: FutureWarning: The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
  df = df.groupby(dimensions, observed=False)[data_fields].agg(aggfunc).fillna(0)

Out[22]:

	duration	trips
0	0 to 5 min	0
1	5 to 10 min	0
2	10 to 15 min	2
3	15 to 30 min	14
4	30 to 45 min	12
5	45 to 60 min	16
6	60 to 90 min	24
7	90 to 120 min	0
8	120+ min	0

Now plot a histogram for duration of the trips.

In [23]:

Copied!





plt.barh(durations["duration"], durations["trips"])
plt.xlabel("Counts")
plt.ylabel("Duration for trips")
plt.title("Duration for different trips")
plt.ylim(ymax="90 to 120 min")
plt.barh(durations["duration"], durations["trips"])
plt.xlabel("Counts")
plt.ylabel("Duration for trips")
plt.title("Duration for different trips")
plt.ylim(ymax="90 to 120 min")

Out[23]:

(-0.8400000000000001, 7.0)

In [24]:

Copied!

# Check the distance of trips
distances = distance_counts(population)
distances
# Check the distance of trips
distances = distance_counts(population)
distances

/Users/bryn.pickering/Repos/arup-group/pam/src/pam/report/benchmarks.py:128: FutureWarning: The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
  df = df.groupby(dimensions, observed=False)[data_fields].agg(aggfunc).fillna(0)

Out[24]:

	distance	trips
0	0 to 1 km	3
1	1 to 5 km	53
2	5 to 10 km	12
3	10 to 25 km	0
4	25 to 50 km	0
5	50 to 100 km	0
6	100 to 200 km	0
7	200+ km	0

Next we plot the distribution of trip distances.

In [25]:

Copied!





plt.barh(distances["distance"], distances["trips"])
plt.xlabel("Counts")
plt.ylabel("distance, km")
plt.title("distance for different trips")
plt.ylim(ymax="25 to 50 km")
plt.barh(distances["distance"], distances["trips"])
plt.xlabel("Counts")
plt.ylabel("distance, km")
plt.title("distance for different trips")
plt.ylim(ymax="25 to 50 km")

Out[25]:

(-0.79, 4.0)

Read/write data¶

Export intermediate CSV tables of population¶

In [26]:

Copied!

to_csv(population, dir="outputs", crs="epsg:27700")
to_csv(population, dir="outputs", crs="epsg:27700")

Plot the distribution of activities by type

In [27]:

Copied!





df_activity = pd.read_csv(os.path.join("outputs", "activities.csv"))
totals = df_activity.activity.value_counts()
plt.barh(totals.index, totals)
plt.title("activities count")
df_activity = pd.read_csv(os.path.join("outputs", "activities.csv"))
totals = df_activity.activity.value_counts()
plt.barh(totals.index, totals)
plt.title("activities count")

Out[27]:

Text(0.5, 1.0, 'activities count')

In [28]:

Copied!





write_od_matrices(population, path="outputs")
od_matrices = pd.read_csv(
    os.path.join("outputs", "total_od.csv")
)  # we should change this method to be consistent with other - ie return a dataframe
od_matrices["total origins"] = od_matrices.drop("Origin", axis=1).sum(axis=1)
od_matrices
write_od_matrices(population, path="outputs")
od_matrices = pd.read_csv(
    os.path.join("outputs", "total_od.csv")
)  # we should change this method to be consistent with other - ie return a dataframe
od_matrices["total origins"] = od_matrices.drop("Origin", axis=1).sum(axis=1)
od_matrices

Out[28]:

	Origin	Hammersmith and Fulham 010B	Hammersmith and Fulham 012B	Hammersmith and Fulham 014C	Hammersmith and Fulham 014D	Hammersmith and Fulham 018A	Hammersmith and Fulham 019B	Hammersmith and Fulham 019C	Hammersmith and Fulham 021F	Hammersmith and Fulham 025B	...	Wandsworth 017B	Wandsworth 019D	Westminster 019B	Westminster 019D	Westminster 020B	Westminster 020C	Westminster 023A	Westminster 024A	Westminster 024B	total origins
0	Hammersmith and Fulham 010B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
1	Hammersmith and Fulham 012B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
2	Hammersmith and Fulham 014C	0	0	0	0	0	0	0	0	0	...	0	1	0	0	0	0	0	0	0	2
3	Hammersmith and Fulham 014D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
4	Hammersmith and Fulham 018A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
5	Hammersmith and Fulham 019B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
6	Hammersmith and Fulham 019C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	2
7	Hammersmith and Fulham 021F	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
8	Hammersmith and Fulham 025B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
9	Kensington and Chelsea 008B	0	0	0	0	0	1	0	0	0	...	0	0	0	0	0	0	0	0	0	1
10	Kensington and Chelsea 009A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	1	0	0	0	1
11	Kensington and Chelsea 010A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
12	Kensington and Chelsea 010B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
13	Kensington and Chelsea 013D	0	0	1	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
14	Kensington and Chelsea 016B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
15	Kensington and Chelsea 016E	0	0	0	0	0	0	0	1	0	...	0	0	0	0	0	0	0	0	0	1
16	Kensington and Chelsea 017B	0	0	0	0	0	0	0	0	1	...	0	0	0	0	0	0	0	0	0	1
17	Kensington and Chelsea 018A	0	0	0	0	0	0	0	0	0	...	0	0	1	0	0	0	0	0	0	1
18	Kensington and Chelsea 018B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
19	Kensington and Chelsea 020A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
20	Kensington and Chelsea 021B	1	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
21	Kensington and Chelsea 021D	0	0	1	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	1	2
22	Lambeth 001B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
23	Lambeth 004A	0	0	0	0	1	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
24	Lambeth 007A	0	0	0	0	0	0	0	0	0	...	0	0	0	1	0	0	0	0	0	1
25	Lambeth 008E	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
26	Lambeth 010A	0	0	0	0	0	0	1	0	0	...	0	0	0	0	0	0	0	0	0	3
27	Lambeth 010C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	1	0	0	0	0	2
28	Lambeth 011D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
29	Lambeth 012E	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
30	Lambeth 015C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
31	Lambeth 015D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
32	Lambeth 016D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
33	Lambeth 017B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
34	Lambeth 017D	0	0	0	1	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
35	Richmond upon Thames 001C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
36	Richmond upon Thames 003A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
37	Wandsworth 001B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
38	Wandsworth 002E	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	2
39	Wandsworth 004B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
40	Wandsworth 004D	0	0	0	0	0	0	0	0	0	...	1	0	0	0	0	0	0	1	0	3
41	Wandsworth 006B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	1	1
42	Wandsworth 007B	0	0	0	0	0	0	1	0	0	...	0	0	0	0	0	0	0	0	0	1
43	Wandsworth 007C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
44	Wandsworth 009A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
45	Wandsworth 010B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
46	Wandsworth 012D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
47	Wandsworth 014B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	1	0	0	1
48	Wandsworth 014E	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
49	Wandsworth 017B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
50	Wandsworth 019D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
51	Westminster 019B	0	1	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
52	Westminster 019D	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
53	Westminster 020B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
54	Westminster 020C	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
55	Westminster 023A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
56	Westminster 024A	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	1
57	Westminster 024B	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	2

58 rows × 60 columns

Plot the number of trips originating from each LSOA

In [29]:

Copied!





lsoas_clipped = lsoas_clipped.reset_index()
origins_heat_map = lsoas_clipped.join(od_matrices["total origins"])

fig, ax = plt.subplots(figsize=(18, 10))
origins_heat_map.plot("total origins", legend=True, ax=ax)
ax.set_title("Total Origins")
lsoas_clipped = lsoas_clipped.reset_index()
origins_heat_map = lsoas_clipped.join(od_matrices["total origins"])

fig, ax = plt.subplots(figsize=(18, 10))
origins_heat_map.plot("total origins", legend=True, ax=ax)
ax.set_title("Total Origins")

Out[29]:

Text(0.5, 1.0, 'Total Origins')

Reload Tabular Data¶

We load in the csv files we previously wrote to disk. This replicates a simple synthesis process we might typically use for travel diary survey data.

In [30]:

Copied!





people = pd.read_csv(os.path.join("outputs", "people.csv")).set_index("pid")
hhs = pd.read_csv(os.path.join("outputs", "households.csv")).set_index("hid")
trips = pd.read_csv(os.path.join("outputs", "legs.csv")).drop(["Unnamed: 0"], axis=1)

trips = trips.rename(columns={"origin activity": "oact", "destination activity": "dact"})
trips.head()
people = pd.read_csv(os.path.join("outputs", "people.csv")).set_index("pid")
hhs = pd.read_csv(os.path.join("outputs", "households.csv")).set_index("hid")
trips = pd.read_csv(os.path.join("outputs", "legs.csv")).drop(["Unnamed: 0"], axis=1)

trips = trips.rename(columns={"origin activity": "oact", "destination activity": "dact"})
trips.head()

Out[30]:

	pid	hid	freq	ozone	dzone	purp	oact	dact	mode	seq	tst	tet	duration
0	agent_0	hh_0	NaN	Wandsworth 014B	Westminster 023A	NaN	home	education	rail	1	1900-01-01 08:13:00	1900-01-01 09:26:00	1:13:00
1	agent_0	hh_0	NaN	Westminster 023A	Lambeth 012E	NaN	education	pub	bike	2	1900-01-01 10:26:00	1900-01-01 11:48:00	1:22:00
2	agent_0	hh_0	NaN	Lambeth 012E	Wandsworth 014B	NaN	pub	home	rail	3	1900-01-01 12:48:00	1900-01-01 14:01:00	1:13:00
3	agent_1	hh_1	NaN	Lambeth 008E	Wandsworth 010B	NaN	home	education	ferry	1	1900-01-01 07:34:00	1900-01-01 08:37:00	1:03:00
4	agent_1	hh_1	NaN	Wandsworth 010B	Lambeth 017B	NaN	education	gym	bus	2	1900-01-01 10:37:00	1900-01-01 11:29:00	0:52:00

In [31]:

Copied!

population_reloaded = load_travel_diary(trips=trips, persons_attributes=people, hhs_attributes=hhs)
population_reloaded = load_travel_diary(trips=trips, persons_attributes=people, hhs_attributes=hhs)

Using from-to activity parser using 'oact' and 'dact' columns

Plot the activities as a 24-hour diary schedules

In [32]:

Copied!

population["hh_0"]["agent_0"].plot()
population["hh_0"]["agent_0"].plot()

In [33]:

Copied!

population_reloaded["hh_0"]["agent_0"].plot()
population_reloaded["hh_0"]["agent_0"].plot()

In [34]:

Copied!

population == population_reloaded
population == population_reloaded

Out[34]:

False

The populations are not the same because the csv files did not preserve the coordinates that we previously sampled, so we will sample them again. But the reloaded population will be different as for each location a new coordinate is sampled.

Write output to MATSim xml¶

In [35]:

Copied!

write_matsim(population=population, plans_path=os.path.join("outputs", "population.xml"))
write_matsim(population=population, plans_path=os.path.join("outputs", "population.xml"))