Pedestrian Growth Trends on Arlington's Trails
Building off my previous work on Arlington's Bikeometer data and mapping points with Folium, I decided to contribute something to the upcoming Walk Hack Night II. While I'm not able to attend this event, I still want to contribute to the discussion with some data visualizations about local walking trends.
Many of the Eco-Counters installed on Arlington's trails can count both bicycle and pedestrian traffic. I decided to look through the various counters and find ones that had data for the past three years. My plan was then to calculate annual average daily traffic for the pedestrian traffic at each counter, and then map the annualized growth rate for that metric for each location.
I used the following Python modules for this project.
import pandas as pd
import requests
from xml.etree import ElementTree
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import folium
Then I grabbed the counter ID numbers for the eleven stations that provided consistent data from January 1, 2014 to December 31, 2016. Those counter IDs are listed below. When I browsed the data, I discovered some very large pedestrian counts that were obvious outliers. Those outliers were probably the result of sensor errors. Upon further investigation, I noticed some of these outliers occurred during the evening hours, which should actually have pedestrian counts close to zero. Something strange seems to happen to these counters at night on occasion.
counterids = ['23', '3', '4', '27', '26', '7', '9', '11', '25', '12', '1']
pedestrian_growth = []
The code below fetches the daily pedestrian count data for each counter in the list above. I dealt with outliers by dropping count data above the 95th percentile and any zero counts. I then summed the inbound and outbound pedestrian counts by day. With those daily sums, I calculated the average daily counts for the remaining data by month and then I calculated the yearly means based on those monthly averages. Finally, I obtained the annualize rate of growth for the average annual daily pedestrian traffic numbers. These statistics are reported below the following Python code.
for cid in counterids:
GetCountURL = ('http://webservices.commuterpage.com/counters.cfc?wsdl&method=GetCountInDateRange&startDate=1/1/2014&endDate=12/31/2016&direction=&mode=P&interval=d&counterid='
+ cid)
xmldata = requests.get(GetCountURL)
tree = ElementTree.fromstring(xmldata.text)
date = []
count = []
direction = []
dfpeds = pd.DataFrame()
for t in tree.findall('count'):
date.append(t.attrib['date'])
count.append(t.attrib['count'])
direction.append(t.attrib['direction'])
dfpeds = pd.DataFrame({'date' : date, 'count': count, 'direction': direction})
dfpeds['date'] = pd.to_datetime(dfpeds.date)
dfpeds['count'] = dfpeds['count'].astype(int)
cleaned = dfpeds['count'] <= dfpeds['count'].quantile(0.95)
zeroes = dfpeds['count'] != 0
dfcleaned = dfpeds[cleaned & zeroes]
fig = plt.figure()
fig.suptitle('Counter ID: ' + cid, fontsize=16)
dfcleaned['count'].plot(kind = 'hist', bins=20)
df_bydate = dfcleaned.groupby('date').sum()
df_bydate['Date'] = df_bydate.index
df_bydate['month'] = df_bydate['Date'].dt.month
df_bydate['year'] = df_bydate['Date'].dt.year
bymonth = df_bydate.groupby(['year', 'month']).mean()
byyear = bymonth['count'].mean(level='year')
percent_change = ((byyear.iloc[2] / byyear.iloc[0]) ** (1/2) - 1) * 100
pedestrian_growth.append(round(percent_change,2))
textid = cid + ": " + str(round(percent_change,2)) + "% annualize growth rate"
print('Counter ID: #' + cid)
print(round(byyear, 0))
print(textid)
print('------------------------')
df_counterstats = pd.DataFrame({'Growth Rate': pedestrian_growth}, index=counterids)
df_counterstats
Using the above DataFrame, I fetched the counter names and locations with the following code.
GetAllCountersUrl = "http://webservices.commuterpage.com/counters.cfc?wsdl&method=GetAllCounters"
xmlfile = open('xml_getallcounters.xml', 'w')
xmldata = requests.get(GetAllCountersUrl)
xmlfile.write(xmldata.text)
xmlfile.close()
xml_data = 'xml_getallcounters.xml'
tree = ElementTree.parse(xml_data)
id = []
name = []
latitude = []
longitude = []
region = []
for c in tree.findall('counter'):
id.append(c.attrib['id'])
name.append(c.find('name').text)
latitude.append(c.find('latitude').text)
longitude.append(c.find('longitude').text)
region.append(c.find('region/name').text)
df_counters = pd.DataFrame(
{'Name' : name,
'latitude' : latitude,
'longitude' : longitude,
'region' : region
}, index = id)
df_counters.head()
df_growth_points = pd.concat([df_counters, df_counterstats], axis=1, join='inner')
df_growth_points['radius'] = (df_growth_points['Growth Rate'] * 5) + 120
def markercolors(counter):
if counter['Growth Rate'] < -5:
return 'DarkRed'
elif counter['Growth Rate'] < -1:
return 'Red'
elif counter['Growth Rate'] < 1:
return 'Yellow'
elif counter['Growth Rate'] < 5:
return 'Green'
else:
return 'DarkGreen'
df_growth_points["color"] = df_growth_points.apply(markercolors, axis=1)
df_growth_points
In the data table above, you'll notice I also added a radius and color. These are based on the annual daily pedestrian growth rates. I need these attributes to create a point map that displays points of varying size and color based on the pedestrian growth rates. The large green circle indicate increases in the average annual number of walkers since 2014 and the small dark red circles indicate the largest declines in the number of daily pedestrians since 2014.
locations = df_growth_points[['latitude', 'longitude']]
locationlist = locations.values.tolist()
map = folium.Map(location=[38.87, -77.1], tiles='CartoDB positron', zoom_start=13)
folium.Icon()
for point in range(0, len(locationlist)):
folium.CircleMarker(locationlist[point], radius=df_growth_points['radius'][point], popup=df_growth_points['Name'][point]+': '+str(df_growth_points['Growth Rate'][point])+'%',
color=df_growth_points['color'][point], fill_color=df_growth_points['color'][point]).add_to(map)
map
You can click on each circle above to get the location name and the growth rate for that point. My quick take is that there has been an increase in walkers on the Custis Trail. Over the same time period, the Columbia Pike and Mount Vernon points indicate decreases in pedestrian traffic counts. Planners may want to really focus on the conditions at the MVT Airport South counter to see if there needs to be any improvements for pedestrians in that area.
Please share your thoughts and ideas in the comment section below.
Comments
Comments powered by Disqus