By collecting the anonymous cellphone location data from nearly two million Bostonians, MIT and Ford were able to produce near-instant urban mobility patterns that typically cost millions of dollars and take years to build.
The big data experiment holds the promise of more accurate and timely data about urban mobility patterns that can be used to quickly determine whether particular attempts to address local transportation needs are working.
In making decisions about infrastructure development and resource allocation, city planners rely on models of how people move through their cities -- on foot, in cars and by public transportation. Those models are largely based on socio-demographic information from costly, time-consuming manual surveys, which are in small sample sizes and infrequently updated. Cities might go more than a decade between surveys.
"In the U.S., every metropolitan area has a...metropolitan planning organization [MPO], and their main job is to use travel surveys to derive the travel demand model, which is their baseline for predicting and forecasting travel demand to build infrastructure," said Shan Jiang, a postdoctoral student in MIT's Human Mobility and Networks Lab. "So our method and model could be the next generation of tools for the planners to plan for the next generation of infrastructure."
The paper, titled TimeGeo: modeling urban mobility without travel surveys, describes how the researchers used call detailed records (CDRs) managed by mobile phone service providers. The CDRs, which are used for billing purposes, contain data in the form of geolocated traces of users across the globe.
The researchers collected a CDR data set of 1.92 million anonymous mobile phone users for a period of six weeks in the Greater Boston area. To have a control experiment, they also examined a donated set of self-collected mobile phone traces of a graduate student in the same region over a course of 14 months, recorded by a smartphone application.
By applying a big data algorithm the CDR data, the researchers were able to quickly assemble the kind of model of urban mobility patterns that typically takes years to build.
The Boston MPO's practices are fairly typical of a major cities. Boston conducted one urban mobility survey in 1994 and another in 2010. Its current mobility model, however, still uses data from 1994 because it's taken the intervening six years simply to sort through all the information collected in 2010. Only now has the work of organizing that newer data into a predictive model begun, the researchers explained.
To validate the results of their research, the scientists from MIT and Ford's Palo Alto Research and Innovation Center compared it to the model currently used by Boston's MPO. "The two models accorded very well," the researchers said in a paper published in latest issue of the Proceedings of the National Academy of Sciences.
"Mobile phones are the prevalent communication tools of the twenty-first century, with the worldwide coverage up to 96% of the population," the researchers said. "Mobile phone data have been useful so far to improve our knowledge on human mobility at unprecedented scale, informing us about the frequency and the number of visited locations over long term observations, daily mobility networks of individuals, and the distribution of trip distances."
While the sparse nature of mobile phone usage leads to samplings that tend to have biases because they don't offer complete journeys in space and time for each individual, the researchers were able to infer certain mobility patterns.
For example, their algorithm assumes the location from which a user departs in the morning and to which he or she returns at night is home. It also infers that the location of the longest recurring stays during weekday daytime hours is the user's workplace.
The algorithm assumes most people's workdays are in accordance with national averages, so if a user makes phone calls from work only between the hours of 12 p.m. and 2 p.m., the system does not interpret that as evidence of a two-hour workday — unless that interpretation is corroborated by other data, such as regular calls from home at 11:30 a.m. and 2:30 p.m.
Any locations other than work and home are treated alike. From the available data, the system builds a probabilistic mobility model for each user, breaking every day of the week into 10-minute increments. For each increment, the model indicates the likeliness of a location change, possible destinations and the amount of time likely to be spent at each destination. The system then generalizes those probabilities across communities, on the basis of census data, and deduces cumulative traffic flows from the resulting probability map.
"We are able to identify home locations for 1.44 million users which is 75% of our initial user base," the researchers said. "Next, we filter users who have more than 50 total stays and at least 10 home stays in the observation period."
Active cellphone users were labeled as commuters, which amounted to 133,448 individuals who have journey-to-work trips, and non-commuters, or 43,606 individuals, who have no journey-to-work trips.
"Our ability to correctly model urban daily activities for traffic control, energy consumption and urban planning have critical impacts on people's quality of life and the everyday functioning of our cities," the researchers said. "To inform policy making of important projects such as planning a new metro line and managing the traffic demand during big events, or to prepare for emergencies, we need reliable models of urban travel demand. These are models with high resolution that simulate individual mobility for an entire region."