Scientists used millions of travelers' Flickr photos to quickly calculate global tourism. Is analyzing online data the next step in measuring our society?
Our summer vacation photos are a record of happy memories made with friends and families, proof of where we’ve been and what we’ve done during our precious time off. Thanks to team of scientists at the Data Science Lab at Warwick Business School and The Alan Turing Institute, these photos might prove to be more than just happy memories, but also serve as rapid, low-cost insights into how people travel around the globe.
New research from Professor Tobias Preis, Dr. Federico Botta, and Professor Suzy Moat analyzed data from 69 million publicly shared photos uploaded to the platform Flickr to estimate global tourism statistics for the G7 countries: Canada, France, Germany, Italy, Japan, the UK and the US.
This map depicts the locations in which 35 million photographs were taken and uploaded to the photo-sharing platform Flickr in 2014.
While countries often rely on potentially time-consuming surveys at airports and accommodation to determine where their visitors are from, the new findings show that rapidly available, low-cost data from photos uploaded to the internet might be able to provide similar measurements.
Tobias Preis, Professor of Behavioral Science and Finance and co-director of the Data Science Lab at Warwick Business School
“We analyzed data on where and when 69 million Flickr photos had been taken over a period of two years,” explained Preis. “To work out where visitors to the UK were from, we identified photos that had been taken in the UK, and then looked at where the photographer had taken photos over the past 12 months.
“We had to make a big simplifying assumption about where people were between photos: namely, that they’d stayed in the same country since the last photo,” he continued. “However, even with this big simplification, we found that estimates of the number of travelers from different countries generated from the online photo data were correlated with the official tourism statistics. The same holds for the other G7 countries.”
The G7 countries currently use various methods to calculate tourist numbers, including collecting data at airports, hotels, and other tourist accommodations. However, common to all these methods is a publication delay, which ranges from months to years.
Suzy Moat, Professor of Behavioral Science and co-director of the Data Science Lab at Warwick Business School
Instead, using the almost instant availability of online data might provide far quicker estimates of tourist flows between countries. By analyzing 69.2 million photos uploaded to Flickr in 2013 and 2014 the researchers were able to infer the travel patterns of nearly half a million people.
When compared to official data released by the G7 the researchers found a strong correlation between the photo-based estimates and the official estimates. They caution, however, that considerable further work will be needed before such insights can be used in the production of official statistics.
“Policymakers need quick, accurate information to make good decisions, especially in times of great change,” said Moat. “Traditional approaches to measuring the state of our society can be time-consuming as well as expensive, and so the possibility of generating rapid, low-cost indicators from online data is very exciting.
Federico Botta, Research Fellow in the Data Science Lab at Warwick Business School
“However, a robust framework would be needed to utilize these insights in practice, to deal with challenges such as bias in the online data, whether access to the online data could be relied upon, and how to calculate and communicate how certain we can be about the estimates we make.”