Stefaan Verhulst is the chief research and development officer, and Andrew Young is the knowledge director at the Governance Laboratory at New York University. Together, they have written a recent report on the role that social media data can play in the nonprofit sector. I asked them a series of questions about it.
HF — Your report suggests that social media data can help NGOs [nongovernmental organizations] and nonprofit organizations help understand the situations that they face. How is this data valuable?
SV & AY — When collected, analyzed and used responsibly, social media data can add value in five ways:
First, it can provide information on the situation on the ground, in real time. For example, Facebook’s Disaster Maps initiative seeks to provide organizations such as UNICEF, the Red Cross and the World Food Program, with information on people’s location (at an aggregate level), how they move from one point to another, and whether they have used the platform’s Safety Check to mark themselves as safe following a disaster. This provides information on where people are, what evacuation routes they are following and how they are doing, helping humanitarian organizations and public sector entities in myriads of ways.
Second, it can combine widely dispersed data sets with social media data to create new knowledge, and ensure that those responsible for solving problems have the most useful information at hand. The Yelp Dataset Challenge, for example, provides public access to its crowdsourced review and ratings data through a prize-backed challenge. Yelp offers cash rewards to winning teams of university students who submit original research using the data in innovative ways.
Third, social media data can help model service delivery in a targeted, evidence-based manner. For instance, Waze, the most widely used crowdsourced traffic and navigation platform, partnered with cities and government agencies to share publicly available incident and road closure data through its free Connected Citizens Program.
Fourth, social media data can power prediction, helping institutions respond to problems before they occur. Researchers are, for example, trying to help government drug regulators identify adverse drug reactions (ADRs) through social media, rather than just the clinical trial period.
HF — You discuss “data collaboratives” — partnerships in which actors from different sectors exchange information for public benefit. What examples do we have of successful data collaboratives and how have they worked?
SF & AY — There are many types and flavors of data collaboratives and there are inspiring examples in each category. These include:
Orange’s Data for Development Challenge in Africa: Orange Telecom hosted a challenge that provided researchers with anonymized, aggregated Call Detail Record (CDR) data to solve development problems, such as transportation, health and agriculture. Winners included research on the use of mobile phone data for electrification planning, analyzing how mobile phone access affects millet prices, and how waterborne parasites might spread through human movement.
Yelp offers their globally crowdsourced user data on restaurants to students and researchers in their Yelp Dataset Challenge, which runs for four months, providing cash prizes and support for conference travel. The company challenges participants to discover insights and innovations in topics such as photo classification algorithms, natural language processing and sentiment analysis, change points and events, graph mining, urban planning and more.
The California Data Collaborative automates the collection, analysis and secure storage of data on metered water use from participating city and state government agencies. This information allows for the creation of a more accurate data set that details how much, when and where water was used by California residents.
Zillow, an online listing service for single family residences, condominiums, and co-op homes that has developed a tool by pooling information from credit bureau TransUnion, the U.S. Census Bureau, the Freddie Mac Primary Mortgage Market Survey, and the Bureau of Labor Statistics’ Employment Cost Index with their own data collected from buyers, sellers and renters that use their website. Zillow provides a “Zestimate” home price index, which, in addition to other home value, historical values, rental, forecast, and geographic affordability data, provides a more comprehensive picture of the housing market in North America.
The JP Morgan Chase Institute taps into JP Morgan’s proprietary data, experience and market access to create analyses and convene stakeholders. For their 2016 Online Platform Economy Report, the JP Morgan Chase Institute used anonymized account data from October 2012 to June 2016 from samples of more than 240,000 Chase customers who received income from 42 different platforms, such as TaskRabbit, Airbnb or Uber. This report detailed the burgeoning online economy to better inform policy and public response to the field.
And finally at the GovLab, with funding from Data2X, we’re in the process of developing a Data Collaborative on Gender and Urban Mobility together with UNICEF, the ISI Foundation, the Universidad del Desarrollo/Telefónica R&D Center and DigitalGlobe, focusing on how data held by the private sector could provide greater insight into the mobility challenges experienced by women and girls in Santiago, Chile, (and in other global megacities).