The JPMorgan Chase Institute is a global think tank dedicated to delivering data-rich analyses and expert insights for the public good. Drawing on JPMorgan Chase’s unique proprietary data, expertise and market access, the Institute aims to help decision makers use better facts, real-time data and thoughtful analysis to make smarter decisions to advance global prosperity.
What is changing about the way data are collected?
Robert Groves: We are at a turning point. The paradigms of the 20th century aren’t surviving. The most basic one is the sample survey, which is a wonderful device when it works well, and is the foundation of social and economic data. But that device is fraying at the edges. Participation in surveys is going down, and the cost of original data collection is inflating at an exponential rate. At the same time, we see the growth of data from other sources, all of them digital in some fashion. The real puzzle of this century, I believe, is how to navigate a world that will blend those together. I don’t think we will abandon the old paradigm. We have to morph it in the presence of these new data.
Sarah Rosen Wartell: The exciting thing is that the new data can help us complement the limitations of the existing data sources, which are not always as timely or granular as we would like. With existing data sources, there was great virtue of consistency, but this also meant that you couldn’t always ask the question that was top of mind at a given moment. Some of the new data sources can help us answer those questions.
Diana Farrell: More and more of these new data sets are what you might characterize as naturally occurring data. They have nothing to do with what we’re trying to observe, except they are a window into it.
Robert Groves: Yes, I sometimes refer to these new data as “organic data” because they are not designed for any particular purpose other than as an auxiliary monitoring device of some process, and not data that were designed to answer any particular question. Our real challenge is how to locate the data that might be relevant.
Diana Farrell: Right. And they may not answer all of the questions we’re trying to answer, so I think the complementarity of these two is really important, because we need both aspects—survey-based and “organic data.”
“The challenge of so much of our policy is that we tackle these problems in a silo,” says Sarah Rosen Wartell, President of the Urban Institute.
What promises do big data offer for promoting economic inclusion?
Sarah Rosen Wartell: Big data can help us get a better understanding of economic inequality. This is so important right now. The research tells us that a poor kid who grows up in Atlanta has far less of a chance of having good life outcomes than a poor kid who grew up similarly in Salt Lake City. What’s happening in those cities? What’s happening in those economies? What’s happening in those schools? Now data can tell us about these different life experiences and where we can shape them to create more opportunity.
Diana Farrell: One of the biggest promises of these new data sets has to do with the questions you’re raising, Sarah. We’re not going to answer the questions about why someone in one city thrives while another person in another city doesn’t until we get pretty granular. We need to have control groups that allow us to compare one to the other, and to really isolate the things that we’re looking at. We already see that we can do a lot more of that with big data. So I think that we can introduce a new level of discipline to the questions of inequality and equity that vex policymakers, to help them really isolate the thing they’re trying to fix.
Sarah Rosen Wartell: The challenge of so much of our policy is that we tackle these problems in a silo. Juvenile justice, workforce development and child care are all different sets of data. Big data give us the opportunity to combine some of that. The data that we’re talking about here are not just either governmental records or in the private sector. They’re also in the nonprofit sector. They may not even be big data. They may just be middle-sized data pulled together, creating architectures that allow people to pool their information with someone else’s to understand a person’s life.
What are the limitations of big data in this area?
Diana Farrell: There are challenges to getting data that are fully representative. There’s no question that our data, because they are bank data, have a limited view of the unbanked and underbanked. We know that. Politics play a role in getting representation right, in the sense that it matters a lot who’s counted and who’s not on any number of different issues.
Sarah Rosen Wartell: Yes, there are people who don’t have access to the same types of transactions and services as others, and often these transactions and services are collected, analyzed and used to generate insights. If you draw policy conclusions from data that unfairly represent parts of the population, by either over-inclusion or under-inclusion, you’re going to make policy that doesn’t fully reflect the population’s needs.
We’re sort of in the “Wild West” of data, which I think we will eventually sort into a more structured world. But there’s a real risk of people drawing conclusions from information that could exacerbate some of the inequalities in our society. There needs to be people who understand the inherent biases in our data sets.
Robert Groves: I’m with you entirely, given the critical fact that big data don’t cover certain subpopulations. We have to be very purposeful in having contrast groups that are related to the missing cases so that we can query the big data on whether there’s any evidence of variation on the dimension missing, such as income. If we are rigorous on that, then that’s a step forward. Every time there’s a big data report, there should also be an independent critic to discuss the pluses and minuses. We’re missing that kind of watchdog.
Diana Farrell: I would add that the more that we can understand the heterogeneity of outcomes and ask what this tells us about the impact that any given intervention will have on the least well-off—or the least represented—the better we’ll be able to address the issues of income inequality.
“We can introduce a new level of discipline to the questions of inequality and equity that vex policymakers, to help them really isolate the thing they’re trying to fix,” says Diana Farrell, Founding President and CEO of the JPMorgan Chase Institute.
How can big data become a catalyst for the private sector to better serve more of the population?
Sarah Rosen Wartell: We can take advantage of big data mining to market to people, but we also can take advantage of big data mining to educate people. This can be valuable when they’re going to make a decision whether or not to swipe a credit card, or to buy a house that is more expensive than they can afford. So it’s not just the policymakers that can use these data to inform. We can use these data to create tools that can help the private sector do a better job of serving more of the population.
Diana Farrell: One of the most promising things that resulted from the Institute’s income and consumption volatility report was that JPMorgan Chase separately funded a prize for helping people manage their liquidity issues. It was encouraging that these applicants for the prize were driving real behavioral change. For example, in one app people get a notice that tells them they are below their minimum balance, so that they can hold back and change their behavior. So I think you’re right, Sarah. It’s not just about policy. It’s about business products and services, and individual behavior as well.
Robert Groves: That may be the definition of the new social trust model for the 21st century. In the last century, the model was to participate in the survey and the whole country would benefit, but you individually got little or nothing out of it. But now, we may have a model where big data actually contribute to improving the lives of individuals who are supplying the data.