Big Ideas for Big Data on the Impact of Research Funding: Julia Lane and the Opportunity of IRIS

State-level spending flows from one federally-funded research university

If you want to know the impact of research funding, says science and innovation policy guru Julia Lane (NYU/Wagner and Center for Urban Science and Progress), you need a system that puts people, and not publications, at the center. And if you want to produce this knowledge at the speed and scale of its emergence, she argues, you need to be aware of the challenges and opportunities that lie in using big data in social science. In her April 1 seminar on social science and big data research, Dr. Lane described the IRIS project as one approach that has had to address both these challenges – in a project that has been the challenge of tracing the effects of federally-funded research initiatives into the economy and society. Dr. Lane’s seminar, sponsored by ISSR and CSSI, was an invitation for social scientists to take charge of the rising tide of questions about the impact of research funding, and to direct it in the scientific tradition – with clear research questions, a conceptual framework, and sound methodological approaches.

As Lane describes it, IRIS (Institute for Research on Innovation & Science) is a collaborative effort to gather and analyze key data on the flow of resources and ideas in federally-funded research. Together, its 24 current members represent 25% of all R&D spending by the federal government; its membership is soon expected to grow by another 59 members, capturing the flow of 80% of all federal research dollars.  Developed by a team of six researchers that includes Dr. Lane and is led today by Dr. Jason Owen-Smith (U.Michigan/Sociology), IRIS boasts powerful mechanisms to ensure real-time data renewal, quality and confidentiality. With this integrated map of people and activity, IRIS is building the potential to answer a range of vital questions in science policy: How do investments translate into new knowledge? Who gets supported to do what, and how do their efforts fare? What economic effects ripple out from research teams, through their subcontracts and the flow of involved people through subsequent projects and teams?

At the heart of the IRIS initiative is a clearly articulated theory of change. “Publications,” argues Dr. Lane, “are not a behavioral unit of analysis. People and relationships are. The greatest technology transfer going is happening through people, not publications.”  Lane pointed to the highly-visible examples of Larry Page and Sergey Brin, neither of whom was principal investigator or first author on any federally-funded research project, but both of whom were paid as Fellows through such research projects, and who leveraged the knowledge there obtained into their high-impact venture: Google. With this insight in mind about the movement of ideas through people – not papers – Lane and her collaborators have built an integrated data infrastructure based on big data - driven by the careful use of existing, “found” data: the abstracts of scientific research grants posted on agency websites, the financial codes assigned to any transaction resulting from these grants, and the broader data available through secure and anonymized links to the movements of fund recipients through IRS and Social Security databases. “Follow the people, then you’re following the knowledge.”

Julia Lane should know. She has led the National Science Foundation’s Science of Science and Innovation Policy program, which is intended to provide a scientific basis to advance understanding of the drivers and consequences of scientific innovation and collaboration. IRIS represents the fruition of a long effort to build a coherent and dynamic data architecture for research on investments in scientific research. Its core features emerged in 2013 as UMETRICS (Universities Measuring the Impacts of Research on Innovation, Competitiveness and Science), building from the earlier STARMETRICS effort that Dr. Lane led on behalf of the White House Office of Science and Technology Policy to assess the impacts of research funding disbursed through the American Relief and Recovery Act (ARRA). Private funding from the Alfred P. Sloan and Ewing Marion Kauffman Foundations allowed the Committee on Institutional Cooperation to launch the 2-year UMETRICS pilot with data from 11 universities, now being extended and institutionalized as IRIS.

As the IRIS network grows, Lane urges university scholars and administrators to envision its potential to spark (and benefit from) their collaboration across disciplines. IRIS offers an elegant and relevant treasure trove of information about the flow of ideas and resources in the scientific enterprise. But to fulfill its potential as a source for breakthrough science, as well as effective management, this powerful tool must be steered in response to key questions: What are we measuring? What is the underlying construct? How are we measuring and what are we missing? How can we draw inferences? How can we protect human subjects? The resolution of these questions will rely on the theoretical and methodological expertise of social scientists, the substantive knowledge about each research domain provided by subject experts, and the technical and systems expertise offered in the computational and systems sciences. For those in the room on April 1 and for those responding to the pressure for universities to provide evidence of their impacts in society, participation in IRIS offers a chance to clarify our own institutional concepts and assumptions, and to engage proactively and critically in the science of science policy.