“After nearly a decade of this policy mandate being in place, it’s now integral to the ways we create science and preserve knowledge.”
“After nearly a decade of this policy mandate being in place, it’s now integral to the ways we create science and preserve knowledge.”
In 2011 the National Science Foundation (NSF) began requiring Data Management Plans (DMPs), which mandated that researchers share, preserve, and provide access to their data. Following the NSF’s example, many other federal agencies and non-federal science organizations now require DMPs. “The promise of the data management policy is that if you develop these management plans it will ensure access to datasets, which helps address the transparency and reproducibility problems in scientific studies,” said Dr. Amelia Acker, Director of the Critical Data Studies Lab at The University of Texas at Austin School of Information. “After nearly a decade of this policy mandate being in place, it’s now integral to the ways we create science and preserve knowledge.”
But has this massive effort to comply with DMPs—necessitating significant resource investments from scientists, universities, and information professionals—actually improved or changed the processes and data practices of science? “No one has asked scientists how they manage their data as a result of this policy mandate or investigated how the DMP policy impacts the questions that scientists ask,” Acker said. “The DMP is supposed to ensure the data lives on after the funded research ends, but because we don’t always cite or follow data being generated and managed, we don’t really understand how the data is used after it’s been preserved.”
The “data” that figures into the “Data Management Plan” policy encompasses a broad understanding of what can constitute as scientific data but emphasizes access and future use after the funded project has ended. Scholars are asked to preserve multiple research artifacts, including code, experimental context, methodological information, and sometimes specimens or other supporting research products to adequately preserve and provide access to research data. Acker and collaborator Dr. Megan Finn from the University of Washington were awarded a $461,085 collaborative grant from the NSF ($287,031 for UT) to investigate how the DMP requirement policy, and domain specific management plans, and data archiving have evolved to confront the growth of digital data from scientific research in a project entitled Collaborative Research Data Afterlives: The long-term impact of NSF Data Management Plans on data archiving and sharing for increased access.
“The most exciting part of the project,” Acker said, “is that we’re the first research project to gather empirical evidence to see if the policy makes a difference in data archives and access.” A key gap in better understanding data preservation, scientific reproducibility, and data management plans across scientific domains is determining exactly how data lives beyond a project and how issues of access or barriers to sharing may (or may not) correspond with the underlying structures, expectations, and requirements of data management plans. During the multi-phase project, Acker and Finn will analyze data types and data retention practices across different domains and directorates of the NSF. Using surveys, interviews, and dataset tracing, they will develop case studies of data management, and then translate their study results into data governance guidance and best practices for researchers, funders, and data management professionals.
“We’re really excited to begin this research examining the impact of the data management policies on federally funded science. Now that the policy is in its tenth year, we’ll be able to examine, trace, and evaluate how data has been preserved and reused. We expect to find major social and institutional changes in knowledge infrastructures where scientific data is created, debated, archived and accessed,” Acker said.
The collaborators will hire a postdoctoral research scholar to join the team in 2021. To learn more about the Critical Data Studies Lab, visit: data.ischool.utexas.edu.