Populating The Data Ark: An attempt to obtain and preserve data from the most highly-cited psychology and psychiatry articles

Abstract

The vast majority of scientific articles published to-date have not been accompanied by concomitant publication of the underlying research data upon which they are based. This state of affairs precludes the routine re-use and re-analysis of research data, undermining the efficiency of the scientific enterprise, and compromising the credibility of claims that cannot be independently verified. It may be especially important to make data available for the most influential studies that have provided a foundation for subsequent research and theory development. Therefore, we launched an initiative—the Data Ark—to examine whether we could retrospectively enhance the preservation and accessibility of important scientific data. Here we report the outcome of our efforts to retrieve, preserve, and liberate data from 111 of the most highly-cited articles published in psychology and psychiatry between 2006–2011 (n = 48) and 2014–2016 (n = 63). Most data sets were not made available (76111, 68%, 95% CI [60, 77]), some were only made available with restrictions (20111, 18%, 95% CI [10, 27]), and few were made available in a completely unrestricted form (15111, 14%, 95% CI [5, 22]). Where extant data sharing systems were in place, they usually (1722, 77%, 95% CI [54, 91]) did not allow unrestricted access. Authors reported several barriers to data sharing, including issues related to data ownership and ethical concerns. The Data Ark initiative could help preserve and liberate important scientific data, surface barriers to data sharing, and advance community discussions on data stewardship.

Publication
PLOS ONE
Date