From Dark Data to a Global, Accessible Digital Resource Documenting Life on Earth
For more than 300 years, biologists have documented research by preserving samples, known as voucher specimens, in biological collections. These specimens are the direct evidence for recognition, description, and publication of the millions of species known to science. The basic data and information within collections worldwide underwrite our knowledge about biological diversity, the history of life on earth, molecular and cellular biology, organismic and ecological systems. Downstream applications in biomedical research, agriculture, and management of genetic and natural resources also directly use or indirectly benefit from collection knowledge bases. However, access to the physical specimens and data associated with them have traditionally been available only to specialists based on their credentials and academic background. Our national and international infrastructure of biological collections are a treasure trove of data; but these are dark data, much of which is still hidden away in the physical archives.
This lecture will address how advances in technology are changing how collections are conceived, maintained, secured, and made accessible, increasing their relevance to science and society. Recent impetus for change began with the recognition that collections are not dark, hidden archives but rich, expansive, big data resources. It is estimated that 2.5 billion specimen objects are curated in biological collections. Every single object has, at a minimum, data about its identity, origin, and provenance. By making basic collections data more available through digitization initiatives scientists will be able not only to study and understand the collections themselves better and the items in them; but, also be better able model how landscapes and environments have changed in the past, how they are changing now, and how they will change in the future.
Technology has enabled use of scientific collections in novel, unanticipated ways, driving further innovation and opening new opportunities for research using collections. For example, ancient DNA methods enable researchers to sequence extinct species from specimens in collections. Non-invasive CT scanning provides a means to visualize of the brain cavity of a fossil animal specimen. With X-ray fluorescence spectroscopy herbarium specimens can be scanned for hyper-accumulation of minerals and to link these observations to environmental sensing data. Furthermore, the concept of biological collections discussed in this lecture goes beyond preserved items to include living stocks, cultures, and cryo-facilities. These repositories make living material, tissues, and genetic resources available for study of model and non-model organisms that are essential for research on far-ranging topics.
The lecture will illustrate examples of how long-term investments in collections have paid off, along with the challenges for supporting and managing the infrastructure critical to their maintenance, growth and effective utilization.
About the Speaker
Reed Beaman is a Program Director at the National Science Foundation (NSF) with primary responsibilities for the Collections in Support of Biological Research and Advancing Digitization of Biodiversity Collections programs. Previously at NSF he was responsible for a variety of programs in biology, including Next Generation Networks for Neuroscience, Advances in Biological Informatics; Dimensions of Biodiversity, and Critical Techniques, Technologies and Methodologies for Advancing Foundations and Applications of Big Data Sciences and Engineering.
Reed's research interests have focused in Southeast Asia, particularly on Mount Kinabalu, a biodiversity hotspot on the Island of Borneo. His dissertation work involved the description of eight new plant species and landscape level biogeographic analysis using remote sensing imagery and geographic information systems. More recently, he has engaged with researchers in Asia as the Biodiversity Expedition Lead for the Pacific Rim Applications and Middleware Grid Applications (PRAGMA) network, a community of practice that facilitates cyberinfrastructure experimentation on an international scale.
Reed was a Postdoctoral Fellow in Biological Informatics sponsored by the Royal Botanic Gardens Sydney and University of Kansas, during which he developed software tools for automating geo-referencing specimen data. He continued work on digitization methods while Associate Director for Informatics at the Yale Peabody Museum and as Curator of Informatics at the Florida Museum of Natural History prior to serving at the NSF.
Reed earned a BS in Botany at the University of Michigan and a PhD in Botany at the University of Florida.