Polar Data: Key challenges faced by an evolving paradigm

By Ann Balasubramaniam

Background

As governments, businesses, and the Canadian public develop interest in our rapidly changing polar regions, increased and timely access to polar data becomes essential, to ensure information is available for decision making. Comprehensive sets of data are needed for examining trends over time, anticipating changes, and identifying existing or emerging issues caused by environmental, political, or socio-economic stressors. As a broad spectrum of individuals and groups can benefit from access to baseline polar data (i.e., policy makers, industry, researchers, and northern communities or organizations) there is a growing impetus to create infrastructure to better manage polar data, ensure interoperability, and disseminate it according to open-data and open-access policies. Despite recent technological advances in on-line sharing and discovery of data, the policies, protocols, and infrastructure for holding and publicly sharing polar data, which facilitate its use, interpretation, and verification, are still being strengthened and fine-tuned.

The many research projects of International Polar Year 2007-2008 (IPY) generated a wealth of data about our polar regions from researchers around the globe. To preserve this legacy, web-based data portals were established or expanded to accept and publish metadata (written descriptions of data sets), store data sets in standard formats, and provide a useable interface for access to data1. For example, the Polar Data Catalogue (PDC, https://www.polardata.ca), a Canadian web-based polar data repository works in partnership with federal agencies and programs (Fisheries and Oceans Canada, Environment Canada, Natural Resources Canada, and Aboriginal Affairs and Northern Development Canada) and networks (Circumpolar Biodiversity Monitoring Programme, le Centre d’études nordiques à l’Université Laval, and the Global Cryosphere Watch of the World Meteorological Organization) to facilitate access to polar data, publish metadata, manage large volumes of data, and post datasets for public use2. Despite the existence of such data repositories and the fact that for many years the circumpolar scientific community has been at the forefront of data sharing and management, there are still many issues that hinder the usability and success of polar data portals and the access to polar data.

Challenges:

  1. Data portal silos and the need for interoperability

  2. There are over a dozen polar data portals operating in Canada and dozens more in other countries. Funded and managed independently, they were developed individually to fulfill important niches within their specialized research networks. Many portals exist as independent data centres with little or no exchange with others. In 2015 the PDC consulted over 200 individuals including researchers, northern residents, data managers, and government scientists, and found most respondents wanted to improve access to polar data by facilitating interoperability between data portals4. Data portals need to link their metadata records so that users can search available data regardless of which data portal they access originally. To facilitate this level of coordination between data portals, standard metadata and data formats must be used, data sharing policies must be developed and implemented, and additional resources must be dedicated to support user training, staff time, and infrastructure. 1,2,3.
  3. Standardized data policies and the need for partnerships

  4. Best practices for data management ensure that before data are submitted to a portal, they undergo quality assurance measures and conform to a data format that is ingestible by the portal. Adoption of international standards that dictate the format of data and metadata promotes easier access to data and information2. Achieving interoperability between data portals requires partnerships, formal agreements on formats for metadata and data archives, and procedures for releasing data. To date, PDC has developed partnerships with more than fifteen polar data portals within Canada and internationally and is testing synchronization of metadata formats and metadata record sharing4. Their partners include: the Northwest Territories Discovery Portal, the Yukon Research Centre, the Arctic Data Centre of the Norwegian Meteorological Institute, the British Antarctic Survey, and the US National Snow and Ice Data Center2. PDC has also shared metadata with Environment Canada’s emerging data repository4. These efforts to link existing portals and build a sharing network are important steps towards creating a platform for making polar data truly accessible.

  5. Data ownership paradigms: who “owns” publicly funded data
  6. Before polar data can be truly open and shared, discussions are needed on data licensing agreements (to protect human subjects and avoid instances where the release of data could infringe on rights) and researchers’ rights to hold data records.5 Because much polar research is funded by government grants and public partnerships, there is an expectation that openly providing data will be a natural and ethical step forward for polar researchers.1 However, data sharing is often complicated by intellectual property rights and a researcher’s need to demonstrate productivity by publishing in peer-reviewed journals. Many scientists feel they should have the right to “hold” data exclusively until publication in order to ensure their long hours of work are acknowledged.5 To deal with this issue, data portals are offering to: i) give deposited data a stable digital identifier which allows data users to formally acknowledge data creators when data is re-used in publications; and ii) team up with data journals that specialize in peer-reviewed data publication in order to produce a formal digital object identifier (DOI) for the data set.6 The effectiveness of efforts to make data citable and persuade scientists to deposit data in repositories will depend on how data-based citations are incorporated into the measure of a researcher’s productivity5,6. Deposition of data by researchers would also be facilitated by funding agency requirements that promote data sharing, clear protocols on distribution of proprietary data, and clear data sharing policies4,5. In the polar research community, data sets are increasingly viewed as legacies equally valuable to peer-reviewed journal articles. Thus, discussions resolving rights to data, the value of data publication, and the value of data citations when data are re-used are needed.

Polar data, like other data, are facing a technology shift that is changing the point of access, previously limited to journal publications, thesis papers, and isolated networked hard-drives. The Internet allows polar data managers to create web-based access platforms, and this exciting prospect is challenging the existing paradigms on data management and sharing. Solutions to the challenges above as well as others are needed to allow data archiving to catch up to available technology and launch polar data usage into the public domain. There will be much dialogue on these issues at the upcoming International Polar Data Forum II: International Collaboration for Advancing Polar Data Access and Preservation (http://www.polar-data-forum.org). To be held 27-29 October 2015, in Waterloo, Ontario, Canada, the Forum will bring together an international community of polar data managers, researchers, students and early career polar scientists, northerners, and government agency representatives to accelerate progress in polar data management by establishing clear actions to address target issues4. With the Government of Canada’s current directive on open data that aims for data to be “open by default” in order to meet the expectations of citizens and industry, discussions to better manage polar data are particularly relevant. If provided with sufficient support, Canadian polar data portals, such as the PDC, have the opportunity to develop the infrastructure necessary to manage open data.

References:

1) Polar Data Activities in Global Data Systems Communiqué: Recommendations and Observations Arising From the ‘International Polar Data Forum.’[Accessed on August 20, 2015 from: https://www.icsu-wds.org/events/files/international-polar-data-forum-communique.pdf]
2) Friddell J.E., LeDrew E.F. and Vincent W.F (2014). The Polar Data Catalogue: Best Practices For Sharing and Archiving Caanada’s Polar Data. Data Science Journal. 13:1 – 7.
3) Friddell J.E., LeDrew E.F. and Vincent W.F (2014). The Polar Data Catalogue: Data Management for Polar and Cryospheric Sciences. 70th Eastern Snow Conference. Huntsville, Ontario.
4) Friddell J.E. (2015). Personal communication.
5) Parsons, M., Godøy Ø., LeDrew E., Bruin T., Danis B., Tomlinson S. and Carlson D. (2011). A conceptual framework for managing very diverse data for complex, interdisciplinary science.
Journal of Information Science 37-6: 555-569.
6) Kratz J.E., Strasser C. (2015). Researcher Perspectives on Publication and Peer Review of Data.

Page details

2015-11-17