Peer review of research data: a “light” analytical approach to assessment of feasibility of research data quality check-up

Download presentation

The aim of the presentation is to establish whether peer review of research data is feasible or even necessary. While this bold introduction may suggest that the presentation will provide a definitive answer to all questions related to the research data peer review, it isn’t so. Peer review of research data is no new concept but it is still developing. The problem of research data management in general originates from the analog era and continues in the digital era in which the quantity of the research data is increasing fast: “It is doubling in size every two years, and by 2020 the digital universe – the data we create and copy annually – will reach 44 zettabytes, or 44 trillion gigabytes”.[1] Today, when the research data are gradually becoming very powerful tool in research and scientific publishing, data users (readers and peer reviewers) still have rarely access and resources necessary to verify a manuscript’s results by analysis of the research data. [2]. Like scientific papers, research data can be also published. Data publication is a process of data available on the internet for as long as possible [3]. To become available in the long term, the research data should be collected, described / annotated, stored and assessed / verified / peer reviewed. The research data can be collected and stored in digital repositories alone or in combination with other data or scientific material. Peer review of research data serves to increase trust in scientific data and results and enable datasets to be evaluated and certified for quality [3]. It “allows the validity of data and results to be judged for quality by a research community before dissemination” [4]. The national academy press, 2009. Assessment / verification / peer review of the research data is equally important and difficult because the research data originate from different branches of science, are created by different scientific methods, by different instruments and people in different environments. Grootveld and van Egmond [5] published a pilot study on peer review of the research data and focused on the following aspect of the research data for their evaluation: the data quality; quality of the documentation; completeness of the data; consistency of the data set (if applicable); structure of the data set (if applicable); usefulness of the file formats. Malhotra and Marder [6] reviewed the main aspects of the research data peer review process and stressed that the data are important to the quality of the research article. In comparison to the peer review of research articles, the peer review of data requires different skills and approaches. Due to the number of problems the peer review of data “is often inconsistent or outright absent, and that is hurting every field where data is not subject to the same rigorous standards as the manuscript itself” [2]. Peer reviewers need access to the research data to be able to verify that the methods described in the manuscript do in fact produce the results presented [2]. This verification process is still less common in today’s scientific practice and is not streamlined as we would like it to be and data are still quite rarely independently assessed / verified. According to Callaghan [7] the research data peer review could be split the review into different types, each asking different questions: “1. Editorial review – “Does the dataset have a permanent identifier?” “Is the dataset stored in a trusted repository?” “Are the access conditions clearly laid out and following journal guidance?”; 2. Technical review – “Is the data in an appropriate, community standard, format?” “Are tools and services provided to facilitate visualisation and manipulation of the data?” “Is there enough metadata so that non specialist users can understand what the data is and how it was collected?”; and 3. Scientific review – “Is the metadata provided accurate?” “Are there suspicious values in the data that shouldn’t be (e.g. are there negative values for rain rate)?” “Is the dataset fit for the purpose it was collected for?” “Is the dataset suitable for any other uses?”. Based on the available information resources about this topic, we can say that there is a lot to be done in development of the research data peer review. It is a new and propulsive area which will demand cooperation between many stakeholders in science.

Location: Date: September 21, 2018 Time: 14:05 - 14:20 Radovan Vrana, University of Zagreb Creative Commons Attribution 4.0 International License