WP3: Data archaeology and rescue | Emodnet Biology

WP3: Data archaeology and rescue

Lead: HCMR - Partners:                 OGS, NIMRD, ICES, VLIZ

+ organizations interested


A real heterogeneous landscape of marine biodiversity datasets, gathered in the course of various research activities and which are not part of the large thematic or national monitoring databases, exist on paper and in electronic form. These datasets refer to different components of the marine ecosystems such as meio- and macrobenthos, infauna, epifauna and plankton and even microbes and are currently stored in the form of logbooks, reports, published documents in conferences and workshops and scientific articles in peer-reviewed journals. In addition, they can be in the form of simple and unorganized spreadsheets, on the hard disks or other media of electronic information storage of individual scientists and of marine institutes, research centers, academic departments, ministries, port authorities, etc. In the contemporary times, the data which are not stored on a remote server (e.g. in the cloud) are considered to be at permanent risk of being lost to future use. All means from hard copies, to old and unused floppy or hard disks run the risk of permanent distruction because of their probable failure of function in the future. In the case of the retirement of the scientists in charge, this possibility becomes certainty since very few research institutes and academia implement a sound, long-term, data management policy. Worse, most of these data sets are not associated with proper metadata, a fact which makes their re-use almost impossible without the assistance of the data owner (holder/steward/custodian). It is these historical datasets, however, which may give us the opportunity to re-construct what was out there in the past and provide an invaluable baseline against which to measure natural and anthropogenic change. Yet, it is these data which cannot be collected ever again simply because it is impossible to be more than once in a certain time in a certain place and collect data for the same purpose. Therefore, without these data our capability to compare past with present and future is lost forever.

The overall objective, therefore, of this work package is to fill the spatial and temporal gaps in species occurrences and make the rescued historical data available through the EMODnet portal, using the same common methodologies and making these data interoperable with the large biological data holdings which are identified in WP2. This process requires implementation of data archaeology and rescue and a long-term strategy to ensure the continuous flow of such data in the EMODnet platform. This Data Archaeology and Rescue WP will build on previous experience gained primarily by the EMODnet projects. The specific objectives of the WP are:

  • To review the strategy, activity and best practices implemented over the previous phases of the EMODnet projects and develop a simple and productive workflow to maximize efficiency by capitalizing on experience gained.
  • To continue identification of historical data that are at risk and implement a plan for their archeology and rescue.
  • To run a framework of small grants for their digitization, standardization and quality control.
  • To implement a mechanism for the networking of the supporting community to ensure continuous inflow of datasets in the future, with the emphasis on the benefits offered to this community.

The focus of the WP will be the Mediterranean and Black Seas but it will also expand to other regional European Seas, where appropriate.

Methodology & activities


1. Developing an efficient workflow

2. Testing and implementing the modified procedure

3. Selection of the datasets to be digitized during the project

4. Digitization: from pilot to industrial production

5. Engaging the community


Output (Deliverables)

D3. 1: Scientific document presenting the data archeology and rescue strategy of the project (M03).

D3.2: Report on the digitization of 3 datasets under the modified procedure (M06).

D3.3: Update of the list of the 76 datasets along with a list of selected datasets for digitization (M08).

D3.4: Policy report on biodiversity data management sent to research organizations (M14).

D3.5: General report on data entry (M24). With an individual report for each dataset (D4.5.1, etc.) as available including. list of data papers in preparation, submitted, and published