Research data management

What does Research Data Management stand for?

The TU Darmstadt Guidelines on Digital Research Data define Research Data Management (RDM) as “all scientific activities that involve digital data [during research projects]. This includes all working stages from planning to production to using and editing research data. Finally, also permanent archiving of data or deleting them is part of RDM.” The guidelines also mention that ”RDM explicitly includes documentation of the production of research data according to the specific discipline's standards. It also means secure storage, accurate editing and, if applicable, appropriate publishing of research data.”

RDM operates with all methods and techniques that can be used to secure research data and ensure their long-term availability. Therefore, research data management already begins during the preparation of your research project. For illustration, please have a look at this well-established data curation life cycle:

Data curation life cycle
Data curation life cycle

The curation life cycle illustrates all sequential stages of data during most research projects:

1) Your plan marks the starting point of this cycle. At this point, you may refer to already existent data.

2) The next step is to collect or process new digital raw or so called primary data (or to integrate existing research data into your own project.)

3) By analyzing and processing raw and input data you generate data that are relevant for the research project, and if applicable, the publication of its results.

4) All data have to be saved as a temporary back-up at first.

5) Before archiving the relevant data have to be selected and described. This includes deleting intermediate stages which are not required anymore. Here, it is important to generate useful metadata, document all stages of the process and preserve the software that was used.

6) Successful archiving often means converting your data into permanent file formats. Permanent, reliable archiving can only work within an infrastructure that is also permanent. You do not want store your data in an infrastructure that is tied to a short term project. That is why it is best to use central infrastructures for digital long-term preservation.

7) Specific data can and should be freely available, or distributed on demand in order to make your research results more accessible. Here, it is important to ensure persistent identifiers and evidence in suitable databases.

8) The last stage of the data curation life cycle is the re-use or processing of data in other research projects. Re-use by the scientific world can only be ensured by carefully accomplishing the previous stages.