Share and publish data
Data remains a valuable resource, even after the end of the project for which it was collected and used. Sharing data enables the future you and other researchers to open up new lines of enquiry without the duplication of effort involved in collecting data again. This page provides guidance on why and how to share data as well as choosing the right data sharing strategy.
Why share data
Sweden, like other members of the European Union as well as other countries worldwide, is in the middle of making science more open, transparent, and inclusive. Open science is about extending the principles of openness to the whole research cycle and making scientific information openly accessible to society to promote innovation, knowledge transfer, as well as awareness. The two pillars of utmost importance for the development of open science are open access to scientific publications and open access to research data. Open access to research data means that data collected and/or generated as part of a research or environmental monitoring and assessment activity shall be made openly and freely available online. Such shall occur in accordance with current legislation (as per EU’s underlying principle “as open as possible, as closed as necessary”) and allow for optimal reuse of data. In this context, take a gander at SLU’s policy for scientific publishing, which encourages SLU personnel to make research data openly available.
In the course of the open science movement, both national funding agencies as well as journals and publishers are increasingly mandating to making data openly available. Thus, both the Swedish Research Council (VR) and Formas (Swedish Research Council for Sustainable Development) have developed policies on research data, requiring open access to research data financed via public funds. On the other hand, a rising number of journals and publishers have formulated research data sharing policies that range from supporting and encouraging to expecting to, in fact, requiring data sharing (see PLosOne, PNAS, Frontiers, American Chemical Society, or SpringerNature for examples of data sharing polices from journals and publishers).
However, there are a number of reasons for why you should share data – apart from governments, funding agencies, as well as journals and publishers increasingly mandating to share data freely. Sharing data can
highlight and advance your research,
Sharing data means that your peers will be able to discover your work more easily. Greater visibility and impact of your research may lead to greater recognition of your scholarly work. It may, furthermore, increase your profile as a researcher, by ensuring credit is given to data as an output in its own right. Finally, it may lead to new collaborations and partnerships as shared data makes other researchers more aware of your own research.
demonstrate your research integrity,
By sharing data you allow others to verify and validate your work.
progresses science in general,
Shared data opens up new avenues of research, encourages scientific enquiry and debate, and promotes innovation. As such, it allows data to be independently validated, tested, and scientific findings to be verified. It, moreover, allows for validation and improvement of research and applied scientific methodologies. In addition, newly arising collaborations may increase the potential of reusing data in a completely new manner. Finally, shared data reduce the costs of duplicating data collections.
be a resource for education and training, and
maximise transparency and accountability.
Ways of sharing data
Sharing data can mean a number of things. Here, data sharing is defined as the practice of making data available to others. Historically, data has been shared ‘upon request from the author’. Nowadays, data may still be shared upon request, yet there are more reliable and secure ways data can be shared. However, data can be made available to others either informally or formally. Informal data sharing means making data available by means of self-dissemination. Formal data sharing or data publishing, on the other hand, stands for making data publicly available and, thus, discoverable via depositing data and/or metadata at a data repository, publishing a data article in a data journal, or publishing an article in a journal together with data as supplemental material.
Self-dissemination (informal data sharing)
Self-dissemination means making data available via peer-to-peer sharing (e.g., ‘upon reqest’) or a project or personal website. Websites, be they project or personal, may offer storage and dissemination but are less stustainable and provide less long-term preservation. In addition, managing data may be costly and controlling who uses the data and how may be difficult.
Depositing data and/or metadata at a data repository (formal data data publishing)
Publishing data via a data repository can include one of two things: either depositing both data and metadata or only metadata at a data repository. In any case, depositing data and/or metadata at a data repository allows for it to be discoverable.
Repositories can provide controlled access to sensitive data, and create a catalogue record for data making it more discoverable. They take responsibility for handling data reuse queries, licensing, dissemination and promotion of data on behalf of the data depositor. Some data repositories, also, manage data safely for long-term use, protect data from format obsolescence, data loss, deterioration, or irreversible damage. Using a repository involves depositing data and/or metadata in a digital database, which can be discipline/domain-specific, institutional, or generalist. Finally, repositories, in general, provide a persistent identifier in connection with the deposition of data and/or metadata.
Publishing a data article (formal data publishing)
Data articles focus on data collected, generated and/or reused throughout a research or environmental monitoring and assessment activity, allowing you to describe tools, methods and processes; that is to fully characterise the data. The actual data and/or metadata, however, is often deposited at a data repository (see above).
Data journals seek to promote scientific accreditation and reuse, improve transparancey of scientific methods and results, support good data management practices, and provide an accessible and permanent route to the data. It is important to note that data articles are peer-reviewed and citable.
Publishing an article together with supplementary data (formal data sharing or publishing)
As of up to recently, the general way of sharing data has been as supplementary material to a peer-reviewed article published in a scientific journal. This, though, often meant that only certain and by no means all data was shared, and that journals and publishers might have kept that data behind a subscription wall and/or claimed copyright over the data. Moreover, data may not be in a user-friendly format or functional for computer processing.
Nowadays, a growing number of journals and publishers support and encourage, expect, or, in fact, require all data underlying a publication to be shared via publishing data and/or metadata through a data repository (see for instance PLosOne, PNAS, Frontiers, American Chemical Society, or SpringerNature).
Linking data to a published article
Research publications are most useful when supported by their underlying data. Journals and publishers are ever more commonly requiring that data directly underpinning a publication is made publicly available (either shared or indeed published). Authors are to this effect asked to state where the data can be found and under what conditions it can be accessed (and, if not, why). Such statements in publications are known as data access statements (or data availability statements). The University of Bath Library provides detailed information on data access statements, including examples thereof.
However, to most effectively link data to your publication you need a persistent identifier, such as a Digital Object Identifier (DOI). In order to get such an identifier (, which can then be included in your publication’s data access statement), you will need to deposit the data and/or metadata at a data repository (i.e., you will need to publish the data). See below for how to obtain a persistent identifier.
Choosing where and how to share data is a crucial matter. Your choice may mean a large and far-reaching impact or little to no reuse. Also, your preference may depend on the existing practices in your discipline or funder and journal/publisher requirements.
How to share data
As seen above, data sharing can be done via many different ways. Independent of which way data is shared, it is best practice to properly prepare the data prior to sharing. Once the data itself is ready to be shared there are a number of issues that need to be considered and addressed.
How to prepare data prior to sharing
Sharing data should be easy, should you have followed the previous steps carefully. When sharing data you will need tidy files (see Collect, organise, and store data), in the right format (see Collect, organise, and store data and Preserve data) and with appropriate documentation and metadata (see Collect, organise, and store data, Process and analyse data, and Preserve data); thus, ensuring data can be discovered, accessed, understood, and reused in the future.
What to consider when sharing data
A number of legal, ethical, and commercial issues need to be anticipated and addressed prior to sharing data. It is important to stress that it may not be possible to publish data openly if one of the following happens to be the case:
material classified according to the Public Access to Information and Secrecy Act (SFS 2009:400),
personal or sensitive information (such information needs to be handled in accordance with data protection, freedom of information, and secrecy legislation; see Collect, organise, and store data for more general information and Process and analyse data for measures to desensitise data prior to publishing openly),
material copyrighted by somebody else and copyright prohibits further publishing,
trade secrets or sensitive financial information,
It is important to note that should one of the above apply, you may still publish the data, however, not openly (e.g., publishing the metadata at a data repository with information on how to get hold of the actual data). Should none of the above be the case or existing issues have been resolved (e.g., via consent), it is time to think about intellectual property rights, licencing, and persistent identifiers.
As a rule, material (such as data) produced as part of research as well as environmental monitoring and assessment activities carried out at SLU are considered official records/documents and access to such must be guaranteed on request unless there is a confidentiality provision according to the Public Access to Information and Secrecy Act (SFS 2009:400). Such a request can – if adhering to current legislation – be met by either allowing the enquirer view/obtain the data on site or providing a copy (depending on the circumstances). Should you be uncertain about allowing or denying such a request, you can contact SLU’s Legal Affairs unit. More on the principle of public access to information can be found at Collect, organise, and store data.
Intellectual property rights (e.g., copyright)
Intellectual property (IP) allows a person to own his or her creation in the same way as something physical can be owned. This gives the rights owner control over the exploitation of their work, such as the right to copy and adapt it, the right to rent or lend it, the right to communicate it to the public and the right to licence and distribute. In Sweden, for something to fall under copyright protection, it has to be the “result of personal intellectual creativity” and of “original and unique” character. In general, data belongs to the public domain and is as such not copyrightable. Research data can though, in some cases, fall under copyright protection if it contains copyrightable work (e.g., literary text, computer code, or works of art). For large compilations of research data the database right may be applicable. The database right is comparable to but distinct from copyright, and its purpose is to recognise the investment made in compiling a database. It, however, does not involve the "creative" aspect required in copyright. In copyright, the rights holder is the creator (e.g., an author), while in the database right the rightsholder is the organisation that made the economical investment to enable the compilation (e.g., a university). Please contact SLU’s DCU (Data Curation Unit) for further information and/or support in this regard.
To enable data reuse, it may need to be appropriately licenced. From a copyright perspective, a licence clearly specifies what others can do with the data you share. It states the conditions according to which reuse is allowed: whether you require attribution (i.e., citation), what type of licence applies when building upon the data shared, whether work resulting from a transformation of the data shared or building upon it can be shared further, and whether commercial use is permitted.
To make data available to the widest audience possible and allow for the widest range of uses possible, it is recommended to choose a standard and open licence. Commonly used standard and open licences for data include Creative Commons and Open Data Commons licences. The Swedish National Data Service (SND) has put together further information regarding licensing specific to the Swedish context (information available only in Swedish). Note that licensing data is one of the principles ensuring FAIR data.
Identifiers for data or information are essential in all computer-based systems. Computer applications apply them for identifying datasets, for searching and retrieval, and for linking or connecting data. Thus, identifiers are persistent links to content, allowing data to be findable and citable. Such unique and persistent identifiers (PIDs) ensure that an object is discoverable at all times. Several types of PIDs exist, such as DOI (Digital Object Identifier), Handle, URN, Ark, PURL, etc. It does not really matter what kind of PID you use, though DOI is currently the most widespread and most integrated in automatic citation counting algorithms. In order to obtain a PID for the data you intend to share, you will need to publish the data and/or its metadata at a data repository (see below).
Choosing a repository
SND’s national research data catalogue
SLU is part of a consortium of Swedish universities making up the Swedish National Data Service (SND) – a national infrastructure for open access to research data. Thus, SLU’s DCU (Data Curation Unit) recommends publishing data and/or metadata via SND’s national research data catalogue. Also, DCU can help you identify other suitable data repositories and assist you in preparing data for publication elsewhere.
SND’s data catalogue is a general digital data repository that accepts data and/or metadata from any subject area. It, moreover, meets most funders and journals’ data policy requirements regarding making data openly and freely available (should SND’s catalogue not be on a journal’s list of approved repositories, it is usually possible to contact the journal and ask them to consider adding it to their list). In addtion, every dataset and/or metadata that is deposited and made accessible at SND’s catalogue is provided with a persistent identifier (a DOI in this case). Lastly, published datasets as well as metadata are discoverable through SND’s catalogue itself as well as other services such as DataCite or Web of Science.
Submission of data and/or metadata is done online via a submission form (login required) that may be either general or tailored towards a specific subject area (earth and environmental sciences, language resources, social sciences, medical and health sciences and archaeology and history). Once submitted, data and/or metadata undergoes a review prior to publication. In case SLU personnel submit data and/or metadata members of SLU’s DCU carry out that review. This review is done in close communication with the actual depositor to ensure that data and/or metadata is as complete and FAIR (Findable, Accessible, Interoperable, Reusable) as possible.
In addition to SND’s national research data catalogue, a large number of data repositories exists. When choosing a repository, it is recommended
to check your funder and journal/publisher’s recommendations and requirements towards sharing data,
repositories that apply machine-readable metadata and use a known metadata standard,
repositories that assign a persistent identifier (PID) to the data and/or metadata.
Remember to include your affiliation in the metadata when depositing data and/or metadata at a repository of your choice. Follow the instructions issued by SLU of how to correctly state your SLU affiliation when publishing.
Visit Discover, reuse, and cite data for a number of websites that may help you in identifying an appropriate repository as well as for a list of potential repositories you may choose among. In addition, the University of Harvard’s Dataverse Project has undertaken a comparative review of various data repositories with regard to features, usage and governance, which can provide further help with choosing an adequate data repository.
Please note that while you can make data available in open data repositories, research data must still be archived on SLU’s behalf. Visit Archive and preserve data for more information in this respect.
The Swedish National Data Service
The Swedish National Data Service (SND) is a national collaboration to support the accessibility, preservation, and re-use of research data and related materials. The SND research data catalogue can be used for publication of research data in any subject area or for the discovery and re-use of published datasets.
The SND Consortium
SND is run by a consortium consisting of nine universities: Swedish University of Agricultural Sciences, Karolinska Institutet, Lund University, Stockholm University, Umeå University, Uppsala University, Chalmers, KTH Royal Institute of Technology and the University of Gothenburg, which is the host university of SND. The research infrastructure is funded primarily by the Swedish Research Council and the consortium members. The consortium makes all strategic decisions regarding the work conducted by SND and is responsible for the technical solutions for the data repository and the data discovery service.
The SND Network
More than 30 higher learning institutions and public research institutes are members of the SND network, striving to create a national infrastructure for open access to research data. The SND network members are committed to establishing local support functions for research data, so called Data Access Units. At SLU, the Data Curation Unit (DCU) functions as Data Access Unit and supports researchers in the areas of data management, preservation, and publication.
The SND Research Data Catalogue and Repository
Research data can be described in the research data catalogue and made available through the SND website. Data descriptions submitted to the SND data catalogue by researchers at SLU, undergo quality review by the staff at DCU prior to publication. The SND repository is CoreTrustSeal certified and data are published under the principle “as open as possible, as closed as necessary”. SND is a member of DataCite and provides published datasets with DOI (persistent identifier). Datasets can be discovered in the SND research data catalogue as well as through other web search services, e.g. Web of Science.
The seven consortium universities contribute expertise through so-called domain specialists with experience and knowledge from different research fields and research data management. The domain specialists function as a bridge between groups of researchers and the research community at large, and between the data repository and the local units for research data support. SLU contributes expertise in climate- and environmental data to SND.
Domain Specialists in Climate- and Environmental Data at SLU
The domain specialists at SLU monitor the research data development in the climate- and environmental domain. They advise on domain specific matters such as controlled vocabularies and metadata standards and are involved in information and education initiatives. They also participate in national and international collaborations, and work towards an increased understanding, use, and publication of open research data.
Domain Specialist, Climate and Environment
Works with data management and data quality within Data Curation Unit and The Unit for Data Management Guidance and Development. Background in environmental databases and coordination of terrestrial and limnological field research infrastructures.
Ida Taberman, business developer
Ida.Taberman@slu.se, +46 90 786 8216
Domain Specialist, Climate and Environment
Biologist and information specialist. Works at the SLU library and Data Curation Unit with research support, including support regarding research data management and publication. Has previously worked in the fields of microbial ecology and biodiversity research, as well as with curation and publication of meteorological and oceanographical monitoring data.
Ylva Toljander, librarian
firstname.lastname@example.org, +4672 227 09 48
The Tilda Project
As a university SLU is responsible for keeping research data available for the research community and the public, today as well as in the future, as part of the SLU archive of public records. In addition, SLU is commissioned by the Swedish government to continuously provide authorities, industries and international organisations with information about the condition of our environment.
A pre-study, “Nystart Tilda”, has been conducted as a result of the review of the Tilda project. It was conducted during 2019 and early 2020. The project team consisted of representatives for the SLU University Library, the Unit for Data Management Guidance and Development, Documentation and Legal Affairs (all members of the Data Curation Unit) and IT. The steering group includes the Heads of the Vice-Chancellor’s Office, the University Library and the IT Department as well as the Pro Vice-Chancellor for Environmental Monitoring and Assessment. The Head of the University Administration is the project owner.
Why is SLU doing this?
Implementing coordinated solutions for archiving and publishing research and EMA data will facilitate:
- the increased reuse of research and EMA data from SLU
- fulfilling our responsibility as a government authority regarding management of public records and making sure data are available today and in the future
- meeting needs and requirements regarding open access to research data and supporting the transition to open science
- making SLU’s research and environmental monitoring and assessment more visible
- assisting researchers and data producers with
- infrastructure for data management from the start of the project, meeting archiving and publishing requirements
- support for managing data according to the FAIR principles (Findable, Accessible, Interoperable, Reusable)
- registering metadata and data from research and environmental monitoring and assessment in one place
- spreading and exposing research results through e.g. citable datasets for crediting the researcher, connection between published articles and supporting datasets
Data management support