Manage your data - everyone!

Publishers

What are the actual requirements for journals in Physics and Chemistry in terms of data sharing? Overview of learned societies and general publishers.

Contents

In reference to the problem of reproducibility, Tsuyoshi Miyakawa, editor-in-chief of Molecular Brain reports having rejected 40 articles out of 181 in two years because of a lack of data supporting the conclusions. (1)

In 2019, a study entitled “Effect of impact factor and discipline on journal data sharing policies” reviewed the policies for data sharing in 447 journals coming from different disciplines related to medical science. Only 12 of them, i.e. 2.7%, had a strict policy making data sharing a condition for publication.

Two factors can influence the adoption of stricter standards for data sharing. The first is the impact factor. Journals with a high impact factor are more likely to insist on data reproducibility and peer reviews (2). Discipline is the other factor. Out of 150 journals examined in 2015 in a survey conducted by COPE, 30% of biomedical scientific journals presented a very voluntary policy in terms of data sharing, compared with 10% of Physics and Chemistry journals for example.
Physics and Chemistry journal publishers demonstrate great disparities in requirements and ambition for the publication and verification of data. See below for a summary current practices of learned societies and general publishers.

LEARNED SOCIETIES

American Chemical Society

Since February 2020, ACS has set up a platform to facilitate the deposit of NMR (Nuclear Magnetic Resonance) data according to FAIR principles. To date, only two organic chemistry journals, Journal of Organic Chemistry and Organic Letters, have participated in this pilot project by inviting authors to use the service to deposit their FID files. At this stage, data are not being stored on ACS servers, though this is one of the project’s objectives.
For each of the 68 journals published by the ACS, instructions are available in the Publications Center. General recommendations on NMR data are published here. Crystallographic data must be verified using the CheckCIF tool. A list of file formats accepted for “supporting information” is also available.
For more detail about each journal policy, go to the drop-down menu and read the information regarding “supporting information” and “data requirements”. Prerequisites vary greatly from one journal to another:

  • The Journal of the American Chemical Society provides an eight-page description to in its guide for authors.
  • The Journal of Organic Chemistry also provides a rather thorough user manual. It explains that: “The publishers insist that the presentation of yields over 95%, of isomeric ratios above 200:1 and enantiomeric excess of more than 99% are not considered realistic without detailed explanations.”
  • The Journal of Proteome Research insists on the obligation for authors to deposit raw data and associated metadata in warehouses such as ProteomeXchange with access details in the manuscript (URL and even passwords if necessary). “Data remains confidential while the manuscript is under review, but will be opened once it is published.”
  • The journal ACS Nano provides author instructions with data requirements. Authors must provide solid proof for both the identity and the purity of new substances studied. Expected specifications are detailed for the different types of data.
Royal Society of Chemistry

The RSC displays a clear and detailed policy: “During the submission of a manuscript, authors must provide all the data necessary for understanding and verifying the research presented in the article.”
Instructions may differ depending on the type of data: x-ray crystallographic data must be deposited in a dedicated repository; chemical compounds and spectra can be included in the “supporting information”; so-called “additional” non-compulsory data that is potentially useful for future research, can be deposited in a discipline-based or general repository.

Spectral data must be supplied in JCAMP-DX format. Guidelines facilitate substance description through a dozen criteria (yield, melting point, spectra, refractive index, etc.). Crystallographic data must, of course, be in CIF format and accompanied by a checkCIF report.

The 44 journals published by the RSC can add their own recommendations to these general directives, depending on the field covered. For example, the journal ChemComm encourages authors to provide a complete description of the conditions in which magnetic measurements of samples were taken (type of capsule used, etc.). The journal Energy and Environmental Science requires detailed information on the calibration protocol for solar converters.

American Institute of Physics

AIP published 32 journals, as well as proceedings and books. Authors are recommended to deposit their data sets in publicly accessible repositories or to present them in the main manuscript so that readers have access to all data sets used to support the conclusions. AIP’s policy for research data is available here.
When the article is submitted, the author must provide a “declaration of data availability” (see: Data Availability Statement Templates).
The journals all follow the general recommendations of the AIP and do not adopt their own standards, unlike, for example, ACS journals.

Institute of Physics

The IOP publishes 96 journals and conference series. The IOP participates in the Committee for Publication Ethics (COPE).
General policy: “IOP Publishing supports the principle of transparency and open data. The goal is reproducibility for which data, codes, and research materials supporting research articles must be available.” The policy is available here. Two aspects on standards and data availability are discussed in detail: Standard data policy and Data availability policy.
All of the IOP journals follow these recommendations. It is recommended to consult possible specific journal policies in the section “About the journal” on their website.
Supplementary materials: supplementary material files must not exceed 10 MB. It is recommended to deposit larger files in a data repository. It is recommended to deposit larger files in a data repository.

Conditions for availability: IOP recommends publication in a discipline-based warehouse, but accepts general warehouses. The IOP cites the following repositories: Figshare, Dryad, Harvard Dataverse, Zenodo. It should be noted that: IOP Publishing has developed a dedicated repository on Figshare (see: dedicated repository on the Figshare platform), used by the journals Environmental Research Letters,IOP SciNotes, JPhys Complexity, Machine Learning: Science and Technology.

According to the policy of these journals, it is compulsory to make a data availability statement.

IOP Publishing journal published in the name of another society or organization is authorized to establish their own research data policy. This is the case, for example, for the American Physical Society which provides a data guide.

American Physical Society

APS publishes 15 journals including Physical Review A, B, C, D, E, Physical Review Letters, Reviews of Modern Physics, Physical Review Applied. The APS policy for data distribution is only provided for “Supplemental Material” archived by the publisher. It can include “multimedia files, raw or analyzed data tables, parameters used in or produced by calculations and computer codes as well as information about how the research was conducted (sample preparation, derivatives, etc.).
All files related to an article are stored in a single unique deposit and are attributed an URL for additional documents. The URL appears in the article’s list of references.

Society of Photo-Optical Instrumentation Engineers

SPIE, a learned society publishing 11 journals specialized in the study of light, mentions in its ethical provisions (SPIE Guidelines for Ethical Publishing), that “research results of research must be preserved in a form for analysis and examination.”
The organization does not publish large data sets at this stage. Authors can, however, refer to large data sets in their manuscripts with a link to the repository. The Advanced Photonics journal specifies that the total size of files attached as supplementary material must not exceed 100 MB.

American Geophysical Union

The American Geophysical Union publishes 22 journals (via Wiley) including Geochemistry, Geophysics, Geosystems, Reviews of Geophysicsor JGR: Solid Earth.
This organization adopted a data publication policy as early as 1993. In 2015, the AGU declared: “Scientific earth and space data should be widely accessible in multiple formats and that long-term preservation of data is the full responsibility of the scientists and institutions that sponsor them.” The AGU stipulates that “any data necessary for understanding, assessing, reproducing and developing published research must be made available and accessible whenever possible”. The AGU journals follow FAIR data principles (see guidelines for enabling FAIR data published by COPDESS, Coalition for Publishing Data in the Earth and Space Sciences)

Authors are recommended to identify and archive data related to their article in a repository widely used in the community, to preserve data for at least 5 years or to provide access using a clear process. If these conditions are not respected, the AGU reserves the right to refuse publication.
Authors have very complete instructions, with examples of templates to indicate the availability of data, types of recommended respositories etc.

European Geosciences Union

EGU, that publishes 19 journals in open access, such as Geoscientific Model Development or Atmospheric Chemistry and Physics (ACP) was a signatory to the Berlin Declaration in 2003 and in March 2020, approved the Open Access 2020 initiative. EGU has specified several issues concerning big data and earth sciences.

GENERAL PUBLISHERS

Like discipline-based publishers, major groups such as Elsevier, Springer Nature and Taylor have established open research data policies.

Springer Nature

Springer Nature (which publishes Nature Physics, Nature Chemistry, Nature Materials, Nature Photonics, La Rivista del Nuovo Cimento) classifies its journals according to four categories, depending on the level of data sharing policy. Only categories 3 and 4 demand communication of data, with peer review for the latter. Most of the chemistry journals are are in category 2, signifying that authors can provide a data availability statement and are encouraged to deposit data in a repository or as supplementary information.
Note: Nature Physics and Nature Chemistry are classified in category 3. In this case, all relevant data, including raw data, must be freely available. The data availability statement is mandatory. The data availability statement is mandatory.
The publisher also provides a list of recommended repositories for each discipline.

Elsevier

“Our policy aims to encourage rather than force authors to share their research data”, the publisher states. Amongst potential warehouses, Elsevier recommends their own platform, Mendeley data, as well as about 60 other more specific repositories.
Elsevier systematically refers to data availability statement, that the authors are invited to provide. For the Journal of Computational Physics, co-submission to Data In Brief is also proposed. In chemistry, data can be highlighted in Chemical Data Collections.

Wiley

The publisher’s requirements vary greatly, from encouraging to requiring deposit, and even peer reviews. Wiley has a partnership with Dryad, a repository which charges fees for depositing data. Some of their journals, in the medical or environmental science fields, cover the cost of deposit.
Generally, a “data availability statement” is required from researchers who can indicate whether their data is open in a repository, subject to embargo or available upon request.
A FAQ gives more information about expectations of authors for data.

Angewandte Chemie International Edition, and Chemistry, a European journal, both offer a fairly detailed user manual, depending on the type of data to supply (NMR, infra-red, mass spectroscopy, catalysis, precise method used for energy conversion and storage, etc.)

By contrast, Polymers for advances technologies provides little information on the subject, simply indicating that an article can contain up to five graphs; others can be placed in “supplementary information”.

Taylor and Francis

T&F policies for data communication range from optional to required. Just like Wiley, a “data availability statement” is requested from authors.
More extensive criteria exist for 7 Geoscience journals, where data deposits in suitable platforms are required. Like ACS, a partnership has been made with Figshare to host “supplementary information”.
Examination of chemistry journals policy terms shows two recurring features. On the one hand, the publisher urges researchers to provide a DOI for data sets associated with the article. On the other, it indicates that data is not reviewed by peers, implying that it is the author’s responsibility to ensure the reliability of data.

EDP Science

The policy of EDP Science, a publisher that stresses their commitment to open science, presents its data sharing policy in the context of research funding: “EDP Sciences journals encourage authors to share and publish their data if it is legally and ethically possible.”
Authors are encouraged to deposit their data in an on-line repository so it is “available for human and machine reading, in order to contribute to the acceleration of scientific discovery”. They are also invited to deposit their data according to FAIR data principles and to provide a data availability statement. These recommendations apply to the following journals Acta acustica, Astronomy and Astrophysics, Journal of Space Weather and Space Climate, are concerned by these recommendations.

Cambridge University Press

Concerned by the principles of transparency, the journal Journal of Fluid Mechanics published by Cambridge University Press, mentions its research data requirements in the research transparency section: “All information required for reproducing the study must be supplied, either in the body of the document, or in repositories accessible to the public”.

A search engine for rating the transparency of journals for data

The Centre for Open Science publishes an analysis grid on the transparency and openness of journals. This guide distinguishes 8 topics (Citation Standards, Data Transparency, Analytic Methods (Code) Transparency, Research Materials Transparency, Design and Analysis Transparency, Study Preregistration) and three levels of compatibility. A search engine can be used to check the compatibility of a journal with these principles. It only takes two clicks to find that Nature Chemistry (Springer) scores nine points, compared with only one point for Forensic Chemistry (Elsevier).

  1. Miyakawa, T. No raw data, no science: another possible source of the reproducibility crisis. Mol Brain 13, 24 (2020). https://doi.org/10.1186/s13041-020-0552-2
  2. According to the authors of the study, this is because “journals with a high impact factor are likely to receive more attention from researchers and the media”, increasing pressure to comply with theses measures.