Finding data

Browsing scientific search engines and data journals

Contents

Des moteurs de recherche généralistes

Datacite Search can be used for identifying data sets (almost 20 million data sets coming from almost 2000 repositories). The engine is implemented by Datacite, a global supplier of DOI for scientific data.

Dataset Search is the tool proposed by Google. First tested in the beta version for two years, it was finally launched in its official version in January 2020, announcing indexation of 25 million data sets. For identification by the search engine, data sets must come from websites which comply with the structured format schema.org. Note that indexation scope is very broad, including statistical databases from national administrations. On the other hand, search filters are very limited.

OpenAire (Open Access Infrastructure for Research in Europe) is a platform which indicates publications and research data. Resources are harvested from over 100 repositories. It is possible to search by the project name, by funder, by type of data, etc.

Mendeley Data (Elsevier) search engine indexes 20 million data and is also a repository where data can be deposited after creating an account.

Dimensions.ia and Lens.org are publication aggregators. They each propose a “dataset” filter so the search is only performed in data sets of the different types of harvested publications.

Other ideas: supplementary materials and data journals

Adding supplementary materials to publications is now a typical publisher requirements (see the Nature website, the RSC instructions on experimental data and the editorial choices of the review Molecular Brain, whose editor-in-chief rejected 40 articles between 2017 and 2019, due to absent or insufficient raw data.)

In supplementary materials, the author explains their method, their calculations, and can attach additional data in tables, diagrams, etc. However, journal requirements and the amount of data that can be attached vary, which is likely to put constraints on finding useful content. To overcome this, ACS chose automatic indexation of supporting information in its reviews in Figshare.

Finally, consider data papers. In Web of Science results can be filtered in data papers only.
See our list of general and topical data journals in chemistry, physics, and related disciplines: