Skip to main content

Unfortunately we don't fully support your browser. If you have the option to, please upgrade to a newer version or use Mozilla Firefox, Microsoft Edge, Google Chrome, or Safari 14 or newer. If you are unable to, and need support, please send us your feedback.

We'd appreciate your feedback on this new experience.Tell us what you think(opens in new tab/window)

Elsevier
Publish with us

Extensive data + powerful platform = impactful analyses

The datasets

ICSR Lab is both powerful as a computational platform and extensive due to the size and breadth of the datasets that it contains. It is available at no cost to users for scholarly research.

Full publication metadata from Scopus

  • Publication metadata, including each publication’s authors and affiliations, language information, title, DOI, ASJC subject codes and more

  • Scopus abstracts

  • Author names and profiles, including affiliation details and ORCID

  • Ability to conduct studies of author gender*, following the methodology used in Elsevier’s Gender Report 2020

  • Institution profiles, including name variants

  • Metadata for selected preprints(opens in new tab/window) from 2017 onwards

  • Funding metadata derived from funding acknowledgement statements

  • Detailed open access information

  • Citation counts, FWCI, and citation links enabling nuanced analyses of citation impact

  • Ability to use Scopus’ Advanced Search fields and operators for greater control when filtering publications

*Gender assignation metadata was derived using an AI-driven, inferred binary genderization methodology that is appropriate for bibliometrics or other large-scale analyses because such studies focus on trends at scale. The methodology cannot be used to unambiguously infer an individual’s gender identity, thus the gender metadata cannot be used for individual level or small group analyses as an alternative to self-reported data.

PlumX Metrics(opens in new tab/window) broken down into separate categories and sources

  • Citations encompassing traditional citations as well as for example clinical citation counts

  • Captures, such as Forks and Followers on GitHub or readers on Mendeley and SSRN

  • Mentions including blog mentions, comments on various platforms and Wikipedia references

  • Social media including Facebook interactions

Other datasets

Specialized ‘Workbenches’ requiring additional application processes

Specialized Workbenches provide access to additional datasets not included automatically in the ICSR Lab. Access to these is subject to peer review by our external advisors in the relevant fields and requires evidence of past research experience in the field.

Peer Review Workbench

This Workbench provides access to summarized metadata of the peer reviews for over one million proprietary Elsevier journal manuscripts submitted between 2018 and 2021 (updated annually), enabling systematic analysis of the peer review process across different disciplines, at scale.

The datasets in the Peer Review Workbench are transparently pre-processed to pre-filter and aggregate along dimensions required for each specific project.

How to apply

For more information and how to apply, please see our how to apply page.

All datasets in ICSR Lab are optimized for big data processing and the list of datasets available continues to grow based on feedback and the data needs of proposed projects. Note that ICSR Lab is not optimized for text mining and does not contain the full text of articles. If this is your need, see Text and data mining for more information about using Elsevier’s full text API.

The breadth of Scopus data

In its initial release, the data from Elsevier’s Scopus forms the backbone of ICSR Lab. Containing more than 82+ million items, as well as the corresponding author and institutional profiles, Scopus is a rich, structured data source that covers 40+ languages and contains many enhancements to the data such as calculating the citation counts that publications receive.

You can read more about the history and structure of Scopus and its analytical uses in this peer-reviewed open access article.(opens in new tab/window) While, this whitepaper(opens in new tab/window) describes the rigorous content curation mechanisms to exclude poor-quality and predatory publications from Scopus, making Scopus through ICSR Lab a reliable database for academic publication.

Scopus facts and figures

Upload your own datasets for richer, linked data

You are welcome to upload and link your own or third-party datasets in ICSR Lab, so long as you have the appropriate rights to upload the data. This is very useful if you have some pre-curated dataset that you want to filter Scopus to, or are reusing annotations from previous work. Keep in mind that as this is a shared research platform these datasets will also be visible to other users until removed (though your code is of course private).

The platform

ICSR Lab is run on the cloud-based ‘Databricks’ platform and is accessible from major web browsers, meaning that you do not need to install any software locally to use ICSR Lab. All the data processing runs in the cloud on powerful Amazon Web Service infrastructure, ensuring quick responses to every query, large or small.

To use the Lab, you write queries in ‘notebooks’ which enable you to interactively run and re-run code written in one of several programming languages including Python/Pyspark, SQL, R and Scala (we require that each team has at least one member with experience coding in one or more of these technologies, as there is no user interface with which to explore data without coding). You can also use Databrick’s interactive visualizations to explore trends in your data using built-in graphing functions in the user interface.

Notebooks can be shared with and commented on by your collaborators. A detailed and granular system of rights gives you control over who can see the contents of and contribute to your notebooks.

Support materials to help you get the most out of the platform include:

  • Dataset documentation accessible from within the Lab

  • List of frequently asked questions and answers

  • Links to relevant learning resources for Python and Pyspark

  • Direct support from professional data scientists in the ICSR Lab team via email

  • Example notebooks illustrating common queries and analyses that can be used as starting points for your own analysis. In the initial release, these include:

    • An example investigation of gender balance within a subject area

    • An example of how to use Databricks’ built-in visualizations

Learn more