Getting & Working with Bibliometric Data

Description: DSI Postdoc Dan Hicks will lead this workshop on finding and using bibliometric data. Bibliometrics is a quantitative "science of science" field that uses publication metadata to study the research process and outcomes. This workshop will introduce methods for getting bibliometric data — that is, publication metadata — from Scopus using a combination of the web interface and API. During the first hour, we will explore some of Scopus' advanced search functionality, and export metadatasets that can be analyzed in Excel, R, and other environments. We will also discuss the limitations of some common metrics used in bibliometrics, including citation counts, citation profiles, h-index, and Impact Factor. During the second hour, we will use the Elsevier APIs to retrieve a wider range of paper metadata, as well as author- and institution-level datasets.

The workshop will be interactive, so please bring your laptop! For the first hour, no particular software is required other than an up-to-date web browser. For the second hour, code demonstrations will be provided in R. Other major programming languages (including Python) can be used to access the Elsevier API, but we may not be able to provide support or assistance.

  • Web interface:
    • Scopus advanced search
    • Exporting metadatasets
  • Some biblio-metrics:
    • Citation counts and profiles
    • h-index
    • The hazards of Impact Factor
  • Programmatic interface:
    • Creating a new API key
    • Scraping data using RCurl
    • Parsing data with XML

Prerequisites: Bring your laptop with R installed and working to follow along. The first portion of the workshop (using the Scopus web interface) will also be helpful to those who do not have a programming background.

Register here. All are welcome to attend, but DSI Affiliates have priority registration.

Resources Video