Location: Instruction Room, Shields Library
Date: 09/29/2014
Time: 9 am - 5 pm

Link to materials.


There are now so many Web-based sources of data for researchers in almost all fields. In addition to the vast amount of, e.g., biological, climate, cosmological and traffic data, we can digitally access census data, political information, images, ancient texts and, of course, live content and network data from social media such as Twitter and Facebook. Being able to efficiently access this data is a powerful new skill and tool for all researchers. It is a skill that can open up new aspects of research and make new things possible.

We will run a one-day workshop on accessing data from Web pages and Web services/APIs. The workshop is aimed at graduate students and postdocs and we strongly encourage interested students from the humanities and social sciences to attend. One of the goals of the workshop is to help researchers access the vast amount of data that is increasingly available from Web pages and structured programming interfaces (APIs).

The topics we will cover include

  • Extracting data from HTML pages.
  • HTML forms.
  • XML, XPath and JSON
  • The important aspects of HTTP.
  • Web Services and REST
  • Authentication - passwords, OAuth, OAuth2.

At the end of the workshop, you will have learned about the core Web technologies and how to programmatically access data via the Web using these technologies from within a commonly used, interactive programming language.