Much of the discussion up to now has been about the potential of web metrics to gain insights into the web of documents: how many times has a particular document been mentioned on the web? How many times has a web page been visited? How many times has a wiki page been updated? It has been suggested, however, that we are moving from a web of documents to a web of data, where there is ever more data available in a machinereadable format. This chapter starts with a description of this ‘web of data’ and all its multiple facets, before expanding on the implications of this increasingly structured data on the development of various kinds of web metrics. Finally the chapter considers some of the existing tools for investigating this web of data.
The web of data
The term ‘web of data’ is used here to refer to data that is structured in machinereadable form and has been published openly on the web (Stuart, 2011). It is not separate from the existing web, but rather may refer to a subset of it, and includes a wide range of different technologies, from a Google spreadsheet hosted in the cloud, to an Excel spreadsheet contained within an institutional repository; from the APIs providing access to data from Web 2.0 sites and services, to web pages with microformats, microdata or Resource Description Framework in Attributes (RDFa). Some of these technologies have already been introduced during the book. Here we consider some of the technologies that contribute to the web of data in a bit more detail, considering both the advantages and disadvantages of the different technologies, before moving on to discuss some of the implications of the web of data (in its various guises) to the development of web metrics.
As has been argued elsewhere, librarians have a long history of providing access to the documents, and are ideally positioned for facilitating access to the increasingly large web of data (Stuart, 2011). Facilitating access to the web of data provides a new avenue for information services to develop as the traditional information services rapidly evolve. While the prospect of learning to program would understandably be a daunting prospect for many librarians, socalled massive open online courses (MOOCs) from sites such as Udacity (www.udacity.com) and Coursera (www.coursera.org) provide a simple way for librarians to expand their technical skills.