Book contents
- Frontmatter
- Contents
- Acknowledgements
- Glossary
- 1 Introduction
- 2 The development of web archiving
- 3 Selection
- 4 Collection methods
- 5 Quality assurance and cataloguing
- 6 Preservation
- 7 Delivery to users
- 8 Legal issues
- 9 Managing a web archiving programme
- 10 Future trends
- Appendix 1 Web archiving and preservation tools
- Appendix 2 Model permissions form
- Appendix 3 Model test script
- Appendix 4 Model issues log
- Appendix 5 Model job description
- Bibliography
- Index
- Digital Preservation
5 - Quality assurance and cataloguing
Published online by Cambridge University Press: 08 June 2018
- Frontmatter
- Contents
- Acknowledgements
- Glossary
- 1 Introduction
- 2 The development of web archiving
- 3 Selection
- 4 Collection methods
- 5 Quality assurance and cataloguing
- 6 Preservation
- 7 Delivery to users
- 8 Legal issues
- 9 Managing a web archiving programme
- 10 Future trends
- Appendix 1 Web archiving and preservation tools
- Appendix 2 Model permissions form
- Appendix 3 Model test script
- Appendix 4 Model issues log
- Appendix 5 Model job description
- Bibliography
- Index
- Digital Preservation
Summary
Introduction
Quality assurance is an essential component of any web archiving programme. All collection methods involve some degree of automation, and it is therefore vital to ensure that the selection policy and collection list are actually being implemented successfully. The nature and degree of quality assurance which is required or practical will depend upon the needs and resources of the collecting agency, and the selection approaches and collection methods employed. In general, the greater the scale of collection undertaken, the more basic the level of quality assurance that can realistically be employed. This dictates that there is invariably a tradeoff between the number of resources that can be collected, and the quality control which can be applied to them, and a policy decision is required as to the minimum acceptable level of assurance.
Whatever the level of detail at which it is applied, any quality assurance process should follow the basic model illustrated in Figure 5.1 overleaf.
This chapter describes these processes in detail, and identifies some of the most commonly encountered problems and their possible solutions. It also discusses the cataloguing of archived websites. Some form of catalogue description is required in order to manage any archival collection, and make it accessible to users. Although cataloguing may take place at various stages in the web archiving process, it is included here because an important element of quality assurance is to ensure that all necessary cataloguing is accurate and complete.
Pre-collection testing
Pre-collection testing is concerned with the identification of potential issues that may affect the quality of collected content, in advance of its acquisition. It is clearly desirable to identify and resolve as many potential problems as possible prior to collection, thereby minimizing the extent of postcollection testing required. Pre-collection testing will typically include two approaches: resource analysis and test collection.
Resource analysis
This involves the manual or automated analysis of the target web resource, in order to identify the appropriate collection method and any issues that are likely to arise during collection. At the most basic level, it will be necessary to determine whether the website is static or dynamic in nature and, if the latter, whether all of the target resources are linked or only available through database queries.
- Type
- Chapter
- Information
- Archiving Websitesa practical guide for information management professionals, pp. 69 - 81Publisher: FacetPrint publication year: 2006