To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure email@example.com
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The successful transmediation of books and documents through digitization requires the synergetic partnership of many professional figures, that have what may sometimes appear as contrasting goals at heart. On one side, there are those who look after the physical objects and strive to preserve them for future generations, and on the other those involved in the digitization of the objects, the information that they contain, and the management of the digital data. These complementary activities are generally considered as separate and when the current literature addresses both fields, it does so strictly within technical reports and guidelines, concentrating on procedures and optimal workflow, standards, and technical metadata. In particular, more often than not, conservation is presented as ancillary to digitization, with the role of the conservator restricted to the preparation of items for scanning, with no input into the digital product, leading to misunderstanding and clashes of interests. Surveying a variety of projects and approaches to the challenging conservation-digitization balance and fostering a dialogue amongst practitioners, this book aims at demonstrating that a dialogue between apparently contrasting fields not only is possible, but it is in fact desirable and fruitful. Only through the synergetic collaboration of all people involved in the digitization process, conservators included, can cultural digital objects that represent more fully the original objects and their materiality be generated, encouraging and enabling new research and widening the horizons of scholarship.
THE VATICAN LIBRARY, ever since its foundation in the mid-fifteenth century, has had as its primary tasks the protection and preservation of its collections, while also making them freely available to scholars. These cardinal concepts have been easily adapted to the new technologies for greater dissemination and preservation of the Library's cultural heritage.
Already in the nineteenth century, faced with the concrete danger of being unable to stop the ongoing deterioration of the palimpsests that had been treated with Gallic acid in order to read the undertext, photographic techniques were selected as a means to at least fix in time the status quo of the text of the most damaged manuscripts. At the beginning of the twentieth century, the internal photographic laboratory was formed, leading to the creation of a new professional figure that would develop over time. Due to the tragic circumstances of the first half of the twentieth century in Europe, the creation of photographic surrogates acquired a supplementary function: should the originals be destroyed, the images preserved in different venues could at least preserve the textual transmission for the future generations. With this purpose in mind, safe storage to collect microfilm copies of most manuscripts was created across the Atlantic Ocean. In 1975, the Vatican Library organized the interlibrary conference Conservation et reproduction des manuscrits et imprimés anciens, once again bringing to the forefront photographic surrogates of manuscripts and early printed books. In 1994, in collaboration with IBM and the Pontifical Catholic University of Rio de Janeiro, the Library took its first steps towards digitization, scanning a sample of 150 manuscripts taken from its most famous holdings. In time, the photographic campaign was gradually extended to the rest of the collections. At the dawn of this new century, a new technology is now expanding considerably the means to disseminate the Vatican's library heritage: the digitization and the publishing of the resulting images online through the web.
A Pharaonic Endeavour: The Complete Digitization of the Manuscript Holdings
The Vatican Library has planned to digitize all the manuscripts in its collections. This is a pharaonic undertaking, as there are about 82,000 manuscripts to digitize, the majority of which are bound in codex format.
A PARTNERSHIP BETWEEN the British Library and the Qatar Foundation was started in 2012. The Qatar Foundation is a state-run non-profit organization focusing on education, scientific research, and community development, and it is within the context of this partnership that a digitization project was also launched in 2012.
During the first phase of this collaborative project, in October 2014, the Qatar Digital Library— a new, bilingual, online portal providing access to previously undigitized British Library archive materials relating to Gulf history and Arabic science manuscripts— was launched as part of the agreed deliveries of the project.
The project was then extended into a second phase until the end of 2018, with the goal to upload over a million pages of historical material related to the Gulf, in addition to the 500,000 images including maps, photographs, manuscripts, letters, but also audio and video files already undertaken in the partnership's first phase. The BL/QDL Partnership and the Digitisation Project are still running today and plans have been made for further extensions.
The Planning Phases
From the beginning, the conservation strand of the project was rooted in the planning process. A new conservation studio with a team of three conservators was set up within the spaces rented by the Qatar Foundation on the sixth floor of the British Library's main building, where all the people employed to work for the project were placed.
During the first few weeks, contacts were made with other institutions where digitization projects had been previously established. Great support, especially in the scoping for the type of materials and equipment needed, was given by the conservation team involved in digitization preparation at The National Archives in Kew, where Catt Thompson-Baum had been the conservation manager: she generously shared her knowledge acquired in years of dealing with digitization workflows and the conservation preparation requirements.
A very comprehensive document (“Guidelines for Conservation”) detailing the project's policies and procedures, revised many times since then, was also compiled during the first few months of the project.
THE NATIONAL ARCHIVES is the repository of the official record of the government of England and Wales. It holds records from all government departments dating back to the eleventh century. There are government documents in other collections, from a time when ministers kept their papers when they left their government role, but from 1838 when the first Public Records Office Act was created, records created for government business belonged to government. From that date, and through further permutations of the Act, The National Archives (TNA) and its previous incarnations have retained the official record of government.
The Public Records Act governs, among other things, the statutory functions of preservation and access to government records, and digitization is a key tool for both as well as vital revenue generation. This chapter focuses on mass commercial digitization of collections, rather than individual record capture for researchers in our reading rooms.
Surrogate Production and Data Creation
TNA holds over 220 kilometres of government history at the time of writing, an amount that increases each year as more records are opened by the government under transparency legislation, which dictates that records are opened to the public after a twenty-year period. Records are stored both on-site and at an off-site facility. Any open record can be requested for digitization, and the collections are valuable for both academic and family history research.
Digitization is an expensive, resource-intensive process, with multiple complex work streams surrounding the image capture itself to create the final digital product. The record sets at The National Archives are rarely consistent; media, condition, size, colour, and format of individual documents can vary greatly even within the same volume or file. Additionally, client requirements vary, so solutions to each work stream must be tailored to each project.
The National Archives has created surrogates from original documents since the 1960s using microfilm and began creating digitized records from its own collection of microfilm in the early 2000s, the largest of which was the 1901 census released in January 2002.
SPECTRAL IMAGING FOR book conservation has been slow to develop, with many of the previous uses in the conservation realm being to look at underdrawings on painting. The focus on palimpsests and hidden text tended to conceal the more diverse uses of spectral imaging for assisting conservation decision-making. There are a range of applications for using spectral imaging, and collaborations between preservation and conservation professionals are a critical component of how specific institutions have integrated and begun to utilize spectral imaging as part of their conservation toolbox. New uses of these existing imaging technologies have been presented or instigated as questions arise from conservation, scholars, conservation science, and curatorial colleagues. These questions can include: how to track changes over time before discoloration or fading appears, how to assess conservation treatments as they are being developed so as not to cause harm in the long term, and how to facilitate better understanding of construction techniques in medieval printing or cartography, non-invasive identification of pigments and inks, comparison of visibly similar inks, pigments and handwriting, and of course, recovery of redacted text.
Spectral imaging measures the reflectance of materials to assess their chemical composition. This is similar to remote sensing imaging with spectral measurements through the atmosphere to calculate the intensity of spectral differentiation from areas of interest in the region being imaged. The response from a surface of a material can be characterized by the percentage of incoming energy (illumination) it reflects at each wavelength across the electromagnetic spectrum. Its spectral reflectance curve is an essentially unchanging property of the material. An important factor for cultural heritage materials is that chemical change or environmental deterioration can modify the spectral response.
One of the challenges for conservation laboratories is how to integrate spectral imaging with digitization procedures, even though many of the types of images captured with spectral imaging are already utilized for conservation purposes, for example, ultraviolet and infrared imaging. Throughout this chapter, the term “spectral” imaging will be used as opposed to multi or hyper. These terms refer to the number of spectral bands in an imaging system, and more than ten bands is considered “hyper”; however, some colleagues refer to hyper as contiguous bands, so to be inclusive, the term spectral will be used.
BOOKBINDINGS HAVE LONG been the Cinderella of the bibliographical world, mostly ignored unless extensively decorated, and the reason most often given for this by cataloguers and bibliographers has been the absence of any consistent and recognized terminology with which to describe them, especially those bindings which have little or no decoration. There are many reasons why no such terminology had been created, but a lack of serious research, the confusion inherent in inherited and inconsistent terminologies, and a general lack of the expertise required to recognize different structures and materials were chief among them. This has not been helped by the antiquarian book trade, which has over the past century and a half developed its own highly idiosyncratic and inconsistent, if not actually inaccurate, terminologies. Traditional bookbinding terms in English, as they have come down to us, refer mostly to nineteenth-century binding practice, as the first bookbinding manual in English dates only from 1811, and the terms used are therefore not necessarily helpful in describing earlier bookbinding practices. The emergence after the disastrous floods in Florence in 1966 of the distinct discipline now known as book conservation made the creation of such comprehensive and consistent terminology essential, as recording the distinctive features of bookbindings and their condition was a necessary part of book conservation. A small number of book conservators went on to do further research into historical book structures, extending and refining the newly created terminology and giving precise meanings to traditional terms that had often been used very loosely up to that date. Unfortunately, the new terms coined in this process by different researchers were not themselves always consistent, with the inevitable risk of creating further confusion rather than reducing it. As, however, more extensive use was made of databases to record such details, the need for consistency in the form of a standardized thesaurus became ever more pressing.
The digitization of the data from the survey of the bound manuscripts and the early printed books in the library of the monastery of Saint Catherine on Mount Sinai provided the genesis of the Language of Bindings (LoB) thesaurus, as the database required an organized terminology.
IN ADDITION TO their contents, the value of manuscripts and printed books lies in the tradition of writing, decorating, or illustrating and is also reflected in the unity of an object, its physical form and features, its material, structure, and binding, to name just a few. Hence the preservation of a broad spectrum of historical evidence claims high priority. This applies in particular to digitization projects, which are intended to play an important preservation role in protecting fragile or valuable originals from handling when digital surrogates are used to present their content.
Lessons from the Past: New Specifications
At the Herzog August Library Wolfenbüttel (HAB) the long-time use of a wide range of reformatting methods for the historic holdings has caused typical damage with fatal “side effects” ever since the era of analogue photography. As a consequence of forcing books open, for example, their sewing threads tore, bands broke, or covering materials burst at the joints. Detached boards or even bookblocks completely split at the spines were also unfortunately not rare.
This created the paradoxical situation that a measure of preservation— such as the creation of surrogates to protect the originals— partly proved to be destructive. Many years have passed since then. This experience and the analysis of the ensuing damage led to the formulation of a decisive specification:
Manuscripts and early printed books require that the reformatting technique should be adapted to the individual demands of the object, and not vice versa.
Further steps inevitably had to follow: on the one hand, the HAB encouraged the urgently needed development of state-of-the art imaging tools and, on the other, it began to integrate an upstream assessment process on the physical condition of books and the impact on their being handled during the conversion process.
Against this background, the Wolfenbüttel Library has already withdrawn from the imaging techniques common in times of analogue reproduction, where without exception books would be opened to 180 degrees and fixed in this position with the aid of a glass plate.2 While opening a book to 180 degrees, the laminar pressure against the glass plate is applied not only to the two pages to be captured and the media used on the pages of manuscript or print, but to the entire book and its mechanics.
BOOKS, AS MOVABLE objects (of any material) provided with surfaces on which to store and deliver information, are a technology that has evolved, taking different forms and shapes with its progress. From clay, bone, and wooden tablets, to cloth and bark-paper folded concertinas, papyrus scrolls, and parchment and paper codices, books have been part of our culture for millennia.
When we think of the physical aspects of books, as empirically demonstrated by an image search on the web for the keyword “book,” typically we expect to see a book in its codex form, as this is the form with which we are nowadays most familiar, to the point that we simply call them books. Books in codex format are defined as “a collection of sheets of any material, folded double and fastened together at the back or spine, and usually protected by covers.” Despite the variations due to different historical and geographical traditions, codices are characterized by their inner working structure, which has yielded the success of this format: their pages (generally arranged in gatherings) are linked together at the spine, allowing for quick and easy browsing of the content.
The Book as a Black Box
Although often collected exclusively for their content and decoration, books, as cultural objects, present the added value of technological and material data, preserved from another time and place. Like other tools and objects that are readily handled in our everyday life, books are, however, used “unconsciously” and few actually consider— or even notice— their physical form beyond the fact that they work as they are supposed to: they convey information. In order to read a physical book, one does not need any knowledge of bookbinding.
In this sense, to borrow a concept from information science and actor– network theories (ANT), books in codex format are, to the typical user, black boxes. A black box, in these terms, is a technological artifact that, while appearing obvious to the ordinary observer— in the sense that its behaviour is perceived “as known and predicted independently of its context”— can also be regarded as a complex entity, the essence of which depends on a diverse system of techniques, materials, processes, and actions.
Like men, books have a soul and a body. With the soul, or the literary portion, we have nothing to do at present; the body, which is the outer frame or covering, and without which the inner would be unusable, is the special work of the binder.
(William Blades, The Enemies of Books, 1902, p. 96)
THE SUCCESSFUL TRANSMEDIATION of books and documents through digitization requires the synergetic partnership of many professional figures that have what may sometimes appear as conflicting goals at heart. On one side, there are those who look after the physical objects and strive to preserve them for the future generations— conservators and curators— and on the other those involved in the digitization of the objects, the information that they contain, and the management of the digital data— digitization professionals, and then digital humanists. These complementary activities are generally considered as separate, and when the current literature addresses both groups, it does so strictly within technical reports and guidelines, concentrating on procedures and optimal workflow, standards, and technical metadata. In particular, more often than not, conservation is presented as ancillary to digitization, with the role of the conservator restricted to the preparation of items for scanning, with no input into the digital product, and this leads to misunderstanding and clashes of interests.
Digitization projects have become increasingly crucial for memory institutions and cultural heritage in general and have been at the core of the field of digital humanities from its beginnings, inevitably with a profound influence on conservation practice since the early 2000s. Here, as hinted at in the title, we will be concerned with the digitization of books and documents. All the fields involved in this diverse practice landscape, besides the specifics included in technical manuals in regard with best-practice procedures, have largely distinct authorship and readerships that do not overlap. This book tries to fill this gap. It strives to do so by showcasing, on the one hand, the need to understand the informational content of books as objects and the role that conservators could have in the creation of broader digital products and tools, and, on the other, the transformative and transcendent value that digital surrogates can and should bring to the table.
THE LIBRARY at Wellcome Collection in London holds an important and distinctive collection of medieval Western manuscripts in Latin, Greek, and vernacular languages related to medicine and health. The collection, of parchment and paper items, includes not only bound codices but also unbound documents, folding calendars, and scrolls. The subject material ranges from theoretical medical texts and compendia of information about plants and animals, to works on alchemy, astrology, and magic, as well as collections of recipes and healing charms. This reflects the breadth of the understanding and practice of medicine in the Middle Ages: information about the natural world, magic, and astrology informed thinking about the human body and enabled medical practitioners to diagnose, prognosticate, and administer treatments.
The Library has approximately 335 Western manuscripts dating from the eleventh to the fifteenth centuries. In 2014, as part of its wider digitization program, the Library started digitizing this collection, aiming to digitize all manuscripts that were sufficiently robust to be imaged. The project is currently ongoing. The digitized manuscripts, freely accessible online, are creating a rich and detailed resource for historical researchers, conservators, heritage scientists, and others who are interested in these remarkable objects. Before digitization commenced, an item-by-item conservation survey of the collection was carried out to assess the suitability of each manuscript for digitization, especially with respect to its overall condition and the opening angles permitted by its binding. Pre-digitization surveys were subsequently standardized by the library digitization support team, and this project was in many ways a pioneering exemplar for special collections digitization at Wellcome Collection. This chapter will discuss the curatorial context for the digitization project, the pre-digitization survey, the post-digitization preservation and conservation work, and the research potential of the project.
Prior to digitization, work was undertaken to develop the framework within which the digitized manuscripts would be hosted. The platform for hosting digital content, now called the viewer, was at this time known as the player. The Library's medieval and early modern specialist worked with colleagues in the digital team to formulate a list of required enhancements to Goobi, the back-end system for ingesting and processing the images that appear in the viewer.
DIGITIZATION PROGRAMS ARE now commonplace throughout the cultural and heritage sector. However, moving away from routine digitization, there are opportunities in using advanced imaging capture methods in the computational sciences to analyze primary historical sources. This relationship can be dependent on conservators and conservation science to provide insight and access, and prioritize issues through highlighting how and why particular documents or artifacts may benefit from an advanced imaging approach— for example, with the virtual flattening of the damaged Great Parchment Book. In addition, imaging scientists can develop best practices for the application of computational technologies within the cultural and heritage space— for instance, to allow more efficient use of multispectral imaging to reduce the time and cost of capture for particular conservation applications, such as the identification of pigments. This chapter reflects on my twenty years researching and teaching digitization technologies in cultural heritage and developing advanced interdisciplinary projects that require teams with wide-ranging expertise, indicating both the benefits and issues that may arise when attempting to undertake advanced imaging of cultural heritage objects, and stressing the role of the digital humanist, or the heritage scientist, in bridging disciplinary divides. It is hoped that by doing so the type of research success that can emerge when links are forged between the conservation and advanced digitization communities can be demonstrated, and, in addition, others can be encouraged to build links that use advanced digital imaging as a research approach.
Digital Imaging and Cultural Heritage: A Personal Journey
I first saw digitization technologies in action in 1993. A friend's uncle demonstrated an early portable scanner while digitizing family photo albums. With a mind duly blown, I hunted out facilities at the University of Glasgow, where I had recently started my undergraduate degree in History of Art and English Literature. My flatmate, studying Physics, introduced me to the World Wide Web just as it was accelerating, and I rapidly became an avid user of Usenet alt.rec groups and GeoCities, on slow, bulky public University Library computers that heralded a networked, utopian future. Manually updated, lovingly curated websites such as Alan Liu's Voice of the Shuttle and Ross Scaife's Diotima indicated how digital cultural heritage would fit into this new digital world order.
MULTISPECTRAL IMAGING HAS proven to be a critical tool for the study of unseen ancient texts, sketches, and other features in cultural heritage objects. For decades libraries and museums have used non-visible light to examine manuscripts and paintings. Multispectral imaging captures data across a significant part of the electromagnetic spectrum, including light from frequencies beyond the visible light range. This imaging enables extraction of additional information that the human eye fails to capture. It was originally developed for space-based imaging and surveillance to penetrate foliage, clouds, and camouflage, and for astronomy to study celestial objects. Following a decade and a half of development for cultural heritage projects, this same technology is now being used to enhance unseen features on fragile cultural heritage objects.
To conduct narrowband multispectral imaging, multiple images of an object are taken at different wavelengths of light, resulting in a digital “stack” or “cube” of images. Computer algorithms are then used to digitally combine images and enhance particular characteristics of the imaged area through computer processing. These processed images can reveal the faint traces of erased undertext, artifacts such as erasures and changes, and residues and areas of concern for preservation. This is particularly useful to reveal texts on palimpsests, an early form of recycling parchment when fresh parchment was not available. Words previously written on parchment folios were scraped off and erased with pumice stone and acidic solutions and then written over with iron gall ink. Narrowband multispectral imaging of palimpsests and other objects in different institutions and projects have provided conservators with a wealth of information, but with varied conservation support for the actual multispectral imaging and use of the resulting images.
Fortunately, one of the pioneering cultural heritage narrowband multispectral imaging programs incorporated close collaboration with conservation professionals from the beginning to ensure the safe preparation and imaging of the object. The Archimedes Palimpsest Program incorporated best practices in conservation throughout its program planning, work processes, and development of the narrowband illumination. This then carried over to many subsequent imaging projects, setting the standard for multispectral imaging work processes geared toward appropriate handling and exposure of fragile materials in accordance with conservation guidelines.
ON THE NIGHT of September 2, 2018, the National Museum of Brazil was destroyed by a tragic fire. Wired Magazine commented on the fact stating that “all those artifacts could have been systematically backed up over the years with photographs, scans” and continued stating that “the academic community has not yet fully embraced the importance of archiving,” causing an uproar on the Twittersphere because of the naïvety of such an accusation. The costs of mass digitization of all human culture are naturally prohibitive, given the time that would take, the funds needed for the reformatting process, and the maintenance of the digital data. Moreover, this also assumes that digitization is the creation of sorts of virtual clones of the original items that can be backed up on the cloud, like the photographs taken on mobile phones. To digitize is not to replicate an artifact in all its nature, and, as well understood by most today, digitization does not equal preservation.
Digitization does, however, capture information and create additional data sources that integrate (and transcend) the originals. These become supplementary objects of study that scholars can use in tandem with the object they capture, in turn informing research and work on the latter. On April 15, 2019, a fire beneath the roof of Notre-Dame cathedral in Paris destroyed the covering and the spire, damaging the windows and the vaulted ceiling of the medieval church. In 2010, however, the late Andrew Tallon, art professor at Vassar College (New York), aided by Paul S. Blaer, senior lecturer at the Department of Computer Science at Columbia University, painstakingly captured the whole architectural structure, piece by piece, inside and outside, with a laser scanner (Leica ScanStation C10), collecting over one billion data points, accounting for around one terabyte of data. John Ochsendorf, structural engineer and historian of construction, with an interest in masonry mechanics, has described the data collected by Tallon and Blaer as essential to understanding the built geometry of the structure and for its reconstruction. Drawings and diagrams, in fact, would not capture all the imperfections and are not as accurate as laser scans, with their millimetric precision.
THE GREAT PARCHMENT Book project represents a very good example of a successful relationship between different disciplines and professions. The research brought together archivists, a paleographer, and conservators from London Metropolitan Archives (LMA) and experts in digital technologies from the University College London (UCL) Department of Computer Science and UCL Centre for Digital Humanities. This joint effort supported a four-year Engineering Doctorate (EngD) in the UCL Virtual Environments, Imaging, and Visualization program funded by the Engineering and Physical Research Council and LMA. The conservation work was funded by the National Manuscript Conservation Trust.
The aim was to create a digital copy that revealed the content of a manuscript where the text was illegible due to its fragile physical condition. The book (LMA reference CLA/ 049/ EM/ 02/ 018) was unavailable for access due to the extreme fragility of its support. Traditional conservation alone could not reinstate the document in an acceptable and safe condition to enable it to be handled. Only the work of the UCL digitization team revealed the very useful information that the document held.
Today the volume is available for access online without the need to retrieve and handle the document unnecessarily.
The Great Parchment Book owned by the Honourable The Irish Society was commissioned in 1639 by Charles I with the aim to survey all the estates in Derry∼Londonderry managed by the City of London through The Irish Society and the City of London livery companies. This survey was compiled at a time of great political and social change and provides important information about the role of the City of London in the Protestant colonization and administration of Ulster as well as the population.
Since 1639 the book has been held in London. In February 1786, a fire in the Chamber of London at the Guildhall in the City of London destroyed most of the early records of The Irish Society, and only very few of the seventeenth-century documents remained. Among those which survived is the Great Parchment Book.
The volume was severely damaged by fire and subsequently by water. The volume was so distorted and fragile that for over 200 years it was unavailable for research.