The Bodleian Libraries’ collections are extraordinary and significant—both from a scholarly point of view and as material that has an historic and aesthetic richness that holds value for non-academic users. Each year the Libraries serve more than 65,000 readers, over 40% of them from beyond the University, while its critically-acclaimed exhibitions attract almost 100,000 visitors annually. In an effort to make portions of our collections open to a wide variety of users from around the world for learning, teaching and research, the Bodleian Libraries have been digitizing library content for nearly twenty years. The result is over 200,000 freely available digital objects and at least another 1.5 million images awaiting release.
Like many academic libraries, though, our freely available digital collections have been placed on-line in project driven websites, with content stored in discrete ’silos’, each with their own metadata format, different user interfaces, and no common search interface enabling users to discover content or navigate across collections. Some of our collections are linked at portal pages, but each collection remains, with a few exceptions, isolated and difficult to search. In addition, only a few collections offer a machine-readable interface, or any way to link their data with similar data in other Bodleian collections, or with collections at other institutions.
Digital.Bodleian aims to solve these problems by:
All of these tasks have been carried out using standards-compliant file formats and methods and with a view to future expansion, scalability and robustness.
The Digital.Bodleian project was initially funded by the JISC as part of the Resource Discovery programme, and began in November 2011.
The advantage iNQUIRE has for the Bodleian over other similar products is it will sit on top of our image repository/digital asset management system, indexing metadata and displaying images directly, rather than storing the images or metadata in a closed or application-specific system.
While the iNQUIRE front-end is a commercial Armadillo developed product, the backend makes use of open-source technologies for search, indexing, and image display, and Microsoft’s SQL Server for application-specific data.
The Bodleian already has in-house experience within BDLSS with Solr so we expect to be able to make use of the digital.bodlian Solr index in other applications. Within Solr we are using a DC-based schema for searching and indexing, and we are writing data handlers to enable the ingest of metadata from our legacy image collections into this Solr schema.
N.B. while we are using a relatively sparse DC-based schema to drive the faceting and searching within Solr, the original source metadata, which may be richer, will still be linked from within the web interface and will be searched in full by the index.
Existing image collections are being converted (using Kakadu) from their existing formats — largely a mixture of Group 4 compressed bitonal tiffs, and uncompressed 24-bit colour tiffs — to lossless jpeg2000 files. Djatoka then serves up these jpeg2000 images as part of the iNQUIRE web-interface.
iNQUIRE makes use of MS SQL Server to store data generated in the iNQUIRE interface such as user collections, tags, and comments, and also for the management and administration of the interface.
N.B. Bibliographic metadata is retained on our systems in open formats. Solr indexes the metadata and iNQUIRE provides links back to this bibliographic metadata. MS SQL is used only for iNQUIRE application-specific content.
Please see the below scan for a hand-drawn sketch [from an internal Bodleian meeting] of the digital.bodleian architecture.
Matthew McGrattan, Digitization Service Manager and Imaging Specialist
Christine Madsen, Head of Digital Programmes
Yvonne Aburrow, Data Librarian
Monica Messaggi Kaya, DAMS Software Engineer
Armadillo Systems iNQUIRE http://www.inquireresearch.co.uk