4

Discovery systems, layers and tools, and the role of the electronic resources librarian

Abstract:

Until recently, libraries were the primary places where people went to find and retrieve information resources. In the late 1990s and early 2000s, however, commercial search engines such as Google quickly became the standard to which library search tools were compared, and it was often a painful comparison for libraries. In the past few years libraries have been able to provide users with search tools that begin to approximate the ease and immediate gratification of Google, and these tools have changed the dynamic of technical services. In the context of discovery layers, all technical services librarians must learn to think like electronic resources librarians.

Key words

discovery system

discovery layer

federated searching

OPAC

web-scale discovery

A collection is only as good as the systems to access it. (Hirshon, 1991: 57)

Libraries have historically been the place for the discovery of information resources, but this position was somewhat taken for granted for many years. There was not a great deal of competition, so there was little pressure to innovate past the point of initial automation and the creation of the OPAC. When Google and other search engines took on information retrieval, however, the game changed. Users flocked to these resources, and libraries found they were far behind in usability and ease of access.

In addition to the library systems office, staff working with electronic resource management were particularly well situated to take on this challenge. Facilitating discovery in libraries requires knowledge of the collections, the metadata used to describe them and the technology that supports them, and electronic resource librarians had been steadily developing expertise in all of these areas. Furthermore, those working with electronic resources had not relied on the catalog for some time. Staff became proficient in systems that facilitated access through a host of alternative discovery points other than the OPAC, such as A–Z e-journal and database listings, and tools that linked disparate databases, such as OpenURL resolvers and metasearch systems, in an attempt to guide users to resources. Full-text holdings as part of packages or collections also required librarians to learn how to do batch updates and large-scale data loads – skills that would become important in constructing and managing a system intended to cover all that a library can provide.

This experience with alternate discovery routes and data management skills has placed electronic resource librarians optimally to play a significant role in investigating, implementing and managing the new discovery environments that now deeply interest libraries. Discovery environments, though, are intended to speak to all formats, so for the library to be most effective in providing such an environment, these skills must be held across technical services – not just by those well versed in ERM.

The evolution of the language of discovery

Searches for information can be divided into two main classes: known-item and subject-based. Either the user knows the work he or she is looking for, or he or she only has a sense of the general topic of the work and does not yet know the title, author, etc. It is this second type – the subject-based search – that rests most comfortably with the concept of “discovery”. This term has emerged as a critical part of the language of library search tools.

Even when the field of electronic information retrieval was emerging in the 1960s, the division from libraries was apparent. As Vickery (1965: 179) states in his monograph On Retrieval System Theory, “Information retrieval is traditionally a library problem. However, few of the terms in the index to this book are to be found in glossaries of librarianship.” As the structures and technologies of electronic information retrieval have evolved, so has the language used to talk about them. Some of the words that have been used in sometimes conflicting ways are next-generation catalogs, metasearch, federated search and discovery layers, interfaces and systems.

Although the term “metasearch” has often been used more broadly, for library searching it has become most directly associated with federated search. Likewise, “discovery”, a broad term on its own, now often implies a central index. Luther and Kelly (2011: 66), for example, distinguished metasearch tools, which “were the first attempts to meet this user expectation by querying each of the databases a library subscribed to and returning a single set of results”, from discovery, “which is modeled on the Google-style approach of building and then searching a unified index of available resources, instead of searching each database individually”.

Breeding (2010: 31) highlights the shift in language to “discovery”, focusing on the breadth of content as the distinguishing factor:

Initially, these new tools were called next-generation library catalogs, but now I prefer to call them discovery interfaces. They aim to provide access to all aspects of library collections, not just those managed in the traditional library catalog, which is limited to the content managed by the integrated library system.

Breeding (ibid.: 34) distinguishes the central index products further with the term “web-scale”. “We use the term web-scale to characterize the discovery platforms that aim to manage access through a single index to all library content to the same extent that search engines address content on the web.” Vaughan (2011: 6) also attaches expectations of performance to this kind of tool: “Web scale discovery can be considered a service capable of searching across a vast range of pre-harvested and indexed content quickly and seamlessly.”

“Discovery” for libraries, then, implies more than a search of the library catalog – it is at least a simultaneous search of the catalog and other local library databases such as institutional repositories; and web-scale implies even fuller breadth of content, mixing at least book and article-level records, as well as demonstrating simplicity and speed. The path to web-scale discovery and its trends in development are detailed below.

The OPAC

The initial OPACs, developed in the 1960s and 1970s and met with gradual acceptance by libraries in the 1970s, mirrored the structure of the card catalog by having the same points of access and bibliographic information. These OPACs eventually came to be called “first generation”. They were strict with users’ requests – requiring exact matches – and subject-based searches were reliant on the Library of Congress subject headings. OPACs of the later 1970s and into the 1980s, “second-generation” OPACs, brought the improvements of Boolean and keyword searching (Large and Beheshti, 1997).

The problem, and it is one that has significantly damaged the view of the library among its users, is that most OPACs stopped development here – at the second-generation state. There was interest in next- or third-generation OPACs that allowed for partial matches and relevance ranking even in the 1980s (Antelman et al., 2006; Large and Beheshti, 1997), but these new concepts were not often incorporated into the OPACs of the major ILS vendors.

What happened? Market stabilization. The OPAC was tied to the ILS, and it was difficult for libraries to change from one ILS to another. Once the ILS reached a high level of adoption by libraries, there was not enough competition to encourage vendors to develop the OPAC aggressively. As said by Calhoun (2006: 41) in her report on catalog integration to the Library of Congress, “There are few vendors, poorly capitalized, and libraries are a small and demanding market with, relatively speaking, little to invest in new ventures.” Libraries dream big but have shallow pockets, and when libraries have budget problems, so do the ILS vendors.

This is not to say that gradual improvements were not made in library search in this time. A great deal of attention was given to improving subject-based searching through enhancing records in the 1980s, and this continues today. There was also improvement to the look and feel of the interface of the OPAC, which is not insignificant to the user experience.

Overall, however, OPAC technology was at a standstill, causing a great deal of frustration for both librarians and users. Borgman (1986) wrote an article entitled “Why are online catalogs hard to use?”, but ten years later was compelled to write a follow-up, “Why are online catalogs still hard to use?”, stating “While user input is simpler and screen displays are much clearer and more attractive, the basic functionality of online catalogs has changed little since the late 1980s” (Borgman, 1996: 493). This was not much different even after another ten years: Antelman et al. (2006: 128) reported that “Library catalogs have represented stagnant technology for close to twenty years” and “the catalog has become for many students a call-number lookup system, with resource discovery happening elsewhere”. Calhoun (2006: 9–10) called the OPAC a “successful product”, but one that “has passed through a life cycle” and currently is “long on problems and short on unique benefits for users”. Surprisingly, even in a time of tremendous growth of search capability in the larger commercial world, library catalogs had not substantially improved their functionality for decades.

Catalog overlays

The key to improved discovery of catalog content was to separate the back end from the front end, or the interface from the ILS. According to Breeding (2010: 31–2):

One of the seminal breakthroughs in library automation involves the separation of resource management from resource discovery… The separation from automation… enables a more rapid, user-focused development strategy. Library management systems provide for the requirements of library personnel; discovery products serve library users.

A new, exciting tool which started to emerge in the mid-2000s once this division had been realized was the catalog overlay. It used the ILS data, but was not tied to the ILS. Libraries and library vendors partnered with vendors from outside the library world such as MediaLab Solutions and Endeca, and used the rich MARC format to build search engines and interfaces that included faceted browsing of results, relevance ranking and even virtual shelves where the user could browse by call number as if he or she was in the physical stacks. Improving the catalog became an “infectious endeavor” (Breeding, 2006: 1).

One of the significant experiments in the world of catalog overlays was North Carolina State University Libraries’ application of the Endeca Information Access Platform to its ILS data, made live in January 2006. Replacing the ILS keyword search engine with search software that had been developed for the commercial web environment opened up a whole new world of browsing and relevance ranking, improving subject-based searching for the user tremendously. It also represented smarter use of existing data. “The Endeca-powered catalog… leverages the ‘ignored’ controlled vocabulary present in the bibliographic records – subject headings and classification numbers – to aid in improving topical searching” (Antelman et al., 2006: 130).

Other libraries focused on open source solutions, dedicating a great deal of local staff time to the development of new tools. Villanova University, for example, developed VuFind as a replacement for its OPAC and to be a single search box for repository and local database material, and released it to users in August 2008. Open source development of discovery tools – by libraries, for libraries – was meant to be both a cost saving (at least in cash, although staff costs can be high) and representative of partnership. “Lucia’s [the Villanova university librarian] intellectual argument for open source in libraries complements a more practical one: libraries working together can create better software and systems than individual libraries can afford to create on their own” (Houser, 2009: 95).

An indicator of the success of VuFind was the contributions from developers in other libraries. As Houser (ibid.: 97) said, “This was real validation: the software was of interest to other institutions and good enough to justify an investment in testing and development time.” This project also showed the intense interest libraries had in these products. The National Library of Australia, for example, was actually the first to implement Villanova’s VuFind, even before Villanova itself (ibid.: 93). Other significant open source discovery projects have been Blacklight, developed at the University of Virginia, and the extensible Catalog (XC) Project at the University of Rochester. The XC Project was not just about creating an effective search tool; it was also about “informing the future development of discovery metadata for libraries” (Bowen, 2008: 6).

Catalog overlays, unfortunately, also revealed the ugly deficiency of the catalog records more than the traditional OPAC ever had. Libraries had always known that the means of discovery was only as good as the data beneath it, but using the facets built from the relevant hits, mis-spelled authors’ names and variations in subject headings were brought to the surface and made obvious. They also were not enough. Google had shown that seemingly everything could be in one place, and this became what users wanted and expected. Due to the challenges inherent in incorporating electronic resources into the ILS, many catalogs represented only a small proportion of their libraries’ electronic resources. The growth of investment in electronic collections combined with the disintegration of these collections from the catalog put increased pressure on libraries to find another discovery solution for their users.

Federated searching

The first products to begin to approximate a holistic search for library users were federated search systems, which started to emerge in the late 1990s. Through one interface and one query, users could search the library catalog and multiple databases. The system would send the search out to these other sources and wait for a response, then compile and organize the results for review by the user.

These federated search products were inevitably compared to Google products. Some authors did direct comparisons, such as Cooke and Donlan’s (2008) study of the relative performance of Serials Solutions’s Central Search, Microsoft’s Live Search Academic and Google Scholar, and others wanted their federated search actually powered by Google. The University of Nevada at Reno, for example, experimented with using the Google Search Appliance as a federated search utility in 2004. This was a large endeavor and required partnership between Google, the library and a test vendor (EBSCO, with sample Academic Search Premier records), but it ultimately faced technical and interoperability challenges (Taylor, 2006).

While in theory federated search was a good solution, in practice these systems were (and still are) often slow and the relevance ranking limited and incomplete. Federated search on its own without a central index was for many a bump in the road toward providing an acceptable discovery experience for library users.

Web-scale discovery

If the basic Google search had libraries alarmed about the flight of their users, Google Scholar was a direct hit. As Marshall Breeding (2005: 27) said, “The entry of Google into the realm of scholarly information encroaches deeply into territory that librarians once considered their own.” York’s (2006: 119) analysis of library guides about Google Scholar reveals the high emotion that this new service brought to librarians: “the range of tone that characterizes the Scholar library guides, from wildly defensive to surprisingly embracive, often reveals the unvarnished fears and hopes of librarians as they took their first look at this new player”.

What was clear was that libraries had to step up the pace of search development, and fast, if they wanted to remain a primary search destination for researchers. Google Scholar in particular tipped Breeding (2005: 27), one of the closest eyes on the development of library search capabilities, away from federated searching:

The recent debut of Google Scholar has convinced me that the architecture that underlies the traditional library approach toward search and retrieval cannot succeed as the sole system that librarians rely on to simultaneously search multiple electronic resources. It now seems clear to me that the current strategy of metasearch that depends on live connections casting queries to multiple remote information sources cannot stand up to search systems based on centralized indexes that were created in advance based on harvested content.

Although the concept had been there for some time (Roy Tennant in 1998, as just one example, spoke about a central index as a solution to the proliferation of digital libraries), the first vendor to approach web-scale discovery was OCLC as part of WorldCat Local, a customized version of WorldCat providing libraries with an OPAC alternative through integration with local library catalog records and services. In 2008, shortly after WorldCat Local moved to production phase, OCLC announced a partnership with Local search platform with full-text article metadata from H.W. Wilson, the National Library of Medicine, the Modern Language Association, the British Library and its own ArticleFirst database retrieved from a central index solution. Soon thereafter, OCLC embarked on aggressive partnerships with publishers and content producers to increase article-level metadata in WorldCat Local while selling off content rich assets, such as NetLibrary, to other library vendors, and discontinued its cooperative licensing program, positioning itself as a service and system provider rather than a content producer to support web-scale development in both discovery and back-end management solutions.

Other library vendors followed suit, building indexed solutions, such as Serials Solutions launching Summon as a stand-alone central index in 2009. An interesting aspect of Summon’s development was that it did not incorporate the Serials Solutions federated search or discovery layer solutions, 360 Search or AquaBrowser, instead relying on a whole new approach using central index technology. Summon took the interface advances made by catalog overlays and metasearch tools and applied them to a more powerful, larger search with a great deal of attention given to the infrastructure behind the interface. In contrast, another serials management vendor, EBSCO, built its discovery system off its already successful search interface, EBSCOHost, by adding catalog and other local metadata to a unified index of its own content. ILS vendors have also created competitive discovery products. Ex Libris, for example, launched Primo Central Index from its existing federated search product. III, acknowledging the challenge (or impossibility) of creating one complete index, used a different approach and built Encore Synergy using live web connections between collections and services.

Web-scale discovery systems are largely about the breadth of content in their indexes. They have books from the library’s local catalog, records from the institutional repository and digital collections, article-level records for immense numbers of journals and book content that extends far beyond the library’s individual holdings. The overlap factor – how much of an individual library’s holdings the system can cover – can become a major issue in making a decision about which web-scale discovery vendor to choose, as can how deeply indexed those resources are. An exciting development has been the matching of mass digitization projects such as Hathi Trust to locally held book titles. Using the digital full text of a book for discovery but presenting the library’s book as the result demonstrates how web-scale discovery can truly change the nature of navigating a library’s collection. These discovery systems, however, are part of the evolution from federated search and often continue to maintain that function to some degree. Due to the proprietary nature of content and indexing, this will unfortunately be necessary for the foreseeable future and might even worsen as competing products, especially ones tied to specific content producers, want to distinguish themselves from their competitors.

Because of these issues, one of the key distinctions made in the web-scale discovery world is whether or not the vendor is “content-neutral”. A number of systems are developed by vendors that also sell content, creating an interesting mess of partnerships amid stiff competition. Vendors and publishers can decide to use XML in proprietary ways or charge fees for access, if they agree to partner at all. Content-neutral players such as Ex Libris with Primo or OCLC with WorldCat Local might be able to say they have no dog in the fight, but they also do not have as much to put forward in a negotiation.

In addition to being about content, these systems are interfaces and connectors to local library services. They are expected to be highly capable discovery interfaces for users, with suggestions for alternate searches, partial-match hits, effective relevance and faceted browsing that allows users to narrow their search by subject, location, format and much more. They are expected to communicate with library services, displaying item availability and (hopefully) integrating ILS functions such as renewals. Development is now so fast for the products present in the market that Vaughan (2011: 11) in his report about web-scale discovery services did not even include a comparison chart of features, saying, “Given that things are changing so rapidly with these tools, and given the reality that quite a lot of local configuration options are available to each library customer, a comparison matrix did not seem appropriate.”

With the power of APIs and data management, it is important to note that development to enhance the effects of these commercial products is also taking place in libraries. In an article entitled “Hacking Summon”, for example, Klein (2010) describes how Oregon State University Libraries made improvements to location code display by consolidating the codes represented in Summon to a more concise and understandable list and adding a service between the Summon Availability API and the OPAC that summarized serials holdings and pushed online access links to the top.

This kind of work highlights the flexibility of this new discovery environment. For libraries that have advanced programming available in-house, all they may need from a commercial discovery product is its index, or they may decide to use the index from one service with the user interface from another. Claremont Colleges Library, for example, uses the Summon index to feed data to the Primo interface (Vaughan, 2011: 48), and Kyushu University in Japan uses the Summon index in combination with extensible Catalog software. More than ever before, libraries can create a customized discovery environment that uses the strengths of various commercial vendors and open source products as well as the capabilities within the library to create a discovery service crafted specifically for users. It becomes not a project completed and set aside, but a continuous process of improvement through attention to the user experience and the underlying data.

Themes of future development

Although none of us knows what the future will bring, a few themes of development have emerged in the world of discovery: a continued focus on the user; true one-stop shopping in the library’s web presence; bringing the web to the library and the library to the web; and attention to multimedia and data formats.

A continued focus on the user

One of the themes consistently present in the development of library search and discovery is a focus on the user. This is nothing new, as libraries exist to be used and to be of service to their patrons. Randall (1931: 31–2), for example, writing about card catalogs and subject description, said, “The problem, then, is to fit our catalogs to the patrons we serve… It can be done only by an intelligent study of the patrons themselves.” The focus on the user, however, can reach new heights of analysis with the technologies available now to track how users interact with systems. Usability testing has become almost expected for libraries wanting to improve the user experience of their websites and search tools, and possibly also to justify the need for improvement to administrators. As reported by the University of California Libraries Bibliographic Services Task Force (2005: 10), “Only through knowing our audience, respecting their needs, and imaginatively reengineering our operations, can we revitalize the library’s suite of bibliographic services.”

True one-stop shopping in the library’s web presence

The library discovery initiative is intended to make finding resources clear and easy for users. Sadeh (2007: 310) puts it well: “Unlike librarians, users are not aware of whether a resource is locally hosted or remotely hosted, free or licensed, MARC formatted or Dublin Core formatted, so libraries need to create an integrated, coherent environment that renders these distinctions invisible to the user.” Why stop, then, at just library resources and holdings? Why not include all library content, including the information the library provides about itself?

Some libraries have moved toward this concept by applying a creative mind to what they put in their catalog. The Orange County Library System in Florida, for example, marketed programs, databases and library services through its catalog by creating MARC records for them (Bost and Conklin, 2006). At the University of Michigan, librarians themselves are cataloged and easily found through a keyword search in the main discovery tool. The discoverability of a library’s LibGuides subject guides through Summon is also an important development, as it connects the initial search tool to deeper, curated learning tools.

Many librarians want to take this further and present only one web presence to users, including the entirety of the library website. The effect of the Google single search box is clearly so appealing to people, shouldn’t libraries have this, too? As Schmidt (2011: 22) says, “We’re expecting people to learn two interfaces – and often two suboptimal interfaces – when we should be providing a single great one. Throw all of our database interfaces into the mix, and there’s even more of a burden… Ideally, there would be no visual distinction between your library website and catalog.” Breeding (2010: 33) agrees: “To the largest extent feasible, the library’s web presence should offer users a seamless experience that presents a consistent interface, despite the use of multiple technology and content products behind the scenes.” One way to do this, of course, is to have the central index of the library’s web-scale discovery service harvest the pages of the library’s website.

Bringing the web to the library and the library to the web

Bringing outside resources into the library discovery environment is one way of broadening its reach and making it more relevant to its users’ other web experience. Libraries have done this by pulling in book covers from Amazon or Syndetics, reviews from a number of sources including Amazon and ChiliFresh.com, readers’ advisory from EBSCO’s NoveList database and even maps from Google Maps (Zylstra, 2011). Not surprisingly, some librarians have issues with certain kinds of outside information coming in or being linked. Villanova University’s initial implementation of VuFind, for example, linked authors’ names to information about them in Wikipedia and brought in ratings from Amazon, but this content came out before it was released because some librarians there found it “too casual” or “not authoritative” (Houser, 2009: 99).

Another way to connect the library and users is to provide a compact, built-in tool in its web browser. LibX (www.libx.org/) is an Internet browser plug-in developed at Virginia Tech University that provides direct access to a library’s catalog and selected databases. In addition to providing a search box within the header of the browser, LibX attempts to link ISBNs, ISSNs and DOIs (digital object identifiers) back to the library’s catalog or OpenURL resolver. As another example, Zotero (www.zotero.org/) resource management and citation generation software developed at George Mason University is a tool that can be embedded in an Internet browser to provide a connection between a library and its users through the storage of research materials.

Some believe, however, that true interconnectedness can only be accomplished through linked data, or the process of creating connections across the web in a way that can be read by the tools and technologies involved rather than just by people. Singer (2009) gives a good summary of linked data and how it might apply to libraries, describing the connections of one core serial title to the licensed electronic resources package, link resolver, metasearch application, database provider and ILS. Singer (ibid.: 121) also makes a good case for why it is important for libraries to get a presence in the larger information environment using linked data: “The information the library contains also would be a welcome and heavily used resource if it was of the Web as opposed to standing apart from the rest of the information universe bridged by rickety connections into its silos, or as an island, inaccessible from the mainland.” Bradley (2009: 49) makes a similar case but focuses on the benefits to the library of connecting to outside resources: “Linked Data sources can allow libraries to link out to a much wider range of information, allowing a better sense of ‘aboutness’ – the places, people, and information that a resource describes, not just information about the resource itself.” Similar to sharing resources with other libraries, linked data challenges the boundaries of libraries and the traditional controls over data and access.

Multimedia and data formats

Although library collections do have multimedia components, they continue to be primarily text-based. Given the dramatic changes in technology that now allow the production of audio, video and image content cheaply and easily, there has been a tremendous rise in the sheer amount of multimedia content. Search engines are beginning to catch up, providing search results in multiple formats, but libraries will soon be expected to follow suit. This will challenge the storage, access and resource description abilities of libraries, as multimedia formats have significantly different needs than text-based formats.

The situation is similar for data formats, such as datasets. Quantitative and qualitative analyses are increasingly important in academic coursework, and libraries are under increasing pressure to support students’ data needs for their projects. Due to improvements in technology, data are easier to create than ever before in many fields, and libraries need to provide mechanisms of storage and retrieval for locally created datasets to ensure their preservation and continued use.

The role of technical services

The path of library discovery has involved a changing role for technical services. With the OPAC, the interface was often left to the public services or web services sides of the library, with technical services’ major attention give to the client side of the ILS and the data within the system. The data were also fairly static and primarily changed one record at a time. The growing attention to discovery and access, however, has brought with it a growing need for technical services to be involved with the tools patrons use to find library resources. Managing a discovery layer is about data management on a large scale and the interconnectivity between systems, and because those in technical services are versed in both the nature of the collections and the technology, they have a major contribution to make in the discovery environment.

This is easiest to see for the electronic resources found through a discovery layer. To implement a discovery layer effectively, librarians need to understand the structure of the electronic packages, titles and collections, the licensing or relationship with the vendor and the dynamic nature of electronic resources, such as those in aggregated collections. Those working within systems that connect tools to full-text resources, such as OpenURL resolvers, and within the knowledge bases that control information display and transfer are well situated to troubleshoot challenges within discovery layers. The discovery environment is not a static, stand-alone system like the ILS – it is an interconnected series of systems with dynamic metadata such as license terms, renewals and dates of coverage that work together to inform patron access.

Although discovery layers might be the most intuitive fit with electronic resources, as they are intended to include ideally all a library’s holdings (even the “just-in-time” content), they demand attention from all of technical services. To implement the discovery layer WorldCat Local, for example, a library’s book holdings must be reconciled with its WorldCat holdings. That the holdings of a given book are represented in WorldCat for the purposes of borrowing with other libraries is something familiar to traditional technical services, but an extension of this concept is more familiar to the electronic resources librarian – the knowledge base of the ILS must be mapped to the knowledge base of OCLC. The library’s records must be exported, batch-updated to have the correct OCLC number present and reloaded en masse into the ILS. In this way, even physical format acquisitions and cataloging personnel must become familiar with dynamic metadata handled in batch forms, data that are up in the cloud and the cyclical workflows of updating material in the discovery environment. In effect, they must gain the skill sets of electronic resources librarians, even if they continue to deal only with physical-format materials.

Discovery layers also represent the changing relationship between technical services and library systems. Because many of these systems are hosted services and do not require a systems intermediary for changes, the role of the systems department has changed. Instead of being responsible for hardware maintenance and software upgrades for products like the OPAC, systems departments are needed for assistance with developing APIs and other interface tools. On the other hand, because technical services librarians can have direct access to and control over the discovery systems, they are brought even closer to the user experience.

Discovery in a broader context

Guthrie and Housewright (2011: 81), writing about the results of Ithaka’s Faculty Survey 2009, give urgency to the discovery dilemma for academic libraries: “Changing behaviors and practices increasingly put the academic library at risk of being disintermediated from the discovery process, a possibility that, if realized, could cause libraries to be irrelevant in one of their core functional areas.” They emphasize the need for careful financial investment and continuous assessment:

The fact that the perceived value of the gateway role has declined is a point that must be factored into libraries’ resource allocation decisions; the trend over the last decade makes an even more powerful argument that libraries need to consider very carefully the investments they make in search and discovery services… Libraries need to regularly assess whether their constituents continue to use and value the gateway services that they provide to ensure that the level of investments being made are justified by the benefits being delivered and valued by their constituents. (Ibid.: 88–9)

An interesting comparison can be made between the results of this survey and those of the Ithaka Library Director Survey 2010. Even with all the success of commercial, non-library discovery systems, and apparently against the search patterns and preferences of many researchers, academic libraries still want to be in the game of discovery:

Library directors believe that it is strategically important that their libraries be seen by users as the principal starting point in the discovery process. While they recognize that faculty members and students increasingly rely on resources outside the library for discovery of information and content, they would like to invest more in discovery tools to aid users. (Long and Schonfeld, 2010: 6)

Ultimately, although web-scale discovery systems are far from ideal, they are a large step in the right direction for libraries if they do want to return to being a primary online destination for researchers. Further, the linked data projects have shown that libraries do not have to be on their own any more, separated from the rest of the web. They can become the discoverable, interconnected resources they were always meant to be.

For technical services, this will likely mean that all personnel – those working with both tangible and intangible materials – will be expected to be involved in the data exchange between systems, many of which are not locally hosted or controlled. As it is the investment of time in a discovery layer, both the front end and the underlying system and metadata infrastructure, that makes a truly good system, the “all hands on deck” situation of implementation, which often involves strong collaboration among library systems, technical services and public services, needs to be carried forward into a commitment to steady, continual improvement.

Case studies

Spotlight on the University of Otago (New Zealand): Success through partnership

We spoke with Helen Brownlie, systems librarian at the University of Otago Library, to learn more about the decision process and implementation of a discovery layer within the Library Consortium of New Zealand. As this case shows, there is great benefit to libraries working together to analyze products and systems; the value of shared expertise rings true throughout the consortium’s evaluation process.

The Library Consortium of New Zealand (LCoNZ) is made up of four universities – the AUT University, University of Waikato, Victoria University of Wellington and University of Otago. Unlike many consortia formed in the electronic resources boom that are focused on purchasing and licensing resources together, this consortium was created in 2004 with the primary goal of collaborating on a library management system.

The initial project, the implementation of Voyager, set the stage for the dynamic within LCoNZ. Together, the universities purchased hardware and selected an external vendor for hosting Voyager, but they decided to maintain separate installations of Voyager. In addition to the benefit of shared costs, the universities benefit from shared expertise, as the libraries work together on upgrades and system maintenance. A reflection of their collaboration as well as of their reciprocal borrowing privileges, the libraries also keep their OPACs the same except for branding.

Similar to the ILS situation, in the interest of building an effective search environment for users, LCoNZ moved together from Endeavor’s LinkFinderPlus for its OpenURL services to Serials Solutions’s 360 Link and, with it, the 360 Search federated search product. Similar to the Voyager implementation, the universities moved together on the decision but each had individual subscriptions to the products. At the University of Otago, 360 Search was used for three years. Due to Internet connection issues that are common to libraries in this region and other challenges such as the limited number of returned results, 360 Search caused frustration for patrons and library staff and was only used in a limited capacity. The library decided to include only general databases and target its use to undergraduates.

As 360 Search had provided only a limited metasearch tool for the consortium, the universities decided to form a project team to analyze the field of discovery and the products within it. This team did an initial evaluation of ten discovery layer products to understand the general state of the market. Members spoke to vendors and compared functions, and from this created a list of requirements for a discovery product. Using this, the team narrowed the initial list to four products; when two promising new products emerged in the market, they were added to the shortlist.

The project team divided the six products among members and each individual member did a more detailed evaluation. This led to a reconsideration of the requirements. The first list of requirements had included branding, advanced searches and other sophisticated features – things at odds with the Google-like search that was the ultimate goal of the project. The team refocused the project on users, and this cut the list of requirements down dramatically.

Rather than wanting something with lots of specialized features, they decided they wanted something that was not complicated. Despite the loading of MARC records for electronic resources into the OPAC, for example, users were still not finding everything they needed in one place, so simplicity became the goal. The important features narrowed largely to two primary requirements: a single search box, and as much content and coverage as possible.

After this evaluation, the shortlist was narrowed again to three: Summon, EBSCO Discovery Service and the Primo Central Index. A request for proposal was issued to these three companies, which all responded. Central to the decision process was the content and coverage of the discovery layers, including how well they represented the libraries’ holdings. Each library, for example, put forward a list of its top 100 journals based on the previous year’s usage. Fortunately, the universities had many of their general databases and subjects of coverage in common so there were not dramatically different situations to consider in the context of the discovery tools.

Based on this final analysis, Summon was selected by the project team, largely because it had such extensive coverage within its central index. For example, although the universities had different lists of top journals by usage, most (if not all) of the journals in each list were covered by Summon. Another factor was that, because the universities had already populated the Serials Solutions knowledge base, it would be an easier implementation than if they were starting over with a new vendor. Just as LCoNZ had done in the past, the institutions each implemented Summon as individual libraries.

For the University of Otago system, this implementation involved many libraries. In some ways, because they share library systems, the University of Otago system is more like a traditional consortium than LCoNZ. The University of Otago, University of Otago, Christchurch and University of Otago, Wellington each had a different ILS in the past, but when LCoNZ implemented Voyager the three campuses merged their databases and the University of Otago took on the administration of the ILS. It also took responsibility for managing access to the electronic resources of the other schools’ libraries. This involves maintaining four separate knowledge bases within 360 Core – one for each of the campuses and one for Otago Polytechnic, which shares one of the libraries on the Dunedin campus.

Because the University of Otago system needed one link associated with each resource in Summon, it implemented the system using the Dunedin campus knowledge base. The other institutions’ resources, though, are present through the MARC load from the ILS, as all the institutions’ records are represented there. Summon is still treated as an alternative search, not as a replacement for the OPAC, although that consideration is on the table for the future. The University of Otago is also part of a potential regional plan to do a joint usability study for discovery tools.

LCoNZ is now beginning to look at the market for next-generation library management systems. This kind of system, envisioned as a purely online environment, would inevitably change the dynamic of the LCoNZ institutions to some degree. Their shared hardware savings with a hosted ILS, for example, will not apply. However, the reason that the consortium was originally formed is still valid. They may move into other areas, but as the decision process of selecting a discovery layer has shown, there is great benefit to libraries working together to evaluate and implement library systems.

Spotlight on the Memorial University Libraries (Canada): Building expertise through experience

We spoke with a librarian at the Memorial University Libraries in Newfoundland, Canada, to learn more about how this library and its consortium, the Council of Atlantic University Libraries, have approached the concepts of search and discovery. It is an interesting contrast to the Library Consortium of New Zealand, as it speaks more to the perceived consortial benefit of implementing a system together rather than evaluating together with the intention to implement systems independently.

Memorial University of Newfoundland is a member of the Council of Atlantic University Libraries/Le Conseil des bibliothéques universitaires de l’Atlantique (CAUL-CBUA), a consortium of 18 colleges and universities within the Atlantic Canada region. Although the consortium was built largely around facilitating a document delivery system among the campuses and buying electronic resources together, in recent years members have also collaborated to implement search and discovery tools.

This shift toward a cooperative discovery approach can be seen as a somewhat non-deliberate evolution. When the consortium selected SirsiDynix Single Search, for example, it was largely based on a domino effect from selecting a link resolver together. Later, when the consortium selected WorldCat Local, it was not a joint decision but rather a cascading decision, as a number of institutions had selected WorldCat Local independently before the consortium decided to implement it together.

These search and discovery tools have been a rough road for the Memorial University Libraries. They made a concerted effort to implement Single Search in the fall of 2008, and unfortunately the most concrete result of the experience was learning that, thanks to being on an island, their Internet connection was very slow – too slow for a federated search environment such as that of Single Search to function well. All the inherent challenges to a federated search environment, such as effectively sorting results, were compounded by slow connections to the resources.

In December 2008 the Memorial University Libraries along with the rest of the CAUL-CBUA consortium decided to license WorldCat Local as the next attempt to provide an effective discovery tool for their users. The implementation was challenging at the Memorial University Libraries. It began with a reclamation process to match holdings in WorldCat; a significant process in any library, but the Memorial University Libraries only had approximately a third of their holdings reflected in WorldCat prior to the reclamation. This process also included battles with vendors to get all the libraries’ digital collections represented in the product, not all of which were initially won.

Issues with field mapping continued for some time, as did problems with the daily update and synchronization process – when records were not accepted, it was difficult to diagnose why. The implementation took most of the following year, and a full workable version was not ready until mid-fall semester 2010. One of the disappointing realizations, especially because the combination of books and articles in one place had been an exciting opportunity, was that WorldCat Local relied heavily on federated searching for some article metadata sources. The libraries had already decided not to use this technology due to the challenges with Internet speed, but it was not easy to tell within the knowledge base at the time whether a given database was searched through the federated process or included in the central index. Another surprise was how poorly WorldCat Local handled known-item searching, which is especially critical for library staff users.

The Memorial University Libraries did user testing on WorldCat Local in the fall of 2010, soon after its release to the public. They focused on users with no previous experience with library systems, such as first-year undergraduate and graduate students, and included both known-item and subject-based searching of book content. Users did not show a strong preference between WorldCat Local and the libraries’ traditional SirsiDynix Symphony OPAC, and librarians noted that task scores were similar for both systems.

One of the good results to come out of the WorldCat Local implementation is that it highlighted a number of back-end internal processing issues. The Memorial University Libraries have taken a closer look at their cataloging and electronic resource record processing systems and workflows, trying to identify ways to manage their resources better. Thanks to this, the interrelationship between electronic resources management and discovery has become clearer – little problems made early on in acquisitions can impact user ease of use and discovery dramatically. This is largely due to the disparate databases that hold information about resources. For example, a user may search WorldCat Local and find it says that the library has a resource, but if this resource is not correctly represented in the link resolver, the user will not get to the full text.

This awareness has played a large role in the next stage of development in discovery at the Memorial University Libraries. There has been a grassroots effort by the core team that implemented WorldCat Local to form a new Discovery Services Task Group to take a step back and closely evaluate the discovery environment in the context of the internal systems and processes it is built on. Central to this is the goal of having one common backend database among the discovery tool, link resolver and ERMS.

To date the team has decided that it would only consider vendors that could provide all three components it is interested in (discovery tool, link resolver and ERMS). It has also eliminated products that gave concerns about compatibility with its existing ILS, as the workload to change the ILS would be too great. This narrowed the field of options down to two providers – EBSCO and Serials Solutions. It is currently gathering comparative data about the vendors’ products, trialing products where possible and experimenting with other libraries’ publicly accessible versions.

References

Antelman, Kristin, Lynema, Emily, Pace, Andrew K. Toward a twenty-first century library catalog. Information Technology and Libraries. 2006; 25(3):128–139.

Borgman, Christine L. Why are online catalogs hard to use? Lessons learned from information-retrieval studies. Journal of the American Society for Information Science. 1986; 37(6):387–400.

Borgman, Christine L. Why are online catalogs still hard to use? Journal of the American Society for Information Science. 1996; 47(7):493–503.

Bost, Wendi, Conklin, Jamie. Creating a one-stop shop: using the catalog to market collections and services. Florida Libraries. 2006; 49(2):5–7.

Bowen, Jennifer. Metadata to support next-generation library resource discovery: lessons from the eXtensible Catalog, Phase 1. Information Technology and Libraries. 2008; 27(2):6–19.

Bradley, Fiona. Discovering linked data. Library Journal. 2009; 134(7):48–50.

Breeding, Marshall. Plotting a new course for metasearch. Computers in Libraries. 2005; 25(2):27–29.

Breeding, Marshall. OPAC sustenance: Ex Libris to serve-up Primo. Smart Libraries. 2006; 26(3):3–4. [: 1].

Breeding, Marshall. The state of the art in library discovery 2010. Computers in Libraries. 2010; 30(1):31–34.

Calhoun, Karen The changing nature of the catalog and its integration with other discovery tools. final report for Library of Congress, 17 March, 2006. available at, (accessed: 5 July 2011). www.loc.gov/catdir/calhoun-report-final.pdf

Cooke, Rachel, Donlan, Rebecca. Thinking inside the box: comparing federated search results from Google Scholar, Live Search Academic, and Central Search. Journal of Library Administration. 2008; 46(3/4):31–42.

Guthrie, Kevin, Housewright, Ross. Repackaging the library: what do faculty think? Journal of Library Administration. 2011; 51(1):77–104.

Hirshon, Arnold. Beyond our walls. In: Racine Drew, ed. Managing Technical Services in the 90’s. New York: Haworth Press; 1991:43–59.

Houser, John. The VuFind implementation at Villanova University. Library Hi Tech. 2009; 27(1):93–105.

Klein, Michael, B., Hacking Summon. Code{4}Lib Journal, 11 2010; available at http://journal.code4lib.org/articles/3655 [(accessed: 4 June 2011)].

Large, Andrew, Beheshti, Jamshid. OPACs: a research review. Library & Information Science Research. 1997; 19(2):111–133.

Long, Matthew P., Schonfeld, Roger C., Ithaka S+R Library Survey 2010: insights from US academic library directors, 2010. available at: (accessed: 5 July 2011). www.ithaka.org/ithaka-s-r/research/ithaka-s-r-library-survey-2010/insights-from-us-academic-library-directors.pdf

Luther, Judy, Kelly, Maureen C. The next generation of discovery. Library Journal. 2011; 136(5):66–71.

Randall, William, The uses of library catalogs; a research projectChicago, I.L., eds. Catalogers’ and Classifiers’ Yearbook; Vol. 2. American Library Association, 1931:24–32.

Sadeh, Tamar. Time for a change: new approaches for a new generation of library users. New Library World. 2007; 108(7/8):307–316.

Schmidt, Aaron. The user experience: a site divided. Library Journal. 2011; 136(6):22.

Singer, Ross. Linked library data now!. Journal of Electronic Resources Librarianship. 2009; 21(2):114–126.

Taylor, Mary. Using the Google Search Appliance for federated searching. Internet Reference Services Quarterly. 2006; 10(3):45–55.

Tennant, Roy. Interoperability: the holy grail. Library Journal. 1998; 123(12):38–39.

University of California Libraries Bibliographic Services Task Force, Rethinking how we provide bibliographic services for the University of California”, final report, December, 2005. http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf [available at, (accessed: 5 June 2011).].

Vaughan, Jason. Web scale discovery services. Library Technology Reports. 47(1), 2011.

Vickery, B.C. On Retrieval System Theory, 2nd ed. Washington, DC: Butterworths; 1965.

York, Maurice C. Calling the scholars home. Internet Reference Services Quarterly. 2006; 10(3):117–133.

Zylstra, Robert. A mobile application for discovery. Computers in Libraries. 2011; 31(2):11–14.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset