Please tell us what you think of the new Bulletin interactive pdf! Feedback
Bulletin, August/September 2007
Standards in Electronic Resource Management
by Rafal Kasprowski
The specifications published by the Digital Library Federation’s Electronic Resource Management Initiative (DLF ERMI www.diglib.org/standards/dlf-erm02.htm) in 2004 have become the de facto standard for the development of electronic resource management systems (ERMS or ERM systems). The document specifies data elements, functions and interrelationships between elements. A major objective of the group’s second phase (ERMI 2) is to develop standards for license expression and usage statistics collection, reducing the administrative costs of both data sets.
The need for ERM standardization has led the stakeholders to collaborate. ERMI 2 and EDItEUR (www.editeur.org) are attempting to incorporate the data elements proposed by ERMI into the ONIX (ONline Information eXchange www.editeur.or) family of publisher data standards through the efforts of the License Expression Working Group (LEWG - www.niso.org/committees/License_Expression/ONIX-PLworkshopminutes19Dec.pdf). The objective is to make license data available in a single data container that could be downloaded directly into ERM systems. Similarly, members of the Standardized Usage Statistics Harvesting Initiative (SUSHI - www.niso.org/standards/resources/Z39-93_DSFTU.pdf) have developed a standard model for importing COUNTER-compliant usage data retrieved from vendors into ERM systems using a platform-independent protocol. COUNTER (Counting Online Usage of NeTworked Electronic Resources - www.projectcounter.org) is a standard promoted by the internationally based COUNTER project.
This report summarizes the session on ERM standards held at the 2006 ASIS&T Annual Meeting. A related session on ERM was held at the 2005 Annual Meeting (www.asis.org/Bulletin/Aug-06/kasprowski.html). The 2006 panel included Ted Koppel, Verde product manager at Ex Libris, Inc., and member of several key standard initiatives, notably SUSHI and LEWG; Adam Chandler, service design group coordinator for information technology and technical services at Cornell University Library and co-chair of SUSHI; Nathan D.M. Robertson, electronic resources librarian at the University of Maryland Thurgood Marshall Law Library and co-chair of LEWG. Koppel presented a historical overview of the standards development preceding and leading to the current ERM standardization efforts. Chandler discussed the creation of the SUSHI protocol and presented examples of current vendor applications of the protocol. Robertson explained the work of LEWG on a license expression format (as opposed to a rights expression language) that would facilitate the licensing process. The session ended with a question and answer period.
Electronic resource management builds on traditional library functions, such as serials management, metadata assignment and item identification. Some significant standards relevant to electronic resources have been around for a long time; other standards, laying the groundwork for standards to come, are more recent.
The ISSN, used for identifying many varieties of serial publications, has recently been modified into a linking ISSN (ISSN-L) to identify the different formats (paper, web, PDF, etc.) in which serials are being delivered. The ISBN, the NISO/ISO standard identifier for print and electronic books, has recently changed to a 13-digit format. SICI, the Z39.56 Serial Item and Contribution Identifier, identifies both the item (issue) and article (contribution) of a journal. This identifier is currently used for full-text retrieval and as a delivery mechanism for full-text suppliers JSTOR and OCLC. It is part of the Z39.88 OpenURL linking format and is used as a bar code by publishers for shipping and paper serials check-in. The DOI, Z39.84-2000 Digital Object Identifier, often accompanies citations and is used to access digital objects via a link resolver. Both the SICI and DOI can be partly structured on the ISSN.
New standards related to electronic resource management, besides those developed by SUSHI and LEWG, include the COUNTER standard, which facilitates the recording and exchange of usage statistics. It defines data elements, usage reports and the format for delivery of usage reports to ensure the data is comparable across publishers. It also promotes a code of practice regarding the minimum level of COUNTER (www.projectcounter.org) compliance for statistics collection in the publisher industry.
The Joint Working Party for the Exchange of Serials Subscription Information (NISO/EDItEUR JWP) has developed an XML format for exchanging serials data as part of the ONIX (www.editeur.or) for Serials Project. This effort resulted in three XML formats for data exchange – SOH (Serials Online Holdings) for communicating electronic serials holdings; SRN (Serials Release Notification) for serials issue (or article) publication notification; SPS (Serials Products and Subscriptions) for communicating serials catalogue information or subscription details.
The Z39.88 OpenURL framework is a linking standard referring to the metadata string and to the technology. It is structured to accept and process citation data and resolve it to the item level, full text or other. A knowledge base is used as the library’s holdings repository to ensure that item-level linking reflects content available from the library.
Since 2003, several best practices guidelines have been produced for building and implementing metasearch engines (NISO RP-2005 documents 01 to 04). Two specific meta-searching standards, both in draft standard for trial use status as of October 2006, were also generated: the NISO Z39.91-200x Collection Description Specification and the NISO Z39.92-200x Information Retrieval Service Description Specification. These standards describe collections and services using Dublin Core (DCMES – Dublin Core Metadata Element Set) and are used for discovery and interoperability functions.
Electronic resource management is a fertile area for a variety of standards posing few technical difficulties. Standardization success would depend on the stakeholders recognizing the problem, finding the solution and agreeing on an implementation format.
Possible standards may include an IP address change broadcast mechanism whereby libraries would update content providers of changes to their IP addresses. A service interruption information protocol compatible with ERM systems could in turn relay information from content providers to libraries about service unavailability for affected platforms. Scheduled and unscheduled access interruptions and status updates could be communicated to libraries using this protocol. The ERM systems would update their unavailability information accordingly and display “system is down” and “system is back up” messages to patrons via a public web service interface. An incident report transmission protocol would also allow libraries to report problems with electronic resources to the content providers.
As the use of ERM systems continues to grow, an ERM data exchange protocol for exchanging information between ERM systems would prove practical, especially for libraries belonging to the same consortium.
Standards addressing not only electronic resource management but also other library functions and processes are equally in demand. A unique identifier for collections and databases has been lacking for at least 20 years. Thanks to this standard packages would be referred to by number in a systematic way, instead of by name as is the case currently. A unique identifier applicable to all libraries and library branches and facilities worldwide is needed. The current Standard Address Number (SAN) identifies every library to the branch and department level, but is only used in North America, while the International Standard Identifier for Libraries and Related Organizations (ISIL, ISO 15511) is not detailed enough. A specification allowing acquisitions-related data to be shared between integrated library systems and ERM systems is in its initial development stages.
Birth of SUSHI
Although the retrieval of usage data by libraries was not a concern initially, regular collection of data across every COUNTER-compliant package licensed by a library creates a heavy workload. Retrieval was the bottleneck preventing wider use and interpretation of COUNTER reports (www.projectcounter.org) , and the need to import usage statistics efficiently into a central repository for easier management became increasingly apparent.
Late in 2004 Adam Chandler met with Innovative Interfaces, Inc., to develop a process for importing statistics into ERM systems and building a protocol for exchanging COUNTER reports. Excel files were already used for exchanging COUNTER data, but Excel posed two problems: format inconsistency and ambiguity about character encoding. The COUNTER XML reports address these two problems, but most librarians do not have XML client tools on their desktops.
In June 2005 a group of librarians and vendors interested in the problem met at the ALA annual conference in Chicago, and work on a protocol officially began the following month. In October 2005 the NISO Standards Development Committee (SDC) recommended making the SUSHI protocol a NISO initiative. From November 2005 through summer 2006 SUSHI version 0.1 was developed and tested. NISO and COUNTER signed a memorandum of understanding that allowed the SUSHI protocol to use the COUNTER Code of Practice XML schemas. In September 2006 the SUSHI 1.0 Draft Standard for Trial Use was released.
The Z39.93-200X NISO SUSHI protocol defines only two messages: a report request going out and a report response coming back. The report request defines the organization making the request (requestor) and the organization for which the report is requested (customer reference). The requestor and the customer reference can be different in the case of a third party, such as a consolidation service, making the request on behalf of the customer. A report definition for the requested report and date range is also provided. The report response contains the original request and the desired COUNTER report.
SUSHI messages are written in the XML-based Web Services Description Language (WSDL) and are usually designed for COUNTER-compliant usage reports (www.projectcounter.org), although they can be adapted to other reporting standards. Libraries can develop additional scripts to combine the usage data with other data sets, such as acquisitions data, to obtain cost per use figures. Conventional web services security measures are used to ensure confidentiality – Secure Sockets Layer (SSL) encryption protocol, Internet Protocol (IP) addresses or client identification in the SUSHI message itself.
Adoption of SUSHI
As of November 2006 several vendors have been working on including the SUSHI (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) protocol in their product line. Innovative Interfaces, Inc., which incorporated SUSHI version 0.1 in the 2006 release of their ERM module, announced a 1.0 version upgrade in the following months. Ex Libris planned adding SUSHI (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) features to their Verde ERM system in mid-2007. Serials Solutions planned to release a SUSHI-compliant version (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) of their COUNTER (www.projectcounter.org) usage data collection service around the same time.
MPS Technologies offers a fee-based usage data collection service for libraries, called Scholarly Stats, and has been testing whether its usage reports can be downloaded into the ERM systems or usage reporting systems of other vendors using the SUSHI (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) protocol.
A few content providers have already made their usage reporting systems compatible with the SUSHI (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) protocol, but most content providers lag behind. Content providers would save by investing in simple machine-readable interfaces designed to handle the SUSHI (www.niso.org/standards/resources/Z39-93_DSFTU.pdf) protocol only and let ERM systems or usage consolidation services manage the data.
As in the past, librarians want to use content and providers want to sell it, so what has caused this exchange to become invariably associated with arduous licensing procedures involving complicated legal jargon?
A grant of rights in intellectual property without conveying ownership is a licensing agreement. In the absence of a formal contract, the default terms for intellectual property are governed by copyright. Copyright law offers a good balance of restrictions and options for use, but early court decisions about emerging electronic products originally held that software was non-copyrightable. Without copyright protection, licenses served as an approach for content providers to protect their content. Software is now considered to be protected by copyright law; however, copyright still does not apply to some content, such as statements of fact in phonebook or database format. Licenses are the proven way for aggregators to protect their products. Libraries can in turn use licenses to make additional performance expectations, such as a certain amount of uptime, legally enforceable.
Unlike most contracts, which represent a negotiated agreement between parties, click-through licenses are contracts of adhesion, where a single party dictates the rights of use to the other parties. Case law on contracts of adhesion varies, but generally they are considered valid unless certain terms are unconscionable. They could possibly be considered binding even if they limit educational or professional use.
Before the advent of ERM systems, licenses were put away in filing cabinets and forgotten. Today licenses are being put to new use thanks to the work of the DLF ERMI (www.diglib.org/standards/dlf-erm02.htm).
Enter the DLF ERMI
In response to demand by libraries for better control over the rapidly growing online collections they were licensing, the DLF ERMI (www.diglib.org/standards/dlf-erm02.htm) was established to define all the activities and concepts related to electronic resources. The group’s final report, published in 2004, became the de facto standard for building ERM systems. The document included, among other things, license element descriptions, which set the stage for a systematic approach to license management.
License Expression v. Rights Expression
Why was it necessary to develop a new standard for license expression when rights expression languages (RELs) dealing with the same permissions and prohibitions already existed? ERMI investigated the possibility of adapting license expression to rights expression at the time it was developing its best practices, but concluded that RELs could not convey the ambiguity present in many licenses and in copyright law.
RELs are designed for digital rights management (DRM), which is used to automatically enforce limitations on users' behaviors. For example, it is an application of DRM that may prevent the Interpol warning or preview trailers on DVDs from being skipped. This kind of strict machine enforcement of technical prohibitions on user behavior requires absolutely explicit and very granular expressions. Compared with the various explicit, interpreted and silent license values identified by the ERMI group, RELs only allow for two: Permitted (explicit) and Prohibited (interpreted). This does not leave room for ambiguity, which is an integral part of copyright and licensing, as expressed by the existing case law on copyright and fair use: “[Adjudicating fair use] is not to be simplified with bright-line rules, for the statute, like the doctrine it recognizes, calls for case-by-case analysis.” [Campbell v. Acuff-Rose Music, Inc., 510 U.S. 569 (1994) <www.law.cornell.edu/supct/html/92-1292.ZS.html>]
Licenses regularly allow users to download and print a “reasonable number of copies for educational or personal use.” Computers cannot determine what a reasonable number is or what the user’s intended use is. So RELs are not expressive of copyright law or compatible with every licensing situation. In allowing libraries and providers not to have to define the exact number of permitted copies, licenses prevent negotiations from stalling and help both parties come to an agreement.
License Expression Working Group
The License Expression Working Group (LEWG - www.niso.org/committees/License_Expression/ONIX-PLworkshopminutes19Dec.pdf) proposes that publishers work together with libraries in defining terms according to the same standard so that the terms can be formatted according to that standard and imported into ERM systems automatically rather than manually. The project has support from the publisher community, who would like all possible terms of a license to be encoded. The scope of LEWG (www.niso.org/committees/License_Expression/ONIX-PLworkshopminutes19Dec.pdf) thus extends beyond the specifications initially proposed by DLF ERMI (www.diglib.org/standards/dlf-erm02.htm).
LEWG (www.niso.org/committees/License_Expression/ONIX-PLworkshopminutes19Dec.pdf) is jointly sponsored by DLF, NISO, EDItEUR (www.editeur.org) and PLS (Publisher Licensing Society of Great Britain). It takes EDItEUR’s (www.editeur.org) ONIX standards as a basis for a new ONIX licensing message that will allow (but not require) greater specificity and granularity than the terms produced by DLF ERMI (www.diglib.org/standards/dlf-erm02.htm). For example, the license expression format would allow a specific encoding to indicate that interlibrary loan is allowed only with other libraries in the same country as the licensing library, instead of relying, as ERMI does, on a free-text note to express this constraint. That specific encoding would, of course, only be applied if the license itself included that constraint on interlibrary loan activities.
Libraries could negotiate license terms with providers following this licensing standard, code the terms directly into a license format and load the final version into the ERM system.
License Expression Format
ERM systems are currently able to specify license elements, such as the sharing of content with colleagues for scholarly use, using DLF ERMI (www.diglib.org/standards/dlf-erm02.htm)specifications. Constraints on the license elements cannot be coded, however, and must be added in a free-text note. The proposed license expression format will allow the constraints on usage to be explicitly contained and coded. While back-end codification may be complex, non-licensing staff and end users will be presented with the licensing language in a simply expressed format.
The license expression format would follow a basic XML structure for coding the various terms of a license: definitions, supply terms, usage terms, payment terms and general terms. The definitions section will be the largest section, defining all agents related to the license, any component of the electronic resource, as well as dates, periods, locations, events and states, usages and references to external documents.
The following is an example of a usage term coded in the ONIX Publications Licenses (ONIX-PL - www.editeur.org/) format, as of draft format 0.9.18 (Nov. 2006):
<AnnotationText> Authorized Users may also transmit such material to
a third-party colleague in hard copy or electronically, for personal
use or scholarly, educational, or scientific research or professional
use but in no case for re-sale. </AnnotationText>
License Expression Applications
Functional requirements are being developed for editing tools allowing users to write license expressions without having to manually code XML.
As part of the discovery for the editing tool functional requirements, it became apparent that license templates, license expressions and license interpretations could have distinct applications. License templates could be made available via a public depository to licensing bodies which could modify them into their own private template to share with customers and business partners. While the actual license expression represents a complete deal between a vendor and a library, the license interpretation refers to the non-explicit terms which need to be interpreted as permitted or prohibited. The ONIX publications (www.editeur.org/) license would define these terms through a license interpretation form. The license interpretation could then be easily shared among all consortial partners, for example.
Click-through license expressions have been suggested as a solution to publishers not knowing what version of their click-through license their clients agreed to, especially when the license has been modified and the previous version is lost. An expression of the agreement would be created at the point of the click-through and sent in an encoded format to both the publisher and the client as a record of the transaction.
Mapping existing ERMI terms to the ONIX publication (www.editeur.org/) license structure is seen as an immediate avenue for importing license expressions into ERM systems. The ERMI licensing terms as applied in the current ERM systems are a subset of the more granular and broadly scoped license expression and could serve as a rudimentary license expression format.
License Expression, License Displays, License Policies
Beyond the ERM System
It is foreseeable that as ERM systems take on an increasingly important role in libraries, they will encompass a broader range of ERM-related functions. Librarians would already like other features, such as acquisitions and serials management functions, and links to electronic data interchange (EDI) documents to be included in ERM systems. If widely adopted, the Serial Release Notification (SRN) format, for instance, could become an easy way for publishers to communicate issue availability to libraries, which would render manual checking obsolete and prevent unnecessary claims. Koppel predicts that in the near future the average ERM system may be more akin to an acquisitions system than to one of the current ERM systems, as the tendency in electronic resource management seems to be towards the integration of all library management functions. Some librarians even have the evolutionary belief that if systems enable electronic resource management, digital rights could be managed in the same way, possibly by the same product.
Standards for Content, Format or Unique Identification Mentioned in the Article
Collection Description Specification NISO Z39.91-200x
COUNTER (Counting Online Usage of NeTworked Electronic Resources)
DLF-ERMI (Digital Library Federation’s Electronic Resource Management Initiative)
DOI (Digital Object Identifier) Z39.84-2000
DCMES (Dublin Core Metadata Element Set)
Information Retrieval Service Description Specification NISO Z39.92-200x
ISIL (International Standard Identifier for Libraries and Related Organizations) ISO 15511
ISBN (International Standard Book Number)
ISSN-L (Linking International Standard Serial Number)
LEWG (License Expression Working Group)
NISO RP-2005-01 to NISO RP-2005-04 [Best practices guidelines have been produced for building and implementing metasearch engines]
NISO/EDItEUR JWP (NISO/EDItEUR Joint Working Party for the Exchange of Serials Information Subscription Information)
ONIX (ONline Information eXchange)
ONIX-PL (ONIX for Publications Licenses)
OpenURL (The OpenURL Framework for Context-Sensitive Services)
SAN (Standard Address Number) NISO Z39.43
SICI (Serial Item and Contribution Identifier) NISO Z39.56
SOH (Serials Online Holdings) NISO/EDItEUR JWP
SPS (Serials Products and Subscriptions) NISO/EDItEUR JWP
SRN (Serials Release Notification) NISO/EDItEUR JWP
SUSHI (Standardized Usage Statistics Harvesting Initiative) NISO Z39.93-200X
Rafal Kasprowski is electronic resources coordinator in the University of Houston Libraries, Houston, TX 77204-2000; email at Rkasprowski<at>uh.edu. The following panelists are cited in this article and can be reached at these email addresses:
Adam Chandler, alc28<at>cornell.edu;
Ted Koppel, ted<at>exlibris-usa.com;
Nathan D.M. Robertson, nrobertson<at>law.umaryland.edu
Articles in this Issue
Standards in Electronic Resource Management