Outline of the Session:
Universal Resource Names (URN) (A type of Universal Resource Identifier (URI))
Roles of names for objects — Why do we need names
In the network environment, names and citations become "actionable", they can be clicked on to get to the actual object.
Naming systems and resolution
Can one tell from an object what its name is? Computable names, such as SICI codes for journal articles or hash codes/signatures
Privacy as a concern in the resolution database. Should it be private or public?
Who gets to use what names? Rights to names. Big issue in domain names.
What is wrong with URLs?
Role of IETF (Internet Engineering Task Force)
Structure of Unified Resource Names (URN)
Look for presentations and other materials on www.acl.lanl.gov and www.acl.lanl.gov/~rdaniel
Both are members of the IETF.
Work of IETF
Structure of URNs
At the top level, URNs are divided into name spaces, such as inet, cid, http, handle, ILN, ISBN. Each name space has its own rules for names. Thus the structure for a URN is
To resolve a URN (find metadata about the resource, incl. pointers to one or more copies of it), follow these steps:
URN standards are determined by IETF. IETF maintains mailing lists for people who want to participate in the formulation of specific standards. Join by signing up. Mailing lists are the final authority.
Progress: URN name spaces are being defined. There is work on a URN client (program to resolve URNs) specification. VRML (Virtual Reality Modeling Language) uses URN
Ron Daniel, Jr.
Whatever happened to URCs (Uniform Resource Characteristic)
URCs were designed to link a URN with the appropriate URL(s)
URC history: IETF URI-WG working group 1992. URC originated as the data string to bind a URN to a set of URLs
The demise of URCs: The URI-WG defined URL schemes, defined URN and URC. Contemplated URN agents, but ended work. A URN-WG was established in Fall 1996, but a proposed URC-WG was not approved. The URN-WG adopted a very loose definition of URC.
Meanwhile, another organization, W3C worked on PICS (Platform for Internet Content Selection. PICS started out as a system to connect third-party ratings with URLs; it had a three-part architecture: labels, rating system, rules. Labels were numeric only. PICS-NG (Next Generation), defined in January 1997, dealt with strings and incorporated other metadata, such as the Dublin Core elements and digital signatures. It evolved into the Resource Description Framework (RDF), a broadly applicable approach to metadata on the Web. The data model is a directed graphs consisting of nodes and arcs. Arcs are labeled with the type of relationship; these labels are part of the name space. RDF defines a system for establishing link types and specifies some broadly useful primitives. The RDF syntax is mapped to an XML format. The generic structure of RDF allows implementation of the basic ideas of the Warwick framework.
Status of RDF: First draft of model and syntax Oct. 2, 1997. The syntax for types will be enhanced. Abbreviation syntax is being defined. Schemas and rules still to come.
Why is RDF likely to be important? Major browser vendors are heavily involved. RDF defines syntax and structure, allowing user communities to concentrate on semantics. RDF is the basis for using and reusing descriptive schemas.
Metadata, MARC, and the Dublin Core
1. MARC as a metadata standard
MARC follows Z39.2 and ISO 2709 in its structure. Field content is governed by other standards (AACR2r) (Anglo-American Cataloging Rules, 2nd ed., 1986 revision)
Use of MARC for cataloging Internet resources. New field 856 for URL and other identifiers. InterCat database (OCLC). Digital cataloging guidelines at LC.
Why MARC for Internet resources? Allows for incorporation in library catalogs. In many cases, a record for the print version exists and can be amended with a pointer to the digital version. Since MARC cataloging requires effort, it is applied only to high-level resources.
Using metadata for navigating digital collections; constructing finding aids.
LC National Digital Library. Framework; access aids; includes digital images which form part of multimedia objects. NDL uses logical names for objects. The repository system links a logical name to a physical location and to metadata about the object.
2. Integrate metadata form several data standards in digital libraries
For unified access, access aids using different metadata structures (MARC, SGML metadata record, HTML header) must be integrated. Example: LC Civil War photographs
3. Dublin Core (DC)
Warwick Framework: Conceptual framework for the coexistence of many varieties of metadata. Web a strategic application (HTML/XML syntax), but not the only one.
CNI/OCLC Sep. 1996. Workshop on the Dublin Core and metadata for images DC-4, Canberra, Australia, March 1997. Discussion: Simplicity vs. flexibility. Minimalist camp vs. structuralist camp (advocate sub-elements and qualifiers). Result: Canberra qualifiers.
DC-5, Helsinki, Finland, Oct. 1997. Refinements and implementation strategies and standards. Agreed on formal data model expressed in RDF (Resource Description Framework, see Daniel above). Minimalist DC frozen with semantics for additional substructure/refinements. Closer DC - W3C collaboration
30+ DC implementation projects. Consensus on DC as lingua franca. Applications to non-electronic resources being considered
4. MARC and the Dublin Core. Other MARC mappings
MARC amended to cover all DC elements. Conversion DC to MARC (simple DC to MARC. complex DC to MARC). Results in skeletal MARC record which can then be enhanced. Several sample projects are underway. Conversion MARC to DC easier, results in complete DC records (but looses information)
MARC mappings to other standards
GILS (geospatial data)
5. Action needed
Guidelines and registry for qualifiers to DC
Integrate metadata in the global information infrastructure: Exploiting metadata in Web search services; library Web pages with DC metadata.
Dublin Core: purl.org/metadata/dublin_core
Intercat database: www.oclc.org:6990
Warwick framework: www.oclc.org:5046/~weibel/html-meta.html
IETF standardization process is less formal than that of other standard bodies. Drafts are put out for comment, and everybody is free to comments.
Who is expected to create DC records? Both authors of documents (some elements could be filled in automatically by authoring software) and catalogers, but catalogers are more likely to produce MARC records (which are more complete).
Is there a movement to embed DC elements into non-text documents? The DC is intended to be applicable to non-text documents. The problem is hoe to link the metadata to images. One possibility is to create a separate HTML page with metatags not commonly found in the TIF header. Metadata can be (1) Part of a document (in-line), (2) in separate records, or (#3) created as a byproduct of retrieval operations.
Metadata for non-electronic resources, for example museum objects? Need to provide for an element that stores the type of the object. Also problem: How to relate the description of an object to the description of a digital image of the object?
Does DC require authority control, esp. of subjects? No requirement, could be agreed upon in special domains, otherwise choice of keywords is up to the author. There is a controlled list of resource types.
Do software producers (such as producers of SGML tools) introduce their own subject schemes as part of authoring tools? Only example would be the use of the Arts and Architecture Thesaurus (AAT) in tools specifically designed for the art world. A DDT (Document Type Definition) may specify authority control, but authoring tools are not bound to a specific DTD.
Benefits of search systems based on DC vs. those based on MARC vs. combined systems? DC is intended for the Web; Web documents are unlikely to get full MARC records. More generally: MARC and other systems give more information than DC and therefore allow for more powerful searching; but for the same reason they require knowledge to produce and use. A user who has limited knowledge or wants to search multiple databases using different metadata standards should use DC, provided a search-time mapping form DC to the other standards is available.