Metadata Principle 2

Metadata Principle 2: Good metadata supports interoperability.

Teaching, learning, and research today take place in a distributed networked environment. It can be challenging to find resources that are distributed across the world’s libraries, archives, museums, and historical societies. To alleviate this problem, cultural heritage institutions must design their metadata systems to support the interoperability of these distributed systems.

Good metadata should be coherent, meaningful, and useful in global contexts beyond those in which it was created. This means that it must include all pertinent information about the object, since assumptions about the context in which it is accessed locally may no longer be valid in the wider networked environment. For example, a photo archive may not indicate in each record that the object being described is a photograph. However, in the wider network context, form and genre information becomes important. Digital collections with a topical focus are notorious for creating non-interoperable metadata when they assume that users know the main topic of the collection. When this metadata is shared in larger aggregations, descriptions that made sense in the context of the original collection can be mystifying. This has been dubbed the “on a horse” problem, from the description of a photograph in Harvard’s Teddy Roosevelt collection, where the title assigned to the photograph did not indicate who was sitting on the horse, since all the materials in the collection related to Roosevelt.

The creation of accessible, meaningful shared collections implies responsibilities on both the part of the data providers (organizations that create metadata records and contribute them to federated collections) and service providers (aggregators that provide access to federated collections or union catalogs). Data providers should strive to create consistent, standards-based metadata, to use appropriate controlled vocabularies and thesauri, and to follow appropriate data content (i.e., cataloging) standards. Service providers must implement metadata normalization, remediation, and enhancement, and should, as their name implies, provide additional “value-added” services such as vocabulary-assisted searching, subject clustering, terminology mapping, and other enhancements. Adherence to appropriate standards and collaboration between data providers and service providers are crucial elements of effective aggregated digital collections.

The goal of interoperability is to help users find and access information objects that are distributed across domains and institutions. Use of standard metadata schemes facilitates interoperability by allowing metadata records to be exchanged and shared by systems that support the chosen scheme.

Ideally, metadata schemes should be documented in a registry that provides standardized information for the definition, identification, and use of each data element. A registry defines metadata characteristics and formatting requirements to ensure that a metadata scheme and data elements in use by one organization can be applied consistently within the organization or community, reused by other communities, and interpreted by computer applications as well as human users.

When different metadata schemes must be used, one way to achieve interoperability is to map elements from one scheme to those of another. These mappings, or crosswalks, help users of one scheme to understand another, can be used in automatic translation of searches, and allow records created according to one scheme to be converted to another. If a locally created metadata scheme is used in preference to a standard scheme, a crosswalk to some standard scheme should be developed in anticipation of future interoperability needs.

Another way to increase interoperability is to support the harvesting protocol of the Open Archives Initiative Protocol for Metadata Harvesting (OAI/PMH). Systems that support the OAI-PMH can expose their metadata to harvesters, allowing their metadata to be included in federated databases and used by external search services.

  • Open Archives Initiative website http://www.openarchives.org/. Links to the Protocol for Metadata Harvesting and guidelines for implementers. 
  • OAIster website http://www.oaister.org/. The University of Michigan’s OAIster search service contains millions of records for digitized cultural heritage materials harvested from hundreds of collections via the OAI-PMH. 
  • Best Practices for OAI Data Provider Implementations and Shareable Metadata website http://webservices.itcs.umich.edu/mediawiki/oaibp/index.php/Main_Page. A joint initiative between the Digital Library Federation and the National Science Digital Library. 

Yet another way to increase interoperability is to support protocols for cross-system searching, also called “metasearch.” Under this model, the metadata remains in the source repository, but the local search system accepts queries from remote search systems. The best-known protocol for cross-system search is the international standard Z39.50, which is being modernized for the web environment.

  • Library of Congress, SRU: Search/Retrieve via URL website http://www.loc.gov/standards/sru/. A standard protocol for passing Z39.50-like search queries in a URL, utilizing a Common Query Language. This site also links to the SRW (Search/Retrieve Web Service) specification, in which queries are passed not via URL as in SRU, but by using XML over HTTP using SOAP (Simple Object Access Protocol).  

 

Last updated: 04/21/2009