Metadata

One of the most challenging aspects of the digital environment is the identification of resources available on the Web. The existence of searchable descriptive metadata increases the likelihood that digital content will be discovered and used. Collection-level metadata is addressed in the COLLECTIONS section of this document (see COLLECTIONS Principle 2). This section addresses the description of individual objects and sets of objects within collections.

Metadata is structured information associated with an object for purposes of discovery, description, use, management, and preservation.

Metadata creation is an incremental process that should be a shared responsibility among various parts of an institution. Different types of metadata can be added by different people at various stages of an information object’s life cycle. For example, at the creation stage, metadata about an object’s authors, contributors, source, and intended audience could be provided by the original authors. At the organization stage, metadata about subjects, publishing history, and access rights could be recorded by catalogers or indexers. At the access and usage stage, evaluative information such as reviews and annotations could be added by the user. Creators of digital objects should be encouraged to embed as much metadata as possible within the object before it is shared or distributed. On the life cycle of information objects, see the article by Gail Hodges, and the first chapter of Baca, Introduction to Metadata, both cited below under “General introductions to metadata issues.”

It is common to distinguish between three basic kinds of metadata. Descriptive metadata helps users find and obtain objects, distinguish one object or group of objects from one another, and discover the subject or contents. Administrative metadata helps collection managers keep track of objects for such purposes as file management, rights management, and preservation. Structural metadata documents relationships within and among objects and enables users to navigate complex objects, such as the pages and chapters of a book.

A primary reason for building digital collections is to increase access to the resources held by the organization. Creating broadly accessible descriptive metadata is a way to maximize access by current users and attract new user communities. Examples of metadata-based access tools include library catalogs, archival finding aids, museum inventory control systems, and search utilities such as Google.

Over the years, various metadata schemes have been developed for describing different types of objects. Within this multiplicity of schemes, there is a degree of consistency that supports interoperability. For example, most schemes provide for a creator or contributor name, date, title, and identifier. As cultural heritage institutions explore the metadata standards that are being adopted within their field, they will want to consider the interoperability issue early in their metadata implementation to ensure the greatest likelihood of interoperability (see Metadata Principle 2 and Objects Principle 3). Institutions must carefully consider not only which metadata schemes and information protocols are best suited to their collections; they must also give considerable thought to which controlled vocabularies and thesauri they should implement (see Metadata Principle 3), and which data content (i.e., cataloging) standards are most suitable for the objects in their collections. There are long-established cataloging guidelines such as AACR (Anglo-American Cataloguing Rules), and recent, new, and emerging standards such as DACS (Describing Archives: A Content Standard), CCO (Cataloging Cultural Objects), and RDA (Resource Description and Access). The cataloging standard that an institution chooses to follow, or the adaptation or combination of cataloging standards selected, is a key factor for providing good end-user access and creating sharable metadata records that work well in aggregated collections. See “Guidelines for Use” in the chart under Metadata Principle 1.

The following table, taken from Anne Gilliland’s essay in Introduction to Metadata (revised edition, 2008, cited below), provides a typology of data standards and how they should work together, with examples. There is usually a direct relationship between the cost of metadata creation and the benefit to the user: describing each item is more expensive than describing collections or groups of items; using a rich, complex metadata scheme is more expensive than using a simple metadata scheme; applying standard subject vocabularies and classification schemes is more expensive than assigning a few uncontrolled keywords; and so on. It should be noted however, that expenditures in development often result in greater efficiency and effectiveness for the end user. Use of a standardized subject thesaurus or other controlled vocabulary, for example, can provide greater precision and recall in searching, and can enable future functionality, such as faceted subject browsing and dynamic searching of subject matter.

Type of Data StandardExamples
Data structure standards (metadata element sets, schemas). These are “categories” or “containers” of data that make up a record or other information object.the set of MARC (Machine-Readable Cataloging format) fields, Encoded Archival Description (EAD), Dublin Core Metadata Element Set (DCMES), Categories for the Description of Works of Art (CDWA), VRA Core Categories
Data value standards (controlled vocabularies, thesauri, controlled lists). These are the terms, names, and other values that are used to populate data structure standards or metadata element sets.Library of Congress Subject Headings (LCSH), Library of Congress Name Authority File (LCNAF), LC Thesaurus for Graphic Materials (TGM), Medical Subject Headings (MeSH), Art & Architecture Thesaurus (AAT), Union List of Artist Names (ULAN), Getty Thesaurus of Geographic Names (TGN), ICONCLASS
Data content standards (cataloging rules and codes). These are guidelines for the format and syntax of the data values that are used to populate metadata elementsAnglo-American Cataloguing Rules (AACR), Resource Description and Access (RDA), International Standard Bibliographic Description (ISBD), Cataloging Cultural Objects (CCO), Describing Archives: A Content Standard (DACS) 
Data format/technical interchange standards (metadata standards expressed in machine-readable form).   This type of standard is often a manifestation of a particular data structure standard (type 1 above), encoded or marked up for machine processing.MARC21, MARCXML, EAD XML DTD, METS, MODS, CDWA Lite XML schema, Simple Dublin Core XML schema, Qualified Dublin Core XML schema, VRA Core 4.0 XML schema

The decisions about which metadata standard(s) to adopt and what levels of description to apply must be made within the context of the organization's purpose for creating the collection, the available human and technical resources, the users and intended usage, and approaches adopted within the particular field of inquiry or knowledge domain.

Questions to consider include, but are not limited to:

  • What is the purpose of the digital collection? 
  • What are the goals and objectives for building this collection? 
  • Who are the targeted users? What information do they need, and what is their typical information-seeking behavior? 
  • Are the materials to be accessed at the collection level or as individual items, or both? 
  • Do multiple versions or manifestations of the object need to be distinguished from each other? 
  • Will the collection or its objects have metadata before the digital collection is built? 
  • What subject discipline will be involved? What are the metadata standards that are commonly used within this discipline? 
  • What metadata standards are used by organizations in this domain? Which ones are most appropriate for this particular collection? 
  • How rich a description is needed, and does the metadata need to convey hierarchical relationships? 

Institutions should be aware that, depending upon the nature of their collections, a single metadata scheme may not suffice for all their needs. Thus a judicious combination of metadata schemes may be the best solution for some materials—for example, using EAD as the scheme at the collection level for archival collections with a common provenance, and MODS, VRA Core 4.0, CDWA Lite or another appropriate scheme at the item level. Likewise, a well-thought out selection of controlled vocabularies, published and collection-specific, should be applied as the data values to populate key access elements within the selected schemes.

Metadata Principle 1: Good metadata conforms to community standards in a way that is appropriate to the materials in the collection, users of the collection, and current and potential future uses of the collection.

Metadata Principle 2: Good metadata supports interoperability. Metadata Principle 3: Good metadata uses authority control and content standards to describe objects and collocate related objects.

Metadata Principle 3: Good metadata uses authority control and content standards to describe objects and collocate related objects. 

Metadata Principle 4: Good metadata includes a clear statement of the conditions and terms of use for the digital object.

Metadata Principle 5: Good metadata supports the long-term curation and preservation of objects in collections.

Metadata Principle 6: Good metadata records are objects themselves and therefore should have the qualities of good objects, including authority, authenticity, archivability, persistence, and unique identification.  

Last updated: 04/17/2008