What is Metadata?

Home - About » Computer Science - Research - Publications - 1999
Computer Science
Research, Industry Work,
Programming
Community Service
Hillside Group, CHOOSE,
Stanford GSA
The Serious Side
Business School,
Learning Chinese
Humorous Takes
Switzerland, United States,
Software, Fun Photos
Travel Stories
Europe, United States, Asia
  
Living Places
Berlin (+ Gallery), Zürich
Boston, S.F. + Bay Area

Dirk Riehle, riehle@acm.org, www.riehle.org
Credit Suisse, Postfach 100, 8070 Zurich, Switzerland.

Position paper for OOPSLA '99 Workshop on Metadata and Active Object Models.
Paper location: http://www.riehle.org/papers/1999/oopsla-1999-ws-21-pp.html

1 Definition

Metadata are data about (some other) data. Metadata describe the structure and meaning of this other data. Metadata control processes and processing steps of this other data. Other data means any kind of data, be it base data or metadata.

2 Examples

2.1 RDBMS example

Consider an RDBMS table description of a CUSTOMER table. The table description defines that the customer table has columns like CUSTOMER_ID, FIRST_NAME, LAST_NAME, etc. This table description is data, and it is different data than a row in the table representing a specific customer.

The table description is an example of metadata. A row in this table is base data.

2.2 Business rules example

Consider the data extraction process of loading data from a host-based operational system onto a data warehousing database. Customer data flows from the host to the data warehousing database. During the process, transformation rules check the customer data for quality and make certain changes to the data.

The transformation rules for cleansing the data are an example of metadata. The customer data is base data.

2.3 Object-oriented example

Consider an object-oriented system of banking products. A key product category is accounts. Typically, there are many hundreds of different account types. Therefore, the system has a class Account and a class AccountType that provides information about a specific type of account.

The class AccountType provides metadata about specific Account, which are its base data. (By the way, AccountType is an instance of the Type Object pattern.)

3 Classification

Metadata may be

  • active or passive;
  • business or technical;
  • about other metadata or about base data.

3.1 Active and passive metadata

Active metadata is metadata that controls its base data and is operational in a general sense of the word. It not only describes its base data but defines how the base data is to be interpreted and used. In object-oriented systems, metadata become metaobjects, which unite the metadata with the associated control functions of the base data.

Passive metadata is purely descriptive metadata that is not used to control anything. Examples of passive metadata are textual descriptions like comments of some base data. Passive metadata is directed at the end-user who knows how to read and interpret it.

3.2 Business or technical metadata

Metadata has different users. Depending on these users, metadata may either be viewed as business metadata, describing business concepts and items, or technical metadata, describing technical concepts or items. Technical concepts may comprise how business data is mapped on to an implementation structure, for example, how a high-level E/R model is mapped onto an RDBMS schema.

Frequently, business data directly maps on technical data, so that there is no need to distinguish these two categories. Also, this classification may be extended to incorporate further stakeholders of the data warehousing process like administrators.

3.3 Meta-metadata

The high road of metadata is data about metadata, or, for short, meta-metadata.

Meta-metadata are data that describe (and in case of active meta-metadata control) the operations of metadata. This is more common than one might think. The definition of modeling languages like E/R or UML are meta-metadata. Also, the modeling language extensions of UML for data warehousing are meta-metadata.

You need to deal with meta-metadata if you want to integrate metadata from different sources. This situation is the typical tool integration scenario as found in all kinds of heterogeneous tool environments.

For example, in data warehousing, you have to integrate metadata from staging tools with metadata from the data warehouses and data marts with metadata from the analysis systems.

4 Further Aspects

University of Zurich and SwissLife provide a technical report that discusses further properties of metadata with a specific focus on data warehousing. Please see: Martin Staudt, Anca Vaduva, Thomas Vetterli. Metadata Management and Data Warehousing. Report 21. Zurich, Switzerland: Swiss Life, July 1999.

Copyright (©) 2007 Dirk Riehle. Some rights reserved. (Creative Commons License BY-NC-SA.) Original Web Location: http://www.riehle.org