Ontology (computer science)

In computer science, an ontology is the attempt to formulate an exhaustive and rigorous conceptual schema within a given domain, a typically hierarchical data structure containing all the relevant entities and their relationships and rules within that domain.

T. R. Gruber has described an ontology in this sense as "an explicit specification of conceptualization".

Although the term 'ontology' has been used very loosely to label almost any conceptual classification scheme, among practising computational ontologists, a true ontology should contain at a minimum not only a hierarchy of concepts organized by the subsumption relation (often called 'isa', 'subtype' or 'subclass'), but other 'semantic relations' that specify how one concept is related to another. The most common of the semantic relations other than subsumption is the 'part-of' relation. In one formal notation, one might see a relation such as (isPartOf Spine Vertebrate), meaning that a 'Spine' (in that specific sense) is part of a Vertebrate. The ontologies are organized by concepts, not words, so that the concept 'spine' referring to the spine of a book would have to be labeled by a different term, such as 'BookSpine'.

This is different from but related to the philosophical meaning of the word ontology, the study of existence. The purpose of a computational ontology is not to specify what does or does not 'exist', but to create a database, which is a human artifact, containing concepts referring to entities of interest to the ontologist, and which will be useful in performing certain types of computations. For this reason, the abstruse reasoning used by philosophical ontologists can be helpful in recognizing and avoiding potential logical ambiguities, but where alternative ontological representations can equally well serve the pragmatic purpose of the computational ontologist, time constraints usually dictate that one choice is made and others are ignored. For certain purposes, it can be better to ignore many of the details of the objects of interest. As a result, computational ontologies developed independently for different purposes will often differ greatly from each other.

Ontologies are commonly used in artificial intelligence and knowledge representation. Computer programs can use an ontology for a variety of purposes including inductive reasoning, classification, a variety of problem solving techniques, as well as to facilitate communication and sharing of information between different systems.

An ontology which is not tied to a particular problem domain but attempts to describe general entities is known as a foundation ontology or upper ontology. Typically, more specialized schema must be created to make the data useful for real world decisions.

Such ontologies are commercially valuable, creating competition to define them. Peter Murray-Rust has claimed that this leads to "semantic and ontological warfare due to competing standards", and accordingly any standard foundation ontology is likely to be contested among commercial or political parties, each with their own idea of 'what exists' (in the philosophical sense). No one upper ontology has yet gained widespread acceptance as a de facto standard. Different organizations are attempting to define standards for specific domains. The 'Process Specification Language' (PSL) created by the National Institute for Standards and Technology (NIST) is one example.

Table of contents

1 Available ontologies
2 Ontology languages
3 See also:

Available ontologies

A well-known and quite comprehensive ontology available today is Cyc, a proprietary system under development since 1985, consisting of a foundation ontology and several domain-specific ontologies (called microtheories). A subset of that ontology has been released for free under the name OpenCyc (see http://opencyc.org/ ).

WordNet, a freely available database originally designed as a semantic network based on psycholinguistic principles, was expanded by addition of definitions and is now also viewed as a dictionary. It qualifies as an upper ontology by including the most general concepts as well as more specialized concepts, related to each other not only by the subsumption relations, but by other semantic relations as well, such as part-of and cause. However, unlike Cyc, it has not been formally axiomatized so as to make the logical relations between he concepts precise. It has been widely used in Natural Language Processing research.

The Suggested Upper Merged Ontology (SUMO) is another attempt to define an upper ontology, created by an IEEE working group (predominantly by a group at Teknowledge) and freely available.

Ontology languages

To be useful, ontologies must be expressed in a concrete notation. An ontology language is a formal language by which an ontology is built. There have been a number of data languages for ontologies, both proprietary and standards-based:

The Cyc project had its own ontology language based on first-order logic, called CycL.
KIF was, among other things, another ontology language
OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects include OIL and DAML