I've been writing software since about 1993, spanning several different domains and programming languages. Always on the look for better and more fun ways to program. I spent much of my youth fencing, and recently I returned to the sport as a coach. Borislav is a DZone MVB and is not an employee of DZone and has posted 15 posts at DZone. You can read more from them at their website. View Full User Profile

Domain Modeling with OWL - Part 2

10.30.2012
| 9017 views |
  • submit to reddit

Why Description Logic Matters?

In this second installment of the OWL introductory series, we will be doing a little bit of math. If you ever plan to create a moderately complex OWL 2 model, understanding the mathematical foundations will give you the right intuitions and spare you unpleasant surprises. The math is especially relevant to the reasoning portion of the OWL toolset and tightly integrated reasoning services are one of the main distinguishing features of OWL as opposed to many other knowledge representation languages.

Last time, I gave you a bit of a historical context on OWL and I mentioned that it is a fragment of first-order logic. This is strictly speaking correct, but a bit misleading as far as the original motivation of the formalism. OWL 2.0 is based on Description Logic (henceforth DL) which was developed in order to put semantic networks (or "conceptual graphs") and frames on a firm foundation. Several variants of these conceptual networks were proposed in the 70s, but the ideas at the core of DL first appeared as something called structured inheritance networks . Specifically, those ideas are the use of classes, instances and properties combined with means to create complex logical expressions and, most importantly, the ability to make inferences about sub-classing and class membership. And this last aspect, the fact that non-trivial IS-A relationships between two classes or between a class and an instance can be inferred, is the crucial contribution of that work and the reason it eventually got a comprehensive mathematical treatment gaining the name Description Logic.

When you create a domain model in OWL, you can state various constraints between classes and properties and those constraints have logical consequences. You are actually able to create a rather rich logical model of your domain and then ask interesting questions about it. Part of the knowledge that you would have represented is explicit. That's simply the collection of statements that you make to describe the domain. You can retrieve that knowledge by using the standard semantic web query language SPARQL or by direct API calls. No reasoning involved here, just plain data querying. But another portion of your knowedge base is the implicit knowledge that can be derived as a consequence of your statements. Accessing that implicit knowledge requires a precise logical interpretation, reasoning services and an appropriate expression language to formulate queries. That is what DL is about.

In what follows, I will introduce the actual mathematical formalism and show you side-by-side how it maps to OWL 2.0. Then we will discuss a few uncommon assumptions that OWL reasoners make and what their consequences are. One of the goals of this introduction is to give you a head start should you want to read more about the subject of Description Logic.

For more information and an extensive literature review, please consult The Description Logic Handbook.

Let's Get Formal

Formal languages, mathematical logic languages just list programming languages, are specified in two parts: syntax and semantics. Logic languages in particular are traditionally specified through what is known as Tarski-style semantics, where the meaning of an expression in the language is ascribed via a correspondence with set theoretical constructs. This allows the whole aparatus of model theory to be applied and classical proof techniques can be used to show whether a language is decideable and, if so, to what complexity class it belongs.

Here is a summary of the DL language elements:

  • Atomic concepts, usually written in capital letters from the beginning of the alphabet: A, B, C, D etc. Those correspond to OWL classes.
  • The special concepts top and bottom, denoted by and respectively. In OWL, those are referred to as owl:Thing and owl:Nothing.
  • Roles, also in capital but from another part of the alphabet: R,S etc. DL roles are equivalent to OWL properties.
  • Individuals usually written in lower case: a, b, c etc. In OWL we call them individuals as well.
  • Logical operators like    (intersection),   (union), ∀ (every), ∃ (there exists), ¬ (complement) as well as number comparison ≤, ≥ etc. In OWL, all those operators are represented differently depending on the OWL syntax used, XML or functional or the very concise Manchester syntax.

The difference in terminology between OWL and DL shouldn't lead to confusion. Knowing the original DL terms and history should help understand the meaning adopted by OWL of what are otherwise standard OO notions. For example, knowing that OWL properties come from DL roles hints at the idea of a connection between two entities rather than having one "belong" to the other. I will be using both DL and OWL terms freely, and they should be considered synonymous. The single letter naming conventions are used only when doing the math. In modeling, both DL and OWL use more descriptive names, usually camel case, starting with capital letter for concepts and individuals, and starting with lower case for roles.

So the core elements in the language are concepts, roles and individuals. The interpretation of those elements is based on a universal set of things that are being talked about, a domain, and a mapping that assigns individual names to elements of that domain, concepts to subsets of the domain and roles to relations over the domain. Thinking about concepts as sets is not far from seeing them as classes, both in the mathematical and in the programming sense.

More formally, the meaning of the language elements is defined by an interpretation function ℑ that assigns a set to each named DL concept. We write x instead ℑ(x) for the element that the interpretations maps x to. The domain of interpretation, or application domain in software engineering terms, is denoted by Δ. One may use just Δ for the namespace of individuals, the domain of discourse. This is an important distinction in mathematical logic as one is moving from formulas to what they are denoting and back. As an example, top and bottom are formally interpreted as the whole domain and the empty set respectively:

= ∅

When proving theorems about a logic language, one frequently reasons about different possible interpretations and each interpretation is called a model, not to be confused with our domain models in software. A model may offer a different mapping of individuals, concepts and roles, but it may not change the meaning of the logical operators. For example, the intersection operation is analogous to what would be logical conjunction ∧ and is always interpreted as set intersection:

(C D) = C∩D

Intuitively, half of what DL allows you to do is express concept descriptions, i.e. descriptions of sets of individuals. The atomic concepts and the roles are elementary descriptions and then you have operators in the syntax to make up more complex ones. The other half is description of individuals in terms of how they are classified and how they are related to other individuals.

In logical terms, concepts can be seen as one-place predicates while roles as two-place predicates. In fact, the claim that DL is a fragment of first-order logic (FOL) starts with that correspondence. Then it is easy to see how the formulas and statements below can be translated into FOL. So one can state in DL that a given individual belongs to a concept:

C(a) - a belongs C, akin to OWL's ClassAssertion axiom.

or that two individuals are related by a role:

R(a,b) - b is a filler of the role R (or an R-filler) for a. This is akin to OWL's ObjectPropertyAssertion that we saw last time.

Note the phrasing here: b is "a" filler, not "the" filler as there may be many. But the more interesting part are the complex concept description that one is allowed to form.

Describing Concepts

The power of DL as a language lies in its ability to describe classes of entities via complex logical formulas. That is what makes it into a useful logic language. In the table below you can see the list of available constructors for building those complex descriptions.The 3d column shows the equivalent OWL syntax in the standard OWL Functional Syntax and the Manchester Syntax. I won't be covering the bloated and ugly XML syntax. IMHO, pushing XML as the default serialization mechanism for RDF/OWL is probably an important reason for the slowish adoption of the technology. The functional syntax is both complete and user-friendly. The Manchester syntax is incomplete but even better looking than DL's own and used in Protege whenever class expressions are needed. So I'm showing both of those.

DL SyntaxNameOWL SyntaxMeaning
Bottom owl:NothingThe empty set.
Topowl:ThingThe entire domain of interest.
C D Intersection ObjectIntersectionOf(C D)
C and D
The set of individuals that belong to both C and D.
C DUnionObjectUnionOf(C D)
C or D
Describes the individuals that belong either to C or to D (or to both!).
¬ CComplement ObjectComplementOf(C)
not C
Describe the set of all things that do not belong to the concept C.
∀R.CUniversal value restriction ObjectAllValuesFrom(R C)
R only C
Describes the individuals all of whose R-fillers belong to C. In OWL terms, this is the class of objects where all the values of the property R are of type C.
∃R.CExistential quantification ObjectSomeValuesFrom(R C)
R some C
Describes the individuals that have at least one R-filler that belongs to C. In OWL terms, all objects that have at least 1 property R whose value is of type C.
{a, b, c, ... }Enumeration ObjectOneOf(a,b,c,..)
{a,b,c,...}
Describes the concept consisting of exactly the individuals a, b, c, etc.
R:aIndividual value restriction ObjectHasValue(R a)
R value a
Describes the individuals having a as an R-filler. In OWL terms, the objects that have property R with value a.
≥ n R.CMinimum cardinality ObjectMinCardinality(n R C)
R min n C
Describes the individuals that have at least n fillers of the role R belonging to the concept C.
≤ n R.CMaximum cardinality ObjectMaxCardinality(n R C)
R max n C
Describes the individuals that have at most n fillers of the role R belonging to the concept C.
= n R.CExact cardinality ObjectExactCardinality(n R C)
R exactly n C
This is a shorthand for ≤ n R.C and ≥ n R.C combined.

Each and every construct listed above has a precise formal, set-theoretic interpretation. For example, universal value restriction is interpreted thus:

(∀R.C) = {a∈Δ | ∀b:(a,b)∈R → b∈C }

As a little exercise, you could spell out the formal semantics of some of the other forms. This list of constructors constitutes a powerful means of expressing all sorts of concepts. Description Logic as a formalism has several variants with different computational characteristics. A particular variant is defined by the set of constructors that are allowed in it. Suffice it to say that all of them are available in OWL. So let's see a few concrete examples of what can be expressed in this language so far, staying on our automobile theme from last time:

DL Expression Denoting Concept
Car ¬Red All cars that are not red.
∃hasPart.American All objects that are at least in part made in America.
Person ∀owns. (Hybrid BioDiesel) Climate change conscious people that don't own cars based exclusively on fossil fuels.

In the constructor table, I showed the OWL Manchester syntax right below the OWL functional syntax. To get a feel, here's what the last expression above looks like in the Manchester syntax:

Person and owns only (Hybrid or Biodiesel)

If you are familiar with mathematical logic, you probably noticed the absence of variables. If this makes you uncomfortable, just think of concept expressions in DL as implicitly containing one free variable ranging over the domain of discourse. In other words, concept expressions are what you get as logical formulas in DL.

Making Statements - the TBox and the ABox

So far so good. We have seen how to make complex class descriptions in terms of simpler ones. Let's see how we can state facts (a.k.a. axioms). There are two fundamental kinds of axioms in DL: axioms expressing constraints purely within the conceptual model and axioms talking about individuals in the world being described. The former comprise the so called TBox (Terminological box) while the latter comprise the ABox (Assertion box). OWL itself doesn't make that distinction, but reasoning algorithms use it and you will come across those terms in the literature and discussion groups. I already showed you the two main types of ABox axioms, concept and role assertions with the following semantics:

C(a) is true in ℑ if a ∈ C
R(a,b) is true in ℑ if (a,b)∈R

Another way to say the above is that the interpretation ℑ satisfies C(a) and that ℑ satisfies R(a,b). The concept and individual assignments that ℑ makes are consistent with those assertions. If an interpretation satisfies all axioms in an ABox, it is a model for that ABox. Concept and role assertions are not the only possible kinds of statements in an ABox, but they are the most important ones. Other assertions allow you to say when different names should be interpreted as the same individual and when not. More on them below.

In the TBox, the axioms establish a priori facts about concepts and roles. Two main types of axioms are used, inclusions and equalities:

C D (inclusion or subsumption)  is true in ℑ if C⊆D
C ≡ D (equality or definition) is true in ℑ if C=D
R S (role subsumption) is true in ℑ if R=S

And similarly, an interpretation ℑ satisfies a TBox whenever it satisfies all axioms in it. Note that we can also define property inheritence in Description Logic, not only class inheritence. An example would be hasSon that is a sub-role of hasChild - if somebody has a son, one can infer that they definitely have a child. Even though we've just used atomic names here, a full concept expression can appear on either side of an inclusion or an equality axiom. For example we can define a Pedestrian as somebody who doesn't own a car:

Pedestrian ≡ Person ∀owns.(not Car)

From this, a reasoner can already trivially infer that a Pedestrian is a Person. As another example, we can say that true sports cars must have no more than two doors:

SportsCar Car ≤ 2 hasPart.Door

The above axiom states that whenever something is known to be a sports car, it is definitely a car (so if somebody owns it, they can't be a pedestrian) and it can't have more than 2 doors. If you declare an individual as a SportsCar and then proceed to assign 4 different doors to it:

hasPart(MyCar, FrontLeftDoor)
hasPart(MyCar, FrontRightDoor)
hasPart(MyCar, BackLeftDoor)
hasPart(MyCar, BackRightDoor)


a reasoner would complain about an inconsistency in your knowledge base, it will enforce a constraint.

Even though operators like intersection and union could be defined for roles, this is not done in OWL. There are other ways to specify role constraints at the conceptual level though. Besides role subsumption, one can constrain the source and target of roles, or domain and range of a property in OWL terms. Talking about "domains" and "ranges" is more familiar than "roles" and "fillers", and consistent with the view of OWL properties as binary relations. OWL provides the special axioms ObjectPropertyDomain and ObjectPropertyRange. However, one should keep in mind that such constraints can be specified using the existing DL tools and are in fact interpreted in exactly that way by OWL reasoners:

≥ 1 R.T C (domain of R is C)
∀ R. C (range of R is C)

In English, the first axiom above says that anything with at least 1 role R filled by whatever also belongs to the concept C. So in other words, whenever you have R(x, ?), you can infer the x ∈ C. Similarly, the second axiom says that any individual can only be a filler of role R if it belongs to the concept C. Therefore, a statement of the form R(?, x) would allow a DL inference engine to conclude that x belong to C. Notice the pattern here that allows you to introduce a constraint that applies to everything. Saying that the universe () is subsumed by a concept C is the same as saying that all individuals belong to C.

Moreoever, just like binary relations in classical set theory, roles in Description Logic can be classified semantically into symmetric, asymmetric etc. One can directly make such declarations about OWL properties as TBox axioms and enrich the conceptual model this way. And this is again where the beauty of DL and OWL shines. The logical aparatus that you learn in basic discrete math, something that you might use for documentation purposes, is available in a simple declarative software modeling language. Here are the options and a refresher on what they mean:

CharacteristicSyntaxMeaning
Functional FunctionalObjectProperty(isMadeBy) Only one value is permitted as a role filler for R. An object can have only one such property.
Inverse InverseObjectProperties(hasMade isMadeBy) This says that hasMade ≡ isMadeBy-. The domain and range are reversed: isMadeBy- = { (a,b) | (b,a) ∈ hasMade}.
Symmetric SymmetricObjectProperty(isMarriedTo) Symmetric means that the relation goes both ways: (a,b)∈R⇔(b,a)∈R
Asymmetric AsymmetricObjectProperty(isMarriedTo) An asymmetric means that the relation cannot go both ways:(a,b)∈R⇒(b,a)∉R
Reflexive ReflexiveObjectProperty(feeds) In a reflexive role, every individual is its own filler. For example, everybody feeds themselves.
Irreflexive IrreflexiveObjectProperty(isMarriedTo) In an irreflexive role, no individual can be its own filler.
Transitive TransitiveObjectProperty(isPartOf) In a transitive role, whenever (a,b) ∈ R and (b,c) ∈ R we have also (a,c) ∈ R.

Some of those role semantics can be expressed using available machinery. For example, the fact that a role R is functional can be expressed as ≤ 1 R. But transitivity can't. Another such "irreducible" construction available in OWL is ObjectPropertyChain which allows you to express role composition. You can say that a chain of roles that indirectly connects two individuals establishes a relationship between them. A common example of this is the uncle relationship which would be defined in OWL like this:

SubObjectPropertyOf(ObjectPropertyChain(hasFather hasBrother) hasUncle)

To sum up, it is common for DL systems to separate the knowledge into a purely conceptual model, sort of like a schema definition, the TBox and actual data which associates individuals to concepts and assigns them roles, the ABox. Reasoning tools tend to use different algorithms and optimization techniques dependending on whether they are dealing exclusively with a TBox or an ABox or a mix of both. The gist of the formalism is the ability to describe complex concepts in terms of simpler ones and the describe individuals in terms of how they are classified and how they relate to other individuals. It is a logic language with no variables all right, but not a propositional one. I've advertised the ability to make non-trivial inferences about the accumulated knowedge, so let's take a look at those now.

Reasoning with Concepts and Roles

There are a few core reasoning problems about the conceptual portion (TBox) of a DL knowledge base stemming from natural questions that one might ask. For example, is a given concept satisfiable (remember, when we say concept here, we may mean a possibly complex logical formula, a full description) in the sense that it is possible to find a model where that concept describes a non-empty set. Another question is whether one concept subsumes another (in OWL terms if a class is a sub-class of another), which can be reduced to satisfiability because:

D if and only if C ¬D is not satisfiable.

Another question is if two concepts (or concept formulas) describe the same set of individuals. And again, this can be reduced to satisfiability by first observing that two concepts are equivalent if and only if both C D and D C are true. Finally, note that two concepts C and D are disjoint if C D is unsatisfiable. If you think about it a bit, you'd find that all of those reasoning tasks are reducible to one another. For example, suppose you have an algorithm for subsumption. You can then determine if C a unsatisfiable by checking if it is subsumed by . The satisfiability question is more interesting to tool builders because that's how tableau algorithms tend to operate - they try to obtain a contradiction when building a model for a formula. And it is also sometimes a necessary condition for a whole ontology to be consistent. But to people, the subsumption question is often the more interesting one because it can be viewed as implication.  If you think of a concept expressions as logical formulas with one free variable, then concept subsumption is logical implication in the sense that whenever the sub-concept formula is true of an individual, the super-concept formula is true as well.

Another reason the subsumption question is interesting in practice is the ability of inference engines to list all named concepts subsumed by a given concept description, thus automatically constructing a conceptual hierarchy out of TBox constraints.

Reasoning with Individuals

Unlike TBox reasoning where we are dealing purely with conceptual constraints, in the ABox we are stating facts about objects. Something may be wrong with a TBox if a concept is unsatisfiable, which simply means that no individual can belong to it, which simply means that it's equivalent to the bottom concept . And there's nothing special about that, there aren't any other consequences. Usually, when a contradiction is found in a logic language, it is devastating for the language because it allows one to prove anything as a consequence. An unsatisfiable concept is a sort of a contradiction, but it doesn't make an ontology useless because it doesn't prevent one from creating a model of the set of axioms. That is, it doesn't prevent one from coming up with a sensible interpretation. It's just that the sets corresponding to the unsatisfiable concepts will be empty.

When a TBox has an unsatisfiable concept, it is simply called incoherent. That's an undesirable property, and a knowledge engineer should strive to maintain coherent ontologies. In fact, studying the consequences of incoherence is a topic on its own.

A contradiction involving the ABox however is a different story. It means that it's impossible to find a model for the axioms, i.e. there is no way to interpret them! And this is what defines an inconsistency in DL: an ontology (TBox+ABox) is called inconsistent if there's no model for it. Interestingly, concept satisfiability and consistency have been proven to be equivalent problems! An example of an inconsistency would be asserting that an individual belongs to two disjoint concepts:

Pedestrian(Tom)
Car(H1)
owns(Tom, H1)


Here we've stated that Tom is a Pedestrian and that he owns the individual H1 which we've asserted to be a Car. According to our definition of a Pedestrian above, Tom can only own things that are not cars. Therefore, a reasoner would infer that H1∈¬Car and that's an inconsistency. Note that the inference engine can't tell you exaclty what you did wrong. If this were a real world example, perhaps the statement Pedestrian(Tom) is at fault. But a reasoner won't complain about it because there's no problem with declaring Tom a pedestrian per se.

While consistency is a natural question to ask, a more practical question is instance checking which asks if a given individual belongs to a given concept. Since a concept can be a complex logical formula, this essentially allows you to ask a logical question about an object. And to check whether C(a) is true, one needs to prove that if you add it to the ontology as an axiom, it leads to an inconsistency. For example to find if a given car model is an all American green energy car we could ask if it is an instance of the concept:

∀hasPart.American (Hybrid BioDiesel)

Even more fun is the ability to ask for all individuals that belong to a concept. This is known as the retrieval problem. It is akin to querying a database by specifying the desired data's characteristics via a logical expression. Conversely, given an individual one may query for all the named concepts the individual belongs to. That is, we can ask for all the types that individual has been classified under.

What About Data Values

Last time you learned that there are two kinds of OWL properties that an individual can have: object properties and data properties. In Description Logic, data properties are introduced by extending the formalism with concrete domains and further allowing standard logical predicates (i.e. n-ary boolean functions) over those domains. An example of a concrete domain is the set of natural numbers with binary predicates for the comparison operators <, ≥ To make the formalism work, certain restrictions are imposed on the available set of predicates for a given domain. However, we won't go into details here because OWL doesn't allow use of concrete datatype predicates in class definitions. Thus, it is not possible to say that somebody is allowed to drink if their age > 21. So data values in OWL are used more or less like individuals except they can only appear as role fillers. When we cover rules later, we will see how to get around this OWL limitation by using the SWRL (Semantic Web Rule Language).

Reasoning in an Open-Ended World

The economist John Keynes is famously quoted as saying "When the facts change, I change my mind. What do you do, sir?" So it goes for much of (sound) human reasoning. We are quick to draw conclusions, taking shortcuts, making assumptions and faced with new information we rapidly retract our deductions and change our reasoning. The alternative would be to very rarely commit ourselves to a conclusion, say "I'm not sure" most of the time, only infer things that are certain so we don't have the embarassment of being wrong. What's the right attitude? That's the debate between monotonic and non-monotonic reasoning.

In monotonic reasoning, when new axioms are added to the knowledge base, all existing inferences remain unchanged. In other words, knowledge can only grow, deductions are never retracted. If one never makes assumptions that are not explicitly stated, the reasoning will be monotonic. In non-monotonic reasoning, it is possible for new information to cause retraction of previously drawn conclusions. This happens if extra assumptions are made during inference.

Now, one can argue that non-monotonic reasoning is more practical, that's how humans do it after all. Or one can argue that you don't want software to deliberatery make mistakes by making the wrong assumptions. In software one cares about things like reusability, long-term maintenance, safety and context-independence. When you draw conclusions from a set of facts, you don't want to have to do bookkeeping in what context they were arrived at (did we know A at the time or no?). So the pioneers of the semantic web had the debate and went for monotonicity. Since the global semantic web is an open-ended knowledge source, constantly growing and being refined, monotonicity is the way to go. Good. However, that leads to what's argubly the most counter-intuitive aspect of working with OWL DL, especially if you're coming from a software background.

We've seen above the various logical statements that can be made in OWL 2.0. Now, it seems natural to assume that if you don't know whether a statement S is true, then you don't know, you can't simply decide that it's false, right? Well, that's what open-world semantics say as well. And that's what DL systems generally do: they make the open-world assumption (OWA): lack of knowledge that S does not automatically mean ¬S. Assuming otherwise is known as the closed-world assumption (CWA) which enables a reasoning style called negation-as-failure where failing to deduce something entails its converse. Using negation-as-failure obviously leads to non-monotonic reasoning because new facts will invalidate the "failure" part.

Explained like this, the OWA doesn't seem like such a large pill to swallow, it feels fairly natural. It turns out there are surprises and sometimes frustrations when you have spent all your life in a technical environment that operates under the CWA, namely conventional database systems, in particular SQL as well as more traditional logical languages like Prolog. It turns out, adopting the OWA is nothing short of a paradigm shift for the practicing programmer. Let me give you at least one example why and I promise we'll see more later on.

Say you have in the knowledge base

AllAmerican ≡ ∀hasPart.American
American(Engine123)
hasPart(F100, Engine123)


Is AllAmerican(F100) true? Since the only part that F100 has is American and since something is AllAmerican whenver all its parts are made in America, then we'd expect AllAmerican(F100) to be true. But this inference can't be made because according to the open-world assumption nothing prevents a new piece of information to assert hasPart(F100, MichelinTires), some time in the future, i.e. that French-made tires are used on the car. In other words, information about the entity in question is incomplete. This is unlike in the classic database world where you'd do a query to list all parts of the entity, join that with the "MadeIn" table listing where parts were made and you get the answer. The constraint that we've stated in the first axiom above will help detect an inconsistency if you assert both:

AllAmerican(F100)
hasPart(F100, MichelinTires)

But the concept definition itself doesn't provide a sure way to retrieve all its instances. On the other hand, if you define:

American ≡ ∃hasPart.American madeIn:America
madeIn(Engine123, America)
hasPart(F100, Engine123)


this is a much more constructive definition. It only defines something that is at least part American. Note how DL is capable of dealing with cyclic definitions. Now, if you ask for everything American, you will get both Engine123 and F100 in the result set. In general, problems with OWA arise when a query or an expected inference rely on the knowledge base somehow having exhaustive information about how entities related to all other possible entities.

There are a few tricks that one can use to "close" the world by explicitly adding information or additional constraints to an ontology with the purpose of forcing certain inferences:

  • Listing all individuals explicitly with the enumeration constructor {...}. That's a way to tell the reasoner that it has complete information about the members of a class.
  • Imposing precise cardinality constraints. For example, one could state that a product has no more than, say, 3 parts. Then if all those 3 parts are explicitly listed, a reasoner knows no extra parts are possible and can decide if the product is AllAmerican or not.
  • Stating explicitly that something doesn't have a certain kind of property. For example, one could state that (¬∃hasPart.¬American)(F100).
  • Stating explicitly that something doesn't have a certain property. OWL allows negative property assertions with the NegativeObjectPropertyAssertion or the NegativeDataPropertyAssertion. Note that those are syntactic sugar. You can say the same thing using concept complement: {a} ¬ (∃P{b})


All those are valid means to "close the world" and they are used in practice. But one must also keep in mind that the reasoning algorithm is separate from the modeling language. Nothing prevents you from applying non-monotic reasoning with negation-as-failure to DL models in limited and controlled contexts.

UNA - The Unique Name Assumption

To complete our short account of the mathematical foundation of OWL, we will have to take a look at another consequential open-world aspect of DL reasoners - the unique name assumption (UNA) which OWL does not make. The UNA states that distinct names necessarily refer to distinct entities. Recall that names in OWL are URIs, that is identifiers unique within the global namespace of all names in the semantic web. So we are saying here that several different identifiers, unique as they are, may actually identify the same thing. This is again something that we're not used to in classic logic systems or databases where distinct identifiers refer to distinct entities.

Unlike the OWA, the UNA actually makes good sense in the context of Description Logic and it is a natural expectation that a knowledge engineer may rely on. However, OWL does not make that assumption and this is in part due to the global nature of naming in OWL. Everybody can come up with a vocabulary and it would be nice to be able to state post factum when we are talking about the same thing even when we were using a different name for it. In fact there's an axiom for that, called an agreement axiom:

xy

The agreement says that x and y refer to the same entity so that all facts about x are also true about y and vice-versa. Before you conclude that this is akin to variable assignment and the URIs of OWL individuals are like variables in a programming language, note that this statement is symmetric, it goes in both directions! Agreement is also something that can be automatically inferred by a reasoner as well as a question that a user may ask of a knowledge base. To repeat: a reasoner is free to conclude that two different names, two different URIs are actually refering to the same real world entity. This sort of inference that dabbles with the sacred notion of identity can lead to some rather unexpected inference results. Consider the sensible constraint that every car has exactly one owner:

Car = 1 isOwnedBy.Person

Now, suppose also that at a certain point in time it is known, asserted in the knowledge base, that isOwnedBy(H1, Tom). Later, the car identified with H1 is acquired by Betty so you assert isOwnedBy(H1, Betty) as well, yet you forget to remove the assertion about Tom's ownership. Or, maybe Tom and Betty got married and she became a co-owner of the car. That should violate our sole ownership constraint, right? Wrong! For the reasoner, Tom and Betty are just names referring to an entity and because names are not assumed to be unique, it happily concludes that Tom Becky. Married or not, they may very well object.

Fortunately, getting around that behavior is much easier than with the OWA. There are a few direct ways to dissociate individuals and here are some OWL axioms available (standard accounts for DL don't have shorthand notations for these, but there are ways to encode the knowledge):

  1. owl:DifferentIndividuals(I1, I2, ..., In) states that the listed individuals are all distinct.
  2. owl:DisjointClasses(C1, C2, ..., Cn) states that each pair of classes Ci, Cj, i≠j, is disjoint, i.e. Ci Cj = .
  3. owl:DisjointUnionOf(C, C1, C2, ..., Cn) states the class C is the union of C1, C2, ..., Cn and furthermore that each pair Ci, Cj is disjoint, i.e. Ci Cj = . In other words C being the disjoint union of C1, C2, ...Cn is the same thing as saying that C1, C2, .. forms a partition of C.


Being an exhaustive list of the sub-classes, the DisjointUnionOf axiom is a bit like an enumeration but for classes rather than for the individuals in a classes. So in case you were wondering, when a concept is defined through an enumeration of its individuals, say C = {a,b,c}, that doesn't imply that the individuals in question, a,b and c, are different. If they are, you'd have to state it separately with owl:DifferentIndividuals(a,b,c).

So in the Tom and Betty situation described above, we have several means to avoid the undesirable inference. Of course we could just declare that they are different. But we can also have Tom belong to the concept Male declared as disjoint from the concept Female.

There may be other indirect ways to refine a model once you discover such an unwanted inference. Inference engines are often capable of giving an explanation of a certain inference so you can figure out the logical steps that led to it and break the chain somewhere else. In our example, we may have a relationship isMarriedTo(Tom,Betty) which may be declared in the TBox to be symmetric, which would imply isMarriedTo(Betty, Tom). To fix the problem, we could declare the isMarriedTo property as irreflexive which would imply Tom != Betty.

Finally, note that some tools may allow you to set a global parameter to force the UNA, which has the same effect as declaring all individuals to be different. So check your documentation.

Conclusion

Ok, if you've read and understand the above then you know almost all of OWL already. I deliberately stuck with the Description Logic terminology and syntax because I believe it forces one to stay in "math land" and think about OWL from a mathematical logic viewpoint, rather than through the prism of the OO programmer with all the bagage that this entails.

OWL DL is a formalism that merges object-oriented modeling ideas into a mathematical logic that allows you to encode highly-structured knowledge and to make non-trivial inferences from it. One of the skills that you'd need to develop while working with OWL is the ability to formulate questions in terms of logical concept descriptions. And get comfortable with the idea that class constraints are expressed as logical formulas and that sub-classing is logical implication. Don't forget that knowledge is open-ended. And by the way, OWL DL is sound, complete and decidable. This means it only makes true inferences, it makes all inferences and you can't make it loop forever.

Coming Up

In the next and following installments, we'll dive into actual modeling and coding. As a piece of homework, I'd suggest you go through the very detailed Protege tutorial. We will be building an application based on an OWL model and using the standard OWLAPI.

Published at DZone with permission of Borislav Iordanov, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)