Domain Modeling with OWL  Part 2
Why Description Logic Matters?
In this second installment of the OWL introductory series, we will be doing a little bit of math. If you ever plan to create a moderately complex OWL 2 model, understanding the mathematical foundations will give you the right intuitions and spare you unpleasant surprises. The math is especially relevant to the reasoning portion of the OWL toolset and tightly integrated reasoning services are one of the main distinguishing features of OWL as opposed to many other knowledge representation languages.
Last time, I gave you a bit of a historical context on OWL and I mentioned that it is a fragment of firstorder logic. This is strictly speaking correct, but a bit misleading as far as the original motivation of the formalism. OWL 2.0 is based on Description Logic (henceforth DL) which was developed in order to put semantic networks (or "conceptual graphs") and frames on a firm foundation. Several variants of these conceptual networks were proposed in the 70s, but the ideas at the core of DL first appeared as something called structured inheritance networks . Specifically, those ideas are the use of classes, instances and properties combined with means to create complex logical expressions and, most importantly, the ability to make inferences about subclassing and class membership. And this last aspect, the fact that nontrivial ISA relationships between two classes or between a class and an instance can be inferred, is the crucial contribution of that work and the reason it eventually got a comprehensive mathematical treatment gaining the name Description Logic.
When you create a domain model in OWL, you can state various constraints between classes and properties and those constraints have logical consequences. You are actually able to create a rather rich logical model of your domain and then ask interesting questions about it. Part of the knowledge that you would have represented is explicit. That's simply the collection of statements that you make to describe the domain. You can retrieve that knowledge by using the standard semantic web query language SPARQL or by direct API calls. No reasoning involved here, just plain data querying. But another portion of your knowedge base is the implicit knowledge that can be derived as a consequence of your statements. Accessing that implicit knowledge requires a precise logical interpretation, reasoning services and an appropriate expression language to formulate queries. That is what DL is about.
In what follows, I will introduce the actual mathematical formalism and show you sidebyside how it maps to OWL 2.0. Then we will discuss a few uncommon assumptions that OWL reasoners make and what their consequences are. One of the goals of this introduction is to give you a head start should you want to read more about the subject of Description Logic.
For more information and an extensive literature review, please consult The Description Logic Handbook.
Let's Get Formal
Formal languages, mathematical logic languages just list programming languages, are specified in two parts: syntax and semantics. Logic languages in particular are traditionally specified through what is known as Tarskistyle semantics, where the meaning of an expression in the language is ascribed via a correspondence with set theoretical constructs. This allows the whole aparatus of model theory to be applied and classical proof techniques can be used to show whether a language is decideable and, if so, to what complexity class it belongs.
Here is a summary of the DL language elements: Atomic concepts, usually written in capital letters from the beginning
of the alphabet:
A, B, C, D
etc. Those correspond to OWL classes.  The special concepts top and bottom, denoted by
and respectively. In OWL, those are referred to as
owl:Thing
andowl:Nothing.
 Roles, also in capital but from another part of the alphabet:
R,S etc.
DL roles are equivalent to OWL properties.  Individuals usually written in lower case:
a, b, c
etc. In OWL we call them individuals as well.  Logical operators like (intersection), (union), ∀ (every), ∃ (there exists), ¬ (complement) as well as number comparison ≤, ≥ etc. In OWL, all those operators are represented differently depending on the OWL syntax used, XML or functional or the very concise Manchester syntax.
The difference in terminology between OWL and DL shouldn't lead to confusion. Knowing the original DL terms and history should help understand the meaning adopted by OWL of what are otherwise standard OO notions. For example, knowing that OWL properties come from DL roles hints at the idea of a connection between two entities rather than having one "belong" to the other. I will be using both DL and OWL terms freely, and they should be considered synonymous. The single letter naming conventions are used only when doing the math. In modeling, both DL and OWL use more descriptive names, usually camel case, starting with capital letter for concepts and individuals, and starting with lower case for roles.
So the core elements in the language are concepts, roles and individuals. The interpretation of those elements is based on a universal set of things that are being talked about, a domain, and a mapping that assigns individual names to elements of that domain, concepts to subsets of the domain and roles to relations over the domain. Thinking about concepts as sets is not far from seeing them as classes, both in the mathematical and in the programming sense.
More formally, the meaning of the language elements is defined by an
interpretation function ℑ that assigns a set to each named DL concept.
We write x^{ℑ}
instead ℑ(x)
for the element that the interpretations maps x
to. The domain of interpretation, or application domain in software engineering terms, is denoted by Δ^{ℑ}.
One may use just Δ for the namespace of individuals, the domain of
discourse. This is an important distinction in mathematical logic as one
is moving from formulas to what they are denoting and back. As an
example, top and bottom are formally interpreted as the whole domain and
the empty set respectively:
^{ℑ} = ∅
^{ℑ}=Δ^{ℑ}
When proving theorems about a logic language, one frequently reasons about different possible interpretations and each interpretation is called a model, not to be confused with our domain models in software. A model may offer a different mapping of individuals, concepts and roles, but it may not change the meaning of the logical operators. For example, the intersection operation is analogous to what would be logical conjunction ∧ and is always interpreted as set intersection:
(C
D)^{ℑ} = C^{ℑ}∩D^{ℑ}
Intuitively, half of what DL allows you to do is express concept descriptions, i.e. descriptions of sets of individuals. The atomic concepts and the roles are elementary descriptions and then you have operators in the syntax to make up more complex ones. The other half is description of individuals in terms of how they are classified and how they are related to other individuals.
In logical terms, concepts can be seen as oneplace predicates while roles as twoplace predicates. In fact, the claim that DL is a fragment of firstorder logic (FOL) starts with that correspondence. Then it is easy to see how the formulas and statements below can be translated into FOL. So one can state in DL that a given individual belongs to a concept:
C(a)
 a belongs C, akin to OWL's ClassAssertion axiom.
or that two individuals are related by a role:
R(a,b)
 b
is a filler of the role R (or an
Rfiller) for a
. This is akin to OWL's ObjectPropertyAssertion
that we saw last time.
Note the phrasing here: b is "a" filler, not "the" filler as there may be many. But the more interesting part are the complex concept description that one is allowed to form.
Describing Concepts
The power of DL as a language lies in its ability to describe classes of entities via complex logical formulas. That is what makes it into a useful logic language. In the table below you can see the list of available constructors for building those complex descriptions.The 3d column shows the equivalent OWL syntax in the standard OWL Functional Syntax and the Manchester Syntax. I won't be covering the bloated and ugly XML syntax. IMHO, pushing XML as the default serialization mechanism for RDF/OWL is probably an important reason for the slowish adoption of the technology. The functional syntax is both complete and userfriendly. The Manchester syntax is incomplete but even better looking than DL's own and used in Protege whenever class expressions are needed. So I'm showing both of those.
DL Syntax  Name  OWL Syntax  Meaning 

Bottom  owl:Nothing  The empty set.  
Top  owl:Thing  The entire domain of interest.  
C D 
Intersection  ObjectIntersectionOf(C D) 
The set of individuals that belong to both C and D .

C D  Union  ObjectUnionOf(C D) 
Describes the individuals that belong either to C or to D (or to both!). 
¬ C  Complement  ObjectComplementOf(C) 
Describe the set of all things that do not belong to the concept C . 
∀R.C  Universal value restriction  ObjectAllValuesFrom(R C) 
Describes the individuals all of whose Rfillers belong to C. In OWL terms, this is the class of objects where all the values of the property R are of type C. 
∃R.C  Existential quantification  ObjectSomeValuesFrom(R C) 
Describes the individuals that have at least one Rfiller that belongs to C. In OWL terms, all objects that have at least 1 property R whose value is of type C. 
{a, b, c, ... }  Enumeration  ObjectOneOf(a,b,c,..) 
Describes the concept consisting of exactly the individuals a, b, c, etc. 
R:a  Individual value restriction  ObjectHasValue(R a) 
Describes the individuals having a as an Rfiller. In OWL terms, the objects that have property R with value a . 
≥ n R.C  Minimum cardinality  ObjectMinCardinality(n R C) 
Describes the individuals that have at least n fillers of the role R belonging to the concept C. 
≤ n R.C  Maximum cardinality  ObjectMaxCardinality(n R C) 
Describes the individuals that have at most n fillers of the role R belonging to the concept C. 
= n R.C  Exact cardinality  ObjectExactCardinality(n R C) 
This is a shorthand for ≤ n R.C and ≥ n R.C combined. 
Each and every construct listed above has a precise formal, settheoretic interpretation. For example, universal value restriction is interpreted thus:
(∀R.C)^{ℑ} = {a∈Δ^{ℑ}  ∀b:(a,b)∈R^{ℑ} → b∈C^{ℑ} }
As a little exercise, you could spell out the formal semantics of some of the other forms. This list of constructors constitutes a powerful means of expressing all sorts of concepts. Description Logic as a formalism has several variants with different computational characteristics. A particular variant is defined by the set of constructors that are allowed in it. Suffice it to say that all of them are available in OWL. So let's see a few concrete examples of what can be expressed in this language so far, staying on our automobile theme from last time:
DL Expression  Denoting Concept 

Car ¬Red 
All cars that are not red. 
∃hasPart.American 
All objects that are at least in part made in America. 
Person ∀owns.
(Hybrid BioDiesel) 
Climate change conscious people that don't own cars based exclusively on fossil fuels. 
In the constructor table, I showed the OWL Manchester syntax right below the OWL functional syntax. To get a feel, here's what the last expression above looks like in the Manchester syntax:
Person and owns only (Hybrid or Biodiesel)
If you are familiar with mathematical logic, you probably noticed the absence of variables. If this makes you uncomfortable, just think of concept expressions in DL as implicitly containing one free variable ranging over the domain of discourse. In other words, concept expressions are what you get as logical formulas in DL.
Making Statements  the TBox and the ABox
So far so good. We have seen how to make complex class descriptions in terms of simpler ones. Let's see how we can state facts (a.k.a. axioms). There are two fundamental kinds of axioms in DL: axioms expressing constraints purely within the conceptual model and axioms talking about individuals in the world being described. The former comprise the so called TBox (Terminological box) while the latter comprise the ABox (Assertion box). OWL itself doesn't make that distinction, but reasoning algorithms use it and you will come across those terms in the literature and discussion groups. I already showed you the two main types of ABox axioms, concept and role assertions with the following semantics:
C(a) is true in ℑ if a^{ℑ} ∈ C^{ℑ}
R(a,b) is true in ℑ if (a^{ℑ},b^{ℑ})∈R^{ℑ}
Another way to say the above is that the interpretation ℑ satisfies C(a) and that ℑ satisfies R(a,b). The concept and individual assignments that ℑ makes are consistent with those assertions. If an interpretation satisfies all axioms in an ABox, it is a model for that ABox. Concept and role assertions are not the only possible kinds of statements in an ABox, but they are the most important ones. Other assertions allow you to say when different names should be interpreted as the same individual and when not. More on them below.
In the TBox, the axioms establish a priori facts about concepts and roles. Two main types of axioms are used, inclusions and equalities:
C
D (inclusion or subsumption) is true in ℑ if C^{ℑ}⊆D^{ℑ}
C ≡ D (equality or definition) is true in ℑ if C^{ℑ}=D^{ℑ}
R
S (role subsumption) is true in ℑ if R^{ℑ}=S^{ℑ}
And similarly, an interpretation ℑ satisfies a TBox whenever it satisfies all axioms in it. Note that we can also define property inheritence in Description Logic, not only class inheritence. An example would be hasSon that is a subrole of hasChild  if somebody has a son, one can infer that they definitely have a child. Even though we've just used atomic names here, a full concept expression can appear on either side of an inclusion or an equality axiom. For example we can define a Pedestrian as somebody who doesn't own a car:
Pedestrian ≡ Person
∀owns.(not Car)
From this, a reasoner can already trivially infer that a Pedestrian is a Person. As another example, we can say that true sports cars must have no more than two doors:
SportsCar
Car
≤ 2 hasPart.Door
The above axiom states that whenever something is known to be a sports car, it
is definitely a car (so if somebody owns it, they can't be a pedestrian) and
it can't have more than 2 doors. If you declare an individual as a
SportsCar and then proceed to assign 4 different doors to it:
hasPart(MyCar, FrontLeftDoor)
hasPart(MyCar, FrontRightDoor)
hasPart(MyCar, BackLeftDoor)
hasPart(MyCar, BackRightDoor)
a reasoner would complain about an inconsistency in your knowledge base, it
will enforce a constraint.
Even though operators like intersection and union could be defined for
roles, this is not done in OWL. There are other ways to specify role
constraints at the conceptual level though. Besides role subsumption,
one can constrain the source and target of roles, or domain and range of
a property in OWL terms. Talking about "domains" and "ranges" is more
familiar than "roles" and "fillers", and consistent with the view of OWL
properties as binary relations. OWL provides the special axioms ObjectPropertyDomain
and ObjectPropertyRange
.
However, one should keep in mind that such constraints can be specified
using the existing DL tools and are in fact interpreted in exactly that
way by OWL reasoners:
≥ 1 R.T
C
(domain of R is C)
∀ R. C
(range of R is C)
In English, the first axiom above says that anything with at least 1
role R
filled by whatever also belongs to the concept C. So in other words,
whenever you have R(x, ?), you can infer the x ∈ C. Similarly, the
second axiom
says that any individual can only be a filler of role R if it belongs to
the
concept C. Therefore, a statement of the form R(?, x) would allow a DL
inference engine to conclude that x belong to C. Notice the pattern here
that allows you to introduce a constraint that applies to everything.
Saying that the universe () is subsumed by a concept C
is the same as saying that all individuals belong to C
.
Moreoever, just like binary relations in classical set theory, roles in Description Logic can be classified semantically into symmetric, asymmetric etc. One can directly make such declarations about OWL properties as TBox axioms and enrich the conceptual model this way. And this is again where the beauty of DL and OWL shines. The logical aparatus that you learn in basic discrete math, something that you might use for documentation purposes, is available in a simple declarative software modeling language. Here are the options and a refresher on what they mean:
Characteristic  Syntax  Meaning 

Functional  FunctionalObjectProperty(isMadeBy) 
Only one value is permitted as a role filler for R. An object can have only one such property. 
Inverse  InverseObjectProperties(hasMade isMadeBy) 
This says that hasMade ≡ isMadeBy^{} . The domain and range are reversed: isMadeBy^{} = { (a,b)  (b,a) ∈ hasMade} .

Symmetric  SymmetricObjectProperty(isMarriedTo) 
Symmetric means that the relation goes both ways: (a,b)∈R⇔(b,a)∈R

Asymmetric  AsymmetricObjectProperty(isMarriedTo) 
An asymmetric means that the relation cannot go both ways:(a,b)∈R⇒(b,a)∉R

Reflexive  ReflexiveObjectProperty(feeds) 
In a reflexive role, every individual is its own filler. For example, everybody feeds themselves. 
Irreflexive  IrreflexiveObjectProperty(isMarriedTo) 
In an irreflexive role, no individual can be its own filler. 
Transitive  TransitiveObjectProperty(isPartOf) 
In a transitive role, whenever (a,b) ∈ R and (b,c) ∈ R we have also (a,c) ∈ R .

Some of those role semantics can be expressed using available machinery. For example, the fact that a role R
is functional can be expressed as ≤ 1 R. But transitivity can't. Another such "irreducible" construction available in OWL is ObjectPropertyChain
which allows you to express role composition. You can say that a chain
of roles that indirectly connects two individuals establishes a
relationship between them. A common example of this is the uncle
relationship which would be defined in OWL like this:
SubObjectPropertyOf(ObjectPropertyChain(hasFather hasBrother) hasUncle)
To sum up, it is common for DL systems to separate the knowledge into a purely conceptual model, sort of like a schema definition, the TBox and actual data which associates individuals to concepts and assigns them roles, the ABox. Reasoning tools tend to use different algorithms and optimization techniques dependending on whether they are dealing exclusively with a TBox or an ABox or a mix of both. The gist of the formalism is the ability to describe complex concepts in terms of simpler ones and the describe individuals in terms of how they are classified and how they relate to other individuals. It is a logic language with no variables all right, but not a propositional one. I've advertised the ability to make nontrivial inferences about the accumulated knowedge, so let's take a look at those now.
Reasoning with Concepts and Roles
There are a few core reasoning problems about the conceptual portion (TBox) of a DL knowledge base stemming from natural questions that one might ask. For example, is a given concept satisfiable (remember, when we say concept here, we may mean a possibly complex logical formula, a full description) in the sense that it is possible to find a model where that concept describes a nonempty set. Another question is whether one concept subsumes another (in OWL terms if a class is a subclass of another), which can be reduced to satisfiability because:
C
D
if and only if C
¬D
is not satisfiable.
Another question is if two concepts (or concept formulas) describe the same set
of individuals. And again, this can be reduced to satisfiability by first
observing that two concepts are equivalent if and only if both
C
D
and D
C
are true. Finally, note that two concepts C and D are disjoint if
C
D
is unsatisfiable. If you think about it a bit, you'd find that all of those
reasoning tasks are reducible to one another. For example, suppose you have an
algorithm for subsumption. You can then determine if C a unsatisfiable by checking if it is subsumed by .
The satisfiability question is more interesting to tool builders because that's
how tableau algorithms
tend to operate  they try to obtain a contradiction when building a model for
a formula. And it is also sometimes a necessary condition for a whole ontology
to be consistent. But to people, the subsumption question is often the more
interesting one because it can be viewed as implication. If you think of
a concept expressions as logical formulas with one free variable, then concept
subsumption is logical implication in the sense that whenever the subconcept formula
is true of an individual, the superconcept formula is true as well.
Another reason the subsumption question is interesting in practice is the ability of inference engines to list all named concepts subsumed by a given concept description, thus automatically constructing a conceptual hierarchy out of TBox constraints.
Reasoning with Individuals
Unlike TBox reasoning where we are dealing purely with conceptual constraints, in the ABox we are stating facts about objects. Something may be wrong with a TBox if a concept is unsatisfiable, which simply means that no individual can belong to it, which simply means that it's equivalent to the bottom concept . And there's nothing special about that, there aren't any other consequences. Usually, when a contradiction is found in a logic language, it is devastating for the language because it allows one to prove anything as a consequence. An unsatisfiable concept is a sort of a contradiction, but it doesn't make an ontology useless because it doesn't prevent one from creating a model of the set of axioms. That is, it doesn't prevent one from coming up with a sensible interpretation. It's just that the sets corresponding to the unsatisfiable concepts will be empty.
When a TBox has an unsatisfiable concept, it is simply called incoherent. That's an undesirable property, and a knowledge engineer should strive to maintain coherent ontologies. In fact, studying the consequences of incoherence is a topic on its own.
A contradiction involving the ABox however is a different story. It means that it's impossible to find a model for the axioms, i.e. there is no way to interpret them! And this is what defines an inconsistency in DL: an ontology (TBox+ABox) is called inconsistent if there's no model for it. Interestingly, concept satisfiability and consistency have been proven to be equivalent problems! An example of an inconsistency would be asserting that an individual belongs to two disjoint concepts:
Pedestrian(Tom)
Car(H1)
owns(Tom, H1)
Here we've stated that Tom is a Pedestrian
and that he owns the
individual H1
which we've asserted to be a Car
. According
to our definition of a Pedestrian
above, Tom can only own things that are not
cars. Therefore, a reasoner would infer that H1∈¬Car
and
that's an inconsistency. Note that the inference engine can't tell you
exaclty what you did wrong. If this were a real world example, perhaps the
statement Pedestrian(Tom)
is at fault. But a reasoner won't
complain about it because there's no problem with declaring Tom a pedestrian per se.
While consistency is a natural question to ask, a more practical question is
instance checking which asks if a given individual belongs to a given
concept. Since a concept can be a complex logical formula, this essentially
allows you to ask a logical question about an object. And to check whether
C(a)
is true, one needs to prove that if you add it to the
ontology
as an axiom, it leads to an inconsistency. For example to find if a
given car model is an all American green energy car we could ask if it
is an instance of the concept:
∀hasPart.American
(Hybrid
BioDiesel)
Even more fun is the ability to ask for all individuals that belong to a concept. This is known as the retrieval problem. It is akin to querying a database by specifying the desired data's characteristics via a logical expression. Conversely, given an individual one may query for all the named concepts the individual belongs to. That is, we can ask for all the types that individual has been classified under.
What About Data Values
Last time you learned that there are two kinds of OWL properties that an
individual can have: object properties and data properties. In
Description Logic, data properties are introduced by extending the
formalism with concrete domains and further allowing standard
logical predicates (i.e. nary boolean functions) over those domains. An
example of a concrete domain is the set of natural numbers with binary
predicates for the comparison operators <, ≥ To make the formalism
work, certain restrictions are imposed on the available set of
predicates for a given domain. However, we won't go into details here
because OWL doesn't allow use of concrete datatype predicates in class
definitions. Thus, it is not possible to say that somebody is allowed to
drink if their age > 21
. So data values in OWL are used
more or less like individuals except they can only appear as role
fillers. When we cover rules later, we will see how to get around this
OWL limitation by using the SWRL (Semantic Web Rule Language).
Reasoning in an OpenEnded World
The economist John Keynes is famously quoted as saying "When the facts change, I change my mind. What do you do, sir?" So it goes for much of (sound) human reasoning. We are quick to draw conclusions, taking shortcuts, making assumptions and faced with new information we rapidly retract our deductions and change our reasoning. The alternative would be to very rarely commit ourselves to a conclusion, say "I'm not sure" most of the time, only infer things that are certain so we don't have the embarassment of being wrong. What's the right attitude? That's the debate between monotonic and nonmonotonic reasoning.
In monotonic reasoning, when new axioms are added to the knowledge base, all existing inferences remain unchanged. In other words, knowledge can only grow, deductions are never retracted. If one never makes assumptions that are not explicitly stated, the reasoning will be monotonic. In nonmonotonic reasoning, it is possible for new information to cause retraction of previously drawn conclusions. This happens if extra assumptions are made during inference.
Now, one can argue that nonmonotonic reasoning is more practical, that's how humans do it after all. Or one can argue that you don't want software to deliberatery make mistakes by making the wrong assumptions. In software one cares about things like reusability, longterm maintenance, safety and contextindependence. When you draw conclusions from a set of facts, you don't want to have to do bookkeeping in what context they were arrived at (did we know A at the time or no?). So the pioneers of the semantic web had the debate and went for monotonicity. Since the global semantic web is an openended knowledge source, constantly growing and being refined, monotonicity is the way to go. Good. However, that leads to what's argubly the most counterintuitive aspect of working with OWL DL, especially if you're coming from a software background.
We've seen above the various logical statements that can be made in OWL
2.0. Now, it seems natural to assume that if you don't know whether a
statement S
is true, then you don't know, you can't simply
decide that it's false, right? Well, that's what openworld semantics
say as well. And that's what DL systems generally do: they make the
openworld assumption (OWA): lack of knowledge that S
does not automatically mean ¬S
. Assuming otherwise is known as the closedworld assumption (CWA) which enables a reasoning style called negationasfailure
where failing to deduce something entails its converse. Using
negationasfailure obviously leads to nonmonotonic reasoning because
new facts will invalidate the "failure" part.
Explained like this, the OWA doesn't seem like such a large pill to swallow, it feels fairly natural. It turns out there are surprises and sometimes frustrations when you have spent all your life in a technical environment that operates under the CWA, namely conventional database systems, in particular SQL as well as more traditional logical languages like Prolog. It turns out, adopting the OWA is nothing short of a paradigm shift for the practicing programmer. Let me give you at least one example why and I promise we'll see more later on.
Say you have in the knowledge base
AllAmerican ≡ ∀hasPart.American
American(Engine123)
hasPart(F100, Engine123)
Is AllAmerican(F100)
true? Since the only part that F100
has is American and since something is AllAmerican
whenver all its parts are made in America, then we'd expect AllAmerican(F100)
to be true. But this inference can't be made because according to the
openworld assumption nothing prevents a new piece of information to
assert hasPart(F100, MichelinTires)
, some time in the
future, i.e. that Frenchmade tires are used on the car. In other words,
information about the entity in question is incomplete. This is unlike
in the classic database world where you'd do a query to list all parts
of the entity, join that with the "MadeIn" table listing where parts
were made and you get the answer. The constraint that we've stated in
the first axiom above will help detect an inconsistency if you assert
both:
AllAmerican(F100)
hasPart(F100, MichelinTires)
But the concept definition itself doesn't provide a sure way to retrieve all its instances. On the other hand, if you define:
American ≡ ∃hasPart.American
madeIn:America
madeIn(Engine123, America)
hasPart(F100, Engine123)
this is a much more constructive definition. It only defines something
that is at least part American. Note how DL is capable of dealing with
cyclic definitions. Now, if you ask for everything American, you will
get both Engine123
and F100
in the result set.
In general, problems with OWA arise when a query or an expected
inference rely on the knowledge base somehow having exhaustive
information about how entities related to all other possible entities.
There are a few tricks that one can use to "close" the world by
explicitly adding information or additional constraints to an ontology
with the purpose of forcing certain inferences:
 Listing all individuals explicitly with the enumeration constructor
{...}
. That's a way to tell the reasoner that it has complete information about the members of a class. 
Imposing precise cardinality constraints. For example, one could state
that a product has no more than, say, 3 parts. Then if all those 3 parts
are explicitly listed, a reasoner knows no extra parts are possible and
can decide if the product is
AllAmerican
or not. 
Stating explicitly that something doesn't have a certain kind of property. For example, one could state that
(¬∃hasPart.¬American)(F100)
. 
Stating explicitly that something doesn't have a certain property. OWL allows negative property assertions with the
NegativeObjectPropertyAssertion
or theNegativeDataPropertyAssertion
. Note that those are syntactic sugar. You can say the same thing using concept complement: {a} ¬ (∃P{b})
All those are valid means to "close the world" and they are used in practice. But one must also keep in mind that the reasoning algorithm is separate from the modeling language. Nothing prevents you from applying nonmonotic reasoning with negationasfailure to DL models in limited and controlled contexts.
UNA  The Unique Name Assumption
To complete our short account of the mathematical foundation of OWL, we will have to take a look at another consequential openworld aspect of DL reasoners  the unique name assumption (UNA) which OWL does not make. The UNA states that distinct names necessarily refer to distinct entities. Recall that names in OWL are URIs, that is identifiers unique within the global namespace of all names in the semantic web. So we are saying here that several different identifiers, unique as they are, may actually identify the same thing. This is again something that we're not used to in classic logic systems or databases where distinct identifiers refer to distinct entities.
Unlike the OWA, the UNA actually makes good sense in the context of Description Logic and it is a natural expectation that a knowledge engineer may rely on. However, OWL does not make that assumption and this is in part due to the global nature of naming in OWL. Everybody can come up with a vocabulary and it would be nice to be able to state post factum when we are talking about the same thing even when we were using a different name for it. In fact there's an axiom for that, called an agreement axiom:
x
y
The agreement says that x and y refer to the same entity so that all facts about x are also true about y and viceversa. Before you conclude that this is akin to variable assignment and the URIs of OWL individuals are like variables in a programming language, note that this statement is symmetric, it goes in both directions! Agreement is also something that can be automatically inferred by a reasoner as well as a question that a user may ask of a knowledge base. To repeat: a reasoner is free to conclude that two different names, two different URIs are actually refering to the same real world entity. This sort of inference that dabbles with the sacred notion of identity can lead to some rather unexpected inference results. Consider the sensible constraint that every car has exactly one owner:
Car
= 1 isOwnedBy.Person
Now, suppose also that at a certain point in time it is known, asserted in the knowledge base, that isOwnedBy(H1, Tom)
. Later, the car identified with H1
is acquired by Betty so you assert isOwnedBy(H1, Betty)
as well, yet you forget to remove the assertion about Tom's ownership.
Or, maybe Tom and Betty got married and she became a coowner of the
car. That should violate our sole ownership constraint, right? Wrong!
For the
reasoner, Tom and Betty are just names referring to an entity and
because names are not assumed to be unique, it happily concludes that Tom
Becky
. Married or not, they may very well object.
Fortunately, getting around that behavior is much easier than with the OWA. There are a few direct ways to dissociate individuals and here are some OWL axioms available (standard accounts for DL don't have shorthand notations for these, but there are ways to encode the knowledge):
owl:DifferentIndividuals(I1, I2, ..., In)
states that the listed individuals are all distinct.owl:DisjointClasses(C1, C2, ..., Cn)
states that each pair of classesCi, Cj, i≠j
, is disjoint, i.e.Ci
Cj =
.owl:DisjointUnionOf(C, C1, C2, ..., Cn)
states the classC
is the union ofC1, C2, ..., Cn
and furthermore that each pairCi, Cj
is disjoint, i.e.Ci
Cj =
. In other wordsC
being the disjoint union ofC1, C2, ...Cn
is the same thing as saying thatC1, C2, ..
forms a partition ofC
.
Being an exhaustive list of the subclasses, the DisjointUnionOf
axiom is a bit like an enumeration but for classes rather than for the
individuals in a classes. So in case you were wondering, when a concept
is
defined through an enumeration of its individuals, say C = {a,b,c}
, that doesn't imply that the individuals in question, a,b
and c
, are different. If they are, you'd have to state it separately with owl:DifferentIndividuals(a,b,c)
.
So in the Tom and Betty situation described above, we have several means
to avoid the undesirable inference. Of course we could just declare
that they are different. But we can also have Tom belong to the concept Male
declared as disjoint from the concept Female
.
There may be other indirect ways to refine a model once you discover
such an unwanted inference. Inference engines are often capable of
giving an explanation of a certain inference so you can figure out the
logical steps
that led to it and break the chain somewhere else. In our example, we
may have a relationship isMarriedTo(Tom,Betty)
which may be declared in the TBox to be symmetric, which would
imply isMarriedTo(Betty, Tom)
. To fix the problem, we could declare the isMarriedTo
property as irreflexive which would imply Tom != Betty
.
Finally, note that some tools may allow you to set a global parameter to force the UNA, which has the same effect as declaring all individuals to be different. So check your documentation.
Conclusion
Ok, if you've read and understand the above then you know almost all of OWL already. I deliberately stuck with the Description Logic terminology and syntax because I believe it forces one to stay in "math land" and think about OWL from a mathematical logic viewpoint, rather than through the prism of the OO programmer with all the bagage that this entails.
OWL DL is a formalism that merges objectoriented modeling ideas into a mathematical logic that allows you to encode highlystructured knowledge and to make nontrivial inferences from it. One of the skills that you'd need to develop while working with OWL is the ability to formulate questions in terms of logical concept descriptions. And get comfortable with the idea that class constraints are expressed as logical formulas and that subclassing is logical implication. Don't forget that knowledge is openended. And by the way, OWL DL is sound, complete and decidable. This means it only makes true inferences, it makes all inferences and you can't make it loop forever.
Coming Up
In the next and following installments, we'll dive into actual modeling and coding. As a piece of homework, I'd suggest you go through the very detailed Protege tutorial. We will be building an application based on an OWL model and using the standard OWLAPI.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)