UML
The Unified Modeling Language (UML89), created in the mid-1990s, is an industry standard that defines a set of modeling languages for making various kinds of models and diagrams in support of object-oriented problem analysis and software design. Its core languages are Class Diagrams for information/data modeling, and Sequence Diagrams, Activity Diagrams and State Diagrams (or State Charts) for process/behavior modeling.
UML class diagrams provide a visual syntax for expressing UML class models, which allow defining information and data models. They can be used both at the more abstract level of conceptual modeling for requirements engineering and at the more detailed level of design modeling for designing the model classes of an app. Their main building blocks are class rectangles and association lines.
A class rectangle has one, two or three compartments, containing the name of the class, its properties, and its methods. The purpose of a class is to classify objects and to define their properties and the methods that can be invoked on them.
RDF and OWL
The Resource Description Framework (RDF90) has been defined by the W3C in 2004 as a logical formalism that allows (1) formalizing information models in the form of RDF vocabularies, and (2) representing propositional information (e. g., meta-data) on the Web. The core of a UML class model can be expressed as an RDF vocabulary. However, many types of integrity constraints cannot be expressed in RDF. The W3C has therefore defined an extension of RDF, called the Web Ontology Language (OWL91), which allows to formalize the core logic (classes, properties and integrity constraints) of a UML class model in the form of an OWL ontology.
An association line connects two class rectangles. The purpose of an association is to classify relationships (links) between objects. While UML classes have a direct counterpart in the class concepts of object-oriented programming (OOP) languages, UML associations do not have such a direct OOP counterpart. They are therefore often more difficult to understand for developers. Only in the special case of a unidirectional functional association there is a direct OOP counterpart: a reference property for referencing the objects that are linked to a given object by the association.
From a logical point of view, a class model defines a vocabulary, or language, for expressing various types of fact statements about objects. The knowledge representation languages RDF and OWL allow to formalize the vocabularies defined by class models and the fact statements made when instantiating their classes by creating objects. In this way, they help to understand the semantics of information models.
In a UML class diagram, a class has a name (shown in the first compartment of the class rectangle), and it may have properties (shown in the second compartment) and methods (shown in the third compartment). Properties and methods may be described with or without details. The following diagrams illustrate these options using the example of a class books
or Book
for describing books as information objects.
A class can be expressed in UML by just providing its name, without any further detail, like so
This option is useful for making sketches and overview diagrams. Using an ordinary English plural name like books
makes the class diagram more readable for non-tech-savvy people.
A more informative description of a class is obtained by listing its properties, possibly without any further detail, like in the following example:
However, for better understanding the meaning of properties and for being able to code a class in an OO programming language, we need to know the range of each property, which is the type of its values. The range of a property can be either a primitive datatype or another class.
In the following diagram we use general implementation-agnostic datatype names (like “Integer”), for which a specific programming language may have specific names (like “int” in Java). Notice that we now use a common OOP naming convention of giving classes a capitalized singular (mixed-case) name like Book
(or LearningUnit
). This allows saying that “an instance of a class C
is a C (object)”, like “an instance of Book
is a book (object)”.
Notice how the standard identifier attribute isbn
is marked with the keyword id
appended to the property declaration in curly braces. This is the UML syntax for defining several kinds of property constraints discussed in the next chapter.
Finally, we can also define the methods and functions of a class in a third compartment, like so:
In this example, the Book
class has a function checkISBN
, which returns a string. Given a class diagram in this form, it is straightforward to code it in an OO programming language like JavaScript or Java.
Recall that in JavaScript a class is defined in the form of a constructor function that assigns the values of its parameters to the properties of the newly created object, like so:
In JavaScript, the (instance-level) methods of a class are defined as method slots of the constructor’s built-in prototype
object. This is how we code the checkISBN
method:
If we don’t have to care about older web browsers, such as Internet Explorer 9, we can also use the new class
definition syntax (introduced in the ES6 version of JavaScript) and combine the definition of properties and methods in one piece of code:
As opposed to JavaScript, Java has always had a language element class
for defining classes:
We need to be aware of the ambiguity of the term “object”. We have to distinguish between objects in the sense of real-world objects (also called “business objects” or “entities”) and objects in an OO program, such as JS objects or Java objects. When we want to manage information about business objects of some type in an app, we represent them in the form of JS/Java objects instantiating a JS/Java class that represents their (business) object type. We call these classes model classes for two reasons: first because they implement the classes defined in an app’s data model, and second because they represent the ‘model’ part of an app’s Model-View-Controller codebase architecture.
Therefore, in a JS/Java app, a business object is a JS/Java object, but not every JS/ Java object represents a business object because we use JS/ Java objects for many purposes (e. g., in JavaScript, an array is a a JS object, but it’s not a business object). The same applies to classes: (business) object types are represented as model classes, but not every JS/Java class is a model class because we may use JS/Java classes also for other purposes (e. g., in Java, a class can be used as a container for a method library, but such a class is not a model class).
Whenever an app has to manage the data of more than one object type, it is very likely that there are associations between some of them. For instance, in the following class diagram, there is an association between publishers
and books
and an association between books
and people
as authors.
An association between two classes can be read in both directions. The association between publishers
and books
associates
published books
,The association between books
and people
as authors associates
authors
,authored books
.As will be discussed in Volume 2, associations are characterized by multiplicity constraints, which restrict the possibilities of how many objects of the associated class can be linked to an object of the given class. In our example, we have a one-to-many association between publishers
and books
and a many-to-many association between books
and people
.
For keeping things simple, we only include one object type and no association in the apps discussed in this volume of the book. In Volume 2, we will discuss how to model associations and how to implement them.
In a new development project, we start our analysis and modeling effort with making a conceptual information model. This type of model is also called domain model since it describes the entities of a given (real-world) problem domain, and does not model software entities.
Recall the conceptual information model for books obtained as the result of the inception phase:
Taking this conceptual model as a starting point, we have to make a number of design decisions for obtaining an information design model:
isbn
, title
and year
?The result of this design phase is a design model like the following:
It is important to understand that such a design model provides an implementation-agnostic (platform-independent) computational design, that is, it does not use any concept or syntax of any specific programming language or technology. Therefore, the same design model can be used for deriving different platform-specific implementation models for different programming languages or technologies, such as for a Java- or PHP-based framework, or for a plain JavaScript approach.
Based on the design model, by replacing the platform-independent datatype names with JavaScript-specific datatype names, and by adding “setter” methods, we obtain the following JavaScript implementation model, which we prefer to call a JavaScript class model:
Notice that in this model, we have used the JavaScript datatypes string
and number
, and we have added the methods setISBN
, setTitle
and setYear
. These “setter” methods are supposed to be used for setting a property to a new value, instead of directly assigning the value to the property.
Having a setter method for each property is a best-practice approach that allows more control over property value assignments. For instance, we could check the validity of values before they are assigned, or we could notify other modules of the app about the assignment event.
The implementation phase consists of making an implementation model for a specific technology platform, and then coding this model and testing the resulting program code. In this book, we make both JavaScript class models and Java class models, which are subsequently coded in plain JavaScript and in Java EE, respectively.
The entire transformation chain, from a conceptual model via a design model to a JavaScript class model (as a special type of implementation model), is summarized in the following figure.
In summary, the process of model-based development takes a conceptual model as the starting point for making a general (platform-independent) design model, from which one or more implementation models for a (set of) specific target technologies can be derived. Typically, they include a class model for an object-oriented programming language and a database model for an SQL DBMS. This process is illustrated by the following diagram:
The Resource Description Framework (RDF), together with its extension RDF Schema, is a logical formalism that allows
RDF is the basis of the Semantic Web. It has several syntaxes, including the textual XML-based syntax of RDF/XML and the visual syntax of RDF Graphs.
Consider the Book
class defined in the following class diagram
The corresponding RDF vocabulary, with one class definition and three property definitions, is defined in the following RDF graph:
In an RDF graph, nodes with an elliptic shape represent “resources” (like properties and classes), and arrows represent relationships defined by a property. Each arrow between two nodes represents a statement (also called “triple”). For instance the rdf:range
arrow between year
and xs:int
represents the statement that the range of the property year
is the XML Schema datatype xs:int
, where xs
is a namespace prefix for the XML Schema namespace.
Notice that RDF has the predefined meta-classes rdfs:Class
and rdf:Property
, used to define classes and their properties with the help of the predefined property rdf:type
. For instance the rdfs:type
arrow between year
and rdf: Property
represents the statement that year
is of type rdf:Property
, that is, it is defined to be an RDF property.
RDF graphs are a formalism for theoretical purposes. They can be used for illustrating simple examples. As opposed to UML class diagrams, they are not useful for visually expressing realistic vocabularies, due to their convolution and unnecessary visual complexity.
The domain of a property has to be defined explicitly in an RDF vocabulary (with an rdfs:domain
property statement), as opposed to a UML class diagram where it is defined implicitly. While it is natural to define properties in the context of a class, as in UML, RDF allows defining properties independently of any class.
The RDF/XML syntax allows publishing an RDF vocabulary on the Web. For instance, the simple Book
vocabulary defined in the RDF graph above, can be represented by the following RDF/ XML document:
Notice that the values of the rdf:resource
attribute must be URIs. If an attribute value is a fragment identifier like #Book
, it represents a relative URI and is resolved into a full URI by appending the fragment identifier to the in-scope base URI, which may be defined with the xml:base
attribute.
If an attribute value is an absolute URI like “ http://www.w3.org/2001/XMLSchema#string”, it contains a full namespace URI (like “ http://www.w3.org/2001/XMLSchema”), even if a namespace prefix (like “xsd” or “xs”) is defined for it. This is because namespace prefixes can only be used for XML element and attribute names, but not for attribute values, which unfortunately makes RDF/XML hard to read for human users.
Notice that the RDF formalization of our simple UML class model above has several shortcomings:
We show how to solve these two issues with the greater expressivity of OWL below.
The propositional information items, or fact statements, expressible with RDF are
ex:Book
is a rdfs:Class
” or “urn: isbn:006251587X
is a ex:Book
”, andex:isbn
property value of urn: isbn:006251587X
is ’006251587X’”.Consequently, for a UML object definition like
we obtain several RDF fact statements:
There are many use cases for machine-readable data (e. g., about people, events, products, etc.) embedded in web documents. For instance, search engines like Google can use such structured data92 for providing more meaningful search results.
Structured data, or meta-data, can be embedded in a web document by either adding a JSON-LD93script
element containing it, or by annotating the document’s content, e.g., the HTML elements of a web page, with RDFa94.
Very limited annotation approaches, called “microformats” (proposed around 2005), are the historic predecessors of the general annotation language RDFa, which is derived from RDF. Some microfomats, like vCard and vEvent, are still being used today, but they are increasingly replaced with one of the two general formats RDFa and JSON-LD.
The main author of HTML5, Ian Hickson, has proposed an alternative general annotation language, called microdata95, with the goal to simplify RDFa and remedy its usability issues (in particular, by dropping its use of XML namespaces). Despite the (rather unfortunate) choice of using different names for the same annotation concepts (like “itemprop” instead of “property”), Hickson’s microdata proposal succeeded to show
Since Hickson ended his collaboration with the W3C, the microdata proposal did not succeed to get an official W3C status, and web browsers have discontinued their support for it. However, it triggered a W3C proposal to use the RDFa Lite subset of RDFa, which “can be applied to most simple to moderate structured data markup tasks, without burdening the authors with additional complexities”.
We present a simple example for using structured data in a web page. Consider the following HTML fragment:
For this content, we may want to code the information that
Person
, which has been defined as a class by the search engine standard vocabulary schema. org
96;Using the RDFa attributes typeof
, vocab
and property
, we can code this information by adding the following annotations to the HTML content:
Using JSON-LD, as recommended by Google, we need to add a script
element of type “application/ld+json” containing the meta-data:
The propositional information expressed with RDFa annotations and JSON-LD corresponds to the following RDF/XML code:
OWL extends RDF by adding many additional language elements for expressing constraints, equalities and derived classes and properties in the context of defining vocabularies. Facts are expressed as in RDF (e. g., with rdf:Description
).
OWL provides its own predefined language elements for defining classes and properties:
owl:Class
is a subclass of rdfs:Class
.owl:DatatypeProperty
is a subclass of rdf:Property
. It classifies attributes. Therefore, the values of an owl:DatatypeProperty
are data literals.owl:ObjectProperty
is a subclass of rdf:Property
. It classifies reference properties corresponding to unidirectional binary associations. Since the values of a reference property are object references, the values of an owl:ObjectProperty
are object references in the form of resource URIs.We only show with the help of an example that an OWL vocabulary can represent a class diagram more faithfully than the corresponding RDF vocabulary by allowing to express certain constraints.
Consider the standard identifier attribute isbn
defined in the Book
class. In an RDF vocabulary, this attribute is defined in the following way:
There are two issues with this RDF definition of an attribute:
Using OWL, we can remedy these shortcomings of RDF. The following OWL property definition makes it explicit that the property http://example.org/ex1#isbn
is an attribute, while the added OWL restriction defines an “exactly one” cardinality constraint for it:
Since the ISBN attribute of the Book
class has been designated as the standard identifier attribute in the UML class diagram above, we should define a uniqueness constraint for it. We can do this by including an owl:hasKey
element within the class definition:
Both RDF and OWL have many usability issues. Especially OWL is so difficult to use that most potential users will be discouraged by it.
Because OWL was created by a community that is more concerned with formal logic than with information modeling and is not familiar with the concepts and terminology established in information modeling, they have introduced many new unfamiliar terms for concepts that had already been established and named in information modeling. They have even introduced duplicate names within OWL: an attribute is in most places called “data property”, but in some places it is called “datatype property” (specifically in OWL/RDF).
Usability issues of RDF are:
a.it does not make an explicit syntactic distinction between attributes (having a datatype as range) and reference properties (having an object type as range);
b.it does not allow expressing simple class definitions, which include mandatory value and single-value constraints, in an RDF vocabulary.
OWL is needed for getting these fundamental features.
Usability issues of OWL are:
Consider the problem of managing information about movies, like the Internet Movie Database98.