In Chapter 3, we took a high-level tour of Alfresco Share and the Share Records Management site. In this chapter, we will look at the Alfresco Content Model and specifically look at the part of the model that is relevant to Records Management.
In this chapter, we will describe:
This chapter describes the mechanics for entering and configuring the content model within Alfresco. Each of the basic elements of the content model is discussed — types, aspects, properties, constraints, and associations. We will discuss how you can use these content model building blocks to design and build your own model. We'll then show how a new content model can be installed and made available from the Alfresco Share user interface.
Later in the chapter, we will look in detail at the built-in Alfresco Records Management Content Model. The model reveals much about the inner workings of Records Management within Alfresco and it also provides a very useful example of how a very rich content model can be created.
Content and metadata storage is a core capability of an enterprise content management system, and it is an area where Alfresco excels. The content model is the framework that prescribes exactly how content data will be stored and how it later can be searched for retrieval. The model describes the structure, the format, and inter-relationships of content. It also provides the framework for organizing content and assigning meaning to it.
While the Alfresco Content Model is built from a very small set of components, the richness and flexibility of those components enable potentially very complex content models to be created.
The content model is actually segmented into a collection of models. For example, Records Management and Workflow are each implemented as separate models.
Each of the individual models contains the description for the specific types of content that can be stored in the repository. Each content type contains a fixed set of metadata properties. Constraints can be applied to properties to limit or to closely define the range of the allowed values for the properties. Associations can also be modeled and associated with types to define relationships between content items such as parent-child relationships or content-to-content references. Dynamic properties and associations can be added at runtime by applying aspects to the content.
When a new piece of content is added to the Alfresco repository, a structure called a node is created to hold the content. Each node gets added to a tree of nodes in the repository and is associated with at least one other node in the tree that acts as its parent. Every node is assigned a content type from the content model. A node can be associated with only a single content type at any one time, although the type of a node could potentially change. Aspects containing additional properties and associations can also be added to or removed from the node at any time.
Alfresco also supports the ability to set ad hoc properties on a node, ones not defined by properties associated with either the type or with applied aspects. Ad hoc properties can be stored as name-value pairs in a generic property bag associated with a node and are called residual properties. While there may be isolated cases where the use of residual properties makes sense, a suggested best practice is to avoid the use of ad hoc properties and to explicitly define all properties that will be needed within the content model.
Creating new content models requires us to assign names to the elements of the models that we define. Our new model must be defined in a way that allows it to globally co-exist with the names used within all other content models that have already been defined.
A common problem that occurs when creating new element names for a content model is to have a name conflict with the name of an element already used by another model definition. Name conflicts can cause the software to not run at all or for data to become accidentally corrupted because of confusion over the naming of the elements.
Suppose, for example, that we decide to add a new property called container
to a document type that we define in our new custom model. There would be a problem because that name conflicts with the Alfresco repository system content model that already has a property named container
.
To avoid naming conflicts like this between content models, Alfresco uses namespaces. A namespace groups together all the elements of the content model and also provides a way to create names that will guarantee their global uniqueness.
Namespaces are typically written as URI strings that start with an HTTP address, usually belonging to the author or the author's company, and then followed by a path that describes or organizes the types of elements contained in the namespace. All standard Alfresco namespaces have URIs that start with http://www.alfresco.org. The URI typically ends with the version number for the namespace.
The table below shows a list of standard Alfresco Content Model namespaces. The namespace URIs can be quite long and writing code that appends the namespace URI to model element names everywhere can make for some very verbose and clumsy-looking code.
To avoid having to always append the namespace URI to an element name, namespace prefixes are defined that significantly shorten the namespace reference. So, instead of having to refer to an element like {http://www.alfresco.org/model/system/1.0}container
, we can even simply write sys:container
. The next table lists the prefixes that are used by convention when referring to Alfresco namespaces. The files defining these models can be found in the tomcatwebappsalfrescoWEB-INFclassesalfrescomodel
directory.
Common Prefix |
Namespace |
Description |
---|---|---|
alf |
General Alfresco Namespace | |
app |
Application Model | |
bpm |
|
Business Process Model |
cm |
Content Domain Model | |
d |
Data Dictionary Model | |
fm |
Forum Model | |
st |
Site Model | |
sys |
Repository System Model | |
dod |
DoD 5015.2 Records Management Model | |
rma |
Records Management Model |
Important namespaces that you'll see frequently referred to are the Content Domain Model and the Dictionary Model. New content models typically inherit from or reuse definitions of these foundational models. You might also notice the Site Model included in this list. The Site Model supports the management of data related to Alfresco Share sites. At the end of the list, there are also two content models that are used by the Alfresco Records Management implementation that we will talk about towards the end of the chapter.
Types in the Alfresco Content Model provide a way to classify content as it is added to the repository. Every node in the repository is assigned a single type, and the type brings along with it a set of properties, associations, and even aspects that are relevant for that kind of content.
Types must be uniquely named and include the namespace prefix at the beginning of the type name. Available elements that are enclosed by the<type>
tag for describing the behavior of a type are as follows:
sys:base
. Subtypes inherit property, association, and constraint definitions from their parent type. Types can be nested to any depth.The following features from parent properties can be overridden:
Note that when defining both properties and associations for a type, the properties must be listed before the associations. It is also not possible to split the properties within a tag among multiple<properties>
tags; only a single<properties>
tag can be used within any one type definition. An example of the definition of a content type can be found in the Records Management model for an rma:recordFolder:
<type name="rma:recordFolder"> <title>Record Folder</title> <parent>cm:folder</parent> <archive>false</archive> <properties> <property name="rma:isClosed"> <title>Record Folder Closed</title> <description>Indicates whether the folder is closed</description> <type>d:boolean</type> <protected>true</protected> <mandatory>true</mandatory> <default>false</default> </property> </properties> <mandatory-aspects> <aspect>cm:titled</aspect> <aspect>rma:recordComponentIdentifier</aspect> <aspect>rma:commonRecordDetails</aspect> <aspect>rma:filePlanComponent</aspect> </mandatory-aspects> </type>
Overrides to properties inherited from the parent type can be defined in the subtype as follows:
<type> ... <overrides> <property name="cm:autoVersion"> <default>false</default> </property> </overrides> </type>
Properties are one of the most important components of the definition for types and aspects. All properties in type and aspect definitions are grouped together and enclosed by a single<properties>
tag. Each property is uniquely named by including a namespace prefix as the initial part of the name. The property name is an attribute of the property called name
, as in<property name="rma:location">
.
Available elements that are enclosed by the<property>
tag for describing the behavior of a property are as follows:
<mandatory>
tag is further qualified with a true value for the enforced
attribute. When enforced
is set to false, as in<mandatory enforced="false">
, if the property is not set at the time of the transaction, the transaction will not be blocked, but after the transaction is completed, the node will be marked with the sys:incomplete
aspect.Every property must be typed. This means that each property is associated with a data type that is defined by the type
element. type
is the only element of those listed above that is mandatory when defining a property. Alfresco has a wide range of data types available and it's possible to add more if the data type that you need isn't available. However, for most cases, the standard data types offered by Alfresco are most likely sufficient.
Because the core Alfresco software is written in Java, the data types available in a content model parallel very closely the data types available in Java. The following table lists some of the common data types available for use in the Alfresco Content Model. The complete list of Alfresco data types can be found in the file tomcatwebappsalfrescoWEB-INFclassesalfrescomodeldictionaryModel.xml
.
Data type name |
Java equivalent |
Description |
---|---|---|
|
|
A text or character string. |
|
Alfresco custom type |
Multilingual text. Able to store multiple translations of a text string. |
|
Alfresco custom type |
Arbitrary content stored as a text or binary stream. |
|
|
32-bit signed two's complement integer. |
|
|
64-bit signed two's complement integer. |
|
|
Single-precision 32-bit IEEE 754 floating point. |
|
|
Double-precision 64-bit IEEE 754 floating point. |
Data type name |
Java equivalent |
Description |
|
|
Date value. |
|
|
Date and time value. |
|
|
Boolean data, either true or false. |
|
|
Locale to describe a geographical or cultural region. |
|
Alfresco custom type |
A file path. |
|
|
Any value, regardless of type. |
Constraints limit the allowed range of values for a property. Within a model XML file, constraints can be defined independently of the definition for any one type or aspect. Constraints defined in this way can then be reused as part of the property definition anywhere within the model.
<property name="cm:userName"> <type>d:text</type> <mandatory>true</mandatory> <constraints> <constraint ref="cm:userNameConstraint" /> </constraints> </property>
It is also possible to define an in-line constraint as part of the definition of the property. In this case, the constraint cannot be applied to any other property outside the one in which it is defined. A simple example of this is the following:
<property name="test:constrainedProp"> <type>d:text</type> <constraints> <constraint type="LENGTH"> <parameter name="minLength"><value>0</value></parameter> <parameter name="maxLength"><value>100</value></parameter> </constraint> </constraints> </property>
Alfresco out of the box supports four types of constraints, which will be discussed in this section.:
The REGEX constraint enforces the syntax, spelling, or format for a property value. The constraint expression is written using regular expression syntax. Valid<parameter>
names for this constraint are as follows:
An example of a REGEX constraint is cm:filename
, which is used for matching valid filenames. This constraint is defined as part of the content model. The definition is shown here:
<constraint name="cm:filename" type="REGEX"> <parameter name="expression"> <value><![CDATA[(.*["*\><?/:|]+.*)|(.*[.]?.*[.]+$)|(.*[ ]+$)]]></value> </parameter> <parameter name="requiresMatch"><value>false</value></parameter> </constraint>
Another simpler example that simply constrains the value of the property to be an all uppercase string is as follows:
<constraint name="test:regexExample" type="REGEX"> <parameter name="expression"><value>[A-Z]*</value></parameter> <parameter name="requiresMatch"><value>true</value></parameter> </constraint>
Regular expressions are extremely powerful, but writing one can quickly become quite complex. There are many tutorials available online or books written about how to write them. Resources like http://regexlib.com/ offer a large library of online regular expressions that can be reused and also provide tools for online interactive debugging of regular expressions.
The LENGTH constraint enforces the lengths of strings to be within a range of values. Valid<parameter>
names for this constraint are as follows:
Consider the following example of a LENGTH
constraint where the length of the string for the property value must be between 0 and 100:
<constraint name="test:lengthExample" type="LENGTH"> <parameter name="minLength"><value>0</value></parameter> <parameter name="maxLength"><value>100</value></parameter> </constraint>
The LIST constraint forces the values of a property to be one of the values contained in an enumerated list. Typically, a user will interact with entering the values for a LIST-constrained property by selecting a value from a drop-down list containing all allowed values. Valid<parameter>
names for this constraint are as follows:
The Alfresco Content Model implementation for the DoD 5015.2 Records Management specification contains the following example of a LIST
constraint:
<constraint name="dod:imageFormatList" type="LIST"> <title>Image Formats</title> <parameter name="allowedValues"> <list> <value>Binary Image Interchange Format (BIIF)</value> <value>GIF 89a</value> <value>Graphic Image Format (GIF) 87a</value> <value>Joint Photographic Experts Group (JPEG) (all versions)</value> <value>Portable Network Graphics (PNG) 1.0</value> <value>Tagged Image Interchange Format (TIFF) 4.0</value> <value>TIFF 5.0</value> <value>TIFF 6.0</value> </list> </parameter> <parameter name="caseSensitive"><value>true</value></parameter> </constraint>
The MINMAX constraint enforces that a numeric value be within a range of numbers. Valid<parameter>
names for this constraint are as follows:
An example of a constraint on a numeric property that requires the number to be between 0
and 1000
is shown next:
<constraint name="test:minMaxExample" type="MINMAX"> <parameter name="minValue"><value>0</value></parameter> <parameter name="maxValue"><value>1000</value></parameter> </constraint>
Custom constraint types can be written too, but doing that is a task that needs to be done using Java. Built-in constraints are defined by the Java package org.alfresco.repo.dictionary.constraint. <property>
values for each constraint correspond to the setter methods of the Java class implementation for the constraint. An example and description on how to do create a custom constraint can be found on the Alfresco wiki: http://wiki.alfresco.com/wiki/Constraints.
Associations are relationships that are created between two types within the content model. Associations are ultimately realized as relationships between nodes in the repository and are controlled by the types assigned to the nodes. Associations must be uniquely named and include the namespace prefix at the beginning of the association name.
Two types of associations are possible — child associations and peer associations. Both types of associations consider one of the types as the source and the other as the target. The source is the type in which the association is defined.
For brevity, within the Alfresco Content Model, a peer association is simply referred to as an association
. Available elements that are enclosed by the<association>
tag for describing the behavior of an association are as follows:
sys:base
would allow the target to be any kind of content, since all types inherit from sys:base
. This element is required for defining the target.An example of a peer association can be found in contentModel.xml
. The association here defines a reference from one item to another piece of content:
<association name="cm:references"> <source> <role>cm:referencedBy</role> <mandatory>false</mandatory> <many>true</many> </source> <target> <class>cm:content</class> <role>cm:references</role> <mandatory>false</mandatory> <many>true</many> </target> </association>
A child-association
is described by the same set of enclosed elements. Additionally, the following two elements are also supported as part of the child-association
definition:
An example of a child association can be found in the Records Management Content Model. This example shows a holds area that is capable of tracking the holds that have been placed:
<child-association name="rma:holds"> <title>Holds</title> <source> <mandatory>false</mandatory> <many>false</many> </source> <target> <class>rma:hold</class> <mandatory>false</mandatory> <many>true</many> </target> </child-association>
The mandatory
flag is enforced whenever a node with the association is being committed at the end of a transaction. This holds for both<association>
and<child-association>
tags. If the mandatory
flag is true, and if it is enforced, then the commit will fail if the association element does not exist, specified by writing<mandatory enforced="true">
. If the mandatory
flag is true but not enforced, the commit will succeed, but an aspect called sys:incomplete
will be applied to the node.
When the two elements, mandatory
and many
, are considered together, they define the cardinality of the association. The following table shows how the cardinality can be determined, based on those two elements:
mandatory = true |
mandatory = false | |
many = true |
1 or more |
0 or more |
many = false |
1 |
0 or 1 |
With a child association, if you delete the parent node, the child nodes will be automatically deleted. In a peer association, deleting the source node will break the association, but will not cause any other nodes to be deleted.
Aspects are a shorthand method to group together property, association, and constraint definitions. Aspects can be applied to repository nodes, type definitions, or to the definition of other aspects. When an aspect is applied to, for example, a node, the properties and associations defined in the aspect are taken from it and added to those that already exist on the node. Application of aspects to types and to other aspects works in a similar way.
Much of what an aspect does overlaps with the functionality of a type. For example, like types, aspects support inheritance, with the concept of one aspect inheriting from a parent aspect. The one difference between types and aspects is that every node must have one and only one type, while any number of aspects can be applied to a node.
The application of multiple aspects to a node is often compared to multiple inheritance. Aspects can also be thought of as being similar to macros. A macro, once defined, can be reused again by referring to it by its name. In the same sort of way, a common practice in the Alfresco Content Model is to define an aspect and to then apply it to many type and aspect definitions as a mandatory aspect. For example, the aspect cm:titled
from the content model is often used in the definition of a type, bringing along with it standard definitions for the properties cm:title
and cm:description
.
Another advantage of aspects is that they can be dynamically applied at runtime to nodes. For example, when a record is declared within Records Management, only at that time are the properties that are relevant to managing records appended to the node. In this way, only metadata relevant to an object needs to be tracked. Aspects create a clean way to assign metadata to objects and avoid tracking metadata fields that are not relevant to an object.
The definition of an aspect is very similar to that of a type. Aspects must be uniquely named and include the namespace prefix at the beginning of the aspect name. Available elements that are enclosed by the<aspect>
tag for describing the behavior of an aspect are as follows:
The following features can be overridden:
A good example of an aspect that is defined in the Alfresco Records Management Content Model is rma:frozen
. This aspect is applied to records that are subject to a hold:
<aspect name="rma:frozen"> <title>Frozen</title> <properties> <property name="rma:frozenAt"> <title>Frozen At Date</title> <type>d:date</type> <mandatory>true</mandatory> </property> <property name="rma:frozenBy"> <title>Frozen By</title> <type>d:text</type> <mandatory>true</mandatory> <index enabled="true"> <atomic>true</atomic> <stored>false</stored> <tokenised>false</tokenised> </index> </property> </properties> <mandatory-aspects> <aspect>rma:filePlanComponent</aspect> </mandatory-aspects> </aspect>