Content is the heart of the ECM system. All functionalities and features of the ECM system are surrounded by content. For the architecture and maintenance of the ECM system, the understanding of the lifecycle of content in an ECM application is very important. Once content gets inside the CMS application, it passes through different phases, which is common in most standard ECM applications.
However, the storage mechanism of the content varies in different ECM applications.
In this chapter, we will understand, in detail, the lifecycle of content in Alfresco and how these different phases impact different components of Alfresco. We will also try to understand the Alfresco database schema.
By the end of this chapter, you will have learned about:
Before going into detail about lifecycles, let's understand the content store and database schema. We already covered indexes in Chapter 5, Search.
The content store controls the creation and deletion of binary content in the filesystem. We have already covered a few details on this in earlier chapters. The dir.root
property in the alfresco-global.properties
(<Tomcat_Home>/shared/classes
) file defines the root binary file storage location.
Let's, for example, examine the path specified in dir.root
which is /mnt/alf_data
. Beneath this directory, there are two folders: contentstore
and contentstore.deleted
, which will be created the first time Alfresco is started. Let's have a look at the details of the folder:
contentstore
: All active and archive content is being stored here. Based on content creation time, a directory hierarchy is created. All the files will have a unique name and the .bin
extension. Let's say there is a file named Employee Handbook.doc
being uploaded in Alfresco on January 20, 2015 at 10:50 A.M., then the file will be stored in /mnt/alf_data/contentstore/2015/1/20/10/50/<unique name>.bin
.contentstore.deleted
: The orphaned content which is permanently deleted by Alfresco is being moved to this directory by an orphan cleaner scheduler. From this directory, files can be removed at any time using the standard operating system remove command. For example, by executing the rm /mnt/alf_data/contentstore.deleted/2015/1/23/13/34/xxxxx.bin
command.This is the general architecture of a default content store. The default content store is named FileContentStore
. Based on this default content store, Alfresco also provides various different types of content store. Here are a few details about each type of content store.
As the name suggests, the content is stored encrypted in the filesystem. All content is encrypted with its unique key. This unique key is again encrypted with a master key and is stored in the Alfresco database. The encrypted ContentStore was introduced in version 5.0 of Alfresco . To enable the encrypted ContentStore, you will need a license file, which has enabled content store encryption from Alfresco.
Here are the steps required to enable and configure the encrypted ContentStore.
keystore
using keytool
. A sample keytool
command can be used to generate the master key.
keytool -genkey -alias key1 -keyalg RSA -keystore <master keystore path> -keysize 2048
alfresco-global.properties
to enable content encryption. These properties can also be changed via JMX:filecontentstore.subsystem.name=encryptedContentStore
This will enable the encrypted content store
cryptodoc.jce.keystore.type=
This is the keystore
type for master keys like jceks
cryptodoc.jce.keystore.path=
Provides the path of the keystore
where the master key was generated
cryptodoc.jce.keystore.password=
Password for keystore
cryptodoc.jce.key.aliases=
cryptodoc.jce.key.passwords=
A comma separated list of all passwords for fetching the master key from the keystore
cryptodoc.jce.keygen.defaultSymmetricKeySize=
The size of the symmetric key size by default is 128 bit
dir.root
path specified in alfresco-global.properties
remains the same.Once enabled, you cannot revert back to the normal ContentStore. Also, if you are upgrading from an old version, only new content will be encrypted. Old content will still remain un-encrypted. Be careful when you choose the encrypted ContentStore. Multi-tenancy is not supported with an encrypted store.
Caching ContentStore works as a wrapper around any ContentStore to provide caching and faster access of data. Caching ContentStore should be used with either a slow disk, Amazon s3, or so on. If the normal content storage mechanism is slow, set up the caching ContentStore around it. Don't use it around FileContentStore
if you have a fast disk.
Follow the steps below to configure the caching of ContentStore (assuming the backing store is already configured).
caching-content-store-context.xml
file located at <ALFRESCO_HOME>/shared/classes/alfresco/extension
by renaming it from .sample
to .xml
.cachingContentStore
. Make sure the backingStore
and quota
are configured properly. Quota can be standard quota or unlimited. With a standard quota manager, you can control the disk usage of cached files:<bean id="cachingContentStore" class="org.alfresco.repo.content.caching.CachingContentStore" init-method="init"> <property name="backingStore" ref="backingStore"/> <property name="cache" ref="contentCache"/> <property name="cacheOnInbound" value="${system.content.caching.cacheOnInbound}"/> <property name="quota" ref="standardQuotaManager"/> </bean>
backingStore
bean is referring to FileContentStore
. Change the bean definition based on the backing store used. With FileContentStore,
there is no use of caching. For example, if you are using S3ContentStore
(details about this content store will be covered later on in this chapter) where caching is required, make sure the backingStore
is referring to the correct ContentStore,
as shown in the following sample code snippet:<bean id="backingStore" class="org.alfresco.integrations.s3store.TenantS3ContentStore"> <constructor-arg> <value>${dir.contentstore}</value> </constructor-arg> </bean>
alfresco-global.properties
file. Default values are set in the repostiory.properties
file:dir.cachedcontent=${dir.root}/cachedcontent:
Change this value if you want the cached content in a different path to the content root directory.
system.content.caching.cacheOnInbound=true
This is the property to enable the caching of content while running the write operation. That way, whenever content is read, it is already in the cache.
system.content.caching.maxDeleteWatchCount=1:
The number of times the file is observed as deleted before cleanup from the cache.
system.content.caching.contentCleanup.cronExpression=0 0 3
system.content.caching.minFileAgeMillis=60000
Specify the minimum live time for the file before it is deleted from the cache.
system.content.caching.maxUsageMB=4096
This property is associated with a quota, the maximum amount of disk space can be used for the cache.
system.content.caching.maxFileSizeMB=0
Change this value if you want any limitations with the file size to be maintained in the cache.
This is a special content store which will be required only when the Alfresco instance is on the Amazon cloud (EC2) (refer to https://en.wikipedia.org/wiki/Amazon_Web_Services for more details). Alfresco provides this additional module to use Amazon's Simple Storage Service (S3) for file storage. The Alfresco S3 content store is slower than the standard FileContentStore
, so you can use this in combination with the caching ContentStore.
Follow these steps to configure the S3 connector:
amp
package for the S3 connector from Alfresco support.amp
package using the Alfresco Module Package (AMP) installation procedure.alfresco-global.properties
files3.accessKey=
Specify the access key for Amazon Web service identification.
s3.secretKey=
Specify the Amazon web service secret key.
s3.bucketName=
Specify the bucket name which will be used for content storage. This bucket name should be unique.
contentstore
and contentstore.deleted
paths using the same bucket name in the alfresco-global.properties
file.dir.contentstore=/AmazonBucketPath/contentstore dir.contentstore.deleted=/AmazonBucketPath/contentstore.deleted
The content store selector provides users with a mechanism to bind content with a specific content store. Alfresco provides the flexibility to have multiple content stores, and you can decide what content needs to be stored in which store. This is very useful in a scenario where you need to store different folder data in a completely different store. You get the flexibility to place the less read, old content to any slow disk and all new content to any fast disk.
Follow the steps mentioned here to enable the content store selector:
<Alfresco_home>/shared/classes/alfresco/extension
. A sample context file is provided with support files of this book.<bean id="projectMarketingContentStore" class="org.alfresco.repo.content.filestore.FileContentStore"> <constructor-arg> <value>${dir.root}/storeProjectA</value> </constructor-arg> </bean>
storeSelectorContentStore
bean. Take a look at the sample code snippet:<bean id="storeSelectorContentStore" parent="storeSelectorContentStoreBase"> <property name="defaultStoreName"> <value>default</value> </property> <property name="storesByName"> <map> <entry key="default"> <ref bean="fileContentStore" /> </entry> <entry key="projectMarketing"> <ref bean="projectMarketingContentStore " /> </entry> ... <bean>
eagerOrphanCleanup
bean to map this list, so all this additional content store can be cleaned up in the same fashion as the default content store.system.content.orphanCleanup.cronExpression
property in alfresco-global.properties
.cm:storeSelector
aspect and cm:storeName
, which is a property associated with this aspect.Find the aspects
tag in the share-config-custom.xml
file located at <ALFRESCO_HOME>/tomcat/shared/classes/alfresco/web-extension
and below the storeSelector
aspect, add it in the list as shown in the following code snippet:
<aspects> <!-- Aspects that a user can see --> <visible> .. <aspect name="cm:storeSelector" /> </visible> .. </aspects>
Also define the user-friendly name of the aspect in the slingshot.properties
file to be shown in the Share user interface.
aspect.cm_storeSelector=Store Selector
storeName
based on the store you want the content to be in, for example if you want to store all marketing documents in the projectMarketing
store, set the storeName
value to projectMarketing
as defined in the store selector bean. The file will be copied from the default content store to the new content store. If no value is specified in storeName,
it takes the default. The file in the old content store will remain as it is, but it will be marked as orphan so the cleanup process can clean these documents.