9.1. The persistence lifecycle

Because Hibernate is a transparent persistence mechanism—classes are unaware of their own persistence capability—it's possible to write application logic that is unaware whether the objects it operates on represent persistent state or temporary state that exists only in memory. The application shouldn't necessarily need to care that an object is persistent when invoking its methods. You can, for example, invoke the calculateTotalPrice() business method on an instance of the Item class without having to consider persistence at all; e.g., in a unit test.

Any application with persistent state must interact with the persistence service whenever it needs to propagate state held in memory to the database (or vice versa). In other words, you have to call Hibernate (or the Java Persistence) interfaces to store and load objects.

When interacting with the persistence mechanism in that way, it's necessary for the application to concern itself with the state and lifecycle of an object with respect to persistence. We refer to this as the persistence lifecycle: the states an object goes through during its life. We also use the term unit of work: a set of operations you consider one (usually atomic) group. Another piece of the puzzle is the persistence context provided by the persistence service. Think of the persistence context as a cache that remembers all the modifications and state changes you made to objects in a particular unit of work (this is somewhat simplified, but it's a good starting point).

We now dissect all these terms: object and entity states, persistence contexts, and managed scope. You're probably more accustomed to thinking about what statements you have to manage to get stuff in and out of the database (via JDBC and SQL). However, one of the key factors of your success with Hibernate (and Java Persistence) is your understanding of state management, so stick with us through this section.

9.1.1. Object states

Different ORM solutions use different terminology and define different states and state transitions for the persistence lifecycle. Moreover, the object states used internally may be different from those exposed to the client application. Hibernate defines only four states, hiding the complexity of its internal implementation from the client code.

The object states defined by Hibernate and their transitions in a state chart are shown in figure 9.1. You can also see the method calls to the persistence manager API that trigger transitions. This API in Hibernate is the Session. We discuss this chart in this chapter; refer to it whenever you need an overview.

We've also included the states of Java Persistence entity instances in figure 9.1. As you can see, they're almost equivalent to Hibernate's, and most methods of the Session have a counterpart on the EntityManager API (shown in italics). We say that Hibernate is a superset of the functionality provided by the subset standardized in Java Persistence.

Some methods are available on both APIs; for example, the Session has a persist() operation with the same semantics as the EntityManager's counterpart. Others, like load() and getReference(), also share semantics, with a different method name.

During its life, an object can transition from a transient object to a persistent object to a detached object. Let's explore the states and transitions in more detail.

Figure 9-1. Object states and their transitions as triggered by persistence manager operations

Transient objects

Objects instantiated using the new operator aren't immediately persistent. Their state is transient, which means they aren't associated with any database table row and so their state is lost as soon as they're no longer referenced by any other object. These objects have a lifespan that effectively ends at that time, and they become inaccessible and available for garbage collection. Java Persistence doesn't include a term for this state; entity objects you just instantiated are new. We'll continue to refer to them as transient to emphasize the potential for these instances to become managed by a persistence service.

Hibernate and Java Persistence consider all transient instances to be nontransactional; any modification of a transient instance isn't known to a persistence context. This means that Hibernate doesn't provide any roll-back functionality for transient objects.

Objects that are referenced only by other transient instances are, by default, also transient. For an instance to transition from transient to persistent state, to become managed, requires either a call to the persistence manager or the creation of a reference from an already persistent instance.

Persistent objects

A persistent instance is an entity instance with a database identity, as defined in chapter 4, section 4.2, "Mapping entities with identity." That means a persistent and managed instance has a primary key value set as its database identifier. (There are some variations to when this identifier is assigned to a persistent instance.)

Persistent instances may be objects instantiated by the application and then made persistent by calling one of the methods on the persistence manager. They may even be objects that became persistent when a reference was created from another persistent object that is already managed. Alternatively, a persistent instance may be an instance retrieved from the database by execution of a query, by an identifier lookup, or by navigating the object graph starting from another persistent instance.

Persistent instances are always associated with a persistence context. Hibernate caches them and can detect whether they have been modified by the application.

There is much more to be said about this state and how an instance is managed in a persistence context. We'll get back to this later in this chapter.

Removed objects

You can delete an entity instance in several ways: For example, you can remove it with an explicit operation of the persistence manager. It may also become available for deletion if you remove all references to it, a feature available only in Hibernate or in Java Persistence with a Hibernate extension setting (orphan deletion for entities).

An object is in the removed state if it has been scheduled for deletion at the end of a unit of work, but it's still managed by the persistence context until the unit of work completes. In other words, a removed object shouldn't be reused because it will be deleted from the database as soon as the unit of work completes. You should also discard any references you may hold to it in the application (of course, after you finish working with it—for example, after you've rendered the removal-confirmation screen your users see).

Detached objects

To understand detached objects, you need to consider a typical transition of an instance: First it's transient, because it just has been created in the application. Now you make it persistent by calling an operation on the persistence manager. All of this happens in a single unit of work, and the persistence context for this unit of work is synchronized with the database at some point (when an SQL INSERT occurs).

The unit of work is now completed, and the persistence context is closed. But the application still has a handle: a reference to the instance that was saved. As long as the persistence context is active, the state of this instance is persistent. At the end of a unit of work, the persistence context closes. What is the state of the object you're holding a reference to now, and what can you do with it?

We refer to these objects as detached, indicating that their state is no longer guaranteed to be synchronized with database state; they're no longer attached to a persistence context. They still contain persistent data (which may soon be stale). You can continue working with a detached object and modify it. However, at some point you probably want to make those changes persistent—in other words, bring the detached instance back into persistent state.

Hibernate offers two operations, reattachment and merging, to deal with this situation. Java Persistence only standardizes merging. These features have a deep impact on how multitiered applications may be designed. The ability to return objects from one persistence context to the presentation layer and later reuse them in a new persistence context is a main selling point of Hibernate and Java Persistence. It enables you to create long units of work that span user think-time. We call this kind of long-running unit of work a conversation. We'll get back to detached objects and conversations soon.

You should now have a basic understanding of object states and how transitions occur. Our next topic is the persistence context and the management of objects it provides.

9.1.2. The persistence context

You may consider the persistence context to be a cache of managed entity instances. The persistence context isn't something you see in your application; it isn't an API you can call. In a Hibernate application, we say that one Session has one internal persistence context. In a Java Persistence application, an EntityManager has a persistence context. All entities in persistent state and managed in a unit of work are cached in this context. We walk through the Session and EntityManager APIs later in this chapter. Now you need to know what this (internal) persistence context is buying you.

The persistence context is useful for several reasons:

  • Hibernate can do automatic dirty checking and transactional write-behind.

  • Hibernate can use the persistence context as a first-level cache.

  • Hibernate can guarantee a scope of Java object identity.

  • Hibernate can extend the persistence context to span a whole conversation.

All these points are also valid for Java Persistence providers. Let's look at each feature.

Automatic dirty checking

Persistent instances are managed in a persistence context—their state is synchronized with the database at the end of the unit of work. When a unit of work completes, state held in memory is propagated to the database by the execution of SQL INSERT, UPDATE, and DELETE statements (DML). This procedure may also occur at other times. For example, Hibernate may synchronize with the database before execution of a query. This ensures that queries are aware of changes made earlier during the unit of work.

Hibernate doesn't update the database row of every single persistent object in memory at the end of the unit of work. ORM software must have a strategy for detecting which persistent objects have been modified by the application. We call this automatic dirty checking. An object with modifications that have not yet been propagated to the database is considered dirty. Again, this state isn't visible to the application. With transparent transaction-level write-behind, Hibernate propagates state changes to the database as late as possible but hides this detail from the application. By executing DML as late as possible (toward the end of the database transaction), Hibernate tries to keep lock-times in the database as short as possible. (DML usually creates locks in the database that are held until the transaction completes.)

Hibernate is able to detect exactly which properties have been modified so that it's possible to include only the columns that need updating in the SQL UPDATE statement. This may bring some performance gains. However, it's usually not a significant difference and, in theory, could harm performance in some environments. By default, Hibernate includes all columns of a mapped table in the SQL UPDATE statement (hence, Hibernate can generate this basic SQL at startup, not at runtime). If you want to update only modified columns, you can enable dynamic SQL generation by setting dynamic-update="true" in a class mapping. The same mechanism is implemented for insertion of new records, and you can enable runtime generation of INSERT statements with dynamic-insert="true". We recommend you consider this setting when you have an extraordinarily large number of columns in a table (say, more than 50); at some point, the overhead network traffic for unchanged fields will be noticeable.

In rare cases, you may also want to supply your own dirty checking algorithm to Hibernate. By default, Hibernate compares an old snapshot of an object with the snapshot at synchronization time, and it detects any modifications that require an update of the database state. You can implement your own routine by supplying a custom findDirty() method with an org.hibernate.Interceptor for a Session. We'll show you an implementation of an interceptor later in the book.

We'll also get back to the synchronization process (known as flushing) and when it occurs later in this chapter.

The persistence context cache

A persistence context is a cache of persistent entity instances. This means it remembers all persistent entity instances you've handled in a particular unit of work. Automatic dirty checking is one of the benefits of this caching. Another benefit is repeatable read for entities and the performance advantage of a unit of work-scoped cache.

For example, if Hibernate is told to load an object by primary key (a lookup by identifier), it can first check the persistence context for the current unit of work. If the entity is found there, no database hit occurs—this is a repeatable read for an application. The same is true if a query is executed through one of the Hibernate (or Java Persistence) interfaces. Hibernate reads the result set of the query and marshals entity objects that are then returned to the application. During this process, Hibernate interacts with the current persistence context. It tries to resolve every entity instance in this cache (by identifier); only if the instance can't be found in the current persistence context does Hibernate read the rest of the data from the result set.

The persistence context cache offers significant performance benefits and improves the isolation guarantees in a unit of work (you get repeatable read of entity instances for free). Because this cache only has the scope of a unit of work, it has no real disadvantages, such as lock management for concurrent access—a unit of work is processed in a single thread at a time.

The persistence context cache sometimes helps avoid unnecessary database traffic; but, more important, it ensures that:

  • The persistence layer isn't vulnerable to stack overflows in the case of circular references in a graph of objects.

  • There can never be conflicting representations of the same database row at the end of a unit of work. In the persistence context, at most a single object represents any database row. All changes made to that object may be safely written to the database.

  • Likewise, changes made in a particular persistence context are always immediately visible to all other code executed inside that persistence context and its unit of work (the repeatable read for entities guarantee).

You don't have to do anything special to enable the persistence context cache. It's always on and, for the reasons shown, can't be turned off.

Later in this chapter, we'll show you how objects are added to this cache (basically, whenever they become persistent) and how you can manage this cache (by detaching objects manually from the persistence context, or by clearing the persistence context).

The last two items on our list of benefits of a persistence context, the guaranteed scope of identity and the possibility to extend the persistence context to span a conversation, are closely related concepts. To understand them, you need to take a step back and consider objects in detached state from a different perspective.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset