Transaction isolation (the “I” in ACID) is a critical part of any transactional system. This section explains isolation conditions, database locking, and transaction isolation levels. These concepts are important when deploying any transactional system.
Transaction isolation is defined in
terms of isolation conditions called
dirty reads,
repeatable reads, and
phantom reads. These
conditions describe what can happen when two or more transactions
operate on the same data.[48] To illustrate these
conditions, let’s think about two separate client
applications using their own instances of the TravelAgent EJB to
access the same data—specifically, a cabin record with the
primary key of 99. These examples revolve around the
RESERVATION
table, which is accessed by both the
bookPassage( )
method (through the Reservation
EJB) and the listAvailableCabins( )
method
(through JDBC). (It might be a good idea to go back to Chapter 11 and review how the
RESERVATION
table is accessed through these
methods. This will help you to understand how two transactions
executed by two different clients can impact each other.) Assume that
both methods have a transaction attribute of
Required
.
A
dirty read
occurs
when a transaction reads uncommitted changes made by a previous
transaction. If the first transaction is rolled back, the data read
by the second transaction becomes invalid because the rollback undoes
the changes. The second transaction will not be aware that the data
it has read has become invalid. Here’s a scenario
showing how a dirty read can occur (illustrated in Figure 16-8):
Time 10:00:00: Client 1 executes the
TravelAgent.bookPassage( )
method. Along with the
Customer and Cruise EJBs, Client 1 had previously chosen Cabin 99 to
be included in the reservation.
Time 10:00:01: Client 1’s TravelAgent EJB creates a
Reservation EJB within the bookPassage( )
method.
The Reservation EJB’s create( )
method inserts a record into the RESERVATION
table, which reserves Cabin 99.
Time 10:00:02: Client 2 executes
TravelAgent.listAvailableCabins( )
. Client 1 has
reserved Cabin 99, so it is not in the list of available cabins that
is returned from this method.
Time 10:00:03: Client 1’s TravelAgent EJB executes
the ProcessPayment.byCredit( )
method within the
bookPassage( )
method. The byCredit( )
method throws an exception because the expiration date on
the credit card has passed.
Time 10:00:04: The exception thrown by the ProcessPayment EJB causes
the entire bookPassage( )
transaction to be rolled
back. As a result, the record inserted into the
RESERVATION
table when the Reservation EJB was
created is not made durable (i.e., it is removed). Cabin 99 is now
available.
Client 2 is now using an invalid list of available cabins because Cabin 99 is available but is not included in the list. This omission would be serious if Cabin 99 was the last available cabin, because Client 2 would inaccurately report that the cruise was booked. The customer would presumably try to book a cruise on a competing cruise line.
A
repeatable read
occurs when the data read is
guaranteed to look the same if read again during the same
transaction. Repeatable reads are guaranteed in one of two ways:
either the data read is locked against changes or the data read is a
snapshot that doesn’t reflect changes. If the data
is locked, it cannot be changed by any other transaction until the
current transaction ends. If the data is a snapshot, other
transactions can change the data, but these changes will not be seen
by this transaction if the read is repeated. Here’s
an example of a repeatable read (illustrated in Figure 16-9):
Time 10:00:00: Client 1 begins an explicit
javax.transaction.UserTransaction
.
Time 10:00:01: Client 1 executes
TravelAgent.listAvailableCabins(2)
, asking for a
list of available cabins that have two beds. Cabin 99 is in the list
of available cabins.
Time 10:00:02: Client 2 is working with an interface that manages Cabin EJBs. Client 2 attempts to change the bed count on Cabin 99 from 2 to 3.
Time 10:00:03: Client 1 re-executes
TravelAgent.listAvailableCabins(2)
. Cabin 99 is
still in the list of available cabins.
This example is somewhat unusual because it uses
javax.transaction.UserTransaction
. This class is
covered in more detail later in this chapter; essentially, it allows
a client application to control the scope of a transaction
explicitly. In this case, Client 1 places transaction boundaries
around both calls to listAvailableCabins( )
, so
that they are a part of the same transaction. If Client 1
didn’t do this, the two
listAvailableCabins( )
methods would have executed
as separate transactions and our repeatable read condition would not
have occurred.
Although Client 2 attempted to change the bed count for Cabin 99 to
3, Cabin 99 still shows up in the Client 1 call to
listAvailableCabins( )
when a bed count of 2 is
requested. Either Client 2 was prevented from making the change
(because of a lock) or Client 2 was able to make the change, but
Client 1 is working with a snapshot of the data that
doesn’t reflect that change.
A nonrepeatable read is when the data retrieved in a subsequent read within the same transaction can return different results. In other words, the subsequent read can see the changes made by other transactions.
A
phantom read
occurs when new records added to
the database are detectable by transactions that started prior to the
insert. Queries will include records added by other transactions
after their transaction has started. Here’s a
scenario that includes a phantom read (illustrated in Figure 16-10):
Time 10:00:00: Client 1 begins an explicit
javax.transaction.UserTransaction
.
Time 10:00:01: Client 1 executes
TravelAgent.listAvailableCabins(2)
, asking for a
list of available cabins that have two beds. Cabin 99 is in the list
of available cabins.
Time 10:00:02: Client 2 executes bookPassage( )
and creates a Reservation EJB. The reservation inserts a new record
into the RESERVATION
table, reserving Cabin 99.
Time 10:00:03: Client 1 re-executes
TravelAgent.listAvailableCabins(2)
. Cabin 99 is no
longer in the list of available cabins.
Client 1 places transaction boundaries around both calls to
listAvailableCabins( )
, so that they are part of
the same transaction. In this case, the reservation was made between
the listAvailableCabins( )
queries in the same
transaction. Therefore, the record inserted in the
RESERVATION
table did not exist when the first
listAvailableCabins( )
method was invoked, but it
did exist and was visible when the second
listAvailableCabins( )
method was invoked. The
record inserted is called a
phantom record.
Databases, especially relational databases, normally use several different locking techniques. The most common are read locks, write locks, and exclusive write locks. (I’ve taken the liberty of adding “snapshots,” although this isn’t a formal term.) These locking mechanisms control how transactions access data concurrently. Locking mechanisms impact the read conditions just described. These types of locks are simple concepts that are not directly addressed in the EJB specification. Database vendors implement these locks differently, so you should understand how your database addresses these locking mechanisms to best predict how the isolation levels described in this section will work.
Read locks prevent other transactions from changing data read during a transaction until the transaction ends, thus preventing nonrepeatable reads. Other transactions can read the data but not write to it. The current transaction is also prohibited from making changes. Whether a read lock locks only the records read, a block of records, or a whole table depends on the database being used.
Write locks are used for updates. A write lock prevents other transactions from changing the data until the current transaction is complete but allows dirty reads by other transactions and by the current transaction itself. In other words, the transaction can read its own uncommitted changes.
Exclusive write locks are used for updates. An exclusive write lock prevents other transactions from reading or changing the data until the current transaction is complete. An exclusive write lock prevents dirty reads by other transactions. Other transactions are not allowed to read the data while it is exclusively locked. Some databases do not allow transactions to read their own data while it is exclusively locked.
Some databases get around locking by providing every transaction with its own snapshot of the data. A snapshot is a frozen view of the data that is taken when the transaction begins. Snapshots can prevent dirty reads, nonrepeatable reads, and phantom reads. They can be problematic because the data is not real-time; it is old the instant the snapshot is taken.
Transaction isolation is defined in terms of the isolation conditions (dirty reads, repeatable reads, and phantom reads). Isolation levels are commonly used in database systems to describe how locking is applied to data within a transaction.[49] The following terms are used to discuss isolation levels:
The transaction can read uncommitted data (i.e., data changed by a different transaction that is still in progress). Dirty reads, nonrepeatable reads, and phantom reads can occur. Bean methods with this isolation level can read uncommitted changes.
The transaction cannot read uncommitted data; data that is being changed by a different transaction cannot be read. Dirty reads are prevented; nonrepeatable reads and phantom reads can occur. Bean methods with this isolation level cannot read uncommitted data.
The transaction cannot change data that is being read by a different transaction. Dirty reads and nonrepeatable reads are prevented; phantom reads can occur. Bean methods with this isolation level have the same restrictions as Read Committed and can execute only repeatable reads.
The transaction has exclusive read and update privileges to data; different transactions can neither read nor write to the same data. Dirty reads, nonrepeatable reads, and phantom reads are prevented. This isolation level is the most restrictive.
These isolation levels are the same as those defined for JDBC.
Specifically, they map to the static final variables in the
java.sql.Connection
class. The behavior modeled by
the isolation levels in the connection class is the same as the
behavior described here.
The exact behavior of these isolation levels depends largely on the locking mechanism used by the underlying database or resource. How the isolation levels work depends in large part on how your database supports them.
In EJB, the deployer sets transaction isolation levels in a vendor-specific way if the container manages the transaction. The EJB developer sets the transaction isolation level if the enterprise bean manages its own transactions. Up to this point, we have discussed only container-managed transactions; we will discuss bean-managed transactions later in this chapter.
Generally speaking, as the isolation levels become more restrictive, the performance of the system decreases because more restrictive isolation levels prevent transactions from accessing the same data. If isolation levels are very restrictive, like Serializable, then all transactions, even simple reads, must wait in line to execute. This can result in a system that is very slow. EJB systems that process a large number of concurrent transactions and need to be very fast will therefore avoid the Serializable isolation level where it is not necessary.
Isolation levels, however, also enforce consistency of data. More restrictive isolation levels help ensure that invalid data is not used for performing updates. The old adage “garbage in, garbage out” applies. The Serializable isolation level ensures that data is never accessed concurrently by transactions, thus ensuring that the data is always consistent.
Choosing the correct isolation level requires some research about the database you are using and how it handles locking. You must also balance the performance needs of your system against consistency. This is not a cut-and-dried process, because different applications use data differently.
Although there are only three ships in Titan’s system, the entity beans that represent them are included in most of Titan’s transactions. This means that many, possibly hundreds, of transactions will be accessing these Ship EJBs at the same time. Access to Ship EJBs needs to be fast or a bottleneck will occur, so we do not want to use a restrictive isolation level. At the same time, the ship data also needs to be consistent; otherwise, hundreds of transactions will be using invalid data. Therefore, we need to use a strong isolation level when making changes to ship information. To accommodate these conflicting requirements, we can apply different isolation levels to different methods.
Most transactions use the Ship EJB’s get methods to obtain information. This is read-only behavior, so the isolation level for the get methods can be very low—such as Read Uncommitted. The set methods of the Ship EJB are almost never used; the name of the ship probably will not change for years. However, the data changed by the set methods must be isolated to prevent dirty reads by other transactions, so we will use the most restrictive isolation level, Serializable, on the ship’s set methods. By using different isolation levels on different business methods, we can balance consistency against performance.
Different EJB servers allow different levels of granularity for isolation levels; some servers defer this responsibility to the database. Most EJB servers control the isolation level through the resource access API (e.g., JDBC and JMS) and may allow different resources to have different isolation levels, but will generally require that access to the same resource within a single transaction use a consistent isolation level. Consult your vendor’s documentation to find out the level of control your server offers.
Bean-managed transactions in session beans and message-driven beans, however, allow you to specify the transaction isolation level using the database’s API. The JDBC API, for example, provides a mechanism for specifying the isolation level of the database connection. For example:
DataSource source = (javax.sql.DataSource) jndiCntxt.lookup("java:comp/env/jdbc/titanDB"); Connection con = source.getConnection( ); con.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE);
You can have different isolation levels for different resources within the same transaction, but all enterprise beans that use the same resource in a transaction should use the same isolation level.