Chapter 8. Monitoring WebSphere Application Server applications

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Monitoring WebSphere Application Server applications

As with other applications, you want to know how your WebSphere Application Server and DB2 applications are performing in terms of elapsed time, response time, and transactions per second. There are many tools and techniques that are available that you can use to capture performance-related information in a WebSphere Application Server / DB2 environment.

Before you dive into the different tools and traces that can be used to collect and analyze performance data, it is important to establish a performance analysis strategy or framework in your installation.

Capturing information is the first step; you also must analyze and interpret the data. As performance data can be captured on both the WebSphere Application Server and the DB2 side, it is also important to be able to correlate the data that is collected on both sides.

This chapter covers the following topics:

•Performance monitoring

•Correlating performance data from different sources

•Monitoring from WebSphere Application Server

•Monitoring from the DB2 side

•Using the performance database

•Monitoring from the z/OS side with RMF

8.1 Performance monitoring

Performance monitoring and analysis is a broad subject. The focus in this publication is on monitoring connections into DB2 for z/OS that come from WebSphere Application Server applications. This book looks at the WebSphere Application Server side, z/OS side, and the DB2 side, with an emphasis on DB2.

•Continuous monitoring

•Detailed monitoring

8.1.1 Continuous monitoring

First, you must decide what performance data to collect, either continuously or at regular intervals, to determine how your applications are performing on a day to day basis.

For DB2 for z/OS, this data is typically DB2 statistics and accounting trace records, and for WebSphere Application Server, the SMF 120 records. When using dynamic SQL, which is used by JDBC applications, it might be a good idea to capture information from the dynamic statement cache at regular intervals to track the performance of individual SQL statements over time. For more information about which DB2 information to capture, see 8.4.1, “Which information to gather” on page 395.

You can use this information to establish a profile for your applications that you can track
over time.

You can use this information to understand how your applications perform on a day to day basis, and to determine what has changed if performance deteriorates.

8.1.2 Detailed monitoring

Normally, this type monitoring is used only when there is a problem with the application’s performance. These traces typically introduce a performance impact so you do not want to turn on these traces permanently.

There are many types of traces in all components of the application (WebSphere Application Server, Data server driver, JVM, and DB2) that you have at your disposal. You must understand what detailed traces are available to you and in which cases they can be useful.

8.2 Correlating performance data from different sources

Gathering performance data or trace data in the different components that are involved in running a transaction is one thing, but correlating the data from different components is something else. When you look at the overall performance of the system, this situation is not really an issue, but when you drill down to the transaction level, you should be able to tie together the data that is gathered by the different components. This publication focuses on correlating WebSphere Application Server data, DB2 performance data, and WLM and RMF data for accounting, workload management, or debugging purposes.

From a DB2 side, people traditionally use the planname, authorization ID, or transaction name (correlation name) to identify transactions. In a WebSphere Application Server environment, those items are not always available, or are often the same for all work coming from the application server, and are therefore not helpful.

For example, when you use a type 4 connection using JDBC, there is not really a DB2 plan (other than the generic DISTSERV plan that is used by everybody). Therefore, using the planname is not useful for discovering how specific applications are performing. The same situation applies to the usage of the DB2 authorization ID. In many cases, the application server uses a single authorization ID for all work that is being sent to DB2.

Therefore, to be able to identify WebSphere Application Server applications and correlate them with the information in DB2 and RMF, you typically must use different identifiers and different techniques than the DB2 planname and authorization ID.

8.2.1 Using client information strings for correlating data

A convenient way to establish a “link” between the WebSphere Application Server application and the work inside the database engine is to use client information strings¹, which are also called user agent strings. They provide extended information about the client to the server to allow better accounting, workload management, or debugging. The extended client information is sent to the database server when the application performs an action that accesses the server, such as running SQL statements.

Setting the client information in your application

You can use IBM Data Server Driver for JDBC and SQLJ-only methods to provide extended client information to the data source and additional information about the client to the server. However, in the IBM Data Server Driver for JDBC and SQLJ Version 4.0 or later, the IBM Data Server Driver for JDBC and SQLJ-only methods are deprecated. You should use java.sql.Connection.setClientInfo instead.

Therefore, the IBM Data Server Driver for JDBC and SQLJ-only methods (using com.ibm.db2.jcc.DB2Connection) in Table 8-1 are listed for reference only.

Table 8-1 Setting client information through Data Server Driver only methods

Method	Information provided
setDB2ClientAccountingInformation	Accounting information
setDB2ClientApplicationInformation	Name of the application that is working with a connection
setDB2ClientDebugInfo	The CLIENT DEBUGINFO connection attribute for the Unified debugger
setDB2ClientProgramId	A caller-specified string that helps the caller identify which program is associated with a particular SQL statement
setDB2ClientUser	User name for a connection
setDB2ClientWorkstation	Client workstation name for a connection

The IBM Data Server Driver for JDBC and SQLJ Version 4.0 and later supports the usage of client information properties that are part of the JDBC 4.0 standard. Use those properties instead of the ‘IBM DB2-only’ implementation. An application can also use the Connection.getClientInfo method to retrieve client information from the database server, or use the DatabaseMetaData.getClientInfoProperties method to determine which client information the IBM Data Server Driver for JDBC and SQLJ driver supports.

Table 8-2 lists the client information property values that the IBM Data Server Driver for JDBC and SQLJ returns for DB2 for z/OS when the connection uses type 4 connectivity.

Table 8-2 Client properties that are set by the driver when using a type 4 connection to DB2 for z/OS

Name	MAX_LEN (bytes)	DEFAULT_VALUE	Description
ApplicationName	32	clientProgramName property value, if set; "db2jcc_application" otherwise.	The name of the application that is using the connection. This value is stored in DB2 special register CURRENT CLIENT_APPLNAME.
ClientAccountingInformation	200	A string that is the concatenation of the following values: •"JCCnnnnn", where nnnnn is the driver level, such as 04000. •The value that is set by DB2Connection.setDB2ClientWorkstation. If the value is not set, the default is the host name of the local host. •applicationName property value, if set; 20 blanks otherwise. •clientUser property value, if set; eight blanks otherwise.	The value of the accounting string from the client information that is specified for the connection. This value is stored in the DB2 special register CURRENT CLIENT_ACCTNG.
ClientHostname	18	The value that is set by DB2Connection.setDB2ClientWorkstation. If the value is not set, the default is the host name of the local host.	The host name of the computer on which the application that is using the connection is running. This value is stored in the DB2 special register CURRENT CLIENT_WRKSTNNAME.
ClientUser	16	The value that is set by DB2Connection.setDB2ClientUser. If the value is not set, the default is the current user ID that is used to connect to the database.	The name of the user on whose behalf the application that is using the connection is running. This value is stored in the DB2 special register CURRENT CLIENT_USERID.

Table 8-3 lists the client information property values that the IBM Data Server Driver for JDBC and SQLJ returns for DB2 for z/OS when the connection uses type 2 connectivity.

Table 8-3 Client properties that are set by the driver when using a type 2 connection to DB2 for z/OS

Name	MAX_LEN (bytes)	DEFAULT_VALUE	Description
ApplicationName	32	Empty string	The name of the application that is using the connection. This value is stored in the DB2 special register CURRENT CLIENT_APPLNAME.
ClientAccountingInformation	200	Empty string	The value of the accounting string from the client information that is specified for the connection. This value is stored in the DB2 special register CURRENT CLIENT_ACCTNG.
ClientHostname	18	Empty string	The host name of the computer on which the application that is using the connection is running. This value is stored in the DB2 special register CURRENT CLIENT_WRKSTNNAME.
ClientUser	16	Empty string	The name of the user on whose behalf the application that is using the connection is running. This value is stored in the DB2 special register CURRENT CLIENT_USERID.

Specifying the client information inside the application has the advantage that each application can set its own setting and allows a high degree of granularity, making detailed monitoring of applications and component possible.

The disadvantage of this approach is that you rely on the programmer to provide this information, and monitoring is often not the number one priority, which might result in this information not being passed, which can result in the program running with the wrong priority and missing its service levels when client information is used to classify work in WLM.

Setting the client information in WebSphere Application Server

Specifying the client information at the application server level removes the burden from the programmer and also allows you to dynamically change the settings without having to change the application code.

Specifying client information at the data source level

Figure 8-1 and Figure 8-2 on page 367 show how to specify the client information strings at the data source level.

Figure 8-1 Data source Custom properties

They must be entered as custom properties at the data source level. In this example, we use TradeClientUser for the clientUser property. When the application connects to DB2 through this data source, this results in the CURRENT CLIENT_USERID special register being set to TradeClientUser.

Figure 8-2 Specifying client information as data source custom properties

Flowing client information implicitly

If you do not want to set the client information explicitly at the data source level, or when the same data source is used by many different applications, you can choose to set the enableClientInformation property, as shown in Figure 8-3.

Figure 8-3 Using the enableClientInformation Custom property

For example, during our tests where we used the type 2 driver, we did not specify any specific client information strings. In that case, only the WebSphere Application Server application name (DayTrader-EE6) is passed to DB2 (QWHCEUTX - the user transaction name).

Specifying client information at the application level

Another place to specify the client information is by using the extended properties at the application level.

Figure 8-4 on page 369 and Figure 8-5 on page 369 show how to specify this information in more detail by using the Admin console. We use the D0ZG WASTestClientInfo application to demonstrate this feature.

Figure 8-4 Application Resource reference window

In this example, we specified only the clientApplicationInformation string.

Figure 8-5 Specifying client information as an extended data source property

8.2.2 Using client information strings to classify work in WLM and RMF reporting

You can use the client information when you classify work on the z/OS system. When work must be run on a z/OS system, the work is classified by the z/OS workload manager (WLM) component. A priority is assigned to this piece of work, which is done by WLM based on the classification rules that you specify in the WLM policy.

Classifying work when using a type 4 connection

When work comes into DB2 (through the distributed address space) using a Java type 4 connection, an enclave is created and the work is classified by using the WLM classification criteria that is described in the WLM policy.

Then, you list the classification options that are related to the usage of client information when using a type 4 connection:

AI Accounting information. This is the value of the DB2 accounting string that is associated with the DDF server thread, as described by the QMDAAINF field in the DB2 DSNDQMDA mapping macro. WLM imposes a maximum length of 143 bytes for accounting information. (The DB2 macros can be found in the hlq.SDSNMACS library.)

PC Process name. This attribute can be used to classify work by using the application name or the transaction name. The value is defined by the QWHCEUTX field in the DB2 DSNDQWHC mapping macro.

SPM Subsystem parameter. This qualifier has a maximum length of 255 bytes. Its content depends on the environment that you run in. When classifying DDF work, the first 16 bytes contain the client's user ID. The next 18 bytes contain the client's workstation name. The remaining 221 bytes are reserved.

If the length of the client's user ID is less than 16 bytes, this attribute uses blanks after the user ID to pad the length. If the length of the client's workstation name is less than 18 bytes, the attribution uses blanks after the workstation name to pad the length.

There are many other classification types that can be used to qualify work. For more information, see the following resources:

•The “Defining Work Qualifiers” section in z/OS MVS Planning: Workload Management, SA22-7602-20, found at:

http://publib.boulder.ibm.com/infocenter/zos/v1r13/index.jsp?topic=%2Fcom.ibm.zos.r13.ieaw100%2Fiea2w1c052.htm

•The “Classification attributes” section in DB2 10 for z/OS Managing Performance, SC19-2978, found at:

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.db2z10.doc.perf/src/tpc/db2z_classificationattributes.htm

The classification rules for the DDF work that we used during this project are shown in Figure 8-6 on page 371.

Subsystem-Type Xref Notes Options Help

--------------------------------------------------------------------------

Modify Rules for the Subsystem Type Row 19 to 41 of 41

Command ===> ___________________________________________ Scroll ===> CSR

Subsystem Type . : DDF Fold qualifier names? N (Y or N)

Description . . . DDF Work Requests

Action codes: A=After C=Copy M=Move I=Insert rule

B=Before D=Delete row R=Repeat IS=Insert Sub-rule

More ===>

--------Qualifier-------- -------Class--------

Action Type Name Start Service Report

DEFAULTS: DDFBAT ________

____ 1 SI D0Z* ___ DDFDEF RD0ZGDEF

____ 2 PC Trade* ___ DDFONL RTRADE0Z

____ 2 PC dwsClie* ___ DDFONL RDWS0Z

Figure 8-6 Classifying DDF work by using the subsystem and process name

The process name (or application name) is used to qualify the work. Both types of work (Trade* and dwsClie*) use the same service class (DDFONL), but to distinguish between them, we use a separate reporting class for each application:

•The Trade* application uses RTRADE0Z.

•The dwsClie* application uses RDWS0Z.

Classifying work when using a type 2 connection

When using a type 2 connection, the work comes into DB2 through the RRS attach. The work is already classified and the existing work unit is used to run the DB2 part of the work. The classification is done when the transaction starts at the application server, not when it comes into DB2 (as it is for DDF work).

WebSphere Application Server transaction qualification

WLM cannot classify work that is based on the HTTP URL. However, work can be classified in WLM by using transaction classes, and WebSphere Application Server offers a configuration option to assign transaction classes to HTTP URLs.

You define the URL to transaction class assignments in a WebSphere Application Server classification document, which is a common XML file, as shown in Figure 8-7.

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE Classification SYSTEM "Classification.dtd" >

<http_classification_info uri="/daytrader*" transaction_class="DTRADE">

</http_classification_info>

< http_classification_info uri="/wastestClientInfo*" transaction_class="DWS">

</http_classification_info>

</InboundClassification>

</Classification>

Figure 8-7 WebSphere Application Server classification document wlm.xml

The default transaction class is WHTTP. When the URI uses /daytrader*, the transaction is associated with the DTRADE transaction class (and we use transaction class DWS when the URI contains /wastestClientInfo*.

The URI information is obtained from the deployment descriptor of the application. To retrieve this information, open the administration console, select the application that you want to classify, and click View Deployment Descriptor, as shown in Figure 8-8.

Figure 8-8 Selecting the application’s deployment descriptor

The context-root that is shown in Figure 8-9 is used to assign the transaction class.

Figure 8-9 DayTrader-EE6 deployment descriptor

Now that you have built the XML file, tell the application server to use this file by setting the wlm_classification_file environment variable to the name of our classification file. To do so, navigate to the appropriate WebSphere Application Server console application, click Environment → Manage WebSphere variables, as shown in Figure 8-10.

Figure 8-10 Setting the wlm_classification_file variable

Note: Make sure that WebSphere Application Server has the necessary permissions to access the WLM classification file.

During the starting sequence, WebSphere Application Server issues a runtime message to confirm that the current WLM_CLASSIFICATION_FILE setting is being used, as shown
in Figure 8-11.

BBOM0001I wlm_classification_file: /u/rajesh/wlm.xml.

Figure 8-11 Current wlm_classification_file that is being used at start

You can also change the classification file, and check the current setting by using console commands, as shown in Figure 8-12. Message BB000211I indicates the success or failure of the RECLASSIFY command option.

F MZSR014,RECLASSIFY,FILE='/u/rajesh/wlm.xml'

BBOJ0129I: The /u/rajesh/wlm.xml workload classification file was 795

loaded at 2012/08/11 00:22:47.297 (GMT)

BBOO0211I MODIFY COMMAND RECLASSIFY,FILE='/u/rajesh/wlm.xml'

COMPLETED SUCCESSFULLY

BBOO0211I MODIFY COMMAND RECLASSIFY,FILE='/u/rajesh/wlm.xml' COMPLETED

WITH ERRORS

F MZSR014,DISPLAY,WORK,CLINFO

BBOJ0129I: The /u/rajesh/wlm.xml workload classification file was 798

loaded at 2012/08/11 00:22:47.297 (GMT)

BBOO0281I CLASSIFICATION COUNTERS FOR HTTP WORK

BBOO0282I CHECKED 0, MATCHED 0, USED 0, COST 3, DESC: HTTP root

BBOO0282I CHECKED 0, MATCHED 0, USED 0, COST 2, DESC: HTTP root

BBOO0282I CHECKED 0, MATCHED 0, USED 0, COST 3, DESC: HTTP root

BBOO0283I FOR HTTP WORK: TOTAL CLASSIFIED 0, WEIGHTED TOTAL COST 0

BBOO0188I END OF OUTPUT FOR COMMAND DISPLAY,WORK,CLINFO

Figure 8-12 Changing and displaying the workload classification file

MZSR014 is the name of one of our WebSphere Application Server servers.

For more information about the workload classification file, see the following website:

http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.zseries.doc/ae/rrun_wlm_tclass_dtd.html

With the transaction workload classification in place on the WebSphere Application Server side, you can now use the transaction classes in the WLM classification rules, as illustrated in Figure 8-13 on page 375.

Subsystem-Type Xref Notes Options Help

--------------------------------------------------------------------------

Modify Rules for the Subsystem Type Row 6 to 11 of 11

Command ===> ___________________________________________ Scroll ===> CSR

Subsystem Type . : CB Fold qualifier names? N (Y or N)

Description . . . WebSphere/Component Broker

Action codes: A=After C=Copy M=Move I=Insert rule

B=Before D=Delete row R=Repeat IS=Insert Sub-rule

More ===>

--------Qualifier-------- -------Class--------

Action Type Name Start Service Report

DEFAULTS: WASONL RCB

____ 1 CN MZSR01 ___ WASONL RMZSR01

____ 2 TC DTRADE ___ WASONL RTRADE

____ 2 TC DWS ___ WASONL RDWS

Figure 8-13 WebSphere work classification using transaction classes

In this example, use the CB subsystem type to classify the WebSphere Application Server work:

CN Collection name. This is the logical server name that is defined by using the Component Broker System Management Utility. It represents a set of business objects that are grouped and run in a logical server. This is the WebSphere Application Server cluster name.

TC Transaction class. This is the name that results from mapping the URI to a name.

DTRADE and DWS are the transaction classes that were assigned through the WLM classification file. When a transaction arrives on the MZSR01 cluster and it is assigned to the DTRADE transaction class, it runs by using the WASONL service class, and it uses the RTRADE RMF reporting class. Using a different reporting class allows you to distinguish between different transactions classes when they use the same service class.

8.2.3 Other techniques to segregate/correlate work

Although usage of client strings is a preferred practice, you can use other techniques to distinguish between different applications or groups of applications that are running in a WebSphere Application Server talking to DB2 for z/OS.

Using separate DB2 collections

When you use JDBC, all applications use the same DB2 packages. They are called SYS* and by default they are bound into a collection called NULLID. If that is the case, you cannot really distinguish between transactions by looking at the DB2 package name.

To get around this problem, you can bind the DB2 JDBC packages into separate collections, one per application or group of applications. To do so, use the DB2Binder utility and specify -collection collection-name.

You can also use the DB2 BIND PACKAGE command with the COPY(collection-name.package-name) keyword. For more information about this command, see 4.3.9, “Bind JDBC packages” on page 165.

On the WebSphere Application Server side, at the data source level, you can use a specific collection for that data source by setting the currentPackageSet property, as shown in Figure 8-14. You can also set this property inside your program on the Connection or DataSource object, but to spare the programmer the effort of specifying this type of information, specify this information at the data source level through the
administration console.

Figure 8-14 Setting currentPackageSet property

Applications that use the TradeDataSource data source now use the SYS* packages from the DAYTRADER collection. When you use the currentPackageSet property, all packages that are used by the applications that use this data source must be present in the collection you point to through the currentPackageSet property.

Using separate DB2 plan names when using a type 2 connection

If you use a type 2 connection, you do not have to specify a planname when you create a connection to DB2 for z/OS. DB2 uses an implicit planname, similar to when you use a type 4 connection. However, you can provide a planname for a type 2 connection. If you create a separate plan for each application (that uses a type 2 connection), you can use the planname to identify the application in much the same way that you do for other types of connections, such as CICS, IMS, and TSO. In this case, create a plan and point it to the standard JDBC packages. For example:

BIND PLAN(plntrade) PKLIST(NULLID.*) ..

In addition to creating the plan, you must indicate in the data source to use this particular plan by setting the planName property, as shown in Figure 8-15 on page 377.

Figure 8-15 Specifying the planName data source property

Now that you have set up a way to correlate WebSphere Application Server, DB2, and RMF information, you can start monitoring our applications.

8.3 Monitoring from WebSphere Application Server

This section describes the different monitoring options and tools you have at your disposal in WebSphere Application Server:

•SMF120 records

•WebSphere Performance Monitoring Infrastructure (PMI)

•Using request metrics

8.3.1 Using SMF 120 records

In z/OS, you can use the system management facilities (SMF) component to gather and record data for evaluating system usage. WebSphere Application Server logs activity data through SMF record type 120. Record 120 includes many subtypes:

•Server Activity record: Subtype 1

•Server Interval record: Subtype 3

•Java Platform, Enterprise Edition Container Activity Record: Subtype 5

•Java Platform, Enterprise Edition Container Interval Record: Subtype 6

•WebContainer Activity record: Subtype 7

•WebContainer Interval record: Subtype 8

•Request Activity record: Subtype 9

•Outbound Request record: Subtype 10

WebSphere Application Server for z/OS Version 7 introduced SMF type 120 subtype 9. It bundles most of the data that is also spread across the other subtypes, and adds additional information, such as how much zAAP processing that the server uses in processing a request. WebSphere Application Server creates one subtype 9 record for every request that the server processes for both external requests (application requests) and internal requests, such as when the controller “talks to” the servant regions.

The other record 120 subtypes are still available, but as subtype 9 combines the information from the other subtypes, we use this information to illustrate the type of information that
is available.

Enabling SMF 120 data collection

To collect this information, you must make sure that SMF can write the record type that you want to collect. You can verify this situation by looking at your SMFPRMxx member in PARMLIB or by running D SMF,O. A (partial) sample output is shown in Example 8-1.

Example 8-1 D SMF,O output

D SMF,O

IEE967I 21.09.51 SMF PARAMETERS 799

MEMBER = SMFPRM00

SMFDLEXIT(USER3(IRRADU86)) -- DEFAULT

SMFDLEXIT(USER2(IRRADU00)) -- DEFAULT

SMFDPEXIT(USER3(IRRADU86)) -- DEFAULT

SMFDPEXIT(USER2(IRRADU00)) -- DEFAULT

EMPTYEXCPSEC(NOSUPPRESS) -- DEFAULT

MULCFUNC -- DEFAULT

DSPSIZMAX(2048M) -- DEFAULT

BUFUSEWARN(25) -- DEFAULT

BUFSIZMAX(0128M) -- DEFAULT

MAXEVENTINTRECS(00) -- DEFAULT

SYNCVAL(00) -- DEFAULT

DUMPABND(RETRY) -- DEFAULT

SUBSYS(STC,NOINTERVAL) -- SYS

SUBSYS(STC,NODETAIL) -- SYS

SUBSYS(STC,EXITS(IEFUSO)) -- PARMLIB

SUBSYS(STC,EXITS(IEFUJP)) -- PARMLIB

SUBSYS(STC,EXITS(IEFUJI)) -- PARMLIB

SUBSYS(STC,EXITS(IEFACTRT)) -- PARMLIB

SUBSYS(STC,EXITS(IEFU85)) -- PARMLIB

SUBSYS(STC,EXITS(IEFU84)) -- PARMLIB

SUBSYS(STC,EXITS(IEFU83)) -- PARMLIB

SUBSYS(STC,EXITS(IEFU29)) -- PARMLIB

SUBSYS(STC,TYPE(0:18,20:98,100:255)) -- PARMLIB

....

In this case, record types 100 - 255 are enabled, which includes the type 120 record.

By default, WebSphere Application Server does not write any records to SMF. You must activate the writing of these records at the application server level. This can be done in different ways. In this example, we use the administration console interface to enable the SMF recording by clicking Servers → Server Types → WebSphere Application Servers, selecting the server, and clicking Java and Process Management → Process definition → Control → Environment entries.

Figure 8-16 shows where to find the Java and Process Management and Process definition options under the Server Infrastructure heading.

Figure 8-16 Java and Process Management option

The SMF options must be specified at the Control region level. Therefore, you must select the applicable control region and add the new environment variables there, as shown in Figure 8-17.

Figure 8-17 Adding an SMF property

We added the following options by using the administration console, which is shown in Figure 8-18, to activate the SMF recording.

Figure 8-18 SMF recording properties that are set through the administration console

WebContainer SMF recording (SMF 120 subtype 7 and 8) is activated and deactivated along with the activation and deactivation of SMF recording for the Java Platform, Enterprise Edition container (SMF 120 subtype 5 and 6), so there are no specific options to activate subtype 7 and 8.

Here are other properties that you can set (value = 1 to activate) through the
administration console:

•server_SMF_request_activity_enabled to enable subtype 9

The following settings add additional information to the subtype 9 record:

– server_SMF_request_activity_CPU_detail

– server_SMF_request_activity_timestamps

– server_SMF_request_activity_security

– server_SMF_request_activity_async

•server_SMF_outbound_enabled to enable subtype 10

The subtype 9 record can also be activated by using z/OS console commands, which are illustrated in Example 8-2. MZSR014 is the WebSphere Application Server name. You can also display the current settings that are in effect.

Example 8-2 Using MVS commands to activate SMF 120 type 9 recording

F MZSR014,SMF,REQUEST,ON

BBOO0211I MODIFY COMMAND SMF,REQUEST,ON COMPLETED SUCCESSFULLY

F MZSR014,SMF,REQUEST,CPU,ON

BBOO0211I MODIFY COMMAND SMF,REQUEST,CPU,ON COMPLETED SUCCESSFULLY

F MZSR014,SMF,REQUEST,TIMESTAMPS,ON

BBOO0211I MODIFY COMMAND SMF,REQUEST,TIMESTAMPS,ON COMPLETED

SUCCESSFULLY

F MZSR014,SMF,REQUEST,SECURITY,ON

BBOO0211I MODIFY COMMAND SMF,REQUEST,SECURITY,ON COMPLETED

SUCCESSFULLY

F MZSR014,DISPLAY,SMF

BBOO0344I SMF 120-9: FORCED_ON, CPU USAGE: FORCED_ON, TIMESTAMPS:

FORCED_ON, SECURITY INFO: FORCED_ON, ASYNC: OFF

BBOO0345I SMF 120-9: TIME OF LAST WRITE: 2012/08/07 18:25:15.814883,

SUCCESSFUL WRITES: 433, FAILED WRITES: 0

BBOO0346I SMF 120-9: LAST FAILED WRITE TIME: NEVER, RC: 0

BBOO0389I SMF 120-10: OFF

BBOO0387I SMF 120-10: TIME OF LAST WRITE: NEVER, SUCCESSFUL WRITES: 0,

FAILED WRITES: 0

BBOO0388I SMF 120-10: LAST FAILED WRITE TIME: NEVER, RC: 0

BBOO0188I END OF OUTPUT FOR COMMAND DISPLAY,SMF

Note: The changes that you make to the SMF 120 subtype 9 settings through console commands remain active only until the server is restarted, and changes that are made through the administration console remain after the server is restarted.

Analyzing SMF 120 information

If you want to learn how to format SMF type 120 records, you can download a sample Java application that is called the SMF Browse from the WebSphere Application Server for
z/OS website:

https://www14.software.ibm.com/webapp/iwm/web/preLogin.do?source=zosos390

The documentation for the SMF Browser is available in the browser package.

The WebSphere Application Server information Center also has information that is related to this topic in the “Viewing the output data set” topic. It is available at the following website:

http://www14.software.ibm.com/webapp/wsbroker/redirect?version=phil&product=was-nd-zos&topic=ttrb_SMFviewdata

Another excellent source of information about SMF 120 subtype 9 records is the white paper Understanding SMF Record Type 120, Subtype 9. It is available at the following website:

http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101342

In our example, we used the following commands to generate a summary (SUMPERF) and a detailed (DEFAULT) report of the SMF 120 records that were collected during one of the runs that were using the Trader sample application:

java -cp bbomsmfv.jar:batchsmf.jar com.ibm.ws390.sm.smfview.SMF 'INFILE(BART.WAS.TEST1.SMF120)' 'PLUGIN(PERFSUM,/tmp/smf120sum.txt)'

java -cp bbomsmfv.jar:batchsmf.jar com.ibm.ws390.sm.smfview.SMF 'INFILE(BART.WAS.TEST1.SMF120)' 'PLUGIN(DEFAULT,/tmp/smf120detail.txt)'

The second parm of the PLUGIN option indicates the file that the output is directed to.

The SMF 120 records contain much f information. We describe only the subtype 9 record in a here. Samples of subtypes 1, 3, 7, and 8 for both the summary and detailed output can be found in Appendix E, “SMF 120 records subtypes 1, 3, 7, and 8” on page 545.

Example 8-3 shows the summary (SUMPERF) output by the SMF Browser program for one of the SMF 120.9 (Request Activity) records. It shows the elapsed and CPU time (in microseconds) and the CPU time that was used on a zAAP engine, in case that is available. In this case, the entire request was offloaded to zAAP (CPU and zAAP time are the same). The record also provides information about the time the request came into the application server, when it was queued, dispatched, and ended. The output also indicates which programs ran; in this case, they are all JSPs.

Example 8-3 Subtype 9 summary

SMF -Record Time Server Bean/WebAppName Bytes Bytes # of El.Time CPU_Time(uSec) Other SMF 120.9

Numbr -Type hh:mm:ss Instance Method/Servlet toSvr frSvr Calls (msec) Tot-CPU zAAP Sections Present

1---+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+ ----------------

694 120.9 19:58:06 MZSR014 STC24171-HTTP / 25 584 584

.9Ts: 2012/08/10 23:58:06.377368 Received

.9Ts: 2012/08/10 23:58:06.377459 Queued

.9Ts: 2012/08/10 23:58:06.386165 Dispatched

.9Ts: 2012/08/10 23:58:06.401038 dispatchComplete

.9Ts: 2012/08/10 23:58:06.402788 Complete

.9N ip addr=9.12.6.9 port=24146 832 6176 .9Cl/daytrader/ap

9CPU:Web DayTrader-EE6#web.wa/TradeAppServlet 1 0 29

9CPU:Web DayTrader-EE6#web.wa//quote.jsp 1 0 61

9CPU:Web DayTrader-EE6#web.wa//displayQuote.jsp 1 2 214

Example 8-4 shows the detailed (DEFAULT) output that is created by the SMF Browser program for the same SMF 120.9 (Request Activity) record that we analyzed in Example 8-3 on page 382. The detailed output contains much information.

One thing that might be of interest is the transaction class that is used by the transaction.

Example 8-4 Subtype 9 detailed output

--------------------------------------------------------------------------------

Record#: 694;

Type: 120; Size: 3624; Date: Fri Aug 10 19:58:06 EDT 2012;

SystemID: SC64; SubsystemID: WAS; Flag: 94;

Subtype: 9 (REQUEST ACTIVITY);

#Subtype Version: 2;

Index of this record: 1;

Total number of records: 1;

record continuation token * 000000c4 0120a481 -------- -------- *

#Triplets: 11;

Triplet #: 1; offsetDec: 204; offsetHex: cc; lengthDec: 76; lengthHex: 4c; count: 1;

Triplet #: 2; offsetDec: 280; offsetHex: 118; lengthDec: 156; lengthHex: 9c; count: 1;

Triplet #: 3; offsetDec: 436; offsetHex: 1b4; lengthDec: 68; lengthHex: 44; count: 1;

Triplet #: 4; offsetDec: 504; offsetHex: 1f8; lengthDec: 736; lengthHex: 2e0; count: 1;

Triplet #: 5; offsetDec: 1240; offsetHex: 4d8; lengthDec: 132; lengthHex: 84; count: 1;

Triplet #: 6; offsetDec: 1372; offsetHex: 55c; lengthDec: 188; lengthHex: bc; count: 1;

Triplet #: 7; offsetDec: 1560; offsetHex: 618; lengthDec: 420; lengthHex: 1a4; count: 3;

Triplet #: 8; offsetDec: 0; offsetHex: 0; lengthDec: 0; lengthHex: 0; count: 0;

Triplet #: 9; offsetDec: 1980; offsetHex: 7bc; lengthDec: 1644; lengthHex: 66c; count: 3;

Triplet #: 10; offsetDec: 0; offsetHex: 0; lengthDec: 0; lengthHex: 0; count: 0;

Triplet #: 11; offsetDec: 0; offsetHex: 0; lengthDec: 0; lengthHex: 0; count: 0;

Triplet #: 1; Type: PlatformNeutralSection;

Server Info Version : 1;

Cell Short Name : MZCELL;

Node Short Name : MZNODE4;

Cluster Short Name : MZSR01;

Server Short Name : MZSR014;

Server/Controller PID : 65569;

WAS Release : 8;

WAS Release x of .x.y.z: 5;

WAS Release y of .x.y.z: 0;

WAS Release z of .x.y.z: 0;

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

Triplet #: 2; Type: ZosServerInfoSection;

Server Info Version : 2;

System Name (CVTSNAME) : SC64;

Sysplex Name : SANDBOX;

Controller Name : MZSR014;

Controller Job ID : STC24171;

Controller STOKEN * 000002a4 00000062 -------- -------- *

Controller ASID (HEX) * 00a9---- -------- -------- -------- *

CPU Usage Overflow : 0;

CEEGMTO failed/unavailable : 1;

Cluster UUID * c9e1e23b fc4d1532 000002b0 00000004 *

* 00000048 -------- -------- -------- *

Server UUID * c9e1e24f 897f45a6 000002b0 00000004 *

* 00000048 -------- -------- -------- *

Daemon Group Name : MZCELL;

LE GMT Offset (Hours) from CEEGMTO : 0;

LE GMT Offset (Minutes) from CEEGMTO: 0;

LE GMT Offset (Seconds) from CEEGMTO: 0;

System GMT Offset from CVTLDTO (HEX) * ffffca5b 17000000 -------- -------- *

Maintenance Level : gm1215.02;

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 -------- -------- -------- *

Triplet #: 3; Type: PlatformNeutralRequestInfoSection;

Version : 1;

Dispatch Servant PID (HEX) * 00010083 -------- -------- -------- *

Dispatch Task ID * 25982400 0000003c -------- -------- *

Dispatch TCB CPU : 583;

Completion Minor Code * 00000000 -------- -------- -------- *

Reserved * 00000000 -------- -------- -------- *

Request Type : 2 (HTTP);

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

Triplet #: 4; Type: ZosRequestInfoSection;

Server Info Version : 2;

Time Received * ca002664 00198ea5 00000000 00000000 *

Time Queued * ca002664 001f3b25 00000000 00000000 *

Time Dispatched * ca002664 023f5e74 00000000 00000000 *

Time Dispatch Complete * ca002664 05e0ee74 00000000 00000000 *

Time Complete * ca002664 064e4ca3 00000000 00000000 *

Servant Job Name : MZSR014S;

Servant Job ID : STC24174;

Servant SToken * 000003f0 00000120 -------- -------- *

Servant ASID (HEX) * 00fc---- -------- -------- -------- *

Reserved for alignment * 0000---- -------- -------- -------- *

Servant Tcb Address * 007bdad0 -------- -------- -------- *

Servant TToken * 000003f0 00000120 0000004c 007bdad0 *

CPU Offload : 580;

Servant Enclave Token * 000000c4 0120a481 -------- -------- *

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

Enclave CPU So Far : 2393980;

zAAP CPU So Far : 2393980;

zAAP Eligible on CP : 0;

zIIP on CPU So Far : 0;

zIIP Qual Time So Far : 0;

zIIP CPU So Far : 0;

zAAP Normalization Factor : 256;

Enclave Delete CPU : 2419626;

Enclave Delete zAAP CPU : 2393980;

Enclave Delete zAAP Norm : 256;

Reserved * 00000000 -------- -------- -------- *

Enclave Delete zIIP Norm : 0;

Enclave Delete zIIP Service : 0;

Enclave Delete zAAP Service : 34;

Enclave Delete CPU Service : 34;

Enclave Delete Resp Time Ratio : 2;

Reserved for alignment * 00000000 00000000 00000000 -------- *

GTID * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00------ -------- *

Reserved for alignment * 000000-- -------- -------- -------- *

Dispatch Timeout : 0;

Transaction Class : ;

Flags * 84d00000 -------- -------- -------- *

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

Classification attributes: ;

Stalled thread dump action : 3;

CPU time used dump action : 3;

DPM dump action : 3;

Timeout recovery : 2;

Dispatch timeout : 300;

Queue timeout : 297;

Request timeout : 180;

CPU time used limit : 0;

DPM interval : 0;

Message Tag : ;

Obtained affinity : ;

Routing affinity : C9E1E24F897F45A6000002B00000000400000048sn6zGpx_39-MGb4qNtoil8h;

Triplet #: 5; Type: TimeStampSection;

Time Received : 2012/08/10 23:58:06.377368;

Time Queued : 2012/08/10 23:58:06.377459;

Time Dispatched : 2012/08/10 23:58:06.386165;

Time Dispatch Complete: 2012/08/10 23:58:06.401038;

Time Complete : 2012/08/10 23:58:06.402788;

Reserved for alignment * 0000---- -------- -------- -------- *

Triplet #: 6; Type: NetworkDataSection;

Version : 1;

Bytes Received : 832;

Bytes Sent : 6176;

Target Port : 99;

Origin String Length: 27;

Origin String : ip addr=9.12.6.9 port=24146;

Reserved * 00000000 00000000 00000000 00000000 *

* 00000000 00000000 00000000 00000000 *

Triplet #: 7; Type: ClassificationDataSection;

Version : 1;

Data Type : 6 (URI);

Data Length: 14;

Data : /daytrader/app (EBCDIC);

Triplet #: 7; Type: ClassificationDataSection;

Version : 1;

Data Type : 7 (Target Hostname);

Data Length: 19;

Data : wtsc64.itso.ibm.com (EBCDIC);

Triplet #: 7; Type: ClassificationDataSection;

Version : 1;

Data Type : 8 (Target Port);

Data Length: 2;

Data : 99 (EBCDIC);

Triplet #: 9; Type: CpuUsageSection;

Version : 1;

Data Type : 2;

Request Type : 2 (Web Container);

CPU Time : 29;

Elapsed Time : 0;

Invocation Count: 1;

String 1 Length : 21;

String 1 : 21 (DayTrader-EE6#web.war);

String 2 Length : 15;

String 2 : 15 (TradeAppServlet);

Triplet #: 9; Type: CpuUsageSection;

Version : 1;

Data Type : 2;

Request Type : 2 (Web Container);

CPU Time : 61;

Elapsed Time : 0;

Invocation Count: 1;

String 1 Length : 21;

String 1 : 21 (DayTrader-EE6#web.war);

String 2 Length : 10;

String 2 : 10 (/quote.jsp);

Triplet #: 9; Type: CpuUsageSection;

Version : 1;

Data Type : 2;

Request Type : 2 (Web Container);

CPU Time : 214;

Elapsed Time : 2;

Invocation Count: 1;

String 1 Length : 21;

String 1 : 21 (DayTrader-EE6#web.war);

String 2 Length : 17;

String 2 : 17 (/displayQuote.jsp);

--------------------------------------------------------------------------------

8.3.2 WebSphere Application Server Performance Monitoring Infrastructure

A typical web application consists of a web server, application server, and a database. Monitoring and tuning the application server is critical to the overall performance. Performance Monitoring Infrastructure (PMI) is the core monitoring infrastructure for WebSphere Application Server. PMI provides a comprehensive set of data that explains the runtime and application resource behavior. For example, PMI provides database connection pool size, servlet response time, Enterprise JavaBeans (EJB) method response time, Java virtual machine (JVM) garbage collection time, and processor usage.

Using PMI data, performance bottlenecks in the application server can be identified and addressed. For example, one of the PMI statistics in the Java DataBase Connectivity (JDBC) connection pool is the number of statements that are discarded from the prepared statement cache, which we use as an example to illustrate the value of PMI data. This statistic can be used to adjust the prepared statement cache size to minimize the discards and to improve the database query performance.

PMI data can be monitored and analyzed by IBM Tivoli® Performance Viewer, other Tivoli tools, your own applications, or third-party tools. As Tivoli Performance Viewer ships with WebSphere Application Server, we use it to visualize the PMI data in our example.

Java Platform, Enterprise Edition (Java EE) 1.4 includes a Performance Data Framework that is defined as part of JSR-077 (Java Platform, Enterprise Edition Management Specification). This framework specifies the performance data that must be available for various Java EE components. WebSphere Application Server PMI complies with Java EE 1.4 standards by implementing the Java EE 1.4 Performance Data Framework. In addition to providing statistics that are defined in Java EE 1.4, PMI provides additional statistics about the Java EE components, such as servlets and enterprise beans, and WebSphere Application Server -specific components, such as thread pools.

Obtaining PMI data

PMI data covers many performance aspects of web applications. As this publication focuses on web applications that access DB2 for z/OS database resources, we limit the discussion to a few parameters that are related to database access as an illustration of how to obtain and use PMI information.

You activate PMI data collection at the Application Server level. To do so, expand Monitoring and Tuning in the left pane of the administration console and click Performance Monitoring Infrastructure. Select the application server that you want to collect data for (MZSR014 in our case) and click Start Monitoring, as shown in Figure 8-19.

Figure 8-19 Start PMI collection

You receive confirmation that monitoring has started and the server is now in the Monitored status, as shown in Figure 8-20.

Figure 8-20 PMI collection that is activated for the application server

PMI can collect many different types of data at various levels of detail. To do so, click Monitoring and Tuning → Request Metrics. We used the standard settings in our example. For more information about the different options and levels of granularity at which PMI data can be collected, see the following website:

http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.express.doc/ae/tprf_pmi_encoll.html

Viewing PMI data

To view PMI data, use the Tivoli Performance Viewer tool that is built into WebSphere Application Server. To use it, expand Monitoring and Tuning in the left pane of the administration console and click Performance Viewer → Current activity. Then, select the server that you want to see the PMI data for. A window similar to Figure 8-21 opens.

Figure 8-21 Tivoli Performance Viewer - Servlet Summary Report

In this case, the servlet summary report of our DayTrader workload is shown. On the left, you have many options to display different summary reports and look at the different performance modules that visualize the PMI data. On the right, you see the (selected) report, which is the servlet summary report in this example. It shows the name of the servlet, the application it belongs to, and the average response time.

As you are interested in performance aspects that affect database access, look at the PMI data of the connection pool that is associated with our data source (we are using a type 4 connection during this DayTrader run). To do so, expand Performance modules → JDBC Connection Pools → JDBC Universal Driver (XA) and select jdbc/TradeDataSource. The result is shown in Figure 8-22. This snapshot is from the time when the workload was increasing. The CreateCount went up quickly from 0 - 50.

Figure 8-22 JDBC Connection Pool statistics at startup

The AllocateCount continues to go up as more transactions run. Notice that the count does not go beyond 50 connections. You can use the administration console to verify whether 50 is the maximum size of the connection by clicking Data Sources → TradeDataSourceXA → Connection pools, as shown in Figure 8-23.

Figure 8-23 Connection pool properties

Using performance advisors

Tivoli Performance Viewer also has some performance advisor modules that are built into it. Select the Advisor option in the left pane to activate it. The bottom part of the advisor output window contains a number of alerts and configuration tips, as shown in Figure 8-24.

Figure 8-24 Advisor output

Zoom in on the first alert, TUNE0201:The rate of discard fro..’, by clicking it. The result is shown in Figure 8-25. The advisor indicates that there are many discards from the WebSphere Application Server statement cache. Creating a prepared statement object is a rather expensive operation, so discarding and creating many prepared statement objects is likely to affect performance. The discard rate is high at 1400/sec.

Figure 8-25 Tuning advice TUNE0201W

To verify this discard rate, go to the performance metrics of the connection pool and look for the PrepStmtCacheDiscardCount statistic, as shown in Figure 8-26 on page 393.

Figure 8-26 Connection pool - PrepStmtCacheDiscardCount

This statistic confirms the alert from the performance advisor. So, what is the current setting for the statement cache size? You can verify the setting by going to the administration console and clicking Data sources → TradeDatasourceXA → WebSphere Application Server data source properties, as shown in Figure 8-27.

Figure 8-27 Data source statement cache size

A value of 10 is indeed low for this workload. Change it to 60. The result of the change is shown in Figure 8-28.

Figure 8-28 PrepStmtCacheDiscardCount after the change

As you can see, the discard count is now down to zero from 1400/sec, which is a
great improvement.

This is just one example of how to use PMI and Tivoli Performance Viewer to analyze WebSphere Application Server performance data. For more information about the usage of PMI, see the WebSphere Application Server Information Center PMI topics found at:

http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/topic/com.ibm.websphere.express.doc/ae/cprf_pmidata.html

A good article about WebSphere Application Server performance called “Case study: Tuning WebSphere Application Server V7 and V8 for performance” can be found at:

http://www.ibm.com/developerworks/websphere/techjournal/0909_blythe/0909_blythe.html#sec3c

8.4 Monitoring from the DB2 side

Even after an application moves to production, it is important to keep monitoring the application. Over time, the behavior might change, for example, because the workload increases or the data becomes disorganized. Therefore, it is important to continuously, or at least periodically, check the performance of your applications. Dealing with all DB2 performance aspects is beyond the scope of this book, but this section provides an overview of the information that is available and how to analyze DB2 performance.

This section describes the following topics:

•Which information to gather

•Analyzing DB2 statistics data

•Analyzing DB2 accounting data

8.4.1 Which information to gather

Most installations run with a set of standard DB2 traces that are always active. The information is often used for chargeback purposes, but it contains much information that can be used to check the health of the system. As a preferred practice, have the following traces permanently active on all your DB2 systems:

•DB2 Statistics trace classes

– 1: System-wide information about the work that is performed by the DB2 system.

– 3: Information about deadlocks, timeouts, lock escalations, and long running units of work. This trace class is valuable for identifying concurrency problems.

– 4: DDF exception conditions.

– 5: Data sharing statistics.

Tip: STATIME DSNZPARM determines the interval at which DB2 writes out its statistics information for classes 1 and 5. The default value in Version 9 is 5 minutes and 1 minute in Version 10. Use STATIME=1. The cost of gathering this information is negligible and it provides valuable information for analyzing performance problems.

In DB2 10, IFCIDs 0001, 0002, 0202, 0217, 0225, and 0230 are no longer controlled by STATIME. These trace records are written at fixed, one-minute intervals.

•DB2 Accounting trace classes

– 1: Total ET and CPU time of the thread/plan and many useful counters.

– 2: In addition to the total time, the time (ET and CPU) spent inside DB2 is collected. This trace class is more expensive (typically around 2.5% processing time for online transactions) than gathering class 1 information. It can have an impact, especially for applications that issue many DB2 requests, such as fetch intensive applications (up to 10% in heavy batch environments).

However, this information is required to determine whether the time is spent in DB2 or elsewhere, and is valuable. So, unless you have a processor issue, activate accounting class 2 and leave it on. If that is not possible, activate it for one hour each day during a high activity period.

– 3: Activating this trace class adds an additional level of granularity as you can use it to determine when DB2 must wait for something, for example, for a lock to become available, how many times that occurs, and how long you wait for it.

Gathering accounting class 3 information is cheap, so it is not a problem to have it on always. Only in cases where transactions experience many lock/latch contentions can tracking them can have a noticeable impact. If that is the case, you can disable accounting trace class 3.

– 7: Similar to accounting class 2, information about the ET and CPU time that is used is gathered, but at the package/DBRM level.

– 8: Similar to accounting class 3 information, but at the package/DBRM level.

– 10: This class obtains additional information at the package level about the type of SQL statements being run, locking information, and buffer pool related information.

Typically, accounting classes 1, 2, 3, 7, and 8 are turned on all the time.

The DB2 statistics and accounting information is normally written to SMF. DB2 statistics records use SMF type 100 and DB2 accounting records use SMF type 101 records. Before you send data to SMF, make sure that SMF is enabled so that it can write DB2 trace record types 100 and 101 (and 102, which is used for performance type records). For more information about this topic, see “Enabling SMF 120 data collection” on page 378.

Both traces can be started at DB2 start time through the SMFSTAT (statistics) and SMFACCT (for accounting) DSNZPARMs.

These traces must be started on each of the members of a DB2 data sharing group to be able to get the complete picture of all the work in the data sharing group.

8.4.2 Creating DB2 accounting records at a transaction boundary

To have a good understanding of the activity that is performed by WebSphere Application Server transactions against the database, it is important to create a DB2 accounting record at the transaction boundary at commit time. As WebSphere Application Server can use either RRS (type 2) or DRDA (type 4) attach, this section briefly explains how to create granular DB2 accounting data for each type.

Ensuring DB2 accounting records when using an RRS (T2)

Typically, DB2 creates an accounting record when a thread ends, or in the case of thread reuse, when a new user performs a sign-on operation. When you use WebSphere Application Server, a thread can be reused many times before it ends or a new user comes in, and a DB2 accounting record is produced. This situation makes it difficult to analyze the performance of such applications, as sometimes a thread runs 10 actual transactions before it produces a DB2 accounting record and other times only 5.

To avoid this situation, you can direct the RRS attach to write a DB2 accounting record at commit time (provided there are no open held cursors). The easiest way to achieve this task from a WebSphere Application Server application is to set the accountingInterval custom property on the data source to COMMIT by using the administration console, as shown in Figure 8-29 on page 397.

Figure 8-29 Specifying the accountingInterval customer property

If the transaction has no open WITH HOLD cursors, each time a commit point is reached (the application issues SRRCMIT explicitly or implicitly), DB2 cuts an accounting record. If the accounting interval is COMMIT and an SRRCMIT is issued while a held cursor is open, the accounting interval spans that commit and ends at the next valid accounting interval end point (such as the next SRRCMIT that is issued without open held cursors, application termination, or SIGNON with a new authorization ID).

Ensuring DB2 accounting records when using DRDA (T4)

To make sure that DB2 cuts an accounting record at the end of each transaction when using type 4 connectivity through DRDA, you do not have to specify a custom property. DB2 cuts an accounting record at commit time for DRDA work when the following conditions are true:

•DSNZPARM CMTSTAT=INACTIVE

•At commit time, DB2 is not using one or more of the following items:

– A held cursor.

– A held LOB locator.

– A package that is bound with KEEPDYNAMIC(YES). However, if KEEPDYNAMIC is the only reason to prevent the thread from being pooled, an accounting record is still created. (A typical example of this are SAP applications.)

– A declared temporary table that is active (the table was not explicitly dropped through the DROP TABLE statement or the ON COMMIT DROP TABLE clause on the DECLARE GLOBAL TEMPORARY TABLE statement).

If these conditions are met, a DB2 accounting record is cut and the WLM enclave is reset.

8.4.3 DB2 rollup accounting

In DB2 8, rollup accounting was introduced for RRS and DRDA workloads. The idea behind rollup accounting is to reduce the number of SMF records that are produced by high volume DB2 workloads. Instead of writing a DB2 accounting record for each transaction, you can use rollup accounting to write an accounting record every x transactions, where x is determined by the value of ACCUMACC DSNZPARM. With ACCUMACC=10, for example, DB2 writes accounting records after 10 transactions complete.

When you use accounting rollup, you can also specify how DB2 aggregates the accounting records by using ACCUMUID DSNZPARM. You can look at ACCUMUID as an SQL GROUP BY specification. There are 18 different settings that you can specify for ACCUMUID. For more information, see DB2 10 for z/OS Installation and Migration Guide, GC19-2974.

In our example, we use ACCUMUID=2 during some of our tests. “2” indicates that the aggregation is done by “user application name” or “transaction name”; this is the value of the clientApplicationInformation property or CURRENT CLIENT_APPLNAME special register value.

So, with ACCUMACC=10 and ACCUMUID=2, if you run 20 transactions named tran_1 and 10 transactions of tran_2, DB2 produces three accounting records; one for all executions of tran_2 and two for the executions of tran_1. The information in such a rollup accounting record is the sum of all the work of these 10 transactions.

The advantage of using rollup accounting is obvious. It can reduce the number of (SMF) accounting records that DB2 produces. The disadvantage of using ACCUMACC is that you lose transaction granularity information. With rollup accounting, you can no longer see how each individual transaction performed because all accounting data of x transactions is rolled into a single accounting record. This situation can make it difficult to analyze performance problems, especially when the problem occurred only briefly or when using a high
ACCUMACC value.

Tip: DB2 10 introduced an option to compress records that are written to SMF by using SMFCOMP=YES DSNZPARM, which compresses the SMF trace record before it is written to the SMF data set. If you use ACCUMACC > 1 to reduce the data volume that is produced by DB2 accounting records, you might want to consider switching to ACCUMACC=NO and SMFCOMP=YES to achieve this task, and keep the transaction level granularity that you give up by using ACCUMACC >1.

Another option to reduce the size of the SMF (offloaded) data sets is to use SMS compression on the dataclass (compaction=YES) for those data sets.

8.4.4 Analyzing DB2 statistics data

DB2 statistics data provides information about the entire subsystem (or data sharing group). DB2 accounting data provides information about individual transactions (or the number of executions of a transaction where ACCUMAC > 0 DSNZPARM is used).

Start by looking at the overall subsystem statistics. This information can be used to check the overall health of the DB2 system.

Note: If this DB2 system runs other work than just transactions coming from WebSphere Application Server, these other transactions are also included in the information of the DB2 statistics record.

We use IBM Tivoli OMEGAMON DB2 Performance Expert on z/OS V5.1.1 batch reporting to look at the DB2 statistics data.

Example 8-5 shows a sample SYSIN to create a DB2 statistics report.

Example 8-5 Create a DB2 statistics report

//SYSIN DD *

DB2PM

* *********************************

* GLOBAL PARMS

* *********************************

GLOBAL

* Adjust for US East Coast DST

TIMEZONE (+4)

FROM(,21:29:00)

TO(,21:30:01)

* Include the entire data sharing group

INCLUDE( GROUP(DB0ZG))

***********************************

* STATISTICS REPORTS

***********************************

STATISTICS

REPORT

LAYOUT (LONG)

EXEC

In our example, we want to look at a one-minute interval. (As the DB2 statistics interval is one minute, we could also have used a STATISTICTS TRACE report instead).

Statistics highlights

Example 8-6 shows the header of the statistics report. It indicates the interval that we are looking at, the DB2 subsystem, and also provides an idea about the number of threads that were created, and the number of commits that occurred in the interval you are
reporting on.

Example 8-6 Statistics report - highlights section

LOCATION: DB0Z OMEGAMON XE FOR DB2 PERFORMANCE EXPERT (V5R1M1) PAGE: 1-1

GROUP: DB0ZG STATISTICS REPORT - LONG REQUESTED FROM: ALL 21:29:00.00

MEMBER: D0Z1 TO: DATES 21:30:01.00

SUBSYSTEM: D0Z1 INTERVAL FROM: 08/13/12 21:29:00.00

DB2 VERSION: V10 SCOPE: MEMBER TO: 08/13/12 21:30:00.12

---- HIGHLIGHTS ----------------------------------------------------------------------------------------------------

INTERVAL START : 08/13/12 21:29:00.00 SAMPLING START: 08/13/12 21:29:00.00 TOTAL THREADS : 80.00

INTERVAL END : 08/13/12 21:30:00.12 SAMPLING END : 08/13/12 21:30:00.12 TOTAL COMMITS : 85191.00

INTERVAL ELAPSED: 1:00.120127 OUTAGE ELAPSED: 0.000000 DATA SHARING MEMBER: N/A

The number of commits includes the commits from both local attach work (such as TSO, RRS, and utilities), as well at the commits that are received for DRDA work. The number of threads is the number of create thread operations, which do not include the Database Access Threads (DBATs) associated with DRDA work.

SQL DML and dynamic statement cache sections

To get a high-level idea about the type of SQL work that occurred during the reporting interval, you can use the SQL DML section (see Example 8-7).

Example 8-7 Statistics report - SQL DML section

SQL DML QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

SELECT 0.00 0.00 0.00 0.00

INSERT 3229.00 53.71 40.36 0.04

NUMBER OF ROWS 3229.00 53.71 40.36 0.04

UPDATE 13865.00 230.62 173.31 0.16

NUMBER OF ROWS 14688.00 244.31 183.60 0.17

MERGE 0.00 0.00 0.00 0.00

DELETE 823.00 13.69 10.29 0.01

NUMBER OF ROWS 823.00 13.69 10.29 0.01

PREPARE 109.9K 1827.56 1373.41 1.29

DESCRIBE 27676.00 460.34 345.95 0.32

DESCRIBE TABLE 0.00 0.00 0.00 0.00

OPEN 95096.00 1581.77 1188.70 1.12

CLOSE 1610.00 26.78 20.13 0.02

FETCH 1610.00 26.78 20.13 0.02

NUMBER OF ROWS 110.4K 1836.04 1379.79 1.30

TOTAL DML 253.8K 4221.25 3172.27 2.98

As the DayTrader application that we used during our testing is a JDBC application using dynamic SQL, it is important to verify that you have a good hit ratio in the global dynamic statement cache. You can verify that in the DYNAMIC SQL STMT section in the statistics report, as shown in Example 8-8.

Example 8-8 Statistics report - dynamic SQL statements section

DYNAMIC SQL STMT QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

PREPARE REQUESTS 109.9K 1827.56 1373.41 1.29

FULL PREPARES 7.00 0.12 0.09 0.00

SHORT PREPARES 109.9K 1827.42 1373.31 1.29

GLOBAL CACHE HIT RATIO (%) 99.99 N/A N/A N/A

Almost all prepares result in a short prepare, which means that the statement was found in the global dynamic statement cache, which results in a high cache hit ratio of 99.99%.

The trade workload uses parameter markers, which increases the chance of finding a matching statement in the dynamic statement cache.

For more information about this topic, see 4.3.2, “Enabling DB2 dynamic statement cache” on page 141 and 6.2, “Dynamic SQL” on page 299.

Subsystem services and DDF and DRDA location sections

The SUBSYSTEM SERVICES section (Example 8-9), and the DRDA REMOTE LOCS section (Example 8-10) can be used to quickly determine whether the bulk of the work is using a type 2 or a type 4 connection.

Using a type 4 connection

Example 8-9 shows the SUBSYSTEM SERVICES section when using a type 2 connection.

Example 8-9 Statistics report - subsystem services section

SUBSYSTEM SERVICES QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

IDENTIFY 20.00 0.33 0.25 0.00

CREATE THREAD 80.00 1.33 1.00 0.00

SIGNON 0.00 0.00 0.00 0.00

TERMINATE 100.00 1.66 1.25 0.00

ROLLBACK 0.00 0.00 0.00 0.00

COMMIT PHASE 1 0.00 0.00 0.00 0.00

COMMIT PHASE 2 0.00 0.00 0.00 0.00

READ ONLY COMMIT 1.00 0.02 0.01 0.00

UNITS OF RECOVERY INDOUBT 0.00 0.00 0.00 0.00

UNITS OF REC.INDBT RESOLVED 0.00 0.00 0.00 0.00

SYNCHS(SINGLE PHASE COMMIT) 80.00 1.33 1.00 0.00

QUEUED AT CREATE THREAD 0.00 0.00 0.00 0.00

SUBSYSTEM ALLIED MEMORY EOT 0.00 0.00 0.00 0.00

SUBSYSTEM ALLIED MEMORY EOM 0.00 0.00 0.00 0.00

SYSTEM EVENT CHECKPOINT 0.00 0.00 0.00 0.00

HIGH WATER MARK IDBACK 9.00 0.15 0.11 0.00

HIGH WATER MARK IDFORE 2.00 0.03 0.02 0.00

HIGH WATER MARK CTHREAD 9.00 0.15 0.11 0.00

When you use a type 2 connection, you expect to see high numbers in the various commit counters in the SUBSYSTEM SERVICES section, and when you use type 4, the commits show up in the DRDA REMOTE LOCS section of the DB2 statistics report. It is clear from the reports that this workload was using a type 4 connection, as almost all commit requests are in the (SINGLE PHASE) COMMITS bucket in the DRDA REMOTE LOCS section.

When the number of active (allied) threads exceeds CTHREAD DSNZPARM value, new create thread requests are queued. When this situation occurs, the QUEUED AT CREATE THREAD counter is incremented. Typically, you want to see a zero value in this field. However, it is possible that you hit CTHREAD when there is a significant spike in the workload, or when things slow down for some reason. In those cases, it is better to queue the threads, or even deny them, than to let them start processing. Using a large CTHREAD value allows much work to start, but when the system is flooded, adding more work makes things worse. Therefore, queuing work at create thread time, or even outside DB2 (in the application server), is better than leaving the gates wide open (using a high CTHREAD value) and adding more work to a system that is already under stress.

Example 8-10 Statistics report - DRDA remote locations section

DRDA REMOTE LOCS SENT RECEIVED

--------------------------- -------- --------

TRANSACTIONS N/A N/A

CONVERSATIONS 0.00 12447.00

CONVERSATIONS QUEUED 0.00

CONVERSATIONS DEALLOCATED 0.00

SQL STATEMENTS 0.00 412.6K

(SINGLE PHASE) COMMITS 0.00 85110.00

(SINGLE PHASE) ROLLBACKS 0.00 0.00

ROWS 110.4K 0.00

MESSAGES 814.8K 814.8K

BYTES 105.3M 135.9M

BLOCKS 188.6K 0.00

MESSAGES IN BUFFER N/A

CONT->LIM.BLOCK FETCH SWTCH N/A

STATEMENTS BOUND AT SERVER N/A

PREPARE REQUEST N/A N/A

LAST AGENT REQUEST N/A N/A

TWO PHASE COMMIT REQUEST N/A N/A

TWO PHASE BACKOUT REQUEST N/A N/A

FORGET RESPONSES N/A N/A

COMMIT RESPONSES N/A N/A

BACKOUT RESPONSES N/A N/A

THREAD INDOUBT-REM.L.COORD. 0.00

COMMITS DONE-REM.LOC.COORD. N/A

BACKOUTS DONE-REM.L.COORD. N/A

Because this is a workload that is using a type 4 connection, the work enters DB2 through the DDF address space, in which case it is interesting to have a look at the GLOBAL DDF ACTIVITY section as well, as shown in Example 8-11 on page 403.

Well-behaved transactions run a number of SQL statements and issue a commit. Then, the DBAT (that represents the thread in DB2) and the connection that is tied to the transport in the Data Server driver are disconnected from each other. The connection goes inactive (waiting for the next request from the application server to arrive) and the DBAT is put into a pool (so it can be reused by other connections that must run SQL statements. These types of connections (that can go inactive at commit) are also called type 2 inactive connections, and the DBATs are called pooled DBATs.

The CMTSTAT subsystem parameter controls whether threads are made active or inactive after they successfully commit or roll back and hold no cursors. A thread can become inactive only if it holds no cursors, has no temporary tables that are defined, and runs no statements from the dynamic statement cache.

Note: Type 2 (inactive) connections have nothing to do with the type of Java driver that is used by the application. On the contrary, DB2 type 2 (inactive) connections are always associated with work entering DB2 through DRDA and always use a Java type 4 driver.

For more information about setting active/inactive connections, see DB2 10 for z/OS Installation and Migration Guide, GC19-2974.

When a connection wants to process an SQL request and enters the DB2 server, the request is put on a queue to allow a DBAT to be selected from the pool, or created, to process
the request.

The ACC QU INACT CONNS (TYPE 2) counter indicates how many of these inactive connections were put on this queue during the interval that you are looking at. It is a good indicator of the amount of DRDA work that goes through the system.

Typically, a connection is only on that queue for a short time. Since Version 10, DB2 provides information about the MIN/MAX and AVG QUEUE TIME in case you suspect that there is a problem with connections not being able to obtain a DBAT quickly.

If the maximum number of DBATs is reached (MAXDBAT DSNZPARM), new requests are queued and the DBAT/CONN QUEUED-MAX ACTIVE counter is incremented. Similar to the QUEUED AT CREATE THREAD counter, you want to have a zero value in this field under normal conditions. But as indicated above, it is often better to queue requests outside DB2 (and have a non-zero value in this field) than to let all the work into DB2 and get stuck during DB2 processing.

Well-behaved transactions commit regularly and allow the connection to become inactive and the DBAT to be pooled so other transactions (connections) can reuse a pooled DBAT. The number of times a pooled DBAT is reused can be found in the DISCON (POOL) DBATS REUSED counter.

There are conditions that do not let a connection go inactive. To optimize resource usage, you want to make sure that these conditions do not apply to your applications. For more information, see Chapter 10, “Managing DB2 threads”, in DB2 10 for z/OS Managing Performance, SC19-2978.

Example 8-11 show that 110.0K requests came in during this interval, and DB2 was able to reuse a DBAT 110.0K times, which is optimal reuse.

Example 8-11 Statistics report - global DDF activity section

GLOBAL DDF ACTIVITY QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

DBAT/CONN QUEUED-MAX ACTIVE 0.00 0.00 0.00 N/A

CONN REJECTED-MAX CONNECTED 0.00 0.00 0.00 N/A

CONN CLOSED - MAX QUEUED 0.00 0.00 0.00 N/A

COLD START CONNECTIONS 0.00 0.00 0.00 0.00

WARM START CONNECTIONS 0.00 0.00 0.00 0.00

RESYNCHRONIZATION ATTEMPTED 0.00 0.00 0.00 0.00

RESYNCHRONIZATION SUCCEEDED 0.00 0.00 0.00 0.00

CUR TYPE 1 INACTIVE DBATS 0.00 N/A N/A N/A

HWM TYPE 1 INACTIVE DBATS 2.00 N/A N/A N/A

TYPE 1 CONNECTIONS TERMINAT 0.00 0.00 N/A N/A

CUR INACTIVE CONNS (TYPE 2) 4.00 N/A N/A N/A

HWM INACTIVE CONNS (TYPE 2) 20.00 N/A N/A N/A

ACC QU INACT CONNS (TYPE 2) 110.0K 1829.89 N/A N/A

CUR QU INACT CONNS (TYPE 2) 2.00 N/A N/A N/A

MIN QUEUE TIME 0.000008 N/A N/A N/A

MAX QUEUE TIME 0.116363 N/A N/A N/A

AVG QUEUE TIME 0.000078 N/A N/A N/A

HWM QU INACT CONNS (TYPE 2) 15.00 N/A N/A N/A

CUR ACTIVE AND DISCON DBATS 20.00 N/A N/A N/A

HWM ACTIVE AND DISCON DBATS 29.00 N/A N/A N/A

HWM TOTL REMOTE CONNECTIONS 22.00 N/A N/A N/A

CUR DISCON DBATS NOT IN USE 14.00 N/A N/A N/A

HWM DISCON DBATS NOT IN USE 29.00 N/A N/A N/A

DBATS CREATED 3.00 N/A N/A N/A

DISCON (POOL) DBATS REUSED 110.0K N/A N/A N/A

CUR ACTIVE DBATS-BND DEALLC 0.00 N/A N/A N/A

HWM ACTIVE DBATS-BND DEALLC 0.00 N/A N/A N/A

Using a type 2 connection

Example 8-12 shows the subsystem services section when you use a type 2 connection.

This information is from a different test run and the time interval is much larger in this case, so you cannot compare this data with the data from the type 4 run. In addition, because a type 2 connection uses a local RRS attach, all work is directed to a single DB2 member, which is the same member that runs the WebSphere Application Server. If there is a need to spread the work across multiple members, which is typically the case in a data sharing environment, you can run a WebSphere Application Server on the other LPAR and use the HTTP server to “spray” the work between the two application servers. For this example, the number of members or the way the workload is distributed is not important.

Example 8-12 Statistics report - subsystem services - T2

SUBSYSTEM SERVICES QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

IDENTIFY 58.00 0.11 1.02 0.00

CREATE THREAD 57.00 0.11 1.00 0.00

SIGNON 57.00 0.11 1.00 0.00

TERMINATE 1.00 0.00 0.02 0.00

ROLLBACK 0.00 0.00 0.00 0.00

COMMIT PHASE 1 0.00 0.00 0.00 0.00

COMMIT PHASE 2 147.4K 272.92 2585.53 0.11

READ ONLY COMMIT 1142.3K 2115.34 20.0K 0.89

UNITS OF RECOVERY INDOUBT 0.00 0.00 0.00 0.00

UNITS OF REC.INDBT RESOLVED 0.00 0.00 0.00 0.00

SYNCHS(SINGLE PHASE COMMIT) 0.00 0.00 0.00 0.00

QUEUED AT CREATE THREAD 0.00 0.00 0.00 0.00

SUBSYSTEM ALLIED MEMORY EOT 0.00 0.00 0.00 0.00

SUBSYSTEM ALLIED MEMORY EOM 0.00 0.00 0.00 0.00

SYSTEM EVENT CHECKPOINT 5.00 0.01 0.09 0.00

HIGH WATER MARK IDBACK 59.00 0.11 1.04 0.00

HIGH WATER MARK IDFORE 2.00 0.00 0.04 0.00

HIGH WATER MARK CTHREAD 59.00 0.11 1.04 0.00

Note the high number of COMMIT PHASE 2 and READ ONLY COMMIT requests. Transactions show only COMMIT PHASE 2 because DB2 is the only subsystem that is involved in the processing and no global transaction is defined.

When you use a type 2 connection, applications coming from WebSphere Application Server come into DB2 through RRS, and this type of RRS connections counts towards the IDBACK DSNZPARM (which limits the number of background connections). If the high water mark (HWM) gets close to the IDBACK² DSNZPARM value, you might have to increase the DSNZPARM value.

For information about IDBACK and other DSNZPARMs, see 4.3.1, “DB2 connectivity installation parameters” on page 138.

Example 8-13 shows the GLOBAL DDF ACTIVITY section from the type 2 run. There is almost no DDF activity (ACC QU INACT CONNS (TYPE 2) is low), which is expected if the entire workload is using a Java type 2 connection (through RRS).

Example 8-13 Statistics report - global DDF activity section - T2

GLOBAL DDF ACTIVITY QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

DBAT/CONN QUEUED-MAX ACTIVE 0.00 0.00 0.00 N/A

CONN REJECTED-MAX CONNECTED 0.00 0.00 0.00 N/A

CONN CLOSED - MAX QUEUED 0.00 0.00 0.00 N/A

COLD START CONNECTIONS 0.00 0.00 0.00 0.00

WARM START CONNECTIONS 0.00 0.00 0.00 0.00

RESYNCHRONIZATION ATTEMPTED 0.00 0.00 0.00 0.00

RESYNCHRONIZATION SUCCEEDED 0.00 0.00 0.00 0.00

CUR TYPE 1 INACTIVE DBATS 0.00 N/A N/A N/A

HWM TYPE 1 INACTIVE DBATS 2.00 N/A N/A N/A

TYPE 1 CONNECTIONS TERMINAT 0.00 0.00 N/A N/A

CUR INACTIVE CONNS (TYPE 2) 0.00 N/A N/A N/A

HWM INACTIVE CONNS (TYPE 2) 50.00 N/A N/A N/A

ACC QU INACT CONNS (TYPE 2) 5.00 0.01 N/A N/A

CUR QU INACT CONNS (TYPE 2) 0.00 N/A N/A N/A

MIN QUEUE TIME 0.000002 N/A N/A N/A

MAX QUEUE TIME 0.000010 N/A N/A N/A

AVG QUEUE TIME 0.000005 N/A N/A N/A

HWM QU INACT CONNS (TYPE 2) 37.00 N/A N/A N/A

CUR ACTIVE AND DISCON DBATS 0.22 N/A N/A N/A

HWM ACTIVE AND DISCON DBATS 88.00 N/A N/A N/A

HWM TOTL REMOTE CONNECTIONS 78.00 N/A N/A N/A

CUR DISCON DBATS NOT IN USE 0.22 N/A N/A N/A

HWM DISCON DBATS NOT IN USE 88.00 N/A N/A N/A

DBATS CREATED 1.00 N/A N/A N/A

DISCON (POOL) DBATS REUSED 5.00 N/A N/A N/A

CUR ACTIVE DBATS-BND DEALLC 0.00 N/A N/A N/A

HWM ACTIVE DBATS-BND DEALLC 0.00 N/A N/A N/A

Locking and data sharing locking sections

WebSphere Application Server applications and Java applications in general have issues with locking. This locking occurs mostly because these applications are not always designed with a high degree of concurrency in mind, and also because the DB2 for z/OS locking techniques differ in some areas from other database management systems, which can sometimes result in a different locking behavior when deploying the application on a System z platform compared to other platforms and or DBMSs.

As with all information in the DB2 statistics report, the locking sections contain information about the locking activity for the entire subsystem. In most cases, it is more interesting to look at the locking information for individual applications, so when you look at the DB2 statistics information, you typically want to make sure that only the overall locking behavior and activity are fine. Example 8-14 shows the locking and data sharing locking section from one of the runs we ran during the project that produced this book.

To perform this type of high-level check, look at the following counters:

•Suspensions: When an application requests a lock and that lock cannot be granted because an incompatible lock exists for the resource, the lock request is suspended. You want to compare the number of suspensions against the number of lock requests. If a significant percentage is suspended, applications do not scale, and the locking behavior must be investigated in more detail.

•Timeouts: When a lock request is suspended for longer that the timeout value that is specified in IRLMRWT DSNZPARM, it is timed out and a resource unavailable error is returned to the application. Timeouts are disruptive for applications, so they should be investigated in more detail. When statistics trace class 3 is active, DB2 writes an IFCID 196 trace record when a timeout occurs. This trace record includes the type of lock that is being requested, the resource that is being locked, and the holder of the lock. A zero (or low) value for the number of timeouts is appropriate.

•Deadlocks: Locks can be a requested in a way the tran A is waiting for a lock that is held by tran B, and tran B is waiting for a lock that is held by tran A. This is called a deadlock situation. IRLM is able to detect these deadlocks and stop one of the trans (known as the victim) and let the other one continue. When statistics trace class 3 is active, DB2 writes an IFCID 172 trace record when a deadlock occurs. This trace record includes the types of locks that are being requested, the resources that are involved, and the holders and waiters of the lock. Deadlocks can also be disruptive for the system throughput, so it is important to understand and minimize them. A zero (or low) value is recommended.

•Lock escalation: When a transaction acquires more locks on a table space (part) than the amount specified on the LOCKMAX for the table space (part), the individual row or page locks are replaced by a gross lock on the table space (part). Depending on whether the application is reading or updating, the lock is escalated to S or X. As for deadlocks and timeouts, statistics trace class 3 creates an IFCID 337 trace record each time lock escalation occurs. The information in this record includes the object that is undergoing lock escalation at that time.

For more information about locking, see DB2 9 for z/OS: Resource Serialization and Concurrency Control, SG24-4725.

Example 8-14 Statistics report - locking and data sharing locking sections

LOCKING ACTIVITY QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

SUSPENSIONS (ALL) 4043.00 67.25 50.54 0.05

SUSPENSIONS (LOCK ONLY) 530.00 8.82 6.63 0.01

SUSPENSIONS (IRLM LATCH) 3448.00 57.35 43.10 0.04

SUSPENSIONS (OTHER) 65.00 1.08 0.81 0.00

TIMEOUTS 0.00 0.00 0.00 0.00

DEADLOCKS 0.00 0.00 0.00 0.00

LOCK REQUESTS 590.5K 9822.30 7381.47 6.93

UNLOCK REQUESTS 132.7K 2207.78 1659.15 1.56

QUERY REQUESTS 22.00 0.37 0.27 0.00

CHANGE REQUESTS 27952.00 464.94 349.40 0.33

OTHER REQUESTS 0.00 0.00 0.00 0.00

LOCK ESCALATION (SHARED) 0.00 0.00 0.00 0.00

LOCK ESCALATION (EXCLUSIVE) 0.00 0.00 0.00 0.00

DRAIN REQUESTS 0.00 0.00 0.00 0.00

DRAIN REQUESTS FAILED 0.00 0.00 0.00 0.00

CLAIM REQUESTS 305.3K 5077.72 3815.91 3.58

CLAIM REQUESTS FAILED 0.00 0.00 0.00 0.00

DATA SHARING LOCKING QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

GLOBAL CONTENTION RATE (%) 1.21

FALSE CONTENTION RATE (%) 0.68

P/L-LOCKS XES RATE (%) 41.45

LOCK REQUESTS (P-LOCKS) 31956.00 531.54 399.45 0.38

UNLOCK REQUESTS (P-LOCKS) 31100.00 517.30 388.75 0.37

CHANGE REQUESTS (P-LOCKS) 3.00 0.05 0.04 0.00

SYNCH.XES - LOCK REQUESTS 258.0K 4291.54 3225.10 3.03

SYNCH.XES - CHANGE REQUESTS 19856.00 330.27 248.20 0.23

SYNCH.XES - UNLOCK REQUESTS 273.6K 4551.25 3420.27 3.21

BACKGROUND.XES -CHILD LOCKS 1.00 0.02 0.01 0.00

ASYNCH.XES -CONVERTED LOCKS 19311.00 321.21 241.39 0.23

SUSPENDS - IRLM GLOBAL CONT 3107.00 51.68 38.84 0.04

SUSPENDS - XES GLOBAL CONT. 0.00 0.00 0.00 0.00

SUSPENDS - FALSE CONT. MBR 3912.00 65.07 48.90 0.05

SUSPENDS - FALSE CONT. LPAR N/A N/A N/A N/A

REJECTED - XES 5.00 0.08 0.06 0.00

INCOMPATIBLE RETAINED LOCK 0.00 0.00 0.00 0.00

NOTIFY MESSAGES SENT 43.00 0.72 0.54 0.00

NOTIFY MESSAGES RECEIVED 80.00 1.33 1.00 0.00

P-LOCK/NOTIFY EXITS ENGINES 500.00 N/A N/A N/A

P-LCK/NFY EX.ENGINE UNAVAIL 0.00 0.00 0.00 0.00

PSET/PART P-LCK NEGOTIATION 0.00 0.00 0.00 0.00

PAGE P-LOCK NEGOTIATION 2041.00 33.95 25.51 0.02

OTHER P-LOCK NEGOTIATION 1.00 0.02 0.01 0.00

P-LOCK CHANGE DURING NEG. 2042.00 33.97 25.52 0.02

The data sharing locking section is also important:

•IRLM global lock suspension: This is similar to local lock suspension, but the lock holder is on another member of the data sharing group. As global IRLM suspensions are more expensive than local lock suspensions, it is important to keep the IRLM global suspension to a minimum.

•XES global contention: XES (the z/OS lock manager) does not use as many lock states as IRLM. Therefore, it is possible that XES thinks there is lock contention on a resource, but when IRLM check its lock information for this resource, there is no lock contention. This is called XES contention. With the introduction of locking protocol 2 in DB2 8, XES contention is low. Therefore, if you see a significant value, it warrants further investigation.

•False contention: This is when two different resources hash to the same entry in the hash table in the lock structure. In this case, there is no contention because after a closer look, it turns out that different resources are locking. This typically happens when the lock structure is not large enough, increasing the likelihood that different resources hash to the same entry.

IBM OMEGAMON XE for DB2 PE on z/OS calculates the GLOBAL CONTENTION RATE for you. Try to keep it below 3 - 5%. It also calculates the FALSE CONTENTION RATE. False contention should be less than 1 - 3% of the total number of IRLM requests sent to XES.

Buffer pool section

Another set of important statistics to see whether the system is performing well can be found in the buffer pool (BP) section of the DB2 statistics report. DB2 provides this type of information for each of the buffer pools that are used by the different applications in the system. It is a common practice to use different buffer pools for different purposes. Typically, different pools are used for different types of objects, table spaces in one (or more) pools, indexes in others, or by the type of access against the object, that is, random versus sequential.

Another way to assign buffer pools is to dedicate certain pools to specific applications. Typically, the buffer pool assignments are a mixture of all the above. IBM OMEGAMON XE for DB2 PE on z/OS reports on each of the DB2 buffer pools that were used in a specific DB2 statistics interval, but it also rolls up all the BP activity into a single TOTAL section, as shown in Example 8-15. If you want to get an idea about the overall BP activity, this is the place to start your analysis. (The information for the individual BPs is identical to the information in the TOTAL section).

Example 8-15 Statistics report - buffer pool section

TOTAL READ OPERATIONS QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

BPOOL HIT RATIO (%) 99.94

BPOOL HIT RATIO (%) SEQU N/C

BPOOL HIT RATIO (%) RANDOM 99.96

GETPAGE REQUEST 519.0K 8632.22 6487.13 6.09

COND. REQUEST FAILED 0.00 0.00 0.00 0.00

GETPAGE REQUEST-SEQUENTIAL 0.00 0.00 0.00 0.00

COND. REQ-SEQU FAILED 0.00 0.00 0.00 0.00

GETPAGE REQUEST-RANDOM 519.0K 8632.22 6487.13 6.09

COND. REQ-RANDOM FAILED 0.00 0.00 0.00 0.00

SYNCHRONOUS READS 210.00 3.49 2.63 0.00

SYNCHRON. READS-SEQUENTIAL 0.00 0.00 0.00 0.00

SYNCHRON. READS-RANDOM 210.00 3.49 2.63 0.00

GETPAGE PER SYN.READ-RANDOM 2471.29

SEQUENTIAL PREFETCH REQUEST 0.00 0.00 0.00 0.00

SEQUENTIAL PREFETCH READS 0.00 0.00 0.00 0.00

PAGES READ VIA SEQ.PREFETCH 0.00 0.00 0.00 0.00

S.PRF.PAGES READ/S.PRF.READ N/C

LIST PREFETCH REQUESTS 5.00 0.08 0.06 0.00

LIST PREFETCH READS 0.00 0.00 0.00 0.00

PAGES READ VIA LIST PREFTCH 0.00 0.00 0.00 0.00

L.PRF.PAGES READ/L.PRF.READ N/C

DYNAMIC PREFETCH REQUESTED 1166.00 19.39 14.57 0.01

DYNAMIC PREFETCH READS 23.00 0.38 0.29 0.00

PAGES READ VIA DYN.PREFETCH 91.00 1.51 1.14 0.00

D.PRF.PAGES READ/D.PRF.READ 3.96

PREF.DISABLED-NO BUFFER 0.00 0.00 0.00 0.00

PREF.DISABLED-NO READ ENG 0.00 0.00 0.00 0.00

PAGE-INS REQUIRED FOR READ 224.00 3.73 2.80 0.00

TOTAL SORT/MERGE QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

MAX WORKFILES CONCURR. USED 0.00 N/A N/A N/A

MERGE PASSES REQUESTED 0.00 0.00 0.00 0.00

MERGE PASS DEGRADED-LOW BUF 0.00 0.00 0.00 0.00

WORKFILE REQ.REJCTD-LOW BUF 0.00 0.00 0.00 0.00

WORKFILE REQ-ALL MERGE PASS 0.00 0.00 0.00 0.00

WORKFILE NOT CREATED-NO BUF 0.00 0.00 0.00 0.00

WORKFILE PRF NOT SCHEDULED 0.00 0.00 0.00 0.00

TOTAL WRITE OPERATIONS QUANTITY /SECOND /THREAD /COMMIT

--------------------------- -------- ------- ------- -------

BUFFER UPDATES 46107.00 766.91 576.34 0.54

PAGES WRITTEN 0.00 0.00 0.00 0.00

BUFF.UPDATES/PAGES WRITTEN N/C

SYNCHRONOUS WRITES 0.00 0.00 0.00 0.00

ASYNCHRONOUS WRITES 0.00 0.00 0.00 0.00

PAGES WRITTEN PER WRITE I/O N/C

PAGES WRTN FOR CASTOUT I/O 554.00 9.21 6.92 0.01

NUMBER OF CASTOUT I/O 13.00 0.22 0.16 0.00

HORIZ.DEF.WRITE THRESHOLD 0.00 0.00 0.00 0.00

VERTI.DEF.WRITE THRESHOLD 0.00 0.00 0.00 0.00

DM THRESHOLD 0.00 0.00 0.00 0.00

WRITE ENGINE NOT AVAILABLE N/A N/A N/A N/A

PAGE-INS REQUIRED FOR WRITE 0.00 0.00 0.00 0.00

The BP information is made up of three sections: read activity, sort activity, merge activity, and write activity. In the read activity, you want to check the following items:

•GETPAGE REQUEST: This is the number of times an SQL statement had to request a page from the DB2 buffer manager component. When an SQL statement must retrieve a row, that row lives on a page, and that page lives in the DB2 buffer pool (or on disk, or in the group buffer pool (GBP) in a data sharing system). So to obtain the row, there is request to the DB2 buffer manager component to get the page (getpage). The amount of getpage activity is a good measure for the amount of work that DB2 is performing in a certain interval.

•SYNCHRONOUS READS: When the DB2 buffer manager finds that the page is not in the buffer pool (or GBP), it must read the page from disk. When DB2 reads a single page from disk, and the application waits for this page to be brought into the BP; this is a synchronous read operation.

Even though disk I/O has improved dramatically over the last couple of years, it is still orders of magnitude slower than retrieving a page that is in the BP. Therefore, reducing the number of SYNC I/Os has a positive effect on the transaction response time. So, it is important to verify the amount of I/O activity in the system. (You also have BP information at the application (accounting) level to show the BP activity of individual users, plans,
and applications.)

•Prefetch reads: Besides synchronous I/Os, DB2 also has a number of I/O mechanisms where it anticipates I/Os and tries to bring in pages from disk before the application requests the page. These are called asynchronous I/Os, as the application does not wait for the I/O, and the I/Os are performed in the background by a DB2 system task.

There are three types of asynchronous I/Os:

– SEQUENTIAL PREFETCH: The DB2 optimizer expects that data is processed sequentially, for example, when you must scan the entire table to retrieve the result set, and DB2 reads a number of pages ahead of time (typically 32, but it varies depending on the buffer pool size). The usage of sequential prefetch was reduced since DB2 9 in favor of dynamic prefetch.

– LIST PREFETCH: This type of asynchronous I/O is used when the optimizer decides to use an access path that retrieves a number of record IDs (RIDs) (a page number and an ID map entry on that page) and used the page numbers to retrieve a number of pages in a single I/O operation. All access paths using list prefetch use this type of I/O when retrieving pages from disk.

– DYNAMIC PREFETCH: This type of prefetch is activated dynamically when DB2 detects that data is being accessed in a sequential manner, and to anticipate subsequent getpages, it prefetches a number of pages. So, even when the optimizer does not decide to use a table space scan, when the application at run time appears to be going through the data in a sequential manner, DB2 triggers dynamic prefetch, bringing in pages that are likely to be needed.

For each type of prefetch, there are three types of counters:

– ... REQUESTS: This number represents the number of times this type of prefetch
was requested.

– ... READS; This number represents the number of times that this type of prefetch triggered a prefetch engine to read in pages from disk. When all pages are in the BP, there is no need to perform an I/O operation. In this case, the counter for the number of requests is incremented, but the counter for the number of reads is not.

– PAGES READ VIA...: This is the number of pages that were brought in by prefetch operations. IBM OMEGAMON XE for DB2 PE conveniently calculates the average number of pages that were read from disk by each prefetch operation in the ... PAGES READ/ ...READ field. In Example 8-15 on page 408, D.PRF.PAGES READ/D.PRF.READ is 3.96. Considering that the prefetch quantity is 32 in our example, fewer than four pages had to be read from disk; the rest were already in
the BP.

If you want to look at a single number for the BP performance, people often use the BP hit ratio. BP hit ratio gives you some idea about whether pages being requested by the applications are in the BP. In our example, the hit ratio is 99.94%, which is high and not a typical number, especially considering this is the total of all buffer pools in the system. Another measure of how your buffer pools are doing is to calculate the number of getpages per sync. I/O (calculated by IBM OMEGAMON XE for DB2 PE as GETPAGE PER SYN.READ-RANDOM). As the application is waiting for sync. I/Os to complete, the higher this ratio, the better. In our case, DB2 must perform a sync I/O only every 2471.29
getpage requests.

Here are a few other counters to monitor:

•PREF.DISABLED-NO BUFFER: This indicates the number of times DB2 was unable to perform a prefetch operation because there were no buffers available for the prefetch operation. This can occur because the sequential prefetch buffer pool threshold is reached (more than 90% of the pages in the BP are not available to be reused) or the VPSEQT buffer pool threshold is set to zero (effectively disabling all prefetch for that buffer pool). Hitting the BP sequential threshold is not good, so you might want to increase the BP size, or adjust the BP write thresholds to trigger writing back to disk sooner.

•PREF.DISABLED-NO READ ENG: This indicates that DB2 ran out of prefetch engines. There are 600 prefetch engines in DB2 and all of them were in use at some point. This happens when many queries run that use DB2 parallelism, and each of the parallel tasks is triggering prefetch. As you cannot adjust the number of prefetch engines, it is better to spread out the queries that are using all these engines (run fewer queries in parallel).

•PAGE-INS REQUIRED FOR READ: When DB2 must perform a read I/O operation, its page frame is fixed in real storage (to make sure that the frame is not stolen when I/O is performed in to that frame). DB2 checks to see whether the page frame that holds the buffer in the BP must be paged in from disk before doing I/O (of the data) into that page frame. If the frame was on auxiliary storage, this counter is incremented. So, if that is the case, DB2 suffers a double I/O penalty, first when the paging I/O brings back the frame from auxiliary storage, and then another I/O to bring the actual data or index page from disk into the DB2 buffer pool. When you see a high number in this field, it is best to reevaluate the DB2 real storage requirements. Maybe you must add more real storage to the LPAR. Another way to avoid paging for DB2 buffer pool it to use the PGFIX(YES) buffer pool option. This option fixes all the DB2 pages for that buffer pool in real storage so they are not paged out to auxiliary storage. However, make sure that there is enough real storage available for the LPAR, or fixing the DB2 buffer pools might increase paging activity for other workloads in the system.

Have these three counters show a low or zero value. If that is not the case, you must investigate and at least understand why they are non-zero if you cannot remedy the problem.

If there is DB2 sort activity that is triggered by SQL activity, such as by ORDER BY or GROUP BY, the SORT/MERGE section shows non-zero values. It is important to make sure the workfile requests that degraded or were rejected are kept to a minimum.

The last section of the buffer pool report contains information about DB2 write operations. This is where DB2 must write the updated pages in the buffer pool or group buffer pool back disk. Most write activity is asynchronous to the application, meaning applications typically do not have to wait for write I/Os to complete. However, it is a preferred practice to verify this section in the DB2 statistics report to make sure that there are no issues with the write performance.

Here are some typical counters to verify:

•BUFFER UPDATES; This represents the number of rows that were inserted, updated, or deleted during the DB2 statistics interval. If multiple rows are updated on the same page, the counter is incremented multiple times, once for each row.

•PAGES WRITTEN: This is the number of pages from the local buffer pool that was written back to disk.

•SYNCHRONOUS WRITES: In some situations, DB2 can perform a synchronous write operation. This is a write that the application must wait for until it completes. A high number here means that many applications are waiting for a write I/O to complete and that is not a good thing. Therefore, if you see a high number here, you must investigate what is the trigger for them and see whether you can reduce them.

•ASYNCHRONOUS WRITES: This is the normal type of write operation. When the buffer pool hits certain thresholds (see below), it triggers an asynchronous write operation to externalize updated pages in the buffer pool back to disk. When a transaction reaches a commit point, DB2 writes to the log and forces the log buffers to disk to harden the fact that the transaction has reached a transaction boundary, but the updated pages are not written back to disk at that point. In case the system fails before the data pages are written back to disk (so with updates only in the buffer pool), DB2 can still recover because the information is on the DB2 log.

The counter above is zero because all the objects in the test run are group buffer pool dependent, so the writes to disk occur from the group buffer pool, not the local buffer pool.

•PAGES WRTN FOR CASTOUT I/O: When you use DB2 data sharing, at commit the log buffers are written to disk but the updated pages are also written to the group buffer pool when the object is group buffer pool dependent. This ensures data coherency between the different members of the data sharing group. Later, the changed pages in the group buffer pool must be written back to disk. This is called castout processing. This counter represents the pages that are written back to disk by the castout operation on this member.

•NUMBER OF CASTOUT I/O: This number represents the number of castout I/O operations, as opposed to the number of pages that are written by castout operations.

There are a number of buffer pool thresholds that indicate to DB2 that it is time to write updated pages back from the virtual pool to disk. Here are two of these thresholds:

•HORIZ.DEF.WRITE THRESHOLD: When the number of unavailable pages reaches the DWQT buffer pool threshold (set to 30% by default, but you can change the value by running the -ALTER BUFFERPOOL command), deferred (asynchronous) writes are triggered.

•VERTI.DEF.WRITE THRESHOLD: This is the same as the deferred write threshold, but at the page set level. When the number of updated pages for a data set reaches the VDWQT, deferred (asynchronous) writes begin for that data set. The default is 5%, but you can change the value by running the -ALTER BUFFERPOOL command. The VDWQT value can either be a percentage (of the BP size) or the number of changed pages that changed before the asynchronous writes are triggered.

For the workload that produced the BP numbers above, these values are zero. This is because all objects are GBP-dependent and the updated pages (few per transaction) are written to the GBP at commit time, and are written back to disk through castout I/O operations. As such, the DWQT and VDQWT are not reached.

As with the BP read operation section, there are a few counters that should have a zero or low value:

•DM THRESHOLD: This counter indicates that the number of unavailable pages in the BP reached 95% or more, which is the data manager threshold. Normally, DB2 accesses the page in the virtual buffer pool once for each page, no matter how many rows are retrieved or updated on that page. If the threshold is exceeded, a getpage request is done for each row instead of each page. If more than one row is retrieved or updated in a page, more than one getpage and release page request is performed on that page. This is a bad thing, but it is an autonomic mechanism that is built into DB2 to slow things down, giving the write engines time to write some of the unavailable pages back to disk.

•WRITE ENGINE NOT AVAILABLE: This counter is the number of times that a write engine was not available. This field is no longer populated by DB2, as it was too common to hit the maximum of 300 engines running at the same time. Under normal circumstances, it is not a problem when you hit this limit, as applications normally do not wait for write I/Os to complete.

•PAGE-INS REQUIRED FOR WRITE: This counter is similar to the page-in counter for read operations. Before doing a write I/O from a buffer, the frame is fixed in real storage. When DB2 detects the frame in auxiliary storage before the write I/O starts, it increments the counter.

For more information about buffer pool tuning, see DB2 9 for z/OS: Buffer Pool Monitoring and Tuning, REDP-4604.

Group buffer pool section

When you are running a DB2 data sharing system, and some of the objects are group buffer pool dependent (GBP-dependent), the DB2 statistics report also has a group buffer pool section, as shown in Example 8-16.

Example 8-16 Statistics report - group buffer pool section

GROUP TOTAL QUANTITY /SECOND /THREAD /COMMIT

----------------------------- -------- ------- ------- -------

GROUP BP R/W RATIO (%) 46.13 N/A N/A N/A

GBP-DEPENDENT GETPAGES 517.7K 8611.01 6471.19 6.08

SYN.READ(XI)-DATA RETURNED 13273.00 220.77 165.91 0.16

SYN.READ(XI)-NO DATA RETURN 169.00 2.81 2.11 0.00

SYN.READ(NF)-DATA RETURNED 111.00 1.85 1.39 0.00

SYN.READ(NF)-NO DATA RETURN 41.00 0.68 0.51 0.00

UNREGISTER PAGE 0.00 0.00 0.00 0.00

CLEAN PAGES SYNC.WRITTEN 0.00 0.00 0.00 0.00

CLEAN PAGES ASYNC.WRTN 0.00 0.00 0.00 0.00

REG.PAGE LIST (RPL) REQUEST 73.00 1.21 0.91 0.00

NUMBER OF PAGES RETR.FROM GBP 55.00 0.91 0.69 0.00

PAGES CASTOUT 554.00 9.21 6.92 0.01

UNLOCK CASTOUT 13.00 0.22 0.16 0.00

READ CASTOUT CLASS 17.00 0.28 0.21 0.00

READ DIRECTORY INFO 0.00 0.00 0.00 0.00

READ STORAGE STATISTICS 47.00 0.78 0.59 0.00

DELETE NAME 0.00 0.00 0.00 0.00

ASYNCH GBP REQUESTS 15997.00 266.08 199.96 0.19

EXPLICIT X-INVALIDATIONS 0.00 0.00 0.00 0.00

CASTOUT CLASS THRESHOLD 17.00 0.28 0.21 0.00

GROUP BP CASTOUT THRESHOLD 0.00 0.00 0.00 0.00

GBP CHECKPOINTS TRIGGERED 0.00 0.00 0.00 0.00

WRITE FAILED-NO STORAGE 0.00 0.00 0.00 0.00

WRITE TO SEC-GBP FAILED 0.00 0.00 0.00 0.00

DELETE NAME LIST SEC-GBP 0.00 0.00 0.00 0.00

DELETE NAME FROM SEC-GBP 0.00 0.00 0.00 0.00

UNLOCK CASTOUT STATS SEC-GBP 0.00 0.00 0.00 0.00

ASYNCH SEC-GBP REQUESTS 0.00 0.00 0.00 0.00

GROUP TOTAL CONTINUED QUANTITY /SECOND /THREAD /COMMIT

----------------------- -------- ------- ------- -------

WRITE AND REGISTER 21843.00 363.32 273.04 0.26

WRITE AND REGISTER MULT 3124.00 51.96 39.05 0.04

CHANGED PGS SYNC.WRTN 28879.00 480.35 360.99 0.34

CHANGED PGS ASYNC.WRTN 257.00 4.27 3.21 0.00

PAGES WRITE & REG MULT 7293.00 121.31 91.16 0.09

READ FOR CASTOUT 1.00 0.02 0.01 0.00

READ FOR CASTOUT MULT 83.00 1.38 1.04 0.00

PAGE P-LOCK LOCK REQ 31760.00 528.28 397.00 0.37

SPACE MAP PAGES 1807.00 30.06 22.59 0.02

DATA PAGES 14583.00 242.56 182.29 0.17

INDEX LEAF PAGES 15370.00 255.65 192.13 0.18

PAGE P-LOCK UNLOCK REQ 29010.00 482.53 362.63 0.34

PAGE P-LOCK LOCK SUSP 3469.00 57.70 43.36 0.04

SPACE MAP PAGES 272.00 4.52 3.40 0.00

DATA PAGES 1442.00 23.99 18.02 0.02

INDEX LEAF PAGES 1755.00 29.19 21.94 0.02

PAGE P-LOCK LOCK NEG 2041.00 33.95 25.51 0.02

SPACE MAP PAGES 113.00 1.88 1.41 0.00

DATA PAGES 942.00 15.67 11.77 0.01

INDEX LEAF PAGES 986.00 16.40 12.32 0.01

The main purpose of the group buffer pool in a DB2 data sharing environment is to ensure buffer coherency, that is, all members of the data sharing group can obtain the correct (most recent) version of the page that they want to process. To achieve this task, updated pages are written to the group buffer pool at commit time, and if these pages are present in the local buffer pool of other members, these pages are cross-invalidated (XI). The next time such a member requests that page, it sees the XI flag and it retrieves a new (latest) copy of the pages either from the group buffer pool or from disk.

To reduce this type of data sharing impact (refreshing XI-pages) to a minimum, make sure that if such a refresh operation must occur that the requested page is available in the group buffer pool. A page can be retrieved from the group buffer pool at microsecond speed, and reading from disk typically takes milliseconds.

To see whether DB2 can accomplish this task, check the group buffer pool section:

•SYN.READ(XI)-DATA RETURNED: This is the number of times that this DB2 member found an XI page in its local buffer pool, went to the GBP, and found the page in the GBP.

•SYN.READ(XI)-NO DATA RETURN: This is the number of times that this DB2 member found an XI page in its local buffer pool, went to the GBP, and did not find the page in
the GBP.

In general, keep the Sync.Read(XI) miss ratio below 10% by using the following formulas:

•TOTAL SYN.READ(XI) = SYN.READ(XI)-DATA RETURNED + SYN.READ(XI)-NO DATA RETURN

•Sync.Read(XI) miss ratio = SYN.READ(XI)-NO DATA RETURN / TOTAL SYN.READ(XI)

In our example, the ration is 169/(169 + 13273) = 1.25%, which is good.

Typically, only changed pages are written to the GBP (GBPCACHE(CHANGED) is the default). However, you can use the group buffer pool as an auxiliary storage level for unchanged pages as well if you specify GBPCACHE(ALL). This way, unchanged pages are also written to the GBP. When DB2 does not find a page in the local BP, it checks the GBP first before going to disk (GBP retrieval is much faster than I/O from disk). If the page is in the GBP, it is reused from there instead of reading from disk.

To verify the efficiency of this extra caching level, you can use the SYN.READ(NF)-DATA RETURNED and SYN.READ(NF)-NO DATA RETURN GBP statistics. They are similar to the XI information, but they represent the time that DB2 was not able to find the page in the local BP, and went to the GBP, and was either successful or unsuccessful in finding the page in the GBP. Our tests used the default GBPCACHE(CHANGED) setting, and the local BP hit ratio that we described in the (local) BP section above, is high, so there is little benefit in using GBPCACHE(ALL) for this application.

The process to write changed pages from the GBP back to disk is called castout processing. As for local BP, castout processing is also triggered by a number of thresholds:

•CASTOUT CLASS THRESHOLD: This is the number of times the group buffer pool castout was initiated because the group buffer pool class castout threshold was exceeded. This is similar to the VDWQT threshold at the page set level for local buffer pools. Queues inside the GBP are not by page set like local BP, but are based on a class. When the number of changed pages exceeds the percentage that is specified by CLASST, castout for that class is triggered. The default is 5%.

•GROUP BP CASTOUT THRESHOLD: This is the number of times group buffer pool castout was initiated because the group buffer pool castout threshold was exceeded. This threshold is similar to the DWQT threshold for local buffer pools. When the number of changed pages exceeds the threshold, castout is triggered. The default GBPOOL threshold is 30%.

•GBP CHECKPOINTS TRIGGERED: This is the number of times group buffer pool castout was initiated because a group buffer pool checkpoint was initiated (the default is every four minutes). This is also similar to DB2 system checkpoint, which triggers asynchronous writes for all updated pages in the local BP. You see only a non-zero value on the member if that member is the GBP structure owner. It is the structure owner that is responsible for GBP checkpoint processing.

In our test case, the castout is driven by the CLASST threshold, which is expected behavior. There are not enough different changed pages in the GBP to trigger the GBPOOL threshold.

As with the local BP, there are a few counters where a zero (or low) value is good:

•WRITE FAILED-NO STORAGE: This occurs when a process must write to the GBP but is unable to because the GBP is full. This is a bad condition. Applications must write all their changed pages to the GBP at commit time (for objects that are GBP-dependent). If they cannot do so, these transactions are suspended (in commit processing). DB2 does not fail these transactions immediately, but waits a few seconds and tries to write to the GBP again. If the write keeps failing, it is added to the logical page list (LPL) requiring recovery.

If the problem is not because of a momentary surge in activity, you need either to decrease the group buffer pool castout thresholds, to trigger castout earlier, or to increase the number of data entries in the group buffer pool through either increasing the total size of the group buffer pool, or adjust the ratio of directory entries to data entries in favor of data entries.

Tip: You can also use the ALLOWAUTOALT(YES) option to allow XES to dynamically adjust your DB2 GBP. This feature is not intended to adjust GBP settings to deal with sudden spikes of activity, but it designed to adjust the settings when the workload changes gradually over time.

For more information, see “Auto Alter capabilities” in DB2 10 for z/OS Data Sharing: Planning and Administration, SC19-2973 and “Identifying the coupling facility structures”, found at:

http://publib.boulder.ibm.com/infocenter/zos/v1r13/index.jsp?topic=%2Fcom.ibm.zos.r13.ieaf100%2Ficfstr.htm

•WRITE TO SEC-GBP FAILED: This is similar to the previous counter, but applies to writes to the secondary group buffer pool.

As page P-lock processing is handled by the DB2 buffer manager component, the DB2 group buffer pool statistics also contain valuable information about the page P-lock activity at the BP level (the example above shows only the total for all GBPs, but the DB2 statistics report contains this type of information for each group buffer pool.

The DB2 statistics record contains information about the page P-lock activity in terms of the number of requests, the number of suspensions, and the number of negotiations. It also distinguishes between page P-locks for space map pages, data pages (when using row level locking), and page P-locks for index leaf pages. A higher number of page P-lock requests means that DB2 must do some additional processing of these transactions, which typically translates into more processor and increased elapsed time. Also, acquiring a page P-lock is less expensive than a suspension for a page P-lock, which in turn is less expensive than negotiating a page P-lock. (Unlike L-locks or transaction locks, P-locks can be negotiated between members, but it is an expensive process, typically requiring forced writes to the active log data set and synchronous writes to the group buffer pool.

CPU information

The CPU Times section of the DB2 statistics report, which is shown in Example 8-17 on page 417, provides information about the amount of processing that is used by the different DB2 system address spaces:

•SYSTEM SERVICES ADDRESS SPACE (ssidMSTR)

•DATABASE SERVICES ADDRESS SPAC (ssidDBM1)

•IRLM

•DDF ADDRESS SPACE (ssidDIST)

Example 8-17 Statistics report - CPU Times section

CPU TIMES TCB TIME PREEMPT SRB NONPREEMPT SRB TOTAL TIME PREEMPT IIP SRB /COMMIT

------------------------------- --------------- --------------- --------------- --------------- --------------- --------------

SYSTEM SERVICES ADDRESS SPACE 0.032263 0.468393 0.011236 0.511892 N/A 0.000006

DATABASE SERVICES ADDRESS SPACE 0.011671 0.270898 0.075548 0.358117 0.011764 0.000004

IRLM 0.000014 0.000000 1.351361 1.351376 N/A 0.000016

DDF ADDRESS SPACE 3.492625 36.759432 0.721285 40.973342 22.948061 0.000481

TOTAL 3.536572 37.498723 2.159430 43.194726 22.959825 0.000507

The CPU time that is reported here is the amount of processing that is used by DB2 to perform system-related activity, on behalf of the applications that are running SQL requests. For example, when DB2 must access a table space the first time, the data set is not open yet. Therefore, the application is suspended, and DB2 switches to a task control block (TCB) in the DBM 1 address space. This TCB performs the allocation and physical open of the data set. The processing to perform the allocation and physical open is charged to the TCB processing time of the DBM1 address space. After the data set open is done, the application is resumed and continues processing.

The WLM-managed SP address spaces are not considered system address spaces. The processing that is used by those spaces is reported in the DB2 nested activity section of the DB2 accounting reports for the different applications.

Note: The processing time that is reported by the DDF (ssidDIST) address space includes both the processing time that is used by system tasks running in this address space, but it also includes the processing time that is used by all the database access threads (DBATs) in the system. DBATs run as pre-emptible Service Request Blocks (SRBs) in the DIST address space, so all the work that is done by remote connections running SQL statements against a DB2 for z/OS system shows up in the DB2 accounting records, and in the CPU Times section in the DB2 statistics report, under PREEMPT SRB time.

This section separates the CPU Time into the following types:

•TCB Time: This is the amount of processing time that is used by work that runs using a task control block as a dispatchable unit of work.

•PREEMPT SRB: This is the amount of processing time that is used by work that runs by using a pre-emptible service request block as a dispatchable unit of work.

•NONPREEMPT SRB: This is similar to PREEMPT SRB, but this type of dispatchable unit must voluntarily relinquish control, but the other types can be interrupted at any time by the MVS dispatcher. DB2 has few of these types of SRB. IRLM still uses this type of dispatchable unit, but IRLM requests are short, must run to completion without being interrupted, and must be serviced with a high priority.

•TOTAL TIME: This is the sum of TCB Time, PREEMPT SRB, and NONPREEMPT SRB.

•PREEMPT IIP SRB: This is the amount of processing that DB2 ran on a specialty engine, such as zIIP or zAAP. It is not included in the other CPU Time fields, as users are not charged for the use of the zIIP or zAAP engine.

The workload in Example 8-17 is a distributed workload, and the majority of the processing time is PREEMPT SRB time in the DIST address space. Note the considerable amount of zIIP offload for the DDF work, 22.959825/ (22.959825 + 37.498723)=38.4%.

When you use ZOSMETRICS=YES DSNZPARM, at each statistics interval, DB2 communicates with RMF and obtains some useful information about the LPAR that the DB2 system is running on, as shown in Example 8-18. This information includes the following items:

•CP LPAR: The number of CPs on the LPAR.

•CPU UTILIZATION LPAR: The total CPU usage of the LPAR at the time that the information is collected.

•CPU UTILIZATION DB2: The CPU usage of all the DB2 system address spaces. These are all address spaces starting with ssid, where ssid is the DB2 subsystem name. In our example, the DB2 CPU usage is higher than the sum of DB2 MSTR + DB2 DBM1. This is because our workload is a distributed workload, so the CPU that is used by the ssidDIST address space is also included in this number.

•REAL STORAGE LPAR: The amount of real storage on the LPAR.

•USED REAL STORAGE DB2: The amount of real storage that is used by DB2.

•USED VIRTUAL STOR DB2: The amount of virtual storage that is used by DB2, as seen by RMF. This is equal to the amount of real storage, so nothing pages out to
auxiliary storage.

Example 8-18 Statistics report - RMF CPU and storage metrics

CPU AND STORAGE METRICS QUANTITY

--------------------------- ------------------

CP LPAR 6.00

CPU UTILIZATION LPAR 45.00

CPU UTILIZATION DB2 5.00

CPU UTILIZATION DB2 MSTR 0.00

CPU UTILIZATION DB2 DBM1 1.00

UNREFERENCED INTERVAL COUNT 65535.00

REAL STORAGE LPAR (MB) 8191.00

FREE REAL STORAGE LPAR (MB) 1359.00

USED REAL STORAGE DB2 (MB) 327.00

VIRTUAL STORAGE LPAR (MB) 70991.00

FREE VIRTUAL STOR LPAR (MB) 65007.00

USED VIRTUAL STOR DB2 (MB) 327.00

There are many more sections in the DB2 statistics record with information about how the subsystem is doing. Describing all of them is beyond the scope of this book. For more information, see DB2 10 for z/OS Managing Performance, SC19-2978 and Tivoli OMEGAMON XE for DB2 on z/OS Report Reference, SH12-6963.

8.4.5 Analyzing DB2 accounting data

Besides information about the entire DB2 subsystem that is contained in the DB2 statistics trace records, DB2 can also collect information about the performance and work that is performed by individual transactions. This information is in the DB2 accounting trace records, IFCID 3 for plan level information, and IFCID 239 for package level information.

Example 8-19 on page 419 shows a sample SYSIN that creates a DB2 accounting report using the IBM OMEGAMON XE for DB2 PE on z/OS batch reporting facility.

Example 8-19 Create a DB2 accounting report

//SYSIN DD *

DB2PM

* *********************************

* GLOBAL PARMS

* *********************************

GLOBAL

* Adjust for US East Coast DST

TIMEZONE (+4)

FROM(,21:29:00)

TO(,21:30:01)

* Include the entire group

INCLUDE( GROUP(DB0ZG))

* ************************************

* ACCOUNTING REPORTS

* ************************************

ACCOUNTING

REPORT

DDNAME(ACTRCDD1)

LAYOUT(LONG)

ORDER(TRANSACT)

EXEC

The statements in Example 8-19 generate an IBM OMEGAMON XE for DB2 PE on z/OS accounting report for all members of data sharing group DB0ZG (which includes two members named D0Z1 and D0Z2). The information is grouped by DB2 transaction name (ORDER(TRANSACT)). The report covers only a single minute. This report is congruent with Example 8-18 on page 418.

As this is an accounting report, the averages are based on the number of occurrences (the number of transactions that qualify for the filter criteria in that interval). This is different from an accounting trace report, which shows individual transactions (so there are no averages).

As with the statistics report, the accounting report consists of many sections. As an example, we use the sections of the transaction that is called ‘TraderClientApplication’. For more information about how we set the client information strings for this test, see “Specifying client information at the data source level” on page 366.

Identification and highlights

Example 8-21 on page 420 shows the identification section, elapsed time distribution, and class 2 time distribution section of the DB2 accounting report for the
TraderClientApplication transaction.

Example 8-20 Accounting report - identification elapsed time and class 2 time distribution

LOCATION: DB0Z OMEGAMON XE FOR DB2 PERFORMANCE EXPERT (V5R1M1) PAGE: 1-4

GROUP: DB0ZG ACCOUNTING REPORT - LONG REQUESTED FROM: ALL 21:29:00.00

MEMBER: D0Z1 TO: DATES 21:30:01.00

SUBSYSTEM: D0Z1 ORDER: TRANSACT INTERVAL FROM: 08/13/12 21:29:00.37

DB2 VERSION: V10 SCOPE: MEMBER TO: 08/13/12 21:30:00.99

TRANSACT: TraderClientApplication

ELAPSED TIME DISTRIBUTION CLASS 2 TIME DISTRIBUTION

---------------------------------------------------------------- ----------------------------------------------------------------

APPL |===============================> 62% CPU |==========> 21%

DB2 |==========> 20% SECPU |============> 24%

SUSP |=========> 18% NOTACC |====> 9%

SUSP |=======================> 47%

Example 8-20 on page 419 provides a quick overview of where the majority of the time is spent; in the application, in DB2 (and when in DB2, whether the transaction is mainly using CPU), on a general CP or specialty engine, or is suspended in DB2 waiting for some known event to complete. In this case, most of the time in DB2 is spent being suspended. This does not necessarily mean that there is a problem with the transaction. It can be normal that the transaction is lightweight and the overall transaction response time is good and well within the service levels.

As we are looking at the overall performance of the TraderClientApplication here, we use an accounting report, which means that most counters are averages, which are based on the number of transactions that are run during the reporting interval. The denominator to calculate these averages is #OCCURRENCES, which can be found in the highlights section of the accounting report, as shown in Example 8-21.

Example 8-21 Accounting report - highlights

HIGHLIGHTS

--------------------------

#OCCURRENCES : 86521

#ALLIEDS : 0

#ALLIEDS DISTRIB: 0

#DBATS : 86521

#DBATS DISTRIB. : 0

#NO PROGRAM DATA: 0

#NORMAL TERMINAT: 86521

#DDFRRSAF ROLLUP: 0

#ABNORMAL TERMIN: 0

#CP/X PARALLEL. : 0

#IO PARALLELISM : 0

#INCREMENT. BIND: 0

#COMMITS : 86522

#ROLLBACKS : 1

#SVPT REQUESTS : 0

#SVPT RELEASE : 0

#SVPT ROLLBACK : 0

MAX SQL CASC LVL: 0

UPDATE/COMMIT : 0.21

SYNCH I/O AVG. : 0.000680

The example indicates that #DBATS is 86521, so we are clearly dealing with a type 4 connection. We are not using accounting rollup (#DDFRRSAF ROLLUP is zero). It is a preferred practice to check #COMMITS versus #ROLLBACKS to make sure that the vast majority of the work is succeeding (and committing), and not constantly failing (rolling back its unit of work).

The highlights section calculates the SYNCH I/O AVG. seen by DB2, which provides you with an easy way to do a smoke test and see whether average synchronous I/O times are correct. In this case, they are 680 microseconds.

The normal termination section, which is shown in Example 8-22 on page 421, gives you a quick idea about what triggered the accounting record. This is from the same accounting report in Example 8-21 that uses type 4 connectivity, so expect to see a high number of TYPE2 INACTIVE threads (where DB2 separates the DBAT and the connection at commit time, by pooling the DBAT and making the connection inactive).

Example 8-22 Accounting report - normal termination

NORMAL TERM. AVERAGE TOTAL

--------------- -------- --------

NEW USER 0.00 1

DEALLOCATION 0.00 1

APPL.PROGR. END 0.00 0

RESIGNON 0.00 0

DBAT INACTIVE 0.00 0

TYPE2 INACTIVE 1.00 86519

RRS COMMIT 0.00 0

END USER THRESH 0.00 0

BLOCK STOR THR 0.00 0

STALENESS THR 0.00 0

SQL DML that is performed by the application

When monitoring transaction performance, it is important to have information about the transaction profile. An important part of the transaction profile is the number and type of SQL statements that are issued by the transaction against the database engine. This information is in the SQL DML section of the DB2 accounting report, as shown in Example 8-23.

Example 8-23 Accounting report - SQL DML

SQL DML AVERAGE TOTAL

-------- -------- --------

SELECT 0.00 0

INSERT 0.04 3274

ROWS 0.04 3274

UPDATE 0.16 14073

ROWS 0.17 14911

MERGE 0.00 0

DELETE 0.01 838

ROWS 0.01 838

DESCRIBE 0.32 28103

DESC.TBL 0.00 0

PREPARE 1.29 111659

OPEN 1.12 96651

FETCH 0.02 1634

ROWS 1.30 112137

CLOSE 0.02 1634

DML-ALL 2.98 257866

On average, the TraderClientApplication application runs 2.98 SQL statements, typically PREPARE, OPEN, and a FETCH statements. The number of FETCH operations is only 0.02, but the number of ROWS fetched is 1.30. This is the result of a DB2 10 performance enhancement for DRDA, where OPEN and FETCH are combined in a single operation, but as a result, the FETCH operation is not counted. However, the number of rows fetched (1.30) accurately reflects the number of rows that were retrieved. The transaction also occasionally performs an INSERT or UPDATE operation.

As the application is using dynamic SQL (because of the number of PREPARE operations), it is important to verify whether these PREPARE operations are able to use the global dynamic statement cache or not, that is, whether they are performing a short prepare (FOUND IN CACHE) or a full prepare (NOT FOUND IN CACHE). To verify this situation, see the DYNAMIC SQL STMT section in the accounting report, as shown in Example 8-24.

Example 8-24 Accounting report - DYNAMIC SQL STMT

DYNAMIC SQL STMT AVERAGE TOTAL

-------------------- -------- --------

REOPTIMIZATION 0.00 0

NOT FOUND IN CACHE 0.00 7

FOUND IN CACHE 1.29 111652

IMPLICIT PREPARES 0.00 0

PREPARES AVOIDED 0.00 0

CACHE_LIMIT_EXCEEDED 0.00 0

PREP_STMT_PURGED 0.00 0

CSWL - STMTS PARSED 0.00 0

CSWL - LITS REPLACED 0.00 0

CSWL - MATCHES FOUND 0.00 0

CSWL - DUPLS CREATED 0.00 0

In our case, we have a good cache hit ratio. The Trader application uses a limited number of SQL statements and they typically use parameter markers. When you use parameter markers, DB2 uses a ‘?’ at prepare time, and provides the actual value of the parameter marker at execution time. So, the only difference between the SQL statements is the value that is provided at run time. The actual SQL statement text (using the ‘?’) is used by DB2 to determine whether the statement is in the cache. Using a parameter marker instead of a literal value increases the chance of finding a statement cache match, allowing DB2 to reuse the cached statement.

Locking information

Another important part of the transaction profile is the locking behavior of the transaction. The locking section and data sharing locking section is shown in Example 8-25. We already looked at some of the fields when we described the DB2 statistics report in “Locking and data sharing locking sections” on page 406. For more information about local and global suspensions, timeouts, deadlocks, and lock escalations, see that section. The only difference between the accounting and statistics fields is that the accounting information applies to a specific transaction/application, but the statistics data applies to all the transactions in the DB2 subsystem that ran during the reporting time frame.

Example 8-25 Accounting report - LOCKING and DATA SHARING

LOCKING AVERAGE TOTAL

--------------------- -------- --------

TIMEOUTS 0.00 0

DEADLOCKS 0.00 0

ESCAL.(SHARED) 0.00 0

ESCAL.(EXCLUS) 0.00 0

MAX PG/ROW LOCKS HELD 1.89 100

LOCK REQUEST 6.56 567488

UNLOCK REQUEST 1.21 104367

QUERY REQUEST 0.00 0

CHANGE REQUEST 0.33 28384

OTHER REQUEST 0.00 0

TOTAL SUSPENSIONS 0.05 4063

LOCK SUSPENSIONS 0.01 530

IRLM LATCH SUSPENS. 0.04 3531

OTHER SUSPENS. 0.00 2

DATA SHARING AVERAGE TOTAL

------------------- -------- --------

GLOBAL CONT RATE(%) 1.91 N/A

FALSE CONT RATE(%) 1.07 N/A

P/L-LOCKS XES(%) 47.70 N/A

LOCK REQ - PLOCKS 0.38 32490

UNLOCK REQ - PLOCKS 0.34 29613

CHANGE REQ - PLOCKS 0.00 3

LOCK REQ - XES 3.31 286161

UNLOCK REQ - XES 0.45 38937

CHANGE REQ - XES 0.26 22095

SUSPENDS - IRLM 0.04 3140

SUSPENDS - XES 0.00 0

CONVERSIONS- XES 0.22 19317

FALSE CONTENTIONS 0.05 4006

INCOMPATIBLE LOCKS 0.00 0

NOTIFY MSGS SENT 0.00 2

LOCK REQUESTS

This counter represents the number of (L-)LOCK requests that were sent to IRLM. It is important to reduce the number of locks as much as possible, as acquiring locks uses processing time, but might also prevent other people from accessing the same resource if they must access the page/row in a way that is not compatible with the lock that your transaction is holding. This can be achieved by application design and the usage of DB2 lock avoidance techniques.

UNLOCK REQUESTS

This counter represents the number of (L-)UNLOCK requests that were sent to IRLM. A single UNLOCK request can release many locks in a single operation. For example, at commit time, DB2 issues an UNLOCK ANY request to release all locks that are no longer required. If a program that is using isolation level CS is fetching from a read only cursor, DB2 tries to avoid taking locks as much as possible (lock avoidance). However, DB2 might have to acquire locks as it fetches rows from the cursor. If a lock was acquired and the cursor moves off the row/page, that lock is released by an UNLOCK request.

So, DB2 issues UNLOCK requests mainly for this type of cursor fetching (unlocking a row/page at a time) or at commit (unlocking them all).

Tip: To assess the effectiveness of the DB2 data lock avoidance techniques at a high level, you can calculate #UNLOCK/COMMIT. If that value is greater than 5, DB2 lock avoidance is not effective. In that case, you might want to ensure that the application uses the ISOLATION(CS) and CURRENTDATA(NO) BIND options.

In Example 8-25 on page 422, UNLOCK/COMMIT is 1.21 (avg #unlocks, as the #commits is almost identical to the #occurrences here), which is a good value. This is a high-level check. If, for example, the application is light and is doing only a few fetches, even if lock avoidance is not working at all, the ratio is still low. Therefore, it is always important to check the overall transaction profile and not blindly apply any rules of thumb.

For more information about lock avoidance, see DB2 9 for z/OS: Resource Serialization and Concurrency Control, SG24-4725.

MAX PG/ROW LOCKS HELD

There is an interesting piece of information in the accounting locking section that is not available in the statistics record. This is the MAX PG/ROW LOCKS HELD field. It is a useful indicator of the application’s commit frequency. The counter applies to low level locks only (page or row, LOB and XML).

Tip: In general, try to issue a COMMIT frequently enough to keep the average MAX PG/ROW LOCKS HELD below 100.

The AVERAGE value that is shown in the accounting REPORT is the average of MAX PG/ROW LOCKS HELD of all the accounting records that qualify for the report. The TOTAL is for the maximum of all MAX PG/ROWS LOCKS HELD, that is, the “high water mark” of all accounting records that qualify for the report. For example, if transaction A has a MAX PG/ROWS LOCKS HELD value of 10, and transaction B has a MAX PG/ROWS LOCKS HELD value of 20, then an accounting report that includes these two transactions has AVERAGE (average of maximum) of 15, and TOTAL (high water mark) of 20.

SYNCH.XES - LOCK REQUEST

In a data sharing environment, locking gets an additional dimension as some of the locks might be sent to the coupling facility to ensure transaction integrity and data coherency. This requires extra CPU cycles, so checking the locking profile and the number of lock requests that must be sent to XES is even more important in a data sharing environment than in a non-data sharing environment.

The SYNCH.XES - LOCK REQUEST counter represents the total number of lock requests that have been synchronously sent to XES. This number includes both P-locks (page set and page) and L-lock requests propagated to z/OS XES synchronously. This number is not incremented if the request is suspended during processing (either because of some type of global contention, or because the XES heuristic algorithm dedicated to convert the request from sync to async. The latter are included in the CONVERSIONS- XES counter.

Buffer pool information

The last piece of information that is important to have about the transaction profile is the (local and group) buffer pool information, which is shown in Example 8-26.

As with the locking information, we already looked at most of the information when we described these sections in the DB2 statistics report in “Buffer pool section” on page 408 and “Group buffer pool section” on page 413. The only difference between the accounting and statistics fields is that the accounting information applies to a specific transaction/application, but the statistics data applies to all the transactions in the DB2 subsystem that ran during the reporting time frame.

Example 8-26 shows only the totals for all buffer pools that are used by the transaction (TOTAL BPOOL). The accounting report contains the same information for each of the buffer pools that were accessed by the transaction.

Example 8-26 Accounting report - buffer pool and group buffer pool

TOTAL BPOOL ACTIVITY AVERAGE TOTAL

--------------------- -------- --------

BPOOL HIT RATIO (%) 99.95 N/A

GETPAGES 6.08 526390

GETPAGES-FAILED 0.00 0

BUFFER UPDATES 0.54 46784

SYNCHRONOUS WRITE 0.00 0

SYNCHRONOUS READ 0.00 221

SEQ. PREFETCH REQS 0.00 0

LIST PREFETCH REQS 0.00 5

DYN. PREFETCH REQS 0.01 1193

PAGES READ ASYNCHR. 0.00 56

GROUP TOTAL AVERAGE TOTAL

--------------------- -------- --------

GBP-DEPEND GETPAGES 6.08 526223

READ(XI)-DATA RETUR 0.15 13393

READ(XI)-NO DATA RT 0.00 180

READ(NF)-DATA RETUR 0.00 111

READ(NF)-NO DATA RT 0.00 41

PREFETCH PAGES READ 0.00 55

CLEAN PAGES WRITTEN 0.00 0

UNREGISTER PAGE 0.00 0

ASYNCH GBP REQUESTS 0.17 15073

EXPLICIT X-INVALID 0.00 0

ASYNCH SEC-GBP REQ 0.00 0

PG P-LOCK LOCK REQ 0.37 32294

SPACE MAP PAGES 0.02 1862

DATA PAGES 0.17 14832

INDEX LEAF PAGES 0.18 15600

PG P-LOCK UNLOCK REQ 0.34 29495

PG P-LOCK LOCK SUSP 0.04 3502

SPACE MAP PAGES 0.00 277

DATA PAGES 0.02 1459

INDEX LEAF PAGES 0.02 1766

WRITE AND REGISTER 0.23 20201

WRITE & REGISTER MULT 0.04 3181

CHANGED PAGES WRITTEN 0.32 27385

From an application profile point of view, the following information is typically used.

GETPAGE REQUEST

This is the number of times an SQL statement must request a page from the DB2 buffer manager component. When an SQL statement must retrieve a row, that row lives on a page, and that page lives in the DB2 buffer pool (or on disk, or in the group buffer pool (GBP) in a data sharing system). So, to obtain the row, there is request to the DB2 buffer manager component to get the page (getpage). The amount of getpage activity is a good measure for the amount of work that DB2 is performing to satisfy the SQL requests from the application.

BUFFER UPDATES

This is the number of times a buffer update occurs. This field is incremented every time that a page is updated and is ready to be written to DASD/GBP. DB2 typically increments the counter for each row that is changed (inserted, updated, or deleted). For example, if an application updates two rows on the same page, you are likely to see GETPAGE REQUEST 1, BUFFER UPDATES 2 (provided no additional getpages were required to retrieve the page or any index updates were needed).

If one of the SQL statements triggers workfile activity, for example, to perform ORDER BY, GROUP BY, or a sort during merge scan join processing, this is considered buffer update activity (these workfiles are created and inserted into the buffer pool), even though the actual SQL statements might read just data.

SYNCHRONOUS READS

When the DB2 buffer manager finds that the page is not in the buffer pool (or GBP), it has to read the page from disk. When DB2 reads a single page from disk, and the application waits for this page to be brought into the BP, this is called a synchronous read operation. Even though disk I/O has improved dramatically over the last couple of years, it is still orders of magnitude slower than retrieving a page that is already in the BP. Therefore, reducing the number of SYNC I/Os has a positive effect on the transaction response time. So, it is important to verify the amount I/O activity that an application must perform. A high number of sync I/O requests can be a sign of a poor access path (when combined with a high number of getpage requests) or a sign of a buffer pools that is performing poorly.

PAGES READ ASYNCHR.

This is the number of pages that were brought into the buffer pool by prefetch engines on behalf of this application. A high number of asynchronous pages can be a sequential process. This might be normal, for example, during a batch process that must work its way through the entire customer database, but could also be a sign of a process that is touching many more pages than it should (and these pages were brought in by a prefetch engine).

GBP PG P-LOCK activity

When looking at the group buffer pool from an application point of view, it is important to check the page P-lock activity, as this introduces extra impact. The group buffer pool section also has the breakdown (per group buffer pool) between data page P-locks (when using row level locking), space map P-locks, and page P-lock requests for index leaf pages. For example, we use P-locks for DATA PAGES for our example, which indicates the usage of row level locking (which is indeed the case for the Trader application. Switching the Trader application to use RLL virtually eliminates all the deadlocks that existed. Given that deadlocks are much more disruptive than the impact of page P-locks, this is certainly a good tradeoff.

Response time reporting for DRDA (T4) connections

When you analyze application elapsed and CPU time, it is important to understand where the time is being spent. DB2 distinguishes between accounting class 1, class 2, and class 3 times, as shown in Figure 8-30 on page 427 and Figure 8-31 on page 427.

The thread activity time is the entire time that the thread (application/transaction) was active. This time is identical to the DB2 class 1 elapsed time. This covers the time between the first SQL statement (that triggers the DB2 thread creation or reuse) until the thread ends or commits depending on the type of attachment you use. This is described in more detail in 8.4.2, “Creating DB2 accounting records at a transaction boundary” on page 396.

Figure 8-30 Thread activity time reporting

The accounting class 2 time is the time that the transaction spends inside the DB2 database engine. When the thread is running an SQL statement, it can either be using CPU time, recorded as CLASS 2 CPU time (when one general-purpose engine) or SE CPU time (when running on a specialty engine), or waiting for something. This waiting time is divided in to class 3 wait time, which is a wait that DB2 is aware of and can account for, like waiting for a lock that is held by another tran in an incompatible state, and waits where we know were in DB2 but it is not one of the classes 3 wait times that DB2 can report on. This latter time is also known as not accounted time, and is typically a small portion of the class 2 elapsed time.

Figure 8-31 shows the life of a transaction and how the work inside and outside of DB2 is reported on, as accounting class 1, class 2, and class 3 time.

Figure 8-31 Accounting class1, 2, and 3 time reporting

Example 8-27 shows how the transaction is reported on in an OMPE accounting report. DB2 accounting information distinguishes between non-nested (activity on the main thread) and nested time (activity by stored procedures, UDFs, and triggers). This allows you to determine how much time is spent doing stored procedure work, for example, but this is beyond the scope of this publication. This section focuses on the non-nested activity, as the Trader application does not use any stored procedures, UDFs, or triggers.

Example 8-27 Accounting report -Class 1, 2, and 3 times

AVERAGE APPL(CL.1) DB2 (CL.2) IFI (CL.5)

------------ ---------- ---------- ----------

ELAPSED TIME 0.002269 0.000853 N/P

NONNESTED 0.002269 0.000853 N/A

STORED PROC 0.000000 0.000000 N/A

UDF 0.000000 0.000000 N/A

TRIGGER 0.000000 0.000000 N/A

CP CPU TIME 0.000201 0.000178 N/P

AGENT 0.000201 0.000178 N/A

NONNESTED 0.000201 0.000178 N/P

STORED PRC 0.000000 0.000000 N/A

UDF 0.000000 0.000000 N/A

TRIGGER 0.000000 0.000000 N/A

PAR.TASKS 0.000000 0.000000 N/A

SECP CPU 0.000000 N/A N/A

SE CPU TIME 0.000234 0.000201 N/A

NONNESTED 0.000234 0.000201 N/A

STORED PROC 0.000000 0.000000 N/A

UDF 0.000000 0.000000 N/A

TRIGGER 0.000000 0.000000 N/A

PAR.TASKS 0.000000 0.000000 N/A

SUSPEND TIME 0.000000 0.000399 N/A

AGENT N/A 0.000399 N/A

PAR.TASKS N/A 0.000000 N/A

STORED PROC 0.000000 N/A N/A

UDF 0.000000 N/A N/A

NOT ACCOUNT. N/A 0.000075 N/A

DB2 ENT/EXIT N/A 7.96 N/A

EN/EX-STPROC N/A 0.00 N/A

EN/EX-UDF N/A 0.00 N/A

DCAPT.DESCR. N/A N/A N/P

LOG EXTRACT. N/A N/A N/P

CLASS 3 SUSPENSIONS AVERAGE TIME AV.EVENT

-------------------- ------------ --------

LOCK/LATCH(DB2+IRLM) 0.000036 0.07

IRLM LOCK+LATCH 0.000033 0.05

DB2 LATCH 0.000003 0.02

SYNCHRON. I/O 0.000082 0.12

DATABASE I/O 0.000001 0.00

LOG WRITE I/O 0.000081 0.12

OTHER READ I/O 0.000012 0.01

OTHER WRTE I/O 0.000001 0.00

SER.TASK SWTCH 0.000001 0.00

UPDATE COMMIT 0.000000 0.00

OPEN/CLOSE 0.000001 0.00

SYSLGRNG REC 0.000000 0.00

EXT/DEL/DEF 0.000000 0.00

OTHER SERVICE 0.000000 0.00

ARC.LOG(QUIES) 0.000000 0.00

LOG READ 0.000000 0.00

DRAIN LOCK 0.000000 0.00

CLAIM RELEASE 0.000000 0.00

PAGE LATCH 0.000060 0.02

NOTIFY MSGS 0.000000 0.00

GLOBAL CONTENTION 0.000164 0.31

COMMIT PH1 WRITE I/O 0.000000 0.00

ASYNCH CF REQUESTS 0.000044 0.19

TCP/IP LOB XML 0.000000 0.00

TOTAL CLASS 3 0.000399 0.71

When the transaction is using a type 4 connection, as is the case here, the work comes in over the network in to the DB2 DIST address space. As the class 1 time is the total time that the thread is active, it also includes the time that the transaction is spending in the application server and in the network. As the class 2 time records the time doing SQL-related activity, the time that we spend in the DDF address space not performing SQL activity is also included in the class 1 time, but it a small amount of time and processing.

Both class 1 and class 2 record both the elapsed and CPU time. For the CPU time, DB2 distinguishes between CPU time that is used on a general-purpose engine (CP CPU time) and time on a specialty engine (zIIP or zAAP, that is, SE CPU time). SE CPU time is not included in the CP CPU time.

The class 2 suspend time is the sum of all the CLASS 3 wait counters that DB2 tracks.

The class 3 suspensions section records the time and number of events that the transaction (on average, as this an accounting report, not an accounting trace) was suspended for each of the suspensions that DB2 tracks.

In this case, the transaction response time is excellent: 2.269 milliseconds.

When you use a type 4 connection, you can calculate the time in DB2:

Class 2 non-nested ET + (SP + UDF + trigger Class 1 ET) + non-nested (Class 1 CP CPU + Class 1 SE CPU - Class 2 CP CPU - Class 2 SE CPU)

In our example, this is 0.000853 + (0 + 0 +0) + (0.000201 + 0.000234 - 0.000178 - 0.000201) = 0.000909, or 0.909 milliseconds.

The time outside DB2 can be calculated:

Total Class 1 ET - time in DB2 (that we previously calculated)

In our example this is 0.002269 - 0.000909 = 0.001360 or 1.36 milliseconds.

This time includes the time that the thread is performing work on the WebSphere Application Server, and the time that is spent in the network. When the thread is reused and does not cut an accounting record at commit, it also includes the time that the thread is idle (waiting for new work to arrive).

When using a type 4 connections, the accounting record also includes a distributed activity section, as shown in Example 8-28, from a DRDA requester with IP address 9.12.6.9 (our WebSphere Application Server).

Example 8-28 Accounting report - Distributed activity

---- DISTRIBUTED ACTIVITY -----------------------------------------------------------------------------------------------------

REQUESTER : ::9.12.6.9 #COMMIT(1) RECEIVED: 86522 MESSAGES SENT : 9.57 ROWS SENT : 1.30

PRODUCT ID : JDBC DRIVER #ROLLBK(1) RECEIVED: 0 MESSAGES RECEIVED: 9.57 BLOCKS SENT : 2.22

METHOD : DRDA PROTOCOL SQL RECEIVED : 4.85 BYTES SENT : 1237.17 #DDF ACCESSES: 86521

CONV.INITIATED : 0.15 BYTES RECEIVED : 1596.58 #RLUP THREADS: 86521

#THREADS INDOUBT : 0

#COMMIT(2) RECEIVED: N/A TRANSACTIONS RECV. : N/A #PREPARE RECEIVED: N/A MSG.IN BUFFER: N/A

#BCKOUT(2) RECEIVED: N/A #COMMIT(2) RES.SENT: N/A #LAST AGENT RECV.: N/A #FORGET SENT : N/A

#COMMIT(2) PERFORM.: N/A #BACKOUT(2)RES.SENT: N/A

#BACKOUT(2)PERFORM.: N/A

In an OMPE accounting report, most of the fields are averages (based on the number of occurrences), but some of the fields contain the total for all occurrences that are included in the report (the fields in bold). This section indicates how much DRDA traffic occurred in terms of the number of SQL statements, bytes, blocks, and messages. When using blocking that rows are put into blocks, which are then sent out in messages. As these are short running transactions with only a small amount of data being passed, there is little blocking activity.

When DB2 accounting trace class 7, 8, or 10 is active, DB2 also produces accounting information at the program or package level, as shown in Example 8-29.

Example 8-29 Accounting report - Package level information

LOCATION: DB0Z OMEGAMON XE FOR DB2 PERFORMANCE EXPERT (V5R1M1) PAGE: 1-9

GROUP: DB0ZG ACCOUNTING REPORT - LONG REQUESTED FROM: ALL 21:29:00.00

MEMBER: D0Z1 TO: DATES 21:30:01.00

SUBSYSTEM: D0Z1 ORDER: TRANSACT INTERVAL FROM: 08/13/12 21:29:00.37

DB2 VERSION: V10 SCOPE: MEMBER TO: 08/13/12 21:30:00.99

TRANSACT: TraderClientApplication

SYSLN300 VALUE SYSLN300 TIMES SYSLN300 AVERAGE TIME AVG.EV TIME/EVENT

------------------ ------------------ ------------------ ------------ ------------------ ------------ ------ ------------

TYPE PACKAGE ELAP-CL7 TIME-AVG 0.000744 LOCK/LATCH 0.000034 0.06 0.000593

CP CPU TIME 0.000142 IRLM LOCK+LATCH 0.000032 0.04 0.000741

LOCATION DB0Z AGENT 0.000142 DB2 LATCH 0.000003 0.02 0.000177

COLLECTION ID JDBCNOHDBAT PAR.TASKS 0.000000 SYNCHRONOUS I/O 0.000082 0.12 0.000680

PROGRAM NAME SYSLN300 SE CPU TIME 0.000157 OTHER READ I/O 0.000012 0.01 0.001411

SUSPENSION-CL8 0.000353 OTHER WRITE I/O 0.000001 0.00 0.000482

ACTIVITY TYPE NONNESTED AGENT 0.000353 SERV.TASK SWITCH 0.000001 0.00 0.006281

ACTIVITY NAME 'BLANK' PAR.TASKS 0.000000 ARCH.LOG(QUIESCE) 0.000000 0.00 N/C

SCHEMA NAME 'BLANK' NOT ACCOUNTED 0.000091 ARCHIVE LOG READ 0.000000 0.00 N/C

SUCC AUTH CHECK 86521 AVG.DB2 ENTRY/EXIT 7.96 DRAIN LOCK 0.000000 0.00 0.000172

OCCURRENCES 86521 DB2 ENTRY/EXIT 688782 CLAIM RELEASE 0.000000 0.00 N/C

NBR OF ALLOCATIONS 86521 PAGE LATCH 0.000060 0.02 0.002916

SQL STMT - AVERAGE 2.98 CP CPU SU 8.11 NOTIFY MESSAGES 0.000000 0.00 0.000565

SQL STMT - TOTAL 257866 AGENT 8.11 GLOBAL CONTENTION 0.000164 0.31 0.000535

NBR RLUP THREADS 86521 PAR.TASKS 0.00 TCP/IP LOB XML 0.000000 0.00 N/C

SE CPU SU 9.18 TOTAL CL8 SUSPENS. 0.000353 0.52 0.000685

As the Trader workload is using JDBC, all the work that is performed by the application runs under the standard JDBC packages, in this case, SYSLN300. (We bound the package into a special collection called JDBCNOHDBAT - JDBC_No_High_performance_DBAT, so the regular packages do not use RELEASE(DEALLOCATE)).

At the package level, DB2 also records the ET and CPU time (GCP and SE) that was spent in the package (Class 7 time), and a large subset of the Class 3 suspension counters is also available at the package level (Class 8 time). When Class 10 is active, there is additional information about the SQL, locking, and buffer pool activity, but we did not record this information during this test.

Package level information is not that useful when using JDBC, as all work runs under the same set of packages. In this case, you must rely on using the client strings to trigger correct segregation of the JDBC work.

When you use SQLJ, package level information can be helpful. In the SQLJ case, the SQL statements run as static SQL and each application binds its own set of packages, and allows for a more granular view at the package level of the application’s database access pattern.

Response time reporting for RRS (type 2) connections

Although the accounting information for type 2 connections is 98% the same as when using a type 4 connection, there are a few differences that are worth mentioning.

Example 8-30 shows the accounting class 1, 2, and 3 information when using a type 2 (RRS) connection to access DB2 for z/OS. The application is identical to what we used before (TraderClientApplication); we only changed from a type 4 to type 2 connection in the data source. However, the number of users being simulated was different, so you should not be comparing the type 2 and the type 4 run, as the application profile is different.

Example 8-30 Accounting report - Class 1, 2, and 3 times for T2

LOCATION: DB0Z OMEGAMON XE FOR DB2 PERFORMANCE EXPERT (V5R1M1) PAGE: 1-7

GROUP: DB0ZG ACCOUNTING REPORT - LONG REQUESTED FROM: NOT SPECIFIED

MEMBER: D0Z2 TO: NOT SPECIFIED

SUBSYSTEM: D0Z2 ORDER: TRANSACT INTERVAL FROM: 08/14/12 22:39:13.46

DB2 VERSION: V10 SCOPE: MEMBER TO: 08/14/12 22:48:25.51

TRANSACT: TraderClientApplication

ELAPSED TIME DISTRIBUTION CLASS 2 TIME DISTRIBUTION

---------------------------------------------------------------- ----------------------------------------------------------------

APPL |======================> 45% CPU |=> 2%

DB2 |=======> 15% SECPU |

SUSP |====================> 40% NOTACC |============> 25%

SUSP |====================================> 73%

AVERAGE APPL(CL.1) DB2 (CL.2) IFI (CL.5) CLASS 3 SUSPENSIONS AVERAGE TIME AV.EVENT HIGHLIGHTS

------------ ---------- ---------- ---------- -------------------- ------------ -------- --------------------------

ELAPSED TIME 0.018236 0.010092 N/P LOCK/LATCH(DB2+IRLM) 0.005214 0.70 #OCCURRENCES : 1304797

NONNESTED 0.018236 0.010092 N/A IRLM LOCK+LATCH 0.001948 0.26 #ALLIEDS : 1304797

STORED PROC 0.000000 0.000000 N/A DB2 LATCH 0.003266 0.44 #ALLIEDS DISTRIB: 0

UDF 0.000000 0.000000 N/A SYNCHRON. I/O 0.000001 0.00 #DBATS : 0

TRIGGER 0.000000 0.000000 N/A DATABASE I/O 0.000000 0.00 #DBATS DISTRIB. : 0

LOG WRITE I/O 0.000001 0.00 #NO PROGRAM DATA: 57

CP CPU TIME 0.000243 0.000197 N/P OTHER READ I/O 0.000000 0.00 #NORMAL TERMINAT: 0

AGENT 0.000243 0.000197 N/A OTHER WRTE I/O 0.000000 0.00 #DDFRRSAF ROLLUP: 130474

NONNESTED 0.000243 0.000197 N/P SER.TASK SWTCH 0.000817 0.11 #ABNORMAL TERMIN: 57

STORED PRC 0.000000 0.000000 N/A UPDATE COMMIT 0.000817 0.11 #CP/X PARALLEL. : 0

UDF 0.000000 0.000000 N/A OPEN/CLOSE 0.000000 0.00 #IO PARALLELISM : 0

TRIGGER 0.000000 0.000000 N/A SYSLGRNG REC 0.000000 0.00 #INCREMENT. BIND: 0

PAR.TASKS 0.000000 0.000000 N/A EXT/DEL/DEF 0.000000 0.00 #COMMITS : 1304734

OTHER SERVICE 0.000000 0.00 #ROLLBACKS : 0

SECP CPU 0.000000 N/A N/A ARC.LOG(QUIES) 0.000000 0.00 #SVPT REQUESTS : 0

LOG READ 0.000000 0.00 #SVPT RELEASE : 0

SE CPU TIME 0.000050 0.000041 N/A DRAIN LOCK 0.000000 0.00 #SVPT ROLLBACK : 0

NONNESTED 0.000050 0.000041 N/A CLAIM RELEASE 0.000000 0.00 MAX SQL CASC LVL: 0

STORED PROC 0.000000 0.000000 N/A PAGE LATCH 0.001033 0.08 UPDATE/COMMIT : 0.21

UDF 0.000000 0.000000 N/A NOTIFY MSGS 0.000000 0.00 SYNCH I/O AVG. : 0.000613

TRIGGER 0.000000 0.000000 N/A GLOBAL CONTENTION 0.000292 0.05

COMMIT PH1 WRITE I/O 0.000000 0.00

PAR.TASKS 0.000000 0.000000 N/A ASYNCH CF REQUESTS 0.000000 0.00

TCP/IP LOB XML 0.000000 0.00

SUSPEND TIME 0.000000 0.007358 N/A TOTAL CLASS 3 0.007358 0.95

AGENT N/A 0.007358 N/A

PAR.TASKS N/A 0.000000 N/A

STORED PROC 0.000000 N/A N/A

UDF 0.000000 N/A N/A

NOT ACCOUNT. N/A 0.002497 N/A

DB2 ENT/EXIT N/A 0.00 N/A

EN/EX-STPROC N/A 0.00 N/A

EN/EX-UDF N/A 0.00 N/A

DCAPT.DESCR. N/A N/A N/P

LOG EXTRACT. N/A N/A N/P

Now we are looking at an RRS (T2) connection, as indicated by a non-zero number in the #ALLIEDS field (in the highlights section). #DDFRRSAF ROLLUP is non-zero, which indicates that during this run we used rollup accounting (ACCUMACC=10).

This is also confirmed by the non-zero value in the END USER THRESH field (accounting record that is written because the ACCUMACC value was reached for the ACCUMUID aggregation field) in the Normal Term section, as shown in Example 8-31.

Example 8-31 Accounting report - Normal Term

NORMAL TERM. AVERAGE TOTAL

--------------- -------- --------

NEW USER 0.00 0

DEALLOCATION 0.00 0

APPL.PROGR. END 0.00 0

RESIGNON 0.00 0

DBAT INACTIVE 0.00 0

TYPE2 INACTIVE 0.00 0

RRS COMMIT 0.00 0

END USER THRESH 0.10 130474

BLOCK STOR THR 0.00 0

STALENESS THR 0.00 0

To calculate the time in DB2 for local applications, use the following formula:

Class 2 non-nested ET + (SP + UDF + trigger Class 1 ET)

In our example, this is 0.010092 + (0 + 0 +0) = 0.010092 or 10.092 milliseconds.

The time outside DB2 can be calculated:

Total Class 1 ET - time in DB2 (that we previously calculated)

In our example, this is 0.018236 - 0.010092 = 0.08144 or 8.144 milliseconds.

When using a local attach such as RRS, the CPU time spent in the application can
be calculated:

Non-nested (Class 1 CP CPU + Class 1 SE CPU) - non-nested (Class 2 CP CPU + Class 2 SE CPU)

In our case:

(0.000243 + 0.000050) - (0.000197 + 0.000041) = 0.000055 or 55 microseconds

Example 8-32 shows the accounting package level information (when accounting class 7 and 8 are active). Type 2 JDBC and type 4 JDBC use the same packages (SYSLN300, in
our case).

Example 8-32 Accounting report -Package information

SYSLN300 VALUE SYSLN300 TIMES SYSLN300 AVERAGE TIME AVG.EV TIME/EVENT

------------------ ------------------ ------------------ ------------ ------------------ ------------ ------ ------------

TYPE PACKAGE ELAP-CL7 TIME-AVG 0.008323 LOCK/LATCH 0.004466 0.60 0.007495

CP CPU TIME 0.000137 IRLM LOCK+LATCH 0.001541 0.20 0.007521

LOCATION DB0Z AGENT 0.000137 DB2 LATCH 0.002926 0.39 0.007482

COLLECTION ID JDBCNOHDBAT PAR.TASKS 0.000000 SYNCHRONOUS I/O 0.000001 0.00 0.000597

PROGRAM NAME SYSLN300 SE CPU TIME 0.000024 OTHER READ I/O 0.000000 0.00 N/C

SUSPENSION-CL8 0.006609 OTHER WRITE I/O 0.000000 0.00 0.010107

ACTIVITY TYPE NONNESTED AGENT 0.006609 SERV.TASK SWITCH 0.000817 0.11 0.007200

ACTIVITY NAME 'BLANK' PAR.TASKS 0.000000 ARCH.LOG(QUIESCE) 0.000000 0.00 N/C

SCHEMA NAME 'BLANK' NOT ACCOUNTED 0.001552 ARCHIVE LOG READ 0.000000 0.00 N/C

SUCC AUTH CHECK 0 AVG.DB2 ENTRY/EXIT N/P DRAIN LOCK 0.000000 0.00 N/C

OCCURRENCES 1304740 DB2 ENTRY/EXIT N/P CLAIM RELEASE 0.000000 0.00 N/C

NBR OF ALLOCATIONS 1304733 PAGE LATCH 0.001033 0.08 0.013138

SQL STMT - AVERAGE 3.81 CP CPU SU 7.95 NOTIFY MESSAGES 0.000000 0.00 0.002867

SQL STMT - TOTAL 4967798 AGENT 7.95 GLOBAL CONTENTION 0.000292 0.00 4.050997

NBR RLUP THREADS 1304733 PAR.TASKS 0.00 TCP/IP LOB XML 0.000000 0.00 N/C

SE CPU SU 1.41 TOTAL CL8 SUSPENS. 0.006609 0.79 0.008372

8.4.6 Monitoring threads and connections by using profiles

After we implement and activate the active thread profile monitoring for the DayTrader-EE6 application (for more information, see 4.3.17, “Using DB2 profiles” on page 180), we run the application to see whether the number of seven active threads was exceeded. During workload testing, we notice the messages that are shown in Figure 8-32 several times during workload execution.

DSNT772I 1 -D0Z1 DSNLQDIS A MONITOR PROFILE WARNING 352

CONDITION OCCURRED

31802 TIME(S) 2

IN PROFILE ID=1 3

WITH PROFILE FILTERING SCOPE=CLIENT_APPLNAME 4

WITH REASON=00E30505 5

DSNT772I 1 -D0Z1 DSNLQDIS A MONITOR PROFILE WARNING 177

CONDITION OCCURRED

12306 TIME(S) 2

IN PROFILE ID=1 3

WITH PROFILE FILTERING SCOPE=CLIENT_APPLNAME 4

WITH REASON=00E30505 5

Figure 8-32 Message DSNT772I

Where:

1. We scheduled the workload to run on both data sharing members, so we observed DSNT772I messages that are issued by both data sharing members.

2. nnnnn TIME(S) shows the number of times that the warning threshold was exceeded since the last DSNT772I message was issued. As observed in the syslog, DB2 issued the DSNT772I message every 5 minutes. Using the interval duration of 300 seconds, you can use these values to calculate the number of active threads that are needed per second:

– D0Z1 = 31802 times
31802 / 300 = 106.66 + 7 = 113.66 active threads per second

– D072 = 12306 times
12306 / 300 = 41.02 + 7 = 48.02 active threads per second

3. PROFILE ID uniquely identifies the ID of the monitoring profile.

4. The filtering scope was for CLIENT_APPLNAME.

5. A warning occurred because the number of concurrent active threads exceeded the warning setting for the MONITOR THREADS keyword in the monitor profile of PROFILE ID 1 for the filtering scopes CLIENT_APPLNAME.

Our DB2 setup collects statistics trace class 4 to include IFCID 402 information about DB2 profile warning and exception conditions. Using the IFCID 402 information, we create the OMPE record trace report that is shown in Figure 8-33 to obtain more information.

LOCATION: OMEGAMON XE FOR DB2 PERFORMANCE EXPERT (V5R1M1) PAGE: 1-1

GROUP: DB0ZG RECORD TRACE - LONG REQUESTED FROM: NOT SPECIFIED

MEMB TO: NOT SPECIFIED

SUBSYST ACTUAL FROM: 08/13/12 21:08:0

DB2 VERSION: V PAGE DATE: 08/13/12

0PRIMAUTH CONNECT INSTANCE END_USER WS_NAME TRANSACT

ORIGAUTH CORRNAME CONNTYPE RECORD TIME DESTNO ACE IFC DESCRIPTION DATA

PLANNAME CORRNMBR TCB CPU TIME ID

-------- -------- ----------- ----------------- ------ --- --- -------------- ---------------------------------

N/P CA037CD25FEA N/P N/P N/P

N/P 'BLANK' 21:08:00.15927885 132857 1 402 SYSTEM PROFILE NETWORKID: D0Z1 LUNAME: D0Z1 LUWSEQ:

N/P N/P MONITORING STA

|-----------------------------------------------------------------------------------------------------------------

|PROFILE ID : 1 (THR = THREAD, EXC = EXCEPTIOTSH=THRESHOLD)

|ACCUMULATED COUNTER OF ...

|THR EXC TSH EXCEEDED : 0 THR QUEUED/SUSP WHEN EXC TSH WAS EXCEEDED : 0

|REQUEST FAILED WHEN THR EXC TSH WAS EXCEEDED : 0 THR WARNING TSH BEING EXCEEDED :1019250

|CONNECTION EXC TSH BEING EXCEEDED : 0 CONNECTION WARN TSH BEING EXCEEDED : 0

|IDLE THR EXC TSH BEING EXCEEDED : 0 IDLE THR WARN TSH BEING EXCEEDED : 0

|-----------------------------------------------------------------------------------------------------------------

Figure 8-33 IFCID 402 record trace report

Additional information

For more information about the topics that are covered in this section, see the setup in 4.3.19, “Configure thread monitoring for the DayTrader-EE6 application” on page 187. For more information about managing and implementing DB2 profile monitoring, see Chapter 45, “Using profiles to monitor and optimize performance”, in DB2 10 for z/OS, Managing Performance, SC19-2978.

8.5 Using the performance database

Appendix D, “IBM OMEGAMON XE for DB2 performance database” on page 527 explains how to implement and populate the OMPE performance database (PDB) tables with DB2 statistics and accounting trace information. This section provides an example of an SQL table UDF that is used to encapsulate the PDB query logic from the SQL user. The UDF receives the following input parameters:

•clientApplicationInformation

For the DayTrader application, the clientApplicationInformation data source custom property was set to the value of TraderClientApplication.

•Connection type

– RRS for JDBC type 2 connections

– DRDA for JDBC type 4 connections

Using the UDF is simple and straightforward. For example, to query the aggregated PDB accounting tables for JDBC type 2 connections that are collected on 14th August 2012, run the query that is shown in Example 8-33.

Example 8-33 .Using the SQL table UDF to query JDBC type 2 accounting information

select * from

table(accounting('TraderClientApplication','RRS')) a

where substr("DateTime",1,10) = '2012-08-14'

order by "DateTime" ;

8.5.1 Querying aggregated JDBC type 2 accounting information

Using the UDF, we ran the query that is shown in Figure 8-34 on page 436 to obtain interval aggregated information about our JDBC type 2 workload execution. For each interval, the query returns aggregations on elapsed time, total and DB2 related processor and zIIP usage, number of commits, SQL DML, locks, get page requests, and row statistics on insert, update, and delete activities.

select

"DateTime"

, "Elapsed"

, "TotCPU"

, "TotzIIP"

, DB2CPU

, "DB2zIIP"

, "Commit"

, SQL

, "Locks"

, "RowsFetched"

, "RowsInserted"

, "RowsUpdated"

, "RowsDeleted"

, "GetPage"

from

table(accounting('TraderClientApplication','RRS')) a

where substr("DateTime",1,10) = '2012-08-14'

order by "DateTime" ;

---------+---------+---------+---------+---------+---------+---------+-------

DateTime Elapsed TotCPU TotzIIP DB2CPU DB2zIIP Commit SQL

---------+---------+---------+---------+---------+---------+---------+-------

2012-08-14-22.39 11.96 4.98 3.40 3.40 2.25 8970 39290

2012-08-14-22.41 2417.46 38.09 7.03 30.28 5.64 123299 164779

2012-08-14-22.42 3375.88 54.40 9.43 43.95 7.69 187120 248216

2012-08-14-22.43 3227.19 55.64 9.14 45.16 7.53 191309 252611

2012-08-14-22.44 3570.88 56.14 9.02 45.67 7.42 192869 255372

2012-08-14-22.45 3299.26 56.33 9.17 45.87 7.58 193129 254434

2012-08-14-22.46 3281.19 56.79 9.02 46.27 7.45 196599 259235

2012-08-14-22.47 3242.99 56.69 8.92 46.15 7.36 196339 259321

2012-08-14-22.48 250.33 4.40 .66 3.59 .54 15100 19722

-+---------+---------+---------+---------+---------

Locks Rows Rows Rows Rows GetPage

Fetched Inserted Updated Deleted

-+---------+---------+---------+---------+---------

92588 22366 6971 9945 0 171534

814550 62241 4646 22235 1152 757692

1223042 41686 7165 30923 1877 1149004

1248040 45798 7359 31095 1801 1184893

1261375 48539 7371 31672 1887 1203983

1258972 50600 7364 31069 1794 1197565

1282236 56418 7477 31693 1839 1221505

1281040 56666 7425 31869 1925 1218127

97848 19435 562 2391 136 92090

DSNE610I NUMBER OF ROWS DISPLAYED IS 9

DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100

Figure 8-34 PDB query JDBC type 2 aggregated accounting data

8.5.2 Querying aggregated JDBC type 4 accounting information

Using the UDF, we ran the query that is shown in Figure 8-35 to obtain interval aggregated information about our JDBC type 4 workload execution. For each interval, the query returns aggregations of elapsed time, total and DB2 related processor and zIIP usage, number of commits, SQL DML, locks, get page requests, and row statistics on insert, update, and
delete activities.

select

"DateTime" , "Elapsed" , "TotCPU" , "TotzIIP" , DB2CPU , "DB2zIIP" , "Commit"

, SQL , "Locks" , "RowsFetched" , "RowsInserted" , "RowsUpdated" , "RowsDeleted" , "GetPage"

from table(accounting('TraderClientApplication','DRDA')) a

where substr("DateTime",1,10) = '2012-08-17' order by "DateTime" ;

---------+---------+---------+---------+---------+---------+-------

DateTime Elapsed TotCPU TotzIIP DB2CPU DB2zIIP Commit SQL

---------+---------+---------+---------+---------+---------+-------

2012-08-17-22.59 288.46 23.56 12.14 20.75 10.59 55235 13185

2012-08-17-22.59 291.66 27.38 14.90 23.69 12.76 64120 26333

2012-08-17-23.00 433.34 49.51 26.76 43.21 23.11 123548 27323

2012-08-17-23.00 423.71 48.04 24.41 42.37 21.32 114517 24805

2012-08-17-23.01 309.15 47.00 23.71 41.51 20.69 112304 24123

2012-08-17-23.01 375.32 51.78 28.08 45.12 24.23 131529 28668

2012-08-17-23.02 796.84 94.92 52.00 81.67 44.18 265297 58104

2012-08-17-23.02 141.11 20.66 10.31 18.35 9.06 46233 10033

2012-08-17-23.03 1101.42 124.68 68.45 106.66 57.73 360722 79375

2012-08-17-23.04 660.42 79.00 43.25 67.71 36.55 226388 50443

2012-08-17-23.04 125.28 27.55 13.98 23.90 11.95 76366 17038

2012-08-17-23.05 637.05 78.89 43.05 67.65 36.41 228661 49864

2012-08-17-23.05 161.45 28.07 14.36 24.35 12.28 78467 16906

--+---------+---------+---------+---------+-------

Locks Rows Rows Rows Rows GetPage

Fetched Inserted Updated Deleted

--+---------+---------+---------+---------+-------

367481 72629 2067 10083 541 818516

452411 91083 6059 17562 579 818516

813130 159809 4691 20264 1185 1469024

746300 148772 4415 18197 1061 1469024

730570 145980 4188 17805 1035 1523288

867468 170043 4969 21239 1256 1523288

1737859 343606 10085 42965 2515 1957605

301810 60545 1729 7414 443 1957605

2361915 467835 13735 58712 3479 2269742

1482703 290534 8826 37282 2172 1910393

497295 99019 2956 12587 740 1910393

1495054 295893 8654 36847 2182 1926461

508469 100866 2983 12480 707 1926461

Figure 8-35 PDB query JDBC type 4 aggregated accounting data

8.5.3 Using RTS to identify DB2 tables that are involved in DML operations

The query result in Figure 8-35 provides information about the number of rows fetched, inserted, updated, and deleted without telling you which tables these activities were performed on. All that we know is that the DayTrader-EE6 application accesses tables belonging to the DBTR8074 database.

Currently, we have no answers to the following questions:

•Which tables were involved in SQL DML update, delete, insert, and fetch operations?

•How many rows were inserted, updated, or deleted for each table?

•If you extrapolate the SQL DML information, how large are the tables going to be in one month, six months, and one year?

Answering these questions is important to identify tables that need extra care when it comes to planning disk capacity, identifying tables for partitioning, and identifying tables that need extra care when it comes to planning Runstats, Reorg, and Backup. For example, you might use the RTS information to identify tables that are volatile or in need of extra reorg utility executions. If you identify tables that start small in size and are going to be huge, you might want to provide for table partitioning and you might want to talk with application developers to determine whether data partitioning secondary indexes (DPSI) are a good choice.

Let us focus on the query output that is shown in Figure 8-35 on page 437, in which we obtain information about the rows that are inserted, updated, and deleted during the workload execution that is performed on 17 August. Before and right after workload execution, we saved the RTS tables using the process that is described in 4.3.24, “DB2 real time statistics” on page 198. We then ran the query that is shown in Figure 8-36 on page 439 to determine the number of rows that were inserted, updated, and deleted for each of the tables that are accessed during workload execution.

WITH

Q1 AS

( SELECT

DBNAME,NAME,PARTITION,

NACTIVE,NPAGES,REORGINSERTS,REORGDELETES,REORGUPDATES,

REORGMASSDELETE,TOTALROWS,SNAPSHOTTS

FROM TABLESPACESTATS

WHERE SNAPSHOTTS = '2012-08-17-22.57.57.673670'

AND DBNAME = 'DBTR8074'

ORDER BY DBNAME, NAME, SNAPSHOTTS),

Q2 AS

( SELECT

DBNAME,NAME,PARTITION,

NACTIVE,NPAGES,REORGINSERTS,REORGDELETES,REORGUPDATES,

REORGMASSDELETE,TOTALROWS,SNAPSHOTTS

FROM TABLESPACESTATS

WHERE SNAPSHOTTS = '2012-08-17-23.08.49.191718'

AND DBNAME = 'DBTR8074'

ORDER BY DBNAME, NAME, SNAPSHOTTS)

SELECT

SUBSTR(Q1.DBNAME,1,8) AS DBNAME,

SUBSTR(Q1.NAME ,1,8) AS NAME,

Q1.PARTITION,

Q2.TOTALROWS - Q1.TOTALROWS AS #ROWS,

Q2.REORGINSERTS - Q1.REORGINSERTS AS INSERTS ,

Q2.REORGDELETES - Q1.REORGDELETES AS DELETES ,

Q2.REORGUPDATES - Q1.REORGUPDATES AS UPDATES ,

Q2.REORGMASSDELETE - Q1.REORGMASSDELETE AS MASSDELETE

FROM Q1,Q2

WHERE

(Q1.DBNAME,Q1.NAME,Q1.PARTITION) = (Q2.DBNAME,Q2.NAME,Q2.PARTITION)

---------+---------+---------+---------+---------+---------+---------+---

DBNAME NAME PARTITION #ROWS INSERTS DELETES UPDATES MASSDELETE

---------+---------+---------+---------+---------+---------+---------+---

DBTR8074 TSACCEJB 0 11489 11489 0 133129 1

DBTR8074 TSACPREJ 0 11489 11489 0 21603 1

DBTR8074 TSHLDEJB 0 2476 24037 21561 21561 1

DBTR8074 TSKEYGEN 0 3 3 0 83 0

DBTR8074 TSORDEJB 0 45598 45598 0 158305 1

DBTR8074 TSQUOEJB 0 1000 1000 0 45580 1

DSNE610I NUMBER OF ROWS DISPLAYED IS 6

Figure 8-36 Using RTS to determine workload-related table changes

The query output that is shown in Figure 8-36 shows table spaces, their SQL insert, update, and delete activities, and the number of rows that are stored in each table space upon workload completion. Because we store only one table per table space, we can relate the name of the table that was involved in the SQL DML operation to the table space that the table belongs to.

Querying performance indicator information

Obtaining the total sum for elapsed and CPU times, the total number of SQL, commits, locks, and getpage requests does not always provide an indication of good or bad application performance. To assess your application test results, you must perform an application target performance comparison. This process compares target performance indicators with corresponding performance indicators of your workload execution. To assist you in performing a target performance comparison, the SQL table UDF that we introduce in 8.5, “Using the performance database” on page 434 provides the following key performance
indicators (KPIs):

•AVG-Time: Average elapsed time per commit

•AVG-CPU: Average CPU usage per commit

•Time/SQL: Average elapsed time per SQL

•CPU/SQL: Average CPU usage per SQL

•AVG-SQL: Average number of SQL per commit

•LOCK/Tran: Average number of lock requests per commit

•LOCK/SQL: Average number of locks per SQL

•GETP/Tran: Average number of getpage requests per commit

•GETP/SQL: Average number of getpage requests per SQL

In our workload scenario, we use these performance indicators to compare the DayTrader-EE6 JDBC type 2 with the DayTrader-EE6 JDBC type 4 workload execution.

JDBC type 2

The performance indicators of the DayTrader-EE6 JDBC type 2 are shown in Figure 8-37.

select

"DateTime"

,"AVG-Time"

,"AVG-CPU"

,"Time/SQL"

,"CPU/SQL"

,"AVG-SQL"

,"LOCK/Tran"

,"LOCK/SQL"

,"GETP/Tran"

,"GETP/SQL"

from table(accounting('TraderClientApplication','RRS')) a

where substr("DateTime",1,10) = '2012-08-14'

order by "DateTime" ;

---------+---------+---------+---------+---------+---------+-

DateTime AVG-Time AVG-CPU Time/SQL CPU/SQL AVG-SQL

---------+---------+---------+---------+---------+---------+-

2012-08-14-22.39 .001333 .000555 .000304 .000126 4.380156

2012-08-14-22.41 .019606 .000308 .014670 .000231 1.336417

2012-08-14-22.42 .018041 .000290 .013600 .000219 1.326507

2012-08-14-22.43 .016868 .000290 .012775 .000220 1.320434

2012-08-14-22.44 .018514 .000291 .013983 .000219 1.324069

2012-08-14-22.45 .017083 .000291 .012967 .000221 1.317430

2012-08-14-22.46 .016689 .000288 .012657 .000219 1.318597

2012-08-14-22.47 .016517 .000288 .012505 .000218 1.320781

2012-08-14-22.48 .016578 .000291 .012692 .000223 1.306092

--------+---------+---------+---------+

LOCK/Tran LOCK/SQL GETP/Tran GETP/SQL

--------+---------+---------+---------+

10.321962 2.356528 19.123076 4.365843

6.606298 4.943287 6.145159 4.598231

6.536137 4.927329 6.140466 4.629048

6.523686 4.940560 6.193608 4.690583

6.540060 4.939362 6.242491 4.714624

6.518813 4.948128 6.200855 4.706780

6.522088 4.946230 6.213180 4.711960

6.524633 4.939977 6.204202 4.697371

6.480000 4.961362 6.098675 4.669404

Figure 8-37 Performance indicators JDBC type 2

JDBC type 4

The performance indicators of the DayTrader-EE6 JDBC type 2 are shown in Figure 8-38.

select

"DateTime"

,"AVG-Time"

,"AVG-CPU"

,"Time/SQL"

,"CPU/SQL"

,"AVG-SQL"

,"LOCK/Tran"

,"LOCK/SQL"

,"GETP/Tran"

,"GETP/SQL"

from

table(accounting('TraderClientApplication','DRDA')) a

where substr("DateTime",1,10) = '2012-08-17'

order by "DateTime" ;

---------+---------+---------+---------+---------+---------+

DateTime AVG-Time AVG-CPU Time/SQL CPU/SQL AVG-SQL

---------+---------+---------+---------+---------+---------+

2012-08-17-22.58 .002664 .000604 .001339 .000303 1.989921

2012-08-17-22.59 .005222 .000426 .021877 .001786 .238707

2012-08-17-22.59 .004548 .000427 .011075 .001039 .410683

2012-08-17-23.00 .003507 .000400 .015859 .001812 .221152

2012-08-17-23.00 .003699 .000419 .017081 .001936 .216605

2012-08-17-23.01 .002752 .000418 .012815 .001948 .214800

2012-08-17-23.01 .002853 .000393 .013091 .001806 .217959

2012-08-17-23.02 .003003 .000357 .013714 .001633 .219014

2012-08-17-23.02 .003052 .000446 .014064 .002059 .217009

2012-08-17-23.03 .003053 .000345 .013876 .001570 .220044

2012-08-17-23.04 .002917 .000348 .013092 .001566 .222816

2012-08-17-23.04 .001640 .000360 .007352 .001616 .223109

2012-08-17-23.05 .002786 .000345 .012775 .001582 .218069

2012-08-17-23.05 .002057 .000357 .009549 .001660 .215453

2012-08-17-23.06 .003343 .000345 .015042 .001555 .222288

---------+---------+---------+---------+

LOCK/Tran LOCK/SQL GETP/Tran GETP/SQL

---------+---------+---------+---------+

9.476892 4.762445 17.073746 8.580111

6.653046 27.871141 14.818792 62.079332

7.055692 17.180382 12.765377 31.083279

6.581490 29.759909 11.890309 53.765106

6.516936 30.086676 12.827999 59.222898

6.505289 30.285204 13.563969 63.146706

6.595260 30.259104 11.581385 53.135482

6.550616 29.909455 7.378918 33.691398

6.528021 30.081730 42.342158 195.116615

6.547743 29.756409 6.292219 28.595174

6.549388 29.393632 8.438578 37.872311

6.511994 29.187404 25.016276 112.125425

6.538299 29.982632 8.424965 38.634305

6.480036 30.076245 24.551225 113.951319

6.552779 29.478718 6.300677 28.344596

Figure 8-38 Performance indicators JDBC type 4

Conclusion

When we compare the JDBC type 2 with the JDBC type 4 performance indicators. We notice a ratio of less than 1 SQL per commit for the JDBC type 4 workload. This is caused by the data source custom property AUTOCOMMIT=ON, which causes the JDBC driver to issue an SQL commit for each SQL statement. When we compare the JDBC type 2 with the JDBC type 4 performance indicators, we notice higher resource usage for CPU, number of locks, and number of get page requests per SQL.

8.6 Monitoring from the z/OS side with RMF

Now, we look at the monitoring that you can perform on the z/OS side. You can use an online monitor to look at the current system performance, but here we focus on how to look at the historical performance information by using RMF monitor postprocessor reports. There are many different reports that you can look at, but here we focus on the workload activity report.

In 8.2.2, “Using client information strings to classify work in WLM and RMF reporting” on page 369, we set up our WLM service classes and, for monitoring purposes,
reporting classes:

•RTRADE as the reporting class for the Trader application inside the WebSphere Application Server

•RTRADE0Z as the reporting class for the DDF enclaves that run the DB2 work when the Trader application is using a type 4 connection.

Example 8-34 shows a sample JCL that you can use to run the RMF post processor. The first step sorts the data, which is especially important when you use multiple input data sets, for example, when combining data from multiple systems. The second step generates the reports. In this case, we use the following JCL:

SYSRPTS(WLMGL(SCLASS,RCLASS,SCPER,RCPER,SYSNAM(SC64)))

This JCL creates a sysplex-wide workload activity report. We look at only one of the systems, SC64 (SYSNAM(SC64)). The report has information about the different service classes (SCLASS), report classes (RCLASS), and within the service class, the individual service class periods (SCPER), and the individual reporting class periods (RCPER) within each reporting class.

For more information about the different post-processor reporting option, see Chapter 17. “Long-term reporting with the Postprocessor z/OS RMF”, in z/OS V1R13 Resource Measurement Facility (RMF) User's Guide, SC33-7990.

Example 8-34 JCL that is used to create the postprocessor workload activity report

//BAT4RMF JOB (999,POK),'BART JOB',CLASS=A,MSGCLASS=T,

// NOTIFY=&SYSUID,TIME=1440,REGION=0M

/*JOBPARM SYSAFF=SC63

//RMFSORT EXEC PGM=SORT

//SORTIN DD DISP=SHR,DSN=DB2SMF.WASRB.SC64.T4.SMFRMF

//SORTOUT DD DISP=(NEW,PASS),UNIT=(SYSDA,5),SPACE=(CYL,(800,800))

//SORTWK01 DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(100,200))

//SORTWK02 DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(100,200))

//SORTWK03 DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(100,200))

//SORTWK04 DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(100,200))

//SORTWK05 DD DISP=(NEW,DELETE),UNIT=SYSDA,SPACE=(CYL,(100,200))

//SYSPRINT DD SYSOUT=*

//SYSOUT DD SYSOUT=*

//SYSIN DD *

SORT FIELDS=(11,4,CH,A,7,4,CH,A),EQUALS

MODS E15=(ERBPPE15,36000,,N),E35=(ERBPPE35,3000,,N)

//SYSOUT DD SYSOUT=*

//SYSPRINT DD SYSOUT=*

//RMFPOST EXEC PGM=ERBRMFPP

//*

//MFPINPUT DD DISP=(OLD,DELETE),DSN=*.RMFSORT.SORTOUT

//MFPMSGDS DD SYSOUT=*

//SYSOUT DD SYSOUT=*

//PPXSRPTS DD SYSOUT=*

//SYSPRINT DD SYSOUT=*

//SYSIN DD *

SYSRPTS(WLMGL(SCLASS,RCLASS,SCPER,RCPER,SYSNAM(SC64)))

SYSOUT(T)

8.6.1 Workload activity when using a type 4 connection

When the applications running in WebSphere Application Server on z/OS use type 4 connectivity to DB2 for z/OS, the work consists of two pieces. The first part is the work that is done by application inside the application server, and the second part is the DB2 work. As the work comes into DB2 through DDF, it has its own enclaves (inside the DDF address space) to represent and account for the DB2 part of the work.

Workload activity that is reported for the DB2 work in the RTRADE0Z reporting class

Example 8-35 shows the workload activity report class period report for the RTRADE0Z reporting class for period 1. The DDFONL service class that is used by this reporting class uses two periods. The period 2 part is shown in Example 8-36 on page 445. As an example, we picked a one minute interval that stated at 21.29.01. (We reduced the RMF interval to 1 minute to have more granularity in the reports.)

In this one minute interval, DB2 completed 43968 transactions in period 1, or 732.8 transactions per second by running, on average, 6.91 threads (enclaves) in parallel.

Example 8-35 Workload activity - reporting class RTRADE0Z period 1

W O R K L O A D A C T I V I T Y

PAGE 72

z/OS V1R13 SYSPLEX SANDBOX DATE 08/13/2012 INTERVAL 01.00.000 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF TIME 21.29.01

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

------------------------------------------------------------------------------------------------------------ REPORT CLASS PERIODS

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE0Z PERIOD=1

HOMOGENEOUS: GOAL DERIVED FROM SERVICE CLASS DDFONL

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 6.91 ACTUAL 9 SSCHRT 3.9 IOC 0 CPU 24.780 CP 20.60 BLK 0.000 AVG 0.00

MPL 6.91 EXECUTION 9 RESP 0.0 CPU 1447K SRB 0.000 AAPCP 0.00 ENQ 0.000 TOTAL 0.00

ENDED 43968 QUEUED 0 CONN 0.0 MSO 0 RCT 0.000 IIPCP 0.07 CRM 0.000 SHARED 0.00

END/S 732.80 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.020

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.0 TOT 1447K HST 0.000 AAP 0.00 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 24117 AAP 0.000 IIP 20.69 SINGLE 0.0

AVG ENC 6.91 STD DEV 29 IIP 12.417 BLOCK 0.0

REM ENC 0.00 ABSRPTN 3491 SHARED 0.0

MS ENC 0.00 TRX SERV 3491 HSP 0.0

GOAL: RESPONSE TIME 000.00.01.000 FOR 90%

RESPONSE TIME EX PERF AVG --EXEC USING%-- -------------- EXEC DELAYS % ----------- -USING%- --- DELAY % --- %

SYSTEM ACTUAL% VEL% INDX ADRSP CPU AAP IIP I/O TOT CPU IIP CRY CNT UNK IDL CRY CNT QUI

SC64 100 4.9 0.5 6.4 1.5 0.0 1.2 0.0 52 52 0.7 0.0 0.0 45 0.0 0.0 0.0 0.0

----------RESPONSE TIME DISTRIBUTION----------

----TIME---- --NUMBER OF TRANSACTIONS-- -------PERCENT------- 0 10 20 30 40 50 60 70 80 90 100

HH.MM.SS.TTT CUM TOTAL IN BUCKET CUM TOTAL IN BUCKET |....|....|....|....|....|....|....|....|....|....|

< 00.00.00.500 43939 43939 100 100 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

<= 00.00.00.600 43941 2 100 0.0 >

<= 00.00.00.700 43955 14 100 0.0 >

<= 00.00.00.800 43957 2 100 0.0 >

<= 00.00.00.900 43966 9 100 0.0 >

<= 00.00.01.000 43968 2 100 0.0 >

<= 00.00.01.100 43968 0 100 0.0 >

<= 00.00.01.200 43968 0 100 0.0 >

<= 00.00.01.300 43968 0 100 0.0 >

<= 00.00.01.400 43968 0 100 0.0 >

<= 00.00.01.500 43968 0 100 0.0 >

<= 00.00.02.000 43968 0 100 0.0 >

<= 00.00.04.000 43968 0 100 0.0 >

> 00.00.04.000 43968 0 100 0.0 >

The number of CPU seconds that were needed to do these 43968 transactions is 24.78. The SERVICE CPU time includes the CPU time on zAAP and zIIP (12.417). So, we used 12.363 seconds on a general CP in this case.

The workload activity report also gives an indication, in percent of an engine, that this reporting or service class (period) used. This information can be found in the APPL% column. In this case, it is 20.6% of a general engine, or one-fifth of an engine. It also used 20.69% of a zIIP engine. In this example, the zIIP time is not included in the CP%.

Tip: The service time CPU includes the CPU seconds that are used on a zIIP or zAAP engine. The CP percentage in the APPL% column does not include the zIIP and zAAP processing.

A workload activity reporting or service class period report also indicates whether the WLM goal for the service class period is met. If the performance index (PI) is less than one, which is the case here, the goal is exceeded. When the PI =1, you meet the goal, and when the PI > 1, WLM could not achieve the goal that is specified in the policy.

After the performance index in a period report, there is also a response time distribution section. In this run, 43939 transactions out of 43968 completed in less than or equal to 0.5 seconds, so you are exceeding the response time goal of 000.00.01.000 second for 90% of the transactions.

Example 8-36 shows the period 2 information for the RTRADE0Z reporting class in this one minute interval. Only 24 transactions finished in period two.

Example 8-36 Workload activity - reporting class RTRADE0Z period 2

W O R K L O A D A C T I V I T Y

PAGE 73

z/OS V1R13 SYSPLEX SANDBOX DATE 08/13/2012 INTERVAL 01.00.000 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF TIME 21.29.01

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE0Z PERIOD=2

HOMOGENEOUS: GOAL DERIVED FROM SERVICE CLASS DDFONL

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 0.03 ACTUAL 156 SSCHRT 0.1 IOC 0 CPU 0.014 CP 0.02 BLK 0.000 AVG 0.00

MPL 0.03 EXECUTION 156 RESP 0.0 CPU 804 SRB 0.000 AAPCP 0.00 ENQ 0.000 TOTAL 0.00

ENDED 24 QUEUED 0 CONN 0.0 MSO 0 RCT 0.000 IIPCP 0.00 CRM 0.000 SHARED 0.00

END/S 0.40 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.000

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.0 TOT 804 HST 0.000 AAP 0.00 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 13 AAP 0.000 IIP 0.00 SINGLE 0.0

AVG ENC 0.03 STD DEV 255 IIP 0.003 BLOCK 0.0

REM ENC 0.00 ABSRPTN 500 SHARED 0.0

MS ENC 0.00 TRX SERV 500 HSP 0.0

GOAL: EXECUTION VELOCITY 40.0% VELOCITY MIGRATION: I/O MGMT 0.0% INIT MGMT 0.0%

RESPONSE TIME EX PERF AVG --EXEC USING%-- -------------- EXEC DELAYS % ----------- -USING%- --- DELAY % --- %

SYSTEM VEL% INDX ADRSP CPU AAP IIP I/O TOT CPU CRY CNT UNK IDL CRY CNT QUI

SC64 --N/A-- 0.0 0.0 0.0 0.0 0.0 0.0 0.0 50 50 0.0 0.0 50 0.0 0.0 0.0 0.0

----------RESPONSE TIME DISTRIBUTIONS----------

SYSTEM: SC64 ----INTERVAL: 00.01.00.000 ---MRT CHANGES: 0 ---

----TIME---- -NUMBER OF TRANSACTIONS- ------PERCENT------

HH.MM.SS.TTT CUM TOTAL IN BUCKET CUM TOTAL IN BUCKET

< 00.00.00.030 9 9 37.5 37.5

<= 00.00.00.036 14 5 58.3 20.8

<= 00.00.00.042 16 2 66.7 8.3

<= 00.00.00.048 17 1 70.8 4.2

<= 00.00.00.054 17 0 70.8 0.0

<= 00.00.00.061 18 1 75.0 4.2

<= 00.00.00.067 18 0 75.0 0.0

<= 00.00.00.073 18 0 75.0 0.0

<= 00.00.00.079 18 0 75.0 0.0

<= 00.00.00.085 18 0 75.0 0.0

<= 00.00.00.091 18 0 75.0 0.0

<= 00.00.00.122 18 0 75.0 0.0

<= 00.00.00.244 19 1 79.2 4.2

> 00.00.00.244 24 5 100 20.8

After looking at the different periods, look at all the transactions in the reporting class, with both periods combined, as shown in Example 8-37. The total number of transactions is 43992 (or 43968 in period 1 + 24 in period 2). As almost all the transactions that are completed in period 1, so the report for the entire report class is similar to the report of period 1. However, the report class report does not have a performance index or response time distribution information. That information is available only at the period report level.

Example 8-37 Workload activity - reporting class RTRADE0Z total

W O R K L O A D A C T I V I T Y

PAGE 74

z/OS V1R13 SYSPLEX SANDBOX DATE 08/13/2012 INTERVAL 01.00.000 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF TIME 21.29.01

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

------------------------------------------------------------------------------------------------------------ REPORT CLASS(ES)

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE0Z

DESCRIPTION =DDF DAY TRADER

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 6.93 ACTUAL 9 SSCHRT 4.0 IOC 0 CPU 24.794 CP 20.62 BLK 0.000 AVG 0.00

MPL 6.93 EXECUTION 9 RESP 0.0 CPU 1448K SRB 0.000 AAPCP 0.00 ENQ 0.000 TOTAL 0.00

ENDED 43992 QUEUED 0 CONN 0.0 MSO 0 RCT 0.000 IIPCP 0.07 CRM 0.000 SHARED 0.00

END/S 733.20 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.020

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.0 TOT 1448K HST 0.000 AAP 0.00 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 24130 AAP 0.000 IIP 20.70 SINGLE 0.0

AVG ENC 6.93 STD DEV 29 IIP 12.420 BLOCK 0.0

REM ENC 0.00 ABSRPTN 3480 SHARED 0.0

MS ENC 0.00 TRX SERV 3480 HSP 0.0

Workload activity that is reported for the WebSphere Application Server work in the RTRADE reporting class

The section “Workload activity that is reported for the DB2 work in the RTRADE0Z reporting class” on page 444 described the workload activity report for the DDF side of the work, that is, the part of the transaction where it runs SQL requests in DB2 for z/OS.

When you use a type 4 connection, the part of the transaction that is running inside the WebSphere Application Server is represented by a different enclave. It is classified by the Subsystem Type CB in the WLM classification panels. (WebSphere Application Server uses Subsystem Type CB for enclave work.)

In Figure 8-13 on page 375, we use the Transaction Class to select the service class (WASONL) and reporting class (RTRADE) that applies to the trader application. The workload activity report for SC64 (that runs our WebSphere Application Server) for period 1 of the RTRADE reporting class is shown in Example 8-38.

Example 8-38 Workload activity - reporting class RTRADE period 1

W O R K L O A D A C T I V I T Y

PAGE 70

z/OS V1R13 SYSPLEX SANDBOX DATE 08/13/2012 INTERVAL 01.00.000 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF TIME 21.29.01

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

------------------------------------------------------------------------------------------------------------ REPORT CLASS PERIODS

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE PERIOD=1

HOMOGENEOUS: GOAL DERIVED FROM SERVICE CLASS WASONL

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 33.02 ACTUAL 50 SSCHRT 0.0 IOC 0 CPU 60.113 CP 41.19 BLK 0.000 AVG 0.00

MPL 33.02 EXECUTION 49 RESP 0.0 CPU 3510K SRB 0.000 AAPCP 32.28 ENQ 0.000 TOTAL 0.00

ENDED 39697 QUEUED 0 CONN 0.0 MSO 0 RCT 0.000 IIPCP 0.00 CRM 0.000 SHARED 0.00

END/S 661.62 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.000

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.0 TOT 3510K HST 0.000 AAP 58.99 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 58504 AAP 35.396 IIP 0.00 SINGLE 0.0

AVG ENC 33.02 STD DEV 68 IIP 0.000 BLOCK 0.0

REM ENC 0.00 ABSRPTN 1772 SHARED 0.0

MS ENC 0.00 TRX SERV 1772 HSP 0.0

RESP -------------------------------- STATE SAMPLES BREAKDOWN (%) ------------------------------- ------STATE------

SUB P TIME --ACTIVE-- READY IDLE -----------------------------WAITING FOR----------------------------- SWITCHED SAMPL(%)

TYPE (%) SUB APPL LOCAL SYSPL REMOT

CB BTE 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

CB EXE 101 99.3 0.7 0.0 0.0 0.0 0.0 0.0

GOAL: RESPONSE TIME 000.00.01.000 FOR 90%

RESPONSE TIME EX PERF AVG --EXEC USING%-- -------------- EXEC DELAYS % ----------- -USING%- --- DELAY % --- %

SYSTEM ACTUAL% VEL% INDX ADRSP CPU AAP IIP I/O TOT AAP CPU Q CRY CNT UNK IDL CRY CNT QUI

MPL

SC64 100 13.5 0.5 31.0 0.3 0.9 0.0 0.0 7.8 7.5 0.2 0.1 0.0 0.0 91 0.0 0.0 0.0 0.0

----------RESPONSE TIME DISTRIBUTION----------

----TIME---- --NUMBER OF TRANSACTIONS-- -------PERCENT------- 0 10 20 30 40 50 60 70 80 90 100

HH.MM.SS.TTT CUM TOTAL IN BUCKET CUM TOTAL IN BUCKET |....|....|....|....|....|....|....|....|....|....|

< 00.00.00.500 39595 39595 100 100 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

<= 00.00.00.600 39637 42 100 0.1 >

<= 00.00.00.700 39652 15 100 0.0 >

<= 00.00.00.800 39666 14 100 0.0 >

<= 00.00.00.900 39678 12 100 0.0 >

<= 00.00.01.000 39684 6 100 0.0 >

<= 00.00.01.100 39687 3 100 0.0 >

<= 00.00.01.200 39689 2 100 0.0 >

<= 00.00.01.300 39689 0 100 0.0 >

<= 00.00.01.400 39689 0 100 0.0 >

<= 00.00.01.500 39689 0 100 0.0 >

<= 00.00.02.000 39697 8 100 0.0 >

<= 00.00.04.000 39697 0 100 0.0 >

> 00.00.04.000 39697 0 100 0.0 >

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE PERIOD=2

---------- ALL DATA ZERO ----------

It contains information similar to the workload activity report of the DDF work that was described in “Workload activity that is reported for the DB2 work in the RTRADE0Z reporting class” on page 444. However, consider the State Samples Breakdown section. State samples are collected on an ongoing basis and reported as a percentage of average transaction response time and have two phases:

•BTE phase: The begin-to-end phase applies to requests that are handled by the application control region (ACR).

•EXE phase: The execution phase applies to requests that are handled by the
servant regions.

The performance is good; the sampled state did not show any delays where the work
is waiting.

For more information about WLM Delay Monitoring, go to the following website:

http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/index.jsp?topic=%2Fcom.ibm.websphere.zseries.doc%2Fae%2Frprf_wlmdm.html

8.6.2 Workload activity when using a type 2 connection

When the applications that are running in WebSphere Application Server on z/OS use type 2 connectivity to DB2 for z/OS, the work is reported as one piece (compared to two pieces in the case of type 4). The enclave now includes both the work in WebSphere Application Server and the work in DB2 (when running SQL statements).

Workload activity that is reported for the RTRADE reporting class

To show some additional functionality of the RMF post-processor, we use a duration report. A duration report does not report on individual RMF intervals, but allows multiple intervals to be grouped.

Note: There are some caveats when using duration reports. For more information, go to:

http://publib.boulder.ibm.com/infocenter/zos/v1r13/topic/com.ibm.zos.r13.erbb200/dintv.htm

Example 8-39 shows the SYSIN DD statements that are used to create a workload activity report for the 9 minute time frame 22:39 - 22:48.

Example 8-39 Duration report SYSIN

//SYSIN DD *

SUMMARY(TOT,INT)

DATE(08142012,08142012)

RTOD(2239,2248)

DINTV(0009)

REPORTS(CPU)

SYSRPTS(WLMGL(SCLASS,RCLASS,SCPER,RCPER))

SYSOUT(T)

Example 8-40 shows the workload activity report class period report for the RTRADE reporting class for period 1. The report looks similar to the ones we looked at before. When using a type 2 connection, all the work that is done by the WebSphere Application Server application, including the SQL activity, is done under the enclave that is created by the WebSphere Application Server control region. As a result, the appl % CP is much higher than in the type 4 case, which is expected, as the DB2 work is also included now. The state samples breakdown shows a small percentage of delay for TYP8, which is the J2C Resource manager delay when you call a J2C connector to resource managers, such as DB2.

Example 8-40 Workload activity - reporting class RTRADE period 1

W O R K L O A D A C T I V I T Y

PAGE 72

z/OS V1R13 SYSPLEX SANDBOX START 08/14/2012-22.39.00 INTERVAL 000.08.59 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF END 08/14/2012-22.47.59

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

------------------------------------------------------------------------------------------------------------ REPORT CLASS PERIODS

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE PERIOD=1

HOMOGENEOUS: GOAL DERIVED FROM SERVICE CLASS WASONL

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 27.23 ACTUAL 39 SSCHRT 0.2 IOC 0 CPU 554.785 CP 80.93 BLK 0.000 AVG 0.00

MPL 27.23 EXECUTION 37 RESP 0.2 CPU 32396K SRB 0.000 AAPCP 0.62 ENQ 0.000 TOTAL 0.00

ENDED 392496 QUEUED 1 CONN 0.1 MSO 0 RCT 0.000 IIPCP 0.00 CRM 0.000 SHARED 0.00

END/S 726.85 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.008

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.1 TOT 32396K HST 0.000 AAP 21.81 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 59994 AAP 117.756 IIP 0.00 SINGLE 0.0

AVG ENC 27.23 STD DEV 37 IIP 0.000 BLOCK 0.0

REM ENC 0.00 ABSRPTN 2203 SHARED 0.0

MS ENC 0.00 TRX SERV 2203 HSP 0.0

RESP -------------------------------- STATE SAMPLES BREAKDOWN (%) ------------------------------- ------STATE------

SUB P TIME --ACTIVE-- READY IDLE -----------------------------WAITING FOR----------------------------- SWITCHED SAMPL(%)

TYPE (%) SUB APPL TYP8 LOCAL SYSPL REMOT

CB BTE 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

CB EXE 95.6 99.0 0.5 0.0 0.0 0.4 0.0 0.0 0.0

CB : TYP8 - CONNECTOR

GOAL: RESPONSE TIME 000.00.01.000 FOR 90%

RESPONSE TIME EX PERF AVG --EXEC USING%-- -------------- EXEC DELAYS % ----------- -USING%- --- DELAY % --- %

SYSTEM ACTUAL% VEL% INDX ADRSP CPU AAP IIP I/O TOT CPU AAP Q CRY CNT UNK IDL CRY CNT QUI

MPL

SC64 100 2.2 0.5 28.4 1.5 0.1 0.0 0.0 70 69 0.6 0.5 0.0 0.0 28 0.0 0.0 0.0 0.0

----------RESPONSE TIME DISTRIBUTION----------

----TIME---- --NUMBER OF TRANSACTIONS-- -------PERCENT------- 0 10 20 30 40 50 60 70 80 90 100

HH.MM.SS.TTT CUM TOTAL IN BUCKET CUM TOTAL IN BUCKET |....|....|....|....|....|....|....|....|....|....|

< 00.00.00.500 392K 392K 100 100 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

<= 00.00.00.600 392K 34 100 0.0 >

<= 00.00.00.700 392K 9 100 0.0 >

<= 00.00.00.800 392K 7 100 0.0 >

<= 00.00.00.900 392K 6 100 0.0 >

<= 00.00.01.000 392K 4 100 0.0 >

<= 00.00.01.100 392K 7 100 0.0 >

<= 00.00.01.200 392K 12 100 0.0 >

<= 00.00.01.300 392K 5 100 0.0 >

<= 00.00.01.400 392K 24 100 0.0 >

<= 00.00.01.500 392K 18 100 0.0 >

<= 00.00.02.000 392K 12 100 0.0 >

<= 00.00.04.000 392K 7 100 0.0 >

> 00.00.04.000 392K 0 100 0.0 >

Example 8-41 shows the similar workload activity report class period report for the RTRADE reporting class for period 2. Only 48 transactions ended in this period.

Example 8-41 Workload activity - reporting class RTRADE period 2

W O R K L O A D A C T I V I T Y

PAGE 73

z/OS V1R13 SYSPLEX SANDBOX START 08/14/2012-22.39.00 INTERVAL 000.08.59 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF END 08/14/2012-22.47.59

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE PERIOD=2

HOMOGENEOUS: GOAL DERIVED FROM SERVICE CLASS WASONL

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 0.05 ACTUAL 633 SSCHRT 0.1 IOC 0 CPU 5.884 CP 0.35 BLK 0.000 AVG 0.00

MPL 0.05 EXECUTION 631 RESP 0.2 CPU 343611 SRB 0.000 AAPCP 0.05 ENQ 0.000 TOTAL 0.00

ENDED 48 QUEUED 2 CONN 0.2 MSO 0 RCT 0.000 IIPCP 0.00 CRM 0.000 SHARED 0.00

END/S 0.09 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.012

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.1 TOT 343611 HST 0.000 AAP 0.74 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 636 AAP 3.982 IIP 0.00 SINGLE 0.0

AVG ENC 0.05 STD DEV 2.233 IIP 0.000 BLOCK 0.0

REM ENC 0.00 ABSRPTN 13K SHARED 0.0

MS ENC 0.00 TRX SERV 13K HSP 0.0

GOAL: EXECUTION VELOCITY 40.0% VELOCITY MIGRATION: I/O MGMT 31.2% INIT MGMT 31.2%

RESPONSE TIME EX PERF AVG --EXEC USING%-- -------------- EXEC DELAYS % ----------- -USING%- --- DELAY % --- %

SYSTEM VEL% INDX ADRSP CPU AAP IIP I/O TOT AAP CPU CRY CNT UNK IDL CRY CNT QUI

SC64 --N/A-- 31.2 1.3 0.0 7.3 15 0.0 0.0 49 46 2.8 0.0 0.0 29 0.0 0.0 0.0 0.0

----------RESPONSE TIME DISTRIBUTIONS----------

SYSTEM: SC64 ----INTERVAL: 00.08.59.998 ---MRT CHANGES: 0 ---

----TIME---- -NUMBER OF TRANSACTIONS- ------PERCENT------

HH.MM.SS.TTT CUM TOTAL IN BUCKET CUM TOTAL IN BUCKET

< 00.00.01.193 45 45 93.8 93.8

<= 00.00.01.432 45 0 93.8 0.0

<= 00.00.01.670 45 0 93.8 0.0

<= 00.00.01.909 45 0 93.8 0.0

<= 00.00.02.148 45 0 93.8 0.0

<= 00.00.02.387 45 0 93.8 0.0

<= 00.00.02.625 45 0 93.8 0.0

<= 00.00.02.864 45 0 93.8 0.0

<= 00.00.03.103 45 0 93.8 0.0

<= 00.00.03.341 45 0 93.8 0.0

<= 00.00.03.580 45 0 93.8 0.0

<= 00.00.04.774 45 0 93.8 0.0

<= 00.00.09.548 47 2 97.9 4.2

> 00.00.09.548 48 1 100 2.1

Example 8-42 shows the workload activity report for the RTRADE (looking at the overall DRDA activity of 392544 transactions).

Example 8-42 Workload activity - reporting class for trade (DRDA)

W O R K L O A D A C T I V I T Y

PAGE 74

z/OS V1R13 SYSPLEX SANDBOX START 08/14/2012-22.39.00 INTERVAL 000.08.59 MODE = GOAL

CONVERTED TO z/OS V1R13 RMF END 08/14/2012-22.47.59

POLICY ACTIVATION DATE/TIME 08/10/2012 19.52.00

------------------------------------------------------------------------------------------------------------ REPORT CLASS(ES)

REPORT BY: POLICY=WLMPOL REPORT CLASS=RTRADE

DESCRIPTION =report class for trade (drda)

-TRANSACTIONS- TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE--- SERVICE TIME ---APPL %--- --PROMOTED-- ----STORAGE----

AVG 27.28 ACTUAL 39 SSCHRT 0.2 IOC 0 CPU 560.670 CP 81.28 BLK 0.000 AVG 0.00

MPL 27.28 EXECUTION 37 RESP 0.2 CPU 32740K SRB 0.000 AAPCP 0.67 ENQ 0.000 TOTAL 0.00

ENDED 392544 QUEUED 1 CONN 0.1 MSO 0 RCT 0.000 IIPCP 0.00 CRM 0.000 SHARED 0.00

END/S 726.94 R/S AFFIN 0 DISC 0.0 SRB 0 IIT 0.000 LCK 0.020

#SWAPS 0 INELIGIBLE 0 Q+PEND 0.1 TOT 32740K HST 0.000 AAP 22.54 SUP 0.000 -PAGE-IN RATES-

EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 60630 AAP 121.738 IIP 0.00 SINGLE 0.0

AVG ENC 27.28 STD DEV 45 IIP 0.000 BLOCK 0.0

REM ENC 0.00 ABSRPTN 2223 SHARED 0.0

MS ENC 0.00 TRX SERV 2223 HSP 0.0

¹ This Java public class, Class BrokerClientInfo, is a data structure that is used to describe client information.

² The IDBACK subsystem parameter determines the maximum number of concurrent connections that can be identified to DB2 from batch.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8. Monitoring WebSphere Application Server applications

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 8. Monitoring WebSphere Application Server applications