Chapter 5. Runtime diagnostics

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Runtime diagnostics

This chapter describes the enhancements that were implemented for runtime diagnostics (RTD) that are running in z/OS V2R2.

RTD is a z/OS component that helps to find and remove soft failures that might lead to sick but not dead (SBND) situations.

This chapter includes the following topics:

•5.1, “RTD overview” on page 44

•5.2, “Health-based routing integration with runtime diagnostics” on page 46

5.1 RTD overview

RTD was originally introduced in z/OS V1R12. It analyzes SBND systems quickly and searches for evidence of soft failures. For more information about soft failures, see “PFA overview” on page 38. Soft failures can be of the following areas:

•Component issues

•Global resource contention

•Important address space execution issues

You can use RTD when your operations staff report a problem on the system. The benefit of RTD is that it provides a timely, comprehensive analysis at a critical time without the need for a storage dump. This advantage can save you time.

You can use RTD to quickly analyze an ailing system for the following types of problems:

•Component problems that are identified as critical messages in OPERLOG

•ENQ, GRS latch contention for system address spaces, and z/OS UINX file system contention

•Address spaces with high CPU usage

•Address spaces that appear to be in a task control block (TCB) enabled loop

•Local lock conditions

•JES2 health exceptions

•Server address space health exceptions

With that information, you can take the next step, including the following tasks:

•Cancel the relevant jobs

•Further investigate the class of resources, or a single address space by using a monitor, such as IBM RMF™ or Omegamon XE for z/OS.

Use the following z/OS command to start RTD from your console or SDSF:

S HZR,SUB=MSTR

You can then start analyzing your system by entering the following command:

F HZR,ANALYZE

When you enter the analyze command, a report displays, as shown in Figure 5-1.

F HZR,ANALYZE

HZR0200I RUNTIME DIAGNOSTICS RESULT 319

SUMMARY: SUCCESS

REQ: 001 TARGET SYSTEM: SC81 HOME: SC81 2015/08/18 - 13:40:11

INTERVAL: 60 MINUTES

EVENTS:

FOUND: 04 - PRIORITIES: HIGH:02 MED:02 LOW:00

TYPES: CF:01 DUMPS:02 ENQ:01

----------------------------------------------------------------------

EVENT 01: HIGH - ENQ - SYSTEM: SC81 2015/08/18 - 13:40:11

ENQ WAITER - ASID:0035 - JOBNAME:HZSPROC - SYSTEM:SC81

ENQ BLOCKER - ASID:0014 - JOBNAME:HZSPROC - SYSTEM:SC81

QNAME: SYSDSN

RNAME: SYS1.SC81.HZSPDATA

ERROR: ADDRESS SPACES MIGHT BE IN ENQ CONTENTION.

ACTION: USE YOUR SOFTWARE MONITORS TO INVESTIGATE BLOCKING JOBS AND

ACTION: ASIDS.

----------------------------------------------------------------------

EVENT 02: HIGH - CF - SYSTEM: SC81 2015/08/18 - 12:42:41

IXC585E STRUCTURE HZS_HEALTHCHKLOG IN COUPLING FACILITY CF8B,

PHYSICAL STRUCTURE VERSION CF61F249 A0D1E082,

IS AT OR ABOVE STRUCTURE FULL MONITORING THRESHOLD OF 80%:

SPACE USAGE IN-USE TOTAL %

ENTRIES: 1645 1954 84

ERROR: INDICATED STRUCTURE IS APPROACHING FULL MONITORING THRESHOLD.

ACTION: D XCF,STR,STRNAME=strname TO GET STRUCTURE INFORMATION.

ACTION: INCREASE STRUCTURE SIZE OR TAKE ACTION AGAINST APPLICATION.

----------------------------------------------------------------------

EVENT 03: MED - DUMPS - SYSTEM: SC81 2015/08/18 - 13:05:10

IEA799I AUTOMATIC ALLOCATION OF SVC DUMP DATASET FAILED

DUMPID=018 REQUESTED BY JOB (CONSOLE )

DYNALLOC FAILED RETURN CODE=04 ERROR RSN CODE=970C INFO RSN CODE=0000

SMS RSN CODE=4379

ERROR: THE SYSTEM WAS UNABLE TO ALLOCATE A DUMP DATA SET FOR A DUMP.

ACTION: D D TO VIEW ALLOCATION STATUS. DD ADD,VOL=volser TO ADD DUMP

ACTION: RESOURCES.

----------------------------------------------------------------------

EVENT 04: MED - DUMPS - SYSTEM: SC81 2015/08/18 - 13:05:20

IEA799I AUTOMATIC ALLOCATION OF SVC DUMP DATASET FAILED

DUMPID=019 REQUESTED BY JOB (HSIBMGR )

DYNALLOC FAILED RETURN CODE=04 ERROR RSN CODE=970C INFO RSN CODE=0000

SMS RSN CODE=4379

ERROR: THE SYSTEM WAS UNABLE TO ALLOCATE A DUMP DATA SET FOR A DUMP.

ACTION: D D TO VIEW ALLOCATION STATUS. DD ADD,VOL=volser TO ADD DUMP

ACTION: RESOURCES.

----------------------------------------------------------------------

Figure 5-1 Output of the RTD analyze command

5.2 Health-based routing integration with runtime diagnostics

In z/OS V2R2, health-based routing is an enhancement to Workload Manager (WLM) dynamic workload routing. The focus here is to further reduce the effect that is caused by middleware or transaction manager server health issues.

WLM provides a health service that is called IWM4HLTH to enable multiple callers to report on a server’s health. The server identifies itself and can provide reasons for its health ratings.

When you run the F HZR,ANALYZE command, RTD starts a new query service that is called IWM4QHLT. This service obtains server health states. The information is then used for diagnostic and serviceability purposes.

If any servers show a current health value that is less than 100, a SERVERHEALTH event is returned to PFA, and PFA starts RTD for health checks that can indicate that the metric is too low. The event is included in the predictive failure analysis (PFA) check exception report. The health indicator is a number that shows how well a server is performing. It can be an integer number of 0 - 100.

Benefits of these new functions are improved routing recommendations and diagnostic reporting about server health states.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5. Runtime diagnostics

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 5. Runtime diagnostics