Chapter 4. What is new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

What is new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

This chapter provides details about what is new with Cluster Aware AIX (CAA) and with Reliable Scalable Clustering Technology (RSCT).

This chapter covers the following topics:

•Cluster Aware AIX

•Automatic repository update for the repository disk

•Reliable Scalable Cluster Technology overview

•PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX

4.1 Cluster Aware AIX

This section describes some of the new CAA features.

4.1.1 Cluster Aware AIX tunables

This section and mention CAA tunables and how they behave. Example 4-1 shows the list of the CAA tunables with IBM AIX 7.2.0.0 and IBM PowerHA V7.2.0. Newer versions can have more tunables, different defaults, or both.

Attention: Do not change any of these tunables without the explicit permission of IBM technical support.

In general, you must never modify these values because these values are modified and managed by PowerHA.

Example 4-1 List of Cluster Aware AIX tunables

# clctrl -tune -a

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).communication_mode = u

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).config_timeout = 240

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).deadman_mode = a

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).link_timeout = 30000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).local_merge_policy = m

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).network_fdt = 20000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).no_if_traffic_monitor = 0

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_down_delay = 10000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).node_timeout = 30000

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).packet_ttl = 32

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).remote_hb_factor = 1

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).repos_mode = e

ha72cluster(71a0d83c-e467-11e5-8022-4217e0ce7b02).site_merge_policy = p

4.1.2 What is new in Cluster Aware AIX: Overview

The following new features are included in CAA:

•Automatic Repository Update (ARU)

Also known as Automatic Repository Replacement (ARR). For more information, see 4.2, “Automatic repository update for the repository disk” on page 79.

•Monitor /var usage

For more information, see 4.1.3, “Monitoring /var usage” on page 67.

•New -g option for the lscluster command

For more information, see 4.1.4, “New lscluster option -g” on page 69.

•Interface Failure Detection:

– Tuning for Interface Failure Detection

– Send multicast packets to generate incoming traffic

– Implementation of network monitor (NETMON) within CAA

For more information, see 4.1.5, “Interface failure detection” on page 78.

•Functional enhancements

– Reduce dependency of CAA node name on host name

– Roll back on mkcluster failure or partial success

•Reliability, availability, and serviceability (RAS) enhancements

– Message improvements

– Several syslog.caa serviceability improvements

– Enhanced Dead Man Switch (DMS) error logging

4.1.3 Monitoring /var usage

Starting with PowerHA V7.2, the /var file system is monitored by default. This monitoring is done by the clconfd subsystem. The following default values are used:

Threshold 75% (range 70 - 95)

Interval 15 min (range 5 - 30)

To change the default values, use the chssys command. The -t option is used to specify the threshold in % and the -i option is used to specify the interval:

chssys -s clconfd -a "-t 80 -i 10"

To check what values are currently used, you have two options: You can use the ps -ef | grep clconfd or the odmget -q "subsysname='clconfd'" SRCsubsys command. Example 4-2 shows the output of the two commands with default values. When using the odmget command, the cmdargs line has no arguments that are listed. The same happens if ps -ef is used because there are no arguments that are displayed after clconfd.

Example 4-2 Check clconfd (when default values are used)

# ps -ef | grep clconfd

root 3713096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd

# odmget -q "subsysname='clconfd'" SRCsubsys

SRCsubsys:

subsysname = "clconfd"

synonym = ""

cmdargs = ""

path = "/usr/sbin/clconfd"

uid = 0

auditid = 0

standin = "/dev/null"

standout = "/dev/null"

standerr = "/dev/null"

action = 1

multi = 0

contact = 2

svrkey = 0

svrmtype = 0

priority = 20

signorm = 2

sigforce = 9

display = 1

waittime = 20

grpname = "caa"

Example 4-3 shows what happens when you change the default values, and what the output of odmget and ps -ef looks like after that change.

Important: You need to stop and start the subsystem to activate your changes.

Example 4-3 Change monitoring for /var

# chssys -s clconfd -a "-t 80 -i 10"

0513-077 Subsystem has been changed

# stopsrc -s clconfd

0513-044 The clconfd Subsystem was requested to stop.

# startsrc -s clconfd

0513-059 The clconfd Subsystem has been started. Subsystem PID is 13173096.

# ps -ef | grep clconfd

root 13173096 3604778 0 17:50:30 - 0:00 /usr/sbin/clconfd -t 80 -i 10

# odmget -q "subsysname='clconfd'" SRCsubsys

SRCsubsys:

subsysname = "clconfd"

synonym = ""

cmdargs = "-t 80 -i 10"

path = "/usr/sbin/clconfd"

uid = 0

auditid = 0

standin = "/dev/null"

standout = "/dev/null"

standerr = "/dev/null"

action = 1

multi = 0

contact = 2

svrkey = 0

svrmtype = 0

priority = 20

signorm = 2

sigforce = 9

display = 1

waittime = 20

grpname = "caa"

If the threshold is exceeded, then you get an entry in the error log. Example 4-4 shows what such an error entry can look like.

Example 4-4 Error message of /var monitoring

LABEL: CL_VAR_FULL

IDENTIFIER: E5899EEB

Date/Time: Fri Nov 13 17:47:15 2015

Sequence Number: 1551

Machine Id: 00F747C94C00

Node Id: esp-c2n1

Class: S

Type: PERM

WPAR: Global

Resource Name: CAA (for RSCT)

Description

/var filesystem is running low on space

Probable Causes

Unknown

Failure Causes

Unknown

Recommended Actions

RSCT could malfunction if /var gets full

Increase the filesystem size or delete unwanted files

Detail Data

Percent full

Percent threshold

4.1.4 New lscluster option -g

Starting with AIX 7.1 TL4 and AIX 7.2, there is an additional option for the CAA lscluster command. The new option -g lists the used communication paths of CAA.

Note: At the time of writing, this option was not available in AIX versions earlier than
AIX 7.1.4.

The lscluster -i command lists all of the seen communication paths by CAA but it does not show if all of them can potentially be used for heartbeating. This is particularly the case if you use a network that is set to private, or if you removed a network from the PowerHA configuration.

Using all interfaces

When using the standard way to configure a cluster, all configured networks in AIX are added to the PowerHA and CAA configuration. In our test cluster, we configured two IP interfaces in AIX. Example 4-5 shows the two networks in our PowerHA configuration, all set to public.

Example 4-5 The cllsif command with all interfaces on public

> cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

n1adm boot adm_net ether public powerha-c2n1 10.17.1.100 en1 255.255.255.0 24

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

n2adm boot adm_net ether public powerha-c2n2 10.17.1.110 en1 255.255.255.0 24

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16

In this case, the lscluster -i output looks like what is shown in Example 4-6.

Example 4-6 The lscluster -i command (all interfaces on public)

> lscluster -i

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

Example 4-7 shows the output of the lscluster -g command. When you compare the output of the lscluster -g command with the lscluster -i command, you should not find any differences. There are no differences because all of the networks are allowed to potentially be used for heartbeat in this example.

Example 4-7 The lscluster -g command output in relation to the cllsif output

# > lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.100 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 3

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, en1

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:05

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 10.17.1.110 broadcast 10.17.1.255 netmask 255.255.255.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 3, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

One network set to private

The following examples in this section describe the lscluster command output when you decide to change one or more networks to private. Example 4-8 shows the starting point for this example. In our testing environment, we changed one network to private.

Note: Private networks cannot be used for any services. When you want to use a service IP address, the network must be public.

Example 4-8 The clslif command (private)

# cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

n1adm service adm_net ether private powerha-c2n1 10.17.1.100 en1 255.255.255.0 24

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

n2adm service adm_net ether private powerha-c2n2 10.17.1.110 en1 255.255.255.0 24

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2

172.16.150.125 255.255.0.0

Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same, as shown in Example 4-6 on page 70.

Remember: You must synchronize your cluster before the change to private is visible in CAA.

Example 4-9 shows the lscluster -g command output after the synchronization. If you now compare the output of the lscluster -g command with the lscluster -i command or with the lscluster -g output from the previous example, you see that the entries about en1 (in our example) do not appear any longer. The list of networks potentially allowed to be used for heartbeat is shorter.

Example 4-9 The lscluster -g command (one private network)

# lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 55430510-e6a7-11e5-8035-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 55284db0-e6a7-11e5-8035-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 55284df6-e6a7-11e5-8035-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Removing networks from PowerHA

The examples in this section describe the lscluster command output when you remove one or more networks from the list of known networks in PowerHA. Example 4-10 shows the starting point for this example. In our test environment, we removed the adm_net network.

Example 4-10 The cllsif command (removed network)

# cllsif

Adapter Type Network Net Type Attribute Node IP Address Hardware Address Interface Name Global Name Netmask Alias for HB Prefix Length

powerha-c2n1 boot service_net ether public powerha-c2n1 172.16.150.121 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n1 172.16.150.125 255.255.0.0 16

powerha-c2n2 boot service_net ether public powerha-c2n2 172.16.150.122 en0 255.255.0.0 16

c2svc service service_net ether public powerha-c2n2 172.16.150.125 255.255.0.0 16

Because we did not change the architecture of our cluster, the output of the lscluster -i command is still the same as listed in Example 4-6 on page 70.

You must synchronize your cluster before the change to private is visible in CAA.

Example 4-11 shows the lscluster -g output after the synchronization. If you now compare the output of the lscluster -g command with the previous lscluster -i command, or with the lscluster -g output in “Using all interfaces” on page 69, you see that the entries about en1 (in our example) do not appear.

When you compare the content of Example 4-11 with the content of Example 4-9 on page 74 in “One network set to private” on page 74, you see that the output of the lscluster -g commands is identical.

Example 4-11 The lscluster -g command output (removed network)

# lscluster -g

Network/Storage Interface Query

Cluster Name: ha72cluster

Cluster UUID: 63d12f4e-e61b-11e5-8016-4217e0ce7b02

Number of nodes reporting = 2

Number of nodes stale = 0

Number of nodes expected = 2

Node powerha-c2n1.munich.de.ibm.com

Node UUID = 63b68a36-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E0:CE:7B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.121 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

Node powerha-c2n2.munich.de.ibm.com

Node UUID = 63b68a86-e61b-11e5-8016-4217e0ce7b02

Number of interfaces discovered = 2

Interface number 1, en0

IFNET type = 6 (IFT_ETHER)

NDD type = 7 (NDD_ISO88023)

MAC address length = 6

MAC address = 42:17:E4:E6:1B:02

Smoothed RTT across interface = 0

Mean deviation in network RTT across interface = 0

Probe interval for interface = 990 ms

IFNET flags for interface = 0x1E084863

NDD flags for interface = 0x0021081B

Interface state = UP

Number of regular addresses configured on interface = 1

IPv4 ADDRESS: 172.16.150.122 broadcast 172.16.255.255 netmask 255.255.0.0

Number of cluster multicast addresses configured on interface = 1

IPv4 MULTICAST ADDRESS: 228.16.150.121

Interface number 2, dpcom

IFNET type = 0 (none)

NDD type = 305 (NDD_PINGCOMM)

Smoothed RTT across interface = 750

Mean deviation in network RTT across interface = 1500

Probe interval for interface = 22500 ms

IFNET flags for interface = 0x00000000

NDD flags for interface = 0x00000009

Interface state = UP RESTRICTED AIX_CONTROLLED

root@powerha-c2n1:/>

4.1.5 Interface failure detection

PowerHA V7.1 had a fixed latency for network failure detection that was about 5 seconds. In PowerHA V7.2, the default is now 20 seconds. The tunable is named network_fdt.

Note: The network_fdt tunable is also available in PowerHA V7.1.3. To get it for your PowerHA V7.1.3 version, you must open a PMR and request the Tunable FDT interim fix bundle.

The self-adjusting network heartbeating behavior (CAA) that was introduced with PowerHA V7.1.0 is still there and is used. It has no impact in the network failure detection time.

The network_fdt tunable can be set to zero to maintain the default behavior. The tunable can be set to 5 - 10 seconds less than node_timeout.

The default recognition time for a network problem is not affected by this tunable. It is 0 for hard failures and 5 seconds for soft failures (since PowerHA V7.1.0). CAA continues to check the network, but it waits until the end of the defined timeout to create a network down event.

For PowerHA nodes, when the effective level of CAA is 4, also known as the 2015 release, CAA automatically sets the network_fdt to 20 seconds and the node_timeout to 30 seconds.

To check for the effective CAA level, use the lscluster -c command. The last two lines of the lscluster -c output list the local CAA level and the effective CAA level. In normal situations, these two show the same level. In case of an operating system update, it can temporarily show different levels. Example 4-12 shows the numbers that you get when you are on CAA level 4.

Example 4-12 The lscluster -c command to check for CAA level

# lscluster -c

Cluster Name: ha72cluster

Cluster UUID: 55430510-e6a7-11e5-8035-4217e0ce7b02

Number of nodes in cluster = 2

Cluster ID for node powerha-c2n1: 1

Primary IP address for node powerha-c2n1: 172.16.150.121

Cluster ID for node powerha-c2n2: 2

Primary IP address for node powerha-c2n2: 172.16.150.122

Number of disks in cluster = 1

Disk = hdisk2 UUID = 7c83d44b-9ac1-4ed7-ce3f-4e950f7ac9c6 cluster_major = 0 cluster_minor = 1

Multicast for site LOCAL: IPv4 228.16.150.121 IPv6 ff05::e410:9679

Communication Mode: unicast

Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Local node max level: 40000

Effective cluster level: 40000

Note: Depending on the installed AIX Service Pack and fixes, the CAA level might not be displayed.

In this case, the only way to know whether the CAA level is 4 is to check whether AUTO_REPOS_REPLACE is listed for the effective cluster-wide capabilities in the output of lscluster -c command.

For example, use the following command:

# lscluster -c | grep "Effective cluster-wide capabilities"
Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE
#

Example 4-13 shows how to both check and change the CAA network tunable attribute by using the CAA native clctrl command.

Example 4-13 Using clctrl to change the CAA network tunable

# clctrl -tune -o network_fdt

HA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 20000

# clctrl -tune -o network_fdt=10000

1 tunable updated on cluster PHA72a_cluster

# clctrl -tune -o network_fdt

PHA72a_cluster(641d80c2-bd87-11e5-8005-96d75a7c7f02).network_fdt = 10000

4.2 Automatic repository update for the repository disk

This section describes the new PowerHA ARU feature for the PowerHA repository disk.

4.2.1 Introduction to the Automatic Repository Update

Starting with PowerHA V7.0.0, PowerHA uses a shared disk, which is called the PowerHA repository disk, for various purposes. The availability of this repository disk is critical to the operation of PowerHA clustering and its nodes. The initial implementation of the repository disk, at PowerHA V7.0.0, did not allow for the operation of PowerHA cluster services if the repository disk failed, making that a single point of failure (SPOF).

With later versions of PowerHA, features were added to make the cluster more resilient if there is a PowerHA repository disk failure. The ability to survive a repository disk failure, in addition to the ability to manually replace a repository disk without an outage, increased the resiliency of PowerHA. With PowerHA V7.2.0, a new feature to increase the resiliency further was introduced, and this is called ARU.

If there is an active repository disk failure, the purpose of ARU is to automate the replacement of a PowerHA repository disk without intervention from a system administrator and without affecting the active cluster services. All that is needed is to point PowerHA to the backup repository disks to use if there is an active repository disk failure.

If a repository disk fails, PowerHA detects the failure of the active repository disk. At that point, it verifies that the active repository disk is not usable. If the active repository disk is unusable, it attempts to switch to the backup repository disk. If it is successful, then the backup repository disk becomes the active repository disk.

4.2.2 Requirements for Automatic Repository Update

ARU has the following requirements:

•AIX 7.1.4 or AIX 7.2.0.

•PowerHA V7.2.0.

•The storage that is used for the backup repository disk has the same requirements as the primary repository disk.

For more information about the PowerHA repository disk requirements, see IBM Knowledge Center.

4.2.3 Configuring Automatic Repository Update

The configuration of ARU is automatic when you configure a backup repository disk for PowerHA. Essentially, all you must do is configure a backup repository disk.

This section shows an example of ARU in a 2-site, 2-node cluster. The cluster configuration is similar to Figure 4-1.

Figure 4-1 Storage example for PowerHA ARU showing linked and backup repository disks

For the purposes of this example, we configure a backup repository disk for each site of this 2-site cluster.

Configuring a backup repository disk

The following process details how to configure a backup repository disk. For our example, we perform this process for each site in our cluster.

1. Using SMIT, run smitty sysmirror and select Cluster Nodes and Networks → Cluster Nodes and Networks → Add a Repository Disk. You are prompted for a site due to the fact that our example is a 2-site cluster, and then given a selection of possible repository disks. The panels that are shown in the following sections provide more details.

When you select Add a Repository Disk, you are prompted to select a site, as shown in Example 4-14.

Example 4-14 Selecting Add a Repository Disk in a multi-site cluster

Manage Repository Disks

Move cursor to desired item and press Enter.

Add a Repository Disk

Remove a Repository Disk

Show Repository Disks

Verify and Synchronize Cluster Configuration

+--------------------------------------------------------------------------+

| Select a Site |

| |

| Move cursor to desired item and press Enter. |

| |

| primary_site1 |

| standby_site2 |

| |

| F1=Help F2=Refresh F3=Cancel |

| F8=Image F10=Exit Enter=Do |

F1| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

2. After selecting primary_site1, Example 4-15 shows the repository disk menu.

Example 4-15 Add a Repository Disk panel

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

3. Next, press F4 on the Repository Disk field, and you are shown the repository disk selection list, as shown in Example 4-16.

Example 4-16 Backup repository disk selection

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [] +

+--------------------------------------------------------------------------+

| Repository Disk |

| |

| Move cursor to desired item and press F7. |

| ONE OR MORE items can be selected. |

| Press Enter AFTER making all selections. |

| |

| hdisk3 (00f61ab295112078) on all nodes at site primary_site1 |

| hdisk4 (00f61ab2a61d5bc6) on all nodes at site primary_site1 |

| hdisk5 (00f61ab2a61d5c7e) on all nodes at site primary_site1 |

| |

| F1=Help F2=Refresh F3=Cancel |

F1| F7=Select F8=Image F10=Exit |

F5| Enter=Do /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

4. After selecting the appropriate disk, the choice is shown in Example 4-17.

Example 4-17 Add a Repository Disk preview panel

Add a Repository Disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name primary_site1

* Repository Disk [(00f61ab295112078)] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

5. Next, after pressing the Enter key to make the changes, the confirmation panel appears, as shown in Example 4-18.

Example 4-18 Backup repository disk addition confirmation panel

COMMAND STATUS

Command: OK stdout: yes stderr: no

Before command completion, additional instructions may appear below.

Successfully added one or more backup repository disks.

To view the complete configuration of repository disks use:

"clmgr query repository" or "clmgr view report repository"

F1=Help F2=Refresh F3=Cancel F6=Command

F8=Image F9=Shell F10=Exit /=Find

4.2.4 Automatic Repository Update operations

PowerHA ARU operations are automatic when a backup repository disk is configured.

Successful Automatic Repository Update operation

ARU operations are automatic when a backup repository disk is defined. Our scenario has a 2-site cluster and a backup repository disk per site.

To induce a failure of the primary repository disk, we log in to the Virtual I/O Server (VIOS) servers that present storage to the cluster LPARs and deallocate the disk LUN that corresponds to the primary repository disk on one site of our cluster. This disables the primary repository disk, and PowerHA ARU detects the failure and automatically activates the backup repository disk as the active repository disk.

This section presents the following examples that are during this process:

1. Before disabling the primary repository disk, we look at the lspv command output and note that the active repository disk is hdisk1, as shown in Example 4-19.

Example 4-19 Output of the lspv command in an example cluster

hdisk0 00f6f5d09570f647 rootvg active

hdisk1 00f6f5d0ba49cdcc caavg_private active

hdisk2 00f6f5d0a621e9ff None

hdisk3 00f61ab2a61d5c7e None

hdisk4 00f61ab2a61d5d81 testvg01 concurrent

hdisk5 00f61ab2a61d5e5b testvg01 concurrent

hdisk6 00f61ab2a61d5f32 testvg01 concurrent

2. We then proceed to log in to the VIOS servers that present the repository disk to this logical partition (LPAR) and de-allocate that logical unit (LUN) so that the cluster LPAR no longer has access to that disk. This causes the primary repository disk to fail.

3. PowerHA ARU detects the failure and activates the backup repository disk as the active repository disk. You can verify this behavior in the syslog.caa log file. This log file logs the ARU activities and shows the detection of the primary repository disk failure and the activation of the backup repository disk. See Example 4-20.

Example 4-20 The /var/adm/ras/syslog.caa file showing repository disk failure and recovery

Nov 12 09:13:29 primo_s2_n1 caa:info cluster[14025022]: caa_config.c run_list 1377 1 = = END REPLACE_REPOS Op = = POST Stage = =

Nov 12 09:13:30 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5

Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.

Nov 12 09:13:30 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/importvg -y caavg_private_t -O hdisk1'

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/reducevg -df caavg_private_t hdisk1'

Nov 12 09:13:32 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:13:33 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_read 5792 1 Could not open cluster repository device /dev/rhdisk1: 5

Nov 12 09:13:33 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 344 1 Failed to read repository data.

Nov 12 09:13:34 primo_s2_n1 caa:err|error cluster[14025022]: cluster_utils.c cluster_repository_write 5024 1 return = -1, Could not open cluster repository device /dev/rhdisk1: I/O error

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c destroy_old_repository 350 1 Failed to write repository data.

Nov 12 09:13:34 primo_s2_n1 caa:warn|warning cluster[14025022]: cl_chrepos.c destroy_old_repository 358 1 Unable to destroy repository disk hdisk1. Manual intervention is required to clear the disk of cluster identifiers.

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2242 1 Replaced hdisk1 with hdisk2

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = 0

Nov 12 09:13:34 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = 0

4. As an extra verification, the AIX error log has an entry showing that a successful repository disk replacement occurred, as shown in Example 4-21.

Example 4-21 AIX error log showing successful repository disk replacement message

LABEL: CL_ARU_PASSED

IDENTIFIER: 92EE81A5

Date/Time: Thu Nov 12 09:13:34 2015

Sequence Number: 1344

Machine Id: 00F6F5D04C00

Node Id: primo_s2_n1

Class: H

Type: INFO

WPAR: Global

Resource Name: CAA ARU

Resource Class: NONE

Resource Type: NONE

Location:

Description

Automatic Repository Update succeeded.

Probable Causes

Primary repository disk was replaced.

Failure Causes

A hardware problem prevented local node from accessing primary repository disk.

Recommended Actions

Primary repository disk was replaced using backup repository disk.

Detail Data

Primary Disk Info

hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf

Replacement Disk Info

hdisk2 5890b139-e987-1451-211e-24ba89e7d1df

It is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps in “Configuring a backup repository disk” on page 81.

Possible ARU failure situations

Some activities can affect the operation of ARU. Specifically, any administrative activity that uses the backup repository disk can affect ARU. If a volume group (VG) was previously created on a backup repository disk and this disk was not cleaned up, then ARU cannot operate properly.

In our sample scenario, we complete the following steps:

1. Configure a backup repository disk that previously had an AIX volume group (VG).

2. Export the AIX VG so that the disk did not display a VG by using the AIX command lspv. However, we did not delete that VG from the disk, so the disk itself still had that information.

3. For our example, we ran the AIX command lspv. Our backup repository disk is hdisk2. The disk shows a PVID but no VG, as shown in Example 4-22.

Example 4-22 Output of lspv command in an example cluster showing hdisk2

hdisk0 00f6f5d09570f647 rootvg active

hdisk1 00f6f5d0ba49cdcc caavg_private active

hdisk2 00f6f5d0a621e9ff None

hdisk3 00f61ab2a61d5c7e None

hdisk4 00f61ab2a61d5d81 testvg01 concurrent

hdisk5 00f61ab2a61d5e5b testvg01 concurrent

hdisk6 00f61ab2a61d5f32 testvg01 concurrent

4. We disconnect the primary repository disk from the LPAR by going to the VIOS and de-allocating the disk LUN from the cluster LPAR. This made the primary repository disk fail immediately.

ARU attempts to perform the following actions:

a. Checks the primary repository disk that is not accessible.

b. Switches to the backup repository disk (but this action fails).

5. ARU leaves an error message in the AIX error report, as shown in Example 4-23.

Example 4-23 Output of the AIX errpt command showing failed repository disk replacement

LABEL: CL_ARU_FAILED

IDENTIFIER: F63D60A2

Date/Time: Wed Nov 11 17:15:17 2015

Sequence Number: 1263

Machine Id: 00F6F5D04C00

Node Id: primo_s2_n1

Class: H

Type: INFO

WPAR: Global

Resource Name: CAA ARU

Resource Class: NONE

Resource Type: NONE

Location:

Description

Automatic Repository Update failed.

Probable Causes

Unknown.

Failure Causes

Unknown.

Recommended Actions

Try manual replacement of cluster repository disk.

Detail Data

Primary Disk Info

hdisk1 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf

6. In addition, we note that ARU verifies the primary repository disk and fails, as shown in the CAA log /var/adm/ras/syslog.caa in Example 4-24.

Example 4-24 Selected messages from the /var/adm/ras/syslog.caa log file

Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=REP_DOWN DISK_NAME=hdisk1 NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8658889411E580026E8DDB

Nov 12 09:13:20 primo_s2_n1 caa:info unix: caa_sock.c caa_kclient_tcp 231 entering caa_kclient_tcp ....

Nov 12 09:13:20 primo_s2_n1 caa:info unix: *base_kernext_services.c aha_thread_queue 614 The AHAFS event is EVENT_TYPE=VG_DOWN DISK_NAME=hdisk1 VG_NAME=caavg_private NODE_NUMBER=2 NODE_ID=0xD9DDB48A889411E580106E8DDB7B3702 SITE_NUMBER=2 SITE_ID=0xD9DE2028889411E580106E8DDB7B3702 CLUSTER_ID=0xD34E8

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/lib/cluster/caa_syslog '

Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c find_event_disk 742 Find disk called for hdisk4

Nov 12 09:13:20 primo_s2_n1 caa:info unix: kcluster_event.c ahafs_Disk_State_register 1504 diskState set opqId = 0xF1000A0150301A00

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 0

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c inherit_socket_inetd 930 1 IPv6=::ffff:127.0.0.1

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: cluster_utils.c cl_kern_repos_check 11769 1 Could not read the respository.

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_message.c cl_recv_req 172 1 recv successful, sock = 0, recv rc = 32, msgbytes = 32

Nov 12 09:13:20 primo_s2_n1 caa:info cluster[14025022]: caa_protocols.c recv_protocol_slave 1518 1 Automatic Repository Replacement request being processed.

7. ARU attempts to activate the backup repository disk, but it fails due to the fact that an AIX VG previously existed in this disk, as shown in Example 4-25.

Example 4-25 Messages from the /var/adm/ras/syslog.caa log file showing an ARU failure

Nov 12 09:11:26 primo_s2_n1 caa:info unix: kcluster_lock.c xcluster_lock 659 xcluster_lock: nodes which responded: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11862 1 START '/usr/sbin/mkvg -y caavg_private_t hdisk2'

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cluster_utils.c cl_run_log_method 11893 1 FINISH return = 1

Nov 12 09:11:26 primo_s2_n1 caa:err|error cluster[8716742]: cl_chrepos.c check_disk_add 2127 1 hdisk2 contains an existing vg.

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2235 1 Failure to move to hdisk2

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: cl_chrepos.c automatic_repository_update 2255 1 FINISH rc = -1

Nov 12 09:11:26 primo_s2_n1 caa:info cluster[8716742]: caa_protocols.c recv_protocol_slave 1542 1 Returning from Automatic Repository replacement rc = -1

Recovering from a failed ARU event

In “Possible ARU failure situations” on page 85, we used an example about what can prevent a successful repository disk replacement by using ARU. To recover from that failed event, we manually switch the repository disks by using the PowerHA SMIT panels.

Complete the following steps:

1. Using SMIT, run smitty sysmirror and select Problem Determination Tools → Replace the Primary Repository Disk. In our sample cluster, we have multiple sites, so we select a site, as shown in Example 4-26.

Example 4-26 Site selection prompt after selecting “Replace the Primary Repository Disk”

Problem Determination Tools

Move cursor to desired item and press Enter.

[MORE...1]

View Current State

PowerHA SystemMirror Log Viewing and Management

Recover From PowerHA SystemMirror Script Failure

Recover Resource Group From SCSI Persistent Reserve Error

Restore PowerHA SystemMirror Configuration Database from Active Configuration

Release Locks Set By Dynamic Reconfiguration

Cluster Test Tool

+--------------------------------------------------------------------------+

| Select a Site |

| |

| Move cursor to desired item and press Enter. |

| |

| primary_site1 |

| standby_site2 |

| |

[M| F1=Help F2=Refresh F3=Cancel |

| F8=Image F10=Exit Enter=Do |

F1| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

2. In our example, we select standby_site2 and a panel opens with an option to select the replacement repository disk, as shown in Example 4-27.

Example 4-27 Prompt to select a new repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

3. Pressing the F4 key shows the available backup repository disks, as shown in Example 4-28.

Example 4-28 SMIT menu prompting for replacement repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [] +

+--------------------------------------------------------------------------+

| Repository Disk |

| |

| Move cursor to desired item and press Enter. |

| |

| 00f6f5d0ba49cdcc |

| |

| F1=Help F2=Refresh F3=Cancel |

F1| F8=Image F10=Exit Enter=Do |

F5| /=Find n=Find Next |

F9+--------------------------------------------------------------------------+

4. Selecting the backup repository disk opens the SMIT panel showing the selected disk, as shown in Example 4-29.

Example 4-29 SMIT panel showing the selected repository disk

Select a new repository disk

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

[Entry Fields]

Site Name standby_site2

* Repository Disk [00f6f5d0ba49cdcc] +

F1=Help F2=Refresh F3=Cancel F4=List

F5=Reset F6=Command F7=Edit F8=Image

F9=Shell F10=Exit Enter=Do

5. Last, pressing the Enter key runs the repository disk replacement. After the repository disk is replaced, the panel that is shown in Example 4-30 opens.

Example 4-30 SMIT panel showing a successful repository disk replacement

COMMAND STATUS

Command: OK stdout: yes stderr: no

Before command completion, additional instructions may appear below.

chrepos: Successfully modified repository disk or disks.

New repository "hdisk1" (00f6f5d0ba49cdcc) is now active.

The configuration must be synchronized to make this change known across the clus

ter.

F1=Help F2=Refresh F3=Cancel F6=Command

F8=Image F9=Shell F10=Exit /=Find

n=Find Next

Now, it is safe to remove the failed repository disk and replace it. The replacement disk can become the new backup repository disk by following the steps that are described in “Configuring a backup repository disk” on page 81.

4.3 Reliable Scalable Cluster Technology overview

This section provides an overview of Reliable Scalable Cluster Technology (RSCT), its components, and the communication path between these components. This section also describes what parts of it are used by PowerHA. The items that are described here are not new but are needed for a basic understanding of the PowerHA underlying infrastructure.

4.3.1 What Reliable Scalable Cluster Technology is

RSCT is a set of software components that provide a comprehensive clustering environment for AIX, Linux, Solaris, and Microsoft Windows operating systems. RSCT is the infrastructure that is used by various IBM products to provide clusters with improved system availability, scalability, and ease of use.

4.3.2 Reliable Scalable Cluster Technology components

This section describes the RSCT components and how they communicate with each other.

Reliable Scalable Cluster Technology components overview

For a more detailed description of the RSCT components, see IBM RSCT for AIX: Guide and Reference, SA22-7889.

The main RSCT components are explained in this section:

•Resource Monitoring and Control (RMC) subsystem

This is the scalable and reliable backbone of RSCT. RMC runs on a single machine or on each node (operating system image) of a cluster, and provides a common abstraction for the resources of the individual system or the cluster of nodes. You can use RMC for a single system monitoring, or for monitoring nodes in a cluster. However, in a cluster, RMC provides global access to subsystems and resources throughout the cluster, thus providing a single monitoring and management infrastructure for clusters.

•RSCT core resource managers (RMs)

A resource manager is a software layer between a resource (a hardware or software entity that provides services to some other component) and RMC. An RM maps programmatic abstractions in RMC into the actual calls and commands of a resource.

•RSCT cluster security services

This RSCT component provides the security infrastructure that enables RSCT components to authenticate the identity of other parties.

•Group Services subsystem

This RSCT component provides cross-node/process coordination on some cluster configurations.

•Topology Services subsystem

This RSCT component provides node and network failure detection on some cluster configurations.

Communication between RSCT components

The RMC subsystem and RSCT core RMs are today the only ones that use the RSCT cluster security services. Since the availability of PowerHA V7, RSCT Group Services are able to use Topology Services or CAA. Figure 4-2 shows the RSCT components and their relationships.

Figure 4-2 RSCT components

The RMC application programming interface (API) is the only interface that can be used by applications to exchange data with the RSCT components. RMC manages the RMs and receives data from them. Group Services is a client of RMC. Depending on whether
PowerHA V7 is installed, it connects to CAA. Otherwise, it connects to the RSCT Topology Services.

RSCT domains

An RSCT management domain is a set of nodes with resources that can be managed and monitored from one of the nodes, which is designated as the management control point (MCP). All other nodes are considered to be managed nodes. Topology Services and Group Services are not used in a management domain. Example 4-3 shows the high-level architecture of an RSCT management domain.

Figure 4-3 RSCT-managed domain (architecture)

An RSCT peer domain is a set of nodes that have a consistent knowledge of the existence of each other, and of the resources shared among them. On each node within the peer domain, RMC depends on a core set of cluster services, which include Topology Services, Group Services, and cluster security services. Figure 4-4 shows the high-level architecture of an RSCT peer domain.

Figure 4-4 RSCT peer domain (architecture)

Group Services are used in peer domains. If PowerHA V7 is installed, Topology Services are not used, and CAA is used instead. Otherwise, Topology Services are used too.

Combination of management and peer domains

You can have a combination of both types of domains (management domain and peer domains).

Figure 4-5 shows the high-level architecture for how an RSCT-managed domain and RSCT peer domains can be combined. In this example, Node Y is an RSCT management server. You have three nodes as managed nodes (Node A, Node B, and Node C). Node B and Node C are part of an RSCT peer domain.

You can have multiple peer domains within a managed domain. A node can be part of a managed domain and a peer domain. A given node can belong to only a single peer domain, as shown in Figure 4-5.

Figure 4-5 Management and peer domain (architecture)

Important: A node can belong to only one RSCT peer domain.

Example of a management and a peer domain

The example here is simplified. It just shows one Hardware Management Console (HMC) that is managing three LPARS, where two of them are used for a 2-node PowerHA cluster.

In a Power Systems environment, the HMC is always the management server in the RSCT management domain. The LPARs are clients to this server from an RSCT point of view. For example, this management domain is used to do dynamic LPAR (DLPAR) operations on the different LPARs.

Figure 4-6 shows this simplified setup.

Figure 4-6 Example management and peer domain

RSCT peer domain on Cluster Aware AIX

When RSCT operates on nodes in a CAA cluster, a peer domain is created that is equivalent to the CAA cluster. This RSCT peer domain presents largely the same set of function to users and software as other peer domains not based on CAA. Consider a peer domain, which is operating without CAA, and autonomously manages and monitors the configuration and liveness of the nodes and interfaces that it comprises.

The peer domain that represents a CAA cluster acquires configuration information and liveness results from CAA. It introduces some differences in the mechanics of peer domain operations, but few in the view of the peer domain that is available to the users.

Only one CAA cluster can be defined on a set of nodes. Therefore, if a CAA cluster is defined, the peer domain that represents it is the only peer domain that can exist, and it exists and is online for the life of the CAA cluster.

Figure 4-7 illustrates the relationship that is described in this section.

Figure 4-7 RSCT peer domain and CAA

When your cluster is configured and synchronized, you can check the RSCT peer domain by using the lsrpdomain command. To list the nodes in this peer domain, you can use the lsrpnode command. Example 4-31 shows a sample output of these commands.

Example 4-31 Listing RSCT peer domain information

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

c2n1_cluster Online 3.1.5.0 Yes 12347 12348

# lsrpnode

lsrpnode

Name OpState RSCTVersion

c2n2.munich.de.ibm.com Online 3.2.1.0

c2n1.munich.de.ibm.com Online 3.2.1.0

The RSCTActiveVersion number of the lsrpdomain output can show a back-level version number. This is the lowest RSCT version that is required by a new joining node. In a PowerHA environment, there is no need to modify this number.

The value of yes for MixedVersions means that you have at least one node with a higher version than the displayed RSCT version. The lsrpdnode command lists the used RSCT version by node.

Updating the RSCT peer domain version

If you like, you can upgrade the RSCT version of the RSCT peer domain, which is reported by the lsrpdomain command. To do this, use the command that is listed in Example 4-32.

Example 4-32 Updating the RSCT peer domain

# export CT_MANAGEMENT_SCOPE=2; runact -c IBM.PeerDomain
CompleteMigration Options=0

To be clear, doing such an update does not give you any advantages in a PowerHA environment. In fact, if you delete the cluster and then re-create it manually, or by using an existing snapshot of the RSCT peer domain version, you are back to the original version, which was 3.1.5.0 in our example.

Checking for Cluster Aware AIX

To do a quick check on the CAA cluster, you can use the lscluster -c command or use the lscluster -m command. Example 4-33 shows an example output of these two commands. For most situations, when you get an output of the lscluster command, CAA is running. To be on the safe side, use the lscluster -m command.

Example 4-33 shows that in our case CAA is running on the local node where we used the lscluster command. But, on the remote node CAA was stopped.

To stop CAA, we use the clmgr off node powerha-c2n2 STOP_CAA=yes command.

Example 4-33 The lscluster -c and lscluster -m commands

# lscluster -c

Cluster Name: c2n1_cluster

Cluster UUID: d19995ae-8246-11e5-806f-fa37c4c10c20

Number of nodes in cluster = 2

Cluster ID for node c2n1.munich.de.ibm.com: 1

Primary IP address for node c2n1.munich.de.ibm.com: 172.16.150.121

Cluster ID for node c2n2.munich.de.ibm.com: 2

Primary IP address for node c2n2.munich.de.ibm.com: 172.16.150.122

Number of disks in cluster = 1

Disk = caa_r0 UUID = 12d1d9a1-916a-ceb2-235d-8c2277f53d06 cluster_major = 0 cluster_minor = 1

Multicast for site LOCAL: IPv4 228.16.150.121 IPv6 ff05::e410:9679

Communication Mode: unicast

Local node maximum capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1.munich.de.ibm.com

State of node: DOWN

Node name: powerha-c2n2.munich.de.ibm.com

State of node: UP NODE_LOCAL

Peer domain on CAA linked clusters

Starting with PowerHA V7.1.2, linked clusters can be used. An RSCT peer domain that operates on linked clusters encompasses all nodes at each site. The nodes that comprise each site cluster are all members of the same peer domain.

Figure 4-8 shows how this looks from an architecture point of view.

Figure 4-8 RSCT peer domain and CAA linked cluster

Example 4-34 shows what the RSCT looks like in our 2-node cluster.

Example 4-34 Output of the lsrpdomain command

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

primo_s1_n1_cluster Online 3.1.5.0 Yes 12347 12348

# lsrpnode

Name OpState RSCTVersion

primo_s2_n1 Online 3.2.1.0

primo_s1_n1 Online 3.2.1.0

Because we define each of our nodes to a different site, the lscluster -c command lists only one node. Example 4-35 shows an example output from node 1.

Example 4-35 Output of the lscluster command (node 1)

# lscluster -c

Cluster Name: primo_s1_n1_cluster

Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702

Number of nodes in cluster = 2

Cluster ID for node primo_s1_n1: 1

Primary IP address for node primo_s1_n1: 192.168.100.20

Cluster ID for node primo_s2_n1: 2

Primary IP address for node primo_s2_n1: 192.168.100.21

Number of disks in cluster = 4

Disk = hdisk2 UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1

Disk = UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2

Disk = hdisk3 UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3

Disk = UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4

Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414

Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415

Communication Mode: unicast

Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Example 4-36 shows the output from node 2.

Example 4-36 Output of the lscluster command (node 2)

# lscluster -c

Cluster Name: primo_s1_n1_cluster

Cluster UUID: d34e8658-8894-11e5-8002-6e8ddb7b3702

Number of nodes in cluster = 2

Cluster ID for node primo_s1_n1: 1

Primary IP address for node primo_s1_n1: 192.168.100.20

Cluster ID for node primo_s2_n1: 2

Primary IP address for node primo_s2_n1: 192.168.100.21

Number of disks in cluster = 4

Disk = UUID = 2f1b2492-46ca-eb3b-faf9-87fa7d8274f7 cluster_major = 0 cluster_minor = 1

Disk = UUID = 20d93b0c-97e8-85ee-8b71-b880ccf848b7 cluster_major = 0 cluster_minor = 3

Disk = hdisk2 UUID = 5890b139-e987-1451-211e-24ba89e7d1df cluster_major = 0 cluster_minor = 4

Disk = hdisk1 UUID = 6c1b76e1-3e0a-ff3c-3c43-cb6c3881c3bf cluster_major = 0 cluster_minor = 2

Multicast for site standby_site2: IPv4 228.168.100.21 IPv6 ff05::e4a8:6415

Multicast for site primary_site1: IPv4 228.168.100.20 IPv6 ff05::e4a8:6414

Communication Mode: unicast

Local node maximum capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

Effective cluster-wide capabilities: CAA_NETMON, AUTO_REPOS_REPLACE, HNAME_CHG, UNICAST, IPV6, SITE

4.4 PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX

Starting with PowerHA V7.1, instead of the RSCT Topology Service, the CAA component is used in a PowerHA V7 setup. Figure 4-9 shows the connections between PowerHA V7, RSCT, and CAA (mainly the connection from PowerHA to RSCT Group services, and from there to CAA and back, are used). The potential communication to RMC is rarely used.

Figure 4-9 PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX overview

4.4.1 Configuring PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX

There is no need to configure RSCT or CAA. You just need to configure or migrate PowerHA, as shown in Figure 4-10. To set it up, use the smitty sysmirror panels or the clmgr command. The different migration processes operate in a similar way.

Figure 4-10 Set up PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX

4.4.2 Relationship between PowerHA, Reliable Scalable Clustering Technology, and Cluster Aware AIX

This section describes, from a high-level point of view, the relationship between PowerHA, RSCT, and CAA. The intention of this section is to give you a general understanding of what is running in the background. The examples in this section are based on a 2-node cluster.

In traditional situations, there is no need to use CAA or RSCT commands because they are all managed by PowerHA.

All PowerHA components are up

In a cluster where the state of PowerHA is up on all nodes, you also have all of the RSCT and CAA services running, as shown in Figure 4-11.

Figure 4-11 All cluster services are up

To check whether the services are up, you can use different commands. In the following examples, we use the clmgr, clRGinfo, lsrpdomain, and lscluster commands. Example 4-37 shows the output of the clmgr and clRGinfo PowerHA commands.

Example 4-37 Checking PowerHA when all is up

# clmgr -a state query cluster

STATE="STABLE“

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

Test_RG ONLINE CL1_N1

OFFLINE CL1_N2

To check whether RSCT is running, use the lsrpdomain command. Example 4-38 shows the output of the command.

Example 4-38 Checking for RSCT when all components are running

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

To check whether CAA is running correctly, we use the lscluster command. You must specify an option when using the lscluster command. We use the option -m in Example 4-39. In most cases, any other valid option can be used as well. However, to be sure, use the -m option.

In most cases, the general behavior is that when you get a valid output, CAA is running. Otherwise, you get an error message informing you that the cluster services are not active.

Example 4-39 Checking for CAA when all is up

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

One node that is stopped with Unmanage

In a cluster where one node is stopped with an Unmanage state, all of the underlying components (RSCT and CAA) must stay running. Figure 4-12 illustrates what happens when LPAR A is stopped with an Unmanage state.

Figure 4-12 One node where all RGs are unmanaged

The following examples use the same commands as in “All PowerHA components are up” on page 101 to check the status of the different components. Example 4-40 shows the output of the clmgr and clRGinfo PowerHA commands.

Example 4-40 Checking PowerHA: One node in state Unmanaged

# clmgr -a state query cluster

STATE="WARNING“

# clRGinfo

-----------------------------------------------------------------------------

Group Name Group State Node

-----------------------------------------------------------------------------

Test_RG UNMANAGED CL1_N1

UNMANAGED CL1_N2

As expected, the output of the lsrpdomain RSCT command shows that RSCT is still online (see Example 4-41).

Example 4-41 Checking RSCT: One node in state unmAnaged

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

Also, as expected, checking for CAA shows that it is running, as shown in Example 4-42.

Example 4-42 Checking CAA: One node in state Unmanaged

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

PowerHA stopped on all nodes

When you stop PowerHA on all cluster nodes, then you get the situation that is shown in Figure 4-13. In this case, PowerHA is stopped on all cluster nodes but RSCT and CAA are still running. You have the same situation after a system restart of all your cluster nodes (assuming that you do not use the automatic startup of PowerHA).

Figure 4-13 PowerHA stopped on all cluster nodes

Again, we use the commands that we use in “All PowerHA components are up” on page 101 to check the status of the different components. Example 4-43 shows the output of the PowerHA commands clmgr and clRGinfo.

As expected, the clmgr command shows that PowerHA is offline, and clRGinfo returns an error message.

Example 4-43 Checking PowerHA: PowerHA stopped on all cluster nodes

# clmgr -a state query cluster

STATE="OFFLINE“

# clRGinfo

Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.

The output of the RSCT lsrpdomain command shows that RSCT is still online (Example 4-44).

Example 4-44 Checking RSCT: PowerHA stopped on all cluster nodes

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348

The check for CAA shows that it is running, as shown in Example 4-45.

When RSCT is running, CAA must be up as well. This statement is only true for a PowerHA cluster.

Example 4-45 Checking CAA: PowerHA stopped on all cluster nodes

# lscluster -m | egrep "Node name|State of node"

Node name: powerha-c2n1

State of node: UP

Node name: powerha-c2n2

State of node: UP NODE_LOCAL

All cluster components are stopped

By default, CAA and RSCT are automatically started as part of an operating system restart (if the system is configured by PowerHA).

There are situations when you need to stop all three cluster components, for example, when you must change the RSCT or CAA code, as shown in Figure 4-14 on page 105.

For example, to stop all cluster components, use clmgr off cluster STOP_CAA=yes. For more information about starting and stopping CAA, see 4.4.3, “How to start and stop CAA and RSCT” on page 106.

Figure 4-14 All cluster services stopped

Example 4-46 shows the status of the cluster with all services stopped. As in the previous examples, we use the clmgr and clRGinfo commands.

Example 4-46 Checking PowerHA: All cluster services stopped

# clmgr -a state query cluster

STATE="OFFLINE“

root@CL1_N1:/home/root# clRGinfo

Cluster IPC error: The cluster manager on node CL1_N1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.

The lsrpdomain command shows that the RSCT cluster is offline, as shown in Example 4-47.

Example 4-47 Checking RSCT: All cluster services stopped

# lsrpdomain

Name OpState RSCTActiveVersion MixedVersions TSPort GSPort

CL1_N1_cluster Offline 3.1.5.0 Yes 12347 12348

The output of the lscluster command creates an error message in this case, as shown in Example 4-48.

Example 4-48 Check CAA: All cluster services stopped

# lscluster -m

lscluster: Cluster services are not active on this node because it has been stopped.

4.4.3 How to start and stop CAA and RSCT

CAA and RSCT are stopped and started together. CAA and RSCT are automatically started as part of an operating system start (if it is configured by PowerHA).

If you want to stop CAA and RSCT, you must use the clmgr command (at the time of writing, SMIT does not support this operation). To stop it, you must use the STOP_CAA=yes argument. This argument can be used for both CAA and RSCT, and the complete cluster or a set of nodes.

The information when you stopped CAA manually is preserved across an operating system restart. So, if you want to start PowerHA on a node where CAA and RSCT were stopped deliberately, you must use the START_CAA argument.

To start CAA and RSCT, you can use the clmgr command with the argument START_CAA=yes. This command also starts PowerHA.

Example 4-49 shows how to stop or start CAA and RSCT. All of these examples stop all three components or start all three components.

Example 4-49 Using clmgr to start and stop CAA and RSCT

To Stop CAA and RSCT:

- clmgr off cluster STOP_CAA=yes

- clmgr off node system-a STOP_CAA=yes

To Start CAA and RSCT:

- clmgr on cluster START_CAA=yes

- clmgr on node system-a START_CAA=yes

Starting with AIX 7.1 TL4 or AIX 7.2, you can use the clctrl command to stop or start CAA and RSCT. To stop it, use the -stop option for the clctrl command. This also stops PowerHA. To start CAA and RSCT, you can use the -start option. If -start is used, only CAA and RSCT start. To start PowerHA, you must use the clmgr command, or use SMIT afterward.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4. What is new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 4. What is new with IBM Cluster Aware AIX and Reliable Scalable Clustering Technology