CHAPTER 7

image

Shell Scripting

Shell scripting is an important skill that every professional DBA must possess. Most DBAs use this coding technique to automate many critical database administration tasks such as backups and monitoring. As part of your job, you’ll be asked to maintain scripts that prior DBAs have written, and you’ll also be required to write new shell scripts as required. The better you are at scripting, the better you’ll be able to perform your job. To be effective at shell scripting, DBAs minimally need to be familiar with four types of tools:

  • SQL
  • Text editor
  • Shell interface
  • Shell scripting language

The most common tools that map to the previous list are SQL*Plus, the vi (or vim) editor, and the Bash shell. The Bash shell is both a shell interface and a scripting language. The previous bulleted list is a minimal list of tools that you should be familiar with. You’ll be invoking a wide variety of Oracle database utilities in shell scripts such as SQL*Plus, RMAN, Data Pump, external tables, and so on.

As discussed in previous chapters, a shell is a command-line interface that allows you to interact with the Linux/Solaris kernel. Some shells that are commonly used by DBAs and developers are the Bash shell (bash), Korn shell (ksh), and C shell (csh or improved tcsh). As explained in Chapter 3, this book focuses on the Bash shell, so we won’t rehash those details here.

A shell script is an OS file that contains one or more commands that a shell can execute. Any command that the shell can execute can be placed in a shell script file. For example, a shell script can run other shell scripts, executable binary files (such as sqlplus or rman), or any system utility (such as df, ps, and so on).

DBAs use shell scripts for critical tasks such as proactively monitoring the database and taking backups. These critical tasks need to be repeatable and reliable. DBAs typically use shell scripts to run a series of commands that accomplish the desired task. DBAs like shell scripting because they don’t need to be expert programmers to write shell scripts. With a base knowledge (and a good example), it is fairly quick and easy to create a shell script that performs the desired task.

The purpose of this chapter is to provide a core set of techniques to enable you to successfully write shell scripts.

This chapter does not cover all the facets of shell scripting (that would take an entire book to accomplish). Instead, we give you the common approaches and shell scripting fundamentals that DBAs need to perform their jobs. We also provide useful real-world examples. Once you become adept at shell scripting, you’ll be able to leverage these skills to automate critical database jobs. One could argue that a database administrator might exclusively use a graphical user interface (GUI) to automate database tasks. We strongly recommend you learn how to write shell scripts to automate database administration tasks. If the GUI ever becomes inaccessible or doesn’t do something you need it to do, for your job’s sake, you had better know how to write and debug a shell script.

The first recipe of this chapter introduces you to the basics of how to write a shell script.

7-1. Writing a Simple Shell Script

Problem

You’re new to shell programming, and you want to write a simple script to determine whether your database is accessible.

Solution

Use an editor to create a new file (see Chapter 4 for details on using the vi file editor). Place the following text within the file:

#!/bin/bash
# Check if database is up.
ORACLE_SID=O1212
ORACLE_HOME=/orahome/app/oracle/product/12.1.0.2/db_1
PATH=$ORACLE_HOME/bin:$PATH
echo "select ’DB up’ from dual;" | sqlplus -s system/foo
exit 0

Modify the previous code to match your ORACLE_SID and ORACLE_HOME variables and your system schema password. After you create the file, you’ll also need to modify the permissions on the file to be executable. In this example, the file name is dbcheck.bsh:

$ chmod +x dbcheck.bsh

This code changes the permission of the file to executable (x). Now you should be able to successfully run the program:

$ dbcheck.bsh

If your database is up, you should receive a message like this:

’DBUP
-----
DB up

All the checking within the script is performed with one line of code:

echo "select ’DB up’ from dual;" | sqlplus -s system/foo

The echo command pipes a valid SQL statement to the sqlplus executable. The sqlplus executable will then attempt to log on with the system schema and run the statement.

Image Note  Adding the file extensions .sh, .bsh, or .bash to the end of a Bash shell script is a common industry practice. Keep in mind that file extensions are meaningless in Linux/Solaris environments (other than helping you document the type of script). This is different from DOS, in which .exe, .com, and .bat indicate executable OS files.

How It Works

The first line of the shell script in the “Solution” section of this recipe needs a little more explanation. We reproduced the line here for the discussion:

#!/bin/bash

The # character is normally used to comment out a line in a shell script; for example:

# This is a comment.

One exception to that rule is when #! appears as the first text in the shell script. When #! is placed on the first line, it can then be combined with a path and program name. The path and program name specify the location and name of the program that will interpret the commands within the script. This is important because it means you can have the script run with a designated shell regardless of the interactive shell you’re using.

Image Note  The two-character sequence of #! is sometimes referred to as the shebang.

If you don’t specify a #!/<path>/<program>, the shell you’re currently logged on to will be used to interpret the commands within the script. We recommend that you specify the path and shell program on the first line of your script so that there is no ambiguity about which shell will be used to interpret the commands within the script.

On most systems, the bash interpreter is in the /bin directory. If you don’t know the location of your bash executable, you can use the which or whereis command to locate it:

$ whereis bash
bash: /bin/bash

When you first attempt to run a shell script, you may receive an error similar to the following:

-bash: dbcheck.bsh: command not found

It means that your PATH variable doesn’t include the current working directory. To work around this, you can reset the PATH variable to include the current working directory. This example exports the PATH variable to include the current working directory:

$ export PATH=$PATH:.

Another method for ensuring that the Bash shell can locate a script is to include the complete directory path to where the script resides. In this example, the script is located in the directory /home/oracle and is run as shown here:

$ /home/oracle/dbcheck.bsh

You can also instruct the shell to look in the current working directory to determine the location of the script. You do this by placing ./ before the script name:

$ ./dbcheck.bsh

When first running the script, you may also receive a message like this:

Error 6 initializing SQL*Plus
SP2-0667: Message file sp1<lang>.msb not found

This message most likely means that the ORACLE_HOME variable isn’t set correctly. In this situation, ensure that you set the ORACLE_HOME variable to the correct location.

BASH SHELL EXIT COMMAND

You can place an exit command at any location within a shell script to instruct the Bash shell to immediately terminate the program. A successful exit is normally specified with an exit or an exit 0. Exiting a shell script when a failure condition has been detected is indicated by a nonzero value, such as an exit 1. We recommend that you explicitly place an exit 0 command within your shell script to indicate a successful completion. You should also use a nonzero value such as exit 1 when an error condition has been detected.

Each Bash shell command that executes will also return an exit code. If a command executes successfully, it will terminate with a status of 0. If there has been some sort of a failure, the exit code will be nonzero. You can check the status of an exit code by inspecting the $? variable:

$ echo $?
0

The $? variable holds the exit value of the previously executed command. The nonsuccess value of an exit code will vary by each command. For example, the grep utility will return a 0 on successfully finding a match, a 1 if no matches are found, and a 2 if there has been some sort of a syntax error or missing input.

7-2. Checking Simple Conditions

Problem

You want to check for a condition such as whether a critical database background process is running and send an e-mail if there is a problem.

Solution

Use the if/then/else Bash control structure to check for a condition and perform an appropriate action. The following example uses an if/then/else structure to determine whether the Oracle system monitor (SMON) process is running and sends an e-mail if the process is not detected:

#!/bin/bash
# Check for background process.
ORACLE_SID=O1212
critProc=ora_smon
ps -ef | grep -v ’grep’ | grep ${critProc}_$ORACLE_SID
if [ $? -eq 0 ]; then
  echo "SMON $ORACLE_SID is available."
else
  echo "SMON $ORACLE_SID issue." | mailx -s "issue with SMON $ORACLE_SID" [email protected]
fi
exit 0

Place the preceding code in a file named bgcheck.bsh and make it executable:

$ chmod +x bgcheck.bsh

Then run it:

$ ./bgcheck.bsh

If the SMON process is running, you’ll see output similar to the following:

oracle   27910     1  0 May11 ?        00:00:00 ora_smon_O1212
SMON O1212 is available.

The previous example uses the $? variable. This variable is often used after conditional statements to evaluate the success or failure of the previous command. The $? variable contains the status of the previous command executed. If the previously executed command was successful, the $? variable will contain a 0; otherwise, it will contain a nonzero value.

How It Works

The if/then/else control structure comes in three basic forms. The first one states that if a condition is true, execute the following commands. The syntax is as follows:

if condition ; then
  commands
fi

On the first line of code in the previous example, the keyword then is a separate command from the if keyword, so you must insert a semicolon to indicate the end line termination point of the if keyword. Another way of executing the previous bit of code is as follows:

if condition
then
  commands
fi

The next form of the if/then/else structure states if a condition is true, execute the following commands. If the first condition is false, execute a separate set of commands. Its syntax is as follows:

if condition ; then
  commands
else
   commands
fi

The third form of the if/then/else structure states that if a condition is true, execute the first set of commands; otherwise, check for the next condition. If it is true, execute the commands. This functionality is enabled with the elif keyword. You can have many elif conditions in this form. Its syntax is as follows:

if condition ; then
   commands
elif  condition
   commands
elif  condition
   commands
fi

Image Tip  You can also check for success (or not) of a command via the two conditional operators || and &&. See recipe 7-8 for more details on conditional operators.

7-3. Testing a Condition

Problem

You want to write a script that checks for certain conditions, such as the number of parameters passed to a script. Based on the condition, you want to perform an action such as displaying an informational message or exiting the script.

Solution

As shown in recipe 7-2, the if/then/else structure is an important programming technique. However, it is the combination of if/then/else with a condition that can be tested that gives you a much more powerful tool to automate DBA tasks. The test command enables you to check a condition within an if command. Here is the basic syntax for the test command:

test operand1 operator operand2

The test command can also be written with the [ ] syntax. This syntax uses a left square bracket to start the command and then finishes the command with a right square bracket. Its syntax is as follows:

[ operand1 operator operand2 ]

Image Note  The shell script examples in this chapter use the [ ] form of the test command.

For some test conditions, an operand1 isn’t required. The syntax for this condition is as follows:

[ operator operand2 ]

The previous test conditional checks will exit with a status of 0 (true) or 1 (false), depending on the evaluation of the condition. Ensure that you have a space between the operands, operators, and brackets. The space is how the shell knows where the operator and operand are separated. If there is no space between the operator, operand, and brackets, the shell will interpret the value as one string, which will result in erroneous outcomes.

To bring if/then/else and test together, we’ll write a small but useful piece of code that checks to see whether the correct number of parameters are passed to a script. The script will use the $# variable. The $# variable automatically gets assigned to the number of positional parameters typed at the command line and passed into the script. This variable is handy when you want to check for the correct number of parameters passed to a script. The script will also use the $1 variable to display the first parameter passed to the script. When parameters are passed to a script, the first parameter is automatically stored in the $1 variable, the second parameter is stored in $2, and so on.

The following bit of code uses the -ne conditional check to determine whether the number of parameters passed to the script is not equal to 1:

#!/bin/bash
if [ $# -ne 1 ]
then
  echo "Wrong number of parameters passed to script."
  exit 1
else
  echo "$1 passed to the script"
fi
exit 0

We next place this code in a script named ss.bsh and make it executable:

$ chmod +x ss.bsh

Now run the script with no parameters passed to it:

$ ./ss.bsh

Here is the corresponding output:

Wrong number of parameters passed to script.

Now run the script with one parameter passed to it:

$ ./ss.bsh testparm

Here is the output:

testparm passed to the script

The $0 parameter is often used in conjunction with the $# parameter to display the syntax required when invoking a script. Within a shell script, the $0 parameter contains the name of the shell script being executed. Here’s a slight variation of the previous code that uses the $0 variable to display the name of the script:

#!/bin/bash
if [ $# -ne 1 ]
then
  echo "Wrong number of parameters passed to script."
  echo "Usage: $0 ORACLE_SID"
  exit 1
else
  echo "$1 passed to the script"
fi
exit 0

The -ne operator is an arithmetic operator and is used to test whether the operands are not equal. If the script (named ss.bsh) is called without passing exactly one parameter to it, the following output is displayed:

Wrong number of parameters passed to script.
Usage: ./ss.bsh ORACLE_SID

Notice the ./ in front of the script name in the previous output. To eliminate the ./ from the output, use the basename command. This utility is used to strip any directory or suffix information from file names; for example:

#!/bin/bash
Pgm=$(basename $0)
if [ $# -ne 1 ]
then
  echo "Wrong number of parameters passed to script."
  echo "Usage: $Pgm ORACLE_SID"
  exit 1
else
  echo "$1 passed to the script"
fi
exit 0

The script is executed again:

$ ./ss.bsh

Notice in the output that the name of the script is displayed without directory information:

Wrong number of parameters passed to script.
Usage: ss.bsh ORACLE_SID

One last note: the following line of code from the prior script needs a bit more explanation:

Pgm=$(basename $0)

The previous line uses a technique known as command substitution. Command substitution allows you to take the output of a command and populate a variable. The basic syntax for doing this is as follows:

variable=$(shell commands)

This is a powerful feature that allows you to populate variables with the output of other commands. In our example, we took the output of basename $0 and populated the Pgm variable. Be aware that in older versions of the shell you may see command substitution implemented with the back tick syntax:

variable=`shell commands`

This is an older style and won’t be used in any of our examples.

How It Works

Testing for conditions is an integral part of shell script logic. The “Solution” section provides a practical example of using the test command with the [ ] syntax. There are several additional use cases for which you’ll need to test for conditions (e.g., an arithmetic condition, a string, if a file exists, and so on). Several real-world scenarios are explored in the following sections.

Testing an Arithmetic Condition

We have occasionally been in a situation in which an abnormally high number of SQL sessions get connected to the database. It might be due to an erroneous program initiating connections to the database and never exiting. In this scenario, it is useful to have a shell script that periodically checks to see whether the number of SQL processes running on the server is less than a certain threshold.

Here’s a simple script that checks to see whether the number of SQL processes is fewer than 300. If there are 300 or more processes executing, the script sends an e-mail:

#!/bin/bash
crit_var=$(ps -ef | grep sqlplus | wc -l)
if [ $crit_var -lt 300 ]; then
  echo $crit_var
  echo "processes running normal"
else
  echo "too many processes"
  echo $crit_var | mailx -s "too many sqlplus procs" [email protected]
fi
exit 0

In the prior script, command substitution is used to populate the crit_var variable with the output of the ps command. The value of crit_var is then tested with the -lt (less than) arithmetic operator.

Several arithmetic operators are available with the Bash shell. Table 7-1 gives a brief description of each operator.

Table 7-1. Arithmetic Operators

Operator

Description

-eq

True if two integers are equal

-ne

True if two integers are not equal

-lt

True if operand1 is less than operand2

-le

True if operand1 is less than or equal to operand2

-gt

True if operand1 is greater than operand2

-ge

True if operand1 is greater than or equal to operand2

Testing Strings

You can use strings with test conditions, and there is a wide variety of ways to use string comparisons. For example, you may want to check to ensure that you’re logged on as a certain OS user before you run commands in a script. This example checks to see whether the user running the script is oracle:

#!/bin/bash
checkUser=oracle
curWho=$(whoami)
if [ "$curWho" != "$checkUser" ]; then
  echo "You are currently logged on as: $curWho"
  echo "Must be logged in as $checkUser to run this script.."
  exit 1
fi
exit 0

In the preceding bit of code, we use command substitution to populate the curWho variable with the output of the whoami command. The curWho variable is then checked to see whether it matches the string of oracle via the != (not equal) operator. If the user doesn’t match, the script displays informational messages and exits the script. Table 7-2 lists test operations for strings and their descriptions.

Table 7-2. String Operators

String Operator

Description

-z string

True if the string is empty

-n string

True if the string is not empty

string1 = string2

True if string1 equals string2

string1 != string2

True if the strings are not equal

string1 < string2

True if string1 sorts before string2

string1 > string2

True if string1 sorts after string2

Accepting Input from the Command Line

Another useful example of a string comparison is to read user input from the keyboard and verify an operation. Suppose you want to check the current Oracle SID variable before continuing to run more commands within the script. This is useful if you work with multiple databases contained on one physical server. This script displays the value of ORACLE_SID and asks whether you want to continue running the script:

#!/bin/bash
keepGoing=n
echo "Current value of ORACLE_SID: $ORACLE_SID"
echo -n "Do you want to continue? y/n "
read keepGoing
if [ "$keepGoing" = "y" ]; then
  echo "Continue to run script."
else
  echo "Exiting script"
  exit 1
fi
exit 0

Testing for the Existence of a File

In addition to arithmetic and string comparisons, you can also perform various tests on OS files. The test command allows you to perform checks such as the availability of a file, the file type, and so on. For example, you may want to determine whether a log file exists; if it does, you want it to send an e-mail to the appropriate support person. This script uses the -e (exists) parameter of the test command to determine this:

#!/bin/bash
checkFile=/home/trace/error.log
if [ -e $checkFile ]; then
  mail -s "error.log exists" [email protected] <$checkFile
else
  echo "$checkFile does not exist"
fi
exit 0

If you want your shell script to do nothing after checking for a condition, use the colon (:) command (sometimes called no-operation or null). For example, the following bit of code does nothing if it detects that the given file exists:

#!/bin/bash
checkFile=/home/oracle/error.log
if [ -e $checkFile ]; then
  :
else
  echo "$checkFile does not exist"
fi
exit 0

Table 7-3 contains descriptions of the Bash shell tests for file operations.

Table 7-3. File Operators

File Operator

Description

-a

True if file exists

-b

True if file is a block device file

-c

True if file is a character device file

-d

True if file is a directory

-e

True if file exists

-f

True if file exists and is a regular file

-g

True if file has set-group-id permission set

-h

True if file is a symbolic link

-L

True if file is a symbolic link

-k

True if file’s sticky bit is set

-p

True if file is a named pipe

-r

True if the file is readable (by current user)

-s

True if file exists and is not empty

-S

True if file is socket

-u

True if file is set-user-id

-w

True if file is writable (by current user)

-x

True if file is executable

-O

True if file is effectively owned by current user

-G

True if file is effectively owned by current user’s group

-N

True if file has been modified since it was last read

file1 -nt file2

True if file1 is newer than file2

file1 -ot file2

True if file1 is older than file2

file1 -ef file2

True if file1 is a hard link to file2

Image Tip  The test command options will vary by OS version. For a complete listing of available test operations in your environment, use the help test command or the man test command.

7-4. Checking Complex Conditions

Problem

You need to perform a sophisticated set of condition checks, such as checking for free disk space on a server. When a particular mount point reaches a certain threshold, you want to send an e-mail to the DBA team. You suspect that you’ll need something more sophisticated than if/then/else checking, such as a case statement.

Solution

In many cases, a simple if/then/else construct is all you need to check a condition. However, as soon as you are presented with many different actions to take, the if/then/else syntax can become unwieldy and nonintuitive. In these situations, use a case statement instead. The basic syntax for a case statement is as follows:

case expression in
  pattern1)
     commands ;;
  pattern2)
     commands ;;
esac

The next example in this section uses a case statement to check for free disk space. The script replies on the output of the df command. So to understand how the script works, first run df -h and view its output:

$ df -h

Here’s the corresponding output for this server:

Filesystem            Size  Used Avail Use% Mounted on
/dev/xvda2            191G  165G   17G  91% /
/dev/xvda1            996M  136M  809M  15% /boot

The two mount points on this server are / and /boot. Looking ahead to the script, the mntlist variable within the script has been defined to match the mount points on this server that require monitoring for used space:

mntlist="/ /boot"

With that understanding, consider the following code, which uses a case statement to perform several checks:

#!/bin/bash
BOX=$(uname -a | awk ’{print $2}’)
mntlist="/ /boot"
for ml in $mntlist
do
echo "Mount point: $ml"
usedSpc=$(echo $(df -h $ml|awk ’{print $5}’|grep -v Use|cut -d "%" -f1 -))
BOX=$(uname -a | awk ’{print $2}’)
case $usedSpc in
[0-9])
diskStat="relax, lots of disk space: $usedSpc"
;;
[1-7][0-9])
diskStat="disk space okay: $usedSpc"
;;
[8][0-9])
diskStat="space getting low: $usedSpc"
;;
[9][0-9])
diskStat="warning, running out of space: $usedSpc"
echo $diskStat $ml | mailx -s "space on: $BOX" [email protected]
;;
[1][0][0])
diskStat="update resume, no space left: $usedSpc"
echo $diskStat $ml | mailx -s "space on: $BOX" [email protected]
;;
*)
diskStat="huh?: $usedSpc"
esac
# end case
echo $diskStat
done
# end for
exit 0

Assume that the preceding code is placed in a script named filesp.bsh and made executable:

$ chmod +x filesp.bsh

Next the script is executed:

$ filesp.bsh

And the output is displayed:

Mount point: /
warning, running out of space: 91
Mount point: /boot
disk space okay: 15

Within the script, the usedSpc variable gets assigned a value that shows what percentage of disk space is used on a mount point. The case statement then examines usedSpc to determine within which range the variable falls. Finally, if a given mount point exceeds the 90% full threshold, the script e-mails the DBA a message to indicate that there could be an issue.

How It Works

The code in the “Solution” section of this recipe uses shell commands, a case statement, and various coding techniques in a few lines of code. The result is a small but extremely useful script that monitors disk space. The usedSpc line of the script needs additional explanation. We repeat it here for convenience:

usedSpc=$(echo $(df -h $ml|awk ’{print $5}’|grep -v Use|cut -d "%" -f1 -))

The usedSpc variable is contained within a looping structure for ml in $mntlist. The loop executes for each mount point defined in the mntlist variable. The ml variable is assigned for the current mount point being examined. The output of the df command is piped to the awk command, which extracts the fifth column. This in turn is passed to the grep command, which eliminates any output that contains the string "Use". This output is piped to the cut command, which extracts the first field delimited by a % character.

Note that there’s an echo command embedded into the line of code; it is used because there’s extra space included in the output of the df command on some systems, and echo removes that extra space. The resultant string should be the percentage of disk space used on the mount point in question.

You may have to tweak the usedSpc line of code, depending on the output of df for your system. For example, the output of the df command might not display the string "Use" on some platforms. Case in point: on some Solaris systems, the output from the df command displays the string "Capacity" to indicate the amount of disk space used; for example:

$ df -h

Here is some sample output on a Solaris system:

Filesystem             Size   Used  Available Capacity  Mounted on
orapool1/ora01         350G   203G       147G      59%  /ora01
orapool2/ora02         350G   265G        85G      76%  /ora02
orapool1/ora03         350G   254G        96G      73%  /ora03

In this situation, you can modify the script to use grep to filter out the string "Capacity"; for example:

usedSpc=$(echo $(df -h $ml|awk ’{print $5}’|grep -v Capacity|cut -d "%" -f1 -))

Here’s the corresponding output when filesp.bsh is run on a Solaris system:

Mount point: /ora01
disk space okay: 59
Mount point: /ora02
disk space okay: 76
Mount point: /ora03
disk space okay: 73

Also note that for this Solaris system, the mntlist variable in the script needs to be defined as follows:

mntlist="/ora01 /ora02 /ora03"

Let’s now go back to the example in the “Solution” section of this recipe: the case statement performs a sophisticated set of string comparisons on the value stored in the usedSpc variable. The case statement will check each condition until it finds a match. When a condition is met, the case statement runs any statements within the matched section and then exits.

An example will help clarify this concept. Let’s look at the first condition in the case statement in the “Solution” section of this recipe:

[0-9])
diskStat="relax, lots of disk space: $usedSpc"
;;

In the preceding snippet of code, the case statement checks the value of the variable to see whether it is a one-digit string that contains a value within the range of 0 through 9. If it matches, it sets the diskStat variable to an appropriate message and exits the case statement.

Take a look at the second condition in the case statement:

[1-7][0-9])
diskStat="disk space okay: $usedSpc"
;;

In this bit of code, the case statement checks for a two-digit number. The first character it looks for must be in the range of 1 through 7. The second character can be any number from 0 to 9. If the pattern matches, the diskStat variable is set to an appropriate value, and the case statement exits.

Now examine the conditions near the end of the case statement:

[9][0-9])
diskStat="warning, running out of space: $usedSpc"
echo $diskStat $ml | mailx -s "space on: $BOX" [email protected]
;;
[1][0][0])
diskStat="update resume, no space left: $usedSpc"
echo $diskStat $ml | mailx -s "space on: $BOX" [email protected]
;;

The idea here is that if a mount point is 90% used or above, send an e-mail to the DBA, warning that disk space is getting low. Finally, if no match is made, the catchall clause *) will be executed, and the case statement will be exited.

The structure of the case statement allows you to perform complicated comparisons that would probably drive you crazy if you tried to code them using if/then/else statements. Table 7-4 lists some common pattern-matching characters used in case statements.

Table 7-4. Common Character-Matching Patterns

Pattern

Description

a|b

Matches either a or b

*

Matches any string of characters, often used for a catchall

[abc]

Matches any character a, b, or c

[a-c]

Matches any character a, b, or c

[0-9]

Matches any character 0 through 9

"<string>"

Matches the string enclosed in the quotes

One final note: typically we’ll use the cron utility (see Chapter 10 for details) to automatically have the disk monitoring script run on a periodic basis (e.g., once every 30 minutes or once per hour). In this way, we’re warned when a server is running out of space. You may be thinking, “Isn’t it the system administrator’s job to monitor disk space?” Yes it is, but we’ve been in numerous situations (too many to count) in which a mount point filled up, the database ceased working, and the production support DBA gets called. By running your own disk space–monitoring script, you’ll give yourself some warning regarding disk fullness issues and save yourself a lot of headaches.

7-5. Repeating a Task

Problem

You want to perform a check on several databases running on a server. You don’t want to have to create a script for each database; in other words, you’d rather write one script in a flexible manner to be used for all databases.

Solution

A for loop allows you to re-run a section of code a fixed number of times. This control construct is particularly useful because DBAs often have a known set of databases or files that need to be operated on. The for loop syntax is as follows:

for name [in list]
do
  commands that can use $name
done

The following code illustrates the power of a for loop. In this environment, there are three databases that are being monitored for a critical background process. The for loop allows you to provide an input list and have the same code re-executed for each database name in the input list:

#!/bin/bash
SID_LIST="dev1 dev2 dev3"
critProc=ora_smon
for curSid in $SID_LIST
do
  ps -ef | grep -v ’grep’ | grep ${critProc}_$curSid
  if [ $? -eq 0 ]; then
    echo "$curSid is available."
  else
    echo "$curSid has issues." | mail -s "issue with  $curSid" [email protected]
  fi
done
exit 0

In this manner, you can efficiently use code to repeat a task.

How It Works

The for loop iterates through each argument passed in to the parameter list. This control structure is ideal for a fixed input list. (Depending on which shell you use, the syntax may be slightly different from the one described in the “Solution” section.)

There are a few other aspects about looping that require further analysis. For example, you can use the built-in Bash shell $@ variable to pass a parameter list to a for loop. The $@ variable contains a quoted list of arguments passed to the script. By default, a for loop will use $@ if no input list is provided. The previous code snippet can be slightly modified to take advantage of this technique, as shown here:

#!/bin/bash
critProc=ora_smon
for curSid in $@
do
  ps -ef | grep -v ’grep’ | grep -i ${critProc}_$curSid
  if [ $? -eq 0 ]; then
    echo "$curSid is available."
  else
    echo "$curSid has issues." | mail -s "issue with  $curSid" [email protected]
  fi
done
exit 0

Assume that the preceding bit of code is placed in a file named dbup.bsh. It can now be run from the command line to pass in a list of databases to check:

$ dbup.bsh dev1 dev2 dev3

One last note: there are many different methods to implement a for loop. For example, it is possible to iterate through a for loop based on the output of a command:

for a in $(ls /home/oracle)
do
    echo "$a"
done

You can also iterate based on a sequence of numbers:

for a in {1..10}
do
  echo "$a "
done

Here’s another common method for looping through a sequence of numbers:

for ((a=1; a <= 10; a++))
do
  echo "$a "
done

The method that you choose depends on your personal preference and the task at hand.

7-6. Iterating Until a Condition Is Met

Problem

You want to perform an operation an unknown number of times until a certain condition is achieved.

Solution

The while and until flow control constructs allow a piece of code to iterate until a condition is met. In contrast with a for loop construct, the while and until loops are useful when the number of times needed to continue looping is not known beforehand. The while loop runs until a test condition has a zero exit status. The syntax for the while loop is as follows:

while condition ; do
   commands
done

A small example will demonstrate the utility of the while loop. Suppose that you want to check the sqlplus process count on a box every 15 seconds. If it exceeds a certain value, you want to send an e-mail and exit. Here’s some code to do just that:

#/bin/bash
crit_var=0
while [ $crit_var -lt 300 ]; do
  crit_var=$(ps -ef | grep sqlplus | wc -l)
  echo "Number of sqlplus processes: $crit_var"
  sleep 15
done
echo $crit_var | mailx -s "too many sqlplus procs" [email protected]
exit 0

The until control construct is similar to the while loop. The until loop runs until a test condition has a nonzero exit status:

until condition ; do
  commands
done

Next is a script that uses an until loop in the same fashion as the while loop example:

#/bin/bash
crit_var=0
until [ $crit_var -ge 300 ]; do
  crit_var=$(ps -ef | grep sqlplus | wc -l)
  echo "Number of sqlplus processes: $crit_var"
  sleep 15
done
echo $crit_var | mailx -s "too many sqlplus procs" [email protected]
exit 0

In this way, you can continually execute a task until a condition is met.

How It Works

The while or until constructs are useful when you need to iterate but don’t know in advance the number of iterations. In other words, the requirement is to loop until a condition has been met and then exit.

Here’s another useful example of using a while loop. Sometimes it is useful when debugging scripts to iterate through all arguments passed to a shell script and view the parameter values. This snippet of code uses a while loop to display all parameters passed into a script:

while [ $# -ne 0 ]; do
  echo $1
  shift 1
done

The code is placed in a shell script named test.bsh, made executable, and run as follows:

$ chmod +x test.bsh
$ test.bsh dev1 dev2 dev3

Here is the corresponding output:

dev1
dev2
dev3

In the previous code sample, the shift command is used to move the positional parameters one position to the left. You can think of the positional parameters as an array of values, and (when invoked) the shift command (destructively) moves these values left in the array by the specified number of values.

An example helps to clarify this shifty concept. Suppose that there are three parameters passed into a program: A, B, and C. The positional variable $1 will contain A, $2 will contain B, and $3 will contain C. When you issue the shift 1 command, $1 now contains B, $2 contains C, and $3 now contains nothing. Another shift will move C into $1, and $2 and $3 will now be empty, and so forth.

One last note: there are many different ways to implement a while loop. Here’s a common while loop structure that C programmers will recognize:

((a = 1))
while (( a <= 10 ))
do
  echo "$a "
  ((a += 1))
done

Here’s an example of using a while loop with a function (more on functions later in this chapter):

a=0
condition ()
{
  ((a++))
  if [ $a -lt 11 ]
  then
    return 0  # true
  else
    return 1  # false
  fi
}
while condition
do
  echo "$a"
done

And here is yet another way to implement an until loop:

a=1
until (( a > 10 ))
do
  echo "$a"
  (( a++ ))
done

The method that you choose depends on your personal preference and the task at hand.

7-7. Displaying a Menu of Choices

Problem

You want to present a menu of choices for the shell script user to pick from.

Solution

The select command allows you to create a menu from an input list. If the input list is omitted, the positional parameters (contained in the $@ variable) are used to construct the menu. The syntax of the select command is nearly identical to that of the for command:

select name in [input list ]
do
  commands that use $name
done

Listed next is a shell script that uses the select command to query the contents of the /etc/oratab file (/var/opt/oracle/oratab in Solaris environments) and sets your Oracle OS variables, depending on which value for ORACLE_SID that you chose:

#!/bin/bash
# Sets Oracle environment variables.
# Setup: 1. Put oraset file in /etc (Linux), in /var/opt/oracle (Solaris)
#        2. Ensure /etc or /var/opt/oracle is in $PATH
# Usage: batch mode: . oraset <SID>
#        menu mode:  . oraset
#====================================================
if [ -f /etc/oratab ]; then
  OTAB=/etc/oratab
elif [ -f /var/opt/oracle/oratab ]; then
  OTAB=/var/opt/oracle/oratab
else
     echo ’oratab file not found.’
     exit
fi
#
if [ -z $1 ]; then
  SIDLIST=$(egrep -v ’#|*’ ${OTAB} | cut -f1 -d:)
  # PS3 indicates the prompt to be used for the Bash select command.
  PS3=’SID? ’
  select sid in ${SIDLIST}; do
    if [ -n $sid ]; then
      HOLD_SID=$sid
      break
    fi
  done
else
  if egrep -v ’#|*’ ${OTAB} | grep -w "${1}:">/dev/null; then
    HOLD_SID=$1
  else
    echo "SID: $1 not found in $OTAB"
  fi
  shift
fi
#
export ORACLE_SID=$HOLD_SID
export ORACLE_HOME=$(egrep -v ’#|*’ $OTAB|grep -w $ORACLE_SID:|cut -f2 -d:)
export ORACLE_BASE=${ORACLE_HOME%%/product*}
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ADR_BASE=$ORACLE_BASE/diag
export PATH=$ORACLE_HOME/bin:/usr/ccs/bin:/opt/SENSsshc/bin/
:/bin:/usr/bin:.:/var/opt/oracle:/usr/sbin:/etc
export LD_LIBRARY_PATH=/usr/lib:$ORACLE_HOME/lib

In this example, the code is placed in a file named oraset in the /etc directory on Linux (or /var/opt/oracle on Solaris). The oraset script is made executable as follows:

$ chmod +x oraset

Before you run oraset, view the contents of the oratab file for this example:

ORA12CR1:/orahome/app/oracle/product/12.1.0.1/db_1:N
O1212:/orahome/app/oracle/product/12.1.0.2/db_1:N
TRG:/orahome/app/oracle/product/12.1.0.2/db_1:N
O112:/orahome/app/oracle/product/11.2.0.4/db_1:N
ORA117:/orahome/app/oracle/product/11.1.0/db_1:N

The names of the databases in the preceding text are ORA12Cr1, O1212, and so on. The path to each database’s home directory is next in the line (separated from the database name by a :). The last column should contain a Y or an N, which indicates whether you want the databases to automatically be restarted when the system reboots (this takes additional setup besides just the Y/N).

Now run oraset from the command line as follows:

$ . /etc/oraset

When running oraset from the command line, you should be presented with a menu such as this (derived from the database names in your oratab file):

1) ORA12CR1
2) O1212
3) TRG
4) O112
5) ORA117
SID ?

In this example, you can now enter 1, 2, 3, 4, or 5 to set the OS variables required for whichever database you want to use. This allows you to set up required Oracle OS variables interactively, regardless of the number of database installations on the server.

How It Works

The Bash shell built-in select command provides an easy way of presenting a menu of choices to the shell script user. The oraset script presented in the “Solution” section of this recipe is a good example of a DBA using the select command to help accurately and consistently set the required OS variables when you have multiple databases running on a server.

When running oraset, be sure to use the . (dot) notation, which instructs the shell to source the script. Sourcing tells your current shell process to inherit any variables set within an executed script. If you don’t use the . notation, the variables set within the script are visible only within the context of the subshell that is spawned when the script is executed.

Image Note  In the Bash and C-shell shell, the source command and the . built-in are synonymous.

You can run the oraset script either from the command line or from a startup file (such as .profile, .bash_profile, or .bashrc). For example, place this line in your startup file:

. /etc/oraset

Now every time you log in to the server, you’ll see a menu of choices that you can use to indicate the database for which you want the OS variables set. If you want the OS variables automatically set to a particular database, put an entry in your startup files such as this:

. /etc/oraset TRG

The prior line will run the oraset file for the TRG database and set the OS variables appropriately.

7-8. Running Commands Based on Success/Failure of the Previous Command

Problem

You have some custom database monitoring code that you have set up. The job should be continuously running on the database server, but sometimes the job unexpectedly dies. You want to set up another job that determines whether the job isn’t running; if not, it should restart the monitoring job. In other words, in one line of code, you need to run a command to see whether the process is running; if not, run another command to restart the job.

Solution

In one line of code, use the || <space> and && control operators to conditionally execute a command based on the status of the previously run command. Here’s the basic syntax for how || works:

<run command 1> || <run command 2 if command 1 did not succeed>

And here’s the syntax for the && operator:

<run command 1> && <run command 2 if command 1 succeeded>

Here’s a simple example illustrating how it works:

$ ls myfile.txt || echo "file does not exist"

Here’s the output indicating that the echo command was executed (meaning that the ls command did not succeed in listing the myfile.txt file):

myfile.txt: No such file or directory
file does not exist

Now suppose that you create the following file:

$ touch myfile.txt

Re-run the command:

$ ls myfile.txt || echo "file does not exist"

The echo command is not executed because the ls command succeeded. The only output returned is the output of the ls command:

myfile.txt

Now that you have that background information, examine the following line of code:

$ ps -ef | grep dbwatch | grep -v grep || nohup /home/oracle/bin/dbwatch.bsh &

If the output from ps -ef | grep dbwatch | grep -v grep does not return a value (meaning that dbwatch is not running on the server), the following code is executed to restart the process:

nohup /orahome/oracle/bin/dbwatch.bsh &

If the ps command does return a value, it means the process is running, so don’t execute the script to restart the job. If you want to automate this, you can place a line in cron as follows:

33 * * * * ps -ef | grep dbwatch | grep -v grep || nohup /home/oracle/bin/dbwatch.bsh &

The preceding code is automatically run every hour (33 minutes after the hour) and checks to see whether dbwatch is running. If not, it is restarted.

How It Works

Sometimes it is useful (on one line of code) to have a command conditionally execute, depending on the success or failure of the previous immediately run command. The && and || operators are designed to do just that. The “Solution” section showed an example using ||; next is an example that uses &&, which means that you want a command to execute if the previous command was successful.

In this next line of code, an e-mail will be sent only if the grep command successfully finds the string "ORA-00600" in the alert.log file:

$ grep ORA-00600 alert*.log && echo "DB prob" | mailx -s "ORA 600 error" [email protected]

On one line of code, you can conditionally execute commands based on the success or failure of the prior command. The examples in this recipe provide another shell-scripting tool that you can use in creative ways to monitor for various activities and alert you if there are problems.

7-9. Modularizing Scripts

Problem

You want to make your scripts more modular and functional. You determine that shell functions will help accomplish this task.

Solution

Functions, which are commonly used in most programming languages, are blocks of commands that perform actions. You can think of a function as a small script within another script that compartmentalizes a section of code. The idea is to define a section of code once and then call that section of code multiple times from other parts of the program. Using functions allows you to modularize your code and make it more reusable, readable, and maintainable.

Like variables, functions must be declared before you can use them. Not surprisingly, the function command is used to declare functions. To illustrate the use of functions, let’s say you need to create a reusable bit of code that displays some debugging information. This example creates a function named showMsg:

function showMsg {
  echo "----------------------------------------"
  echo "You’re at location: $1 in the $0 script."
  echo "----------------------------------------"
} # showMsg

The function can now be referenced anywhere in the script after the point at which it was declared. For example, suppose you want to use the previous function in a script. You can add it before it is called and then reference the function multiple times within the code:

#!/bin/bash
debug=1
function showMsg {
  echo "----------------------------------------"
  echo "You’re at location: $1 in the $0 script."
  echo "----------------------------------------"
} # showMsg
#
SID_LIST="dev1 dev2 dev3"
critProc=ora_smon
#
if [[ debug -eq 1 ]]; then
showMsg 1
fi
#
for curSid in $SID_LIST
do
  ps -ef | grep -v ’grep’ | grep ${critProc}_$curSid
  if [ $? -eq 0 ]; then
    echo "$curSid is available."
  else
    echo "$curSid has issues." | mail -s "issue with  $curSid" [email protected]
  fi
done
#
if [[ debug -eq 1 ]]; then
showMsg 2
fi
#
exit 0

Assume that the preceding code is placed in a script named d.bsh. After it is executed, here is the output:

----------------------------------------
You’re at location: 1 in the ./d.bsh script.
----------------------------------------
----------------------------------------
You’re at location: 2 in the ./d.bsh script.
----------------------------------------

In this way, you can use a function to define the code once and execute it many times within the script.

How It Works

Functions allow you to organize large shell scripts into modular pieces of code, allowing for easier debugging, maintenance, and reusability. Functions can also be declared and invoked directly from the OS command line. From a DBA perspective, this gives you a very powerful tool that allows you to create and use any number of useful functions that can be run as if they were OS commands. For example, create a file named dba_fcns and place it in the following commands:

#!/bin/bash
#------------------------------------------------------------------------------#
# sshl : ssh with your login to remote host
  function sshl {
    echo "ssh -l $LOGNAME $*"
    ssh -l $LOGNAME $*
  } # sshl
#------------------------------------------------------------------------------#
# ssho : ssh with the oracle userid to remote host
  function ssho {
    echo "ssh -l oracle $*"
    ssh -l oracle $*
  } # ssho
#------------------------------------------------------------------------------#
# chkps: check for a process on the box
  function chkps {
    ps -ef | grep -i $1 | grep -v grep
  } # chkps
#------------------------------------------------------------------------------#

Now source the file as shown here:

$ . dba_fcns

You now have access to the sshl, ssho, and chkps functions from the OS command line. To illustrate this, the chkps function is called while passing in the string of smon to operate on:

$ chkps smon

You’ll most likely collect many functions in your DBA tool bag to alleviate having to type long, typo-prone shell commands. Yes, for the previous simple functions, you could create aliases to accomplish essentially the same task. However, functions give you the additional ability to combine several different commands, use parameters, and display useful informational messages.

7-10. Passing Parameters to Scripts

Problem

You don’t like hard-coding variables in your script. You want to change a script to set variables based on parameters passed to the script so your code is more reusable, flexible, and maintainable.

Solution

First, take a look at this script with hard-coded values in it for the database SID:

#!/bin/bash
ORACLE_SID=brdstn
rman target / <<EOF
backup database;
EOF

The << characters instruct the command running (in this case, the rman utility) to receive its input from anything that appears between the first EOF and the last EOF. You don’t have to use EOF for start and finish markers; you can use any text string. It simply marks the beginning and end of where the shell directs the input for the command running.

If you want to use this script as it is to back up a different database, you have to manually edit it and change the name of the database, which isn’t very efficient. A better approach is to modify the script so that it can dynamically be passed the name of the database to be backed up. Assume for this example that the script name is back.bsh. The script is modified as shown to accept the database name as input:

#!/bin/bash
ORACLE_SID=$1
rman target / <<EOF
backup database;
EOF

In the previous bit of code, $1 is a built-in variable in the Bash shell. The $1 variable holds the first parameter passed to the script when invoking it. Now the script can be run by passing in a database name. In this example, pass in the name of devdb:

$ back.bsh devdb

How It Works

Passing parameters to scripts allows for greater flexibility and reusability of a script. You can pass any number of arguments into a shell script. The first parameter is referenced inside the script as $1, the second parameter is referenced as $2, and so on. These shell variables are known as positional parameters, which are special variables that are set internally by the shell and are available for you to use within a script.

If you pass in more than nine positional parameters to a script, when you reference a positional variable, you will have to use braces {} to wrap the number portion of the parameter with multidigit parameters. Without braces, the variable $10 will be interpreted as the contents of $1 with a 0 concatenated to it, whereas ${10} will be interpreted as the contents of the tenth variable.

An example helps to illustrate this point. Suppose that you pass 10 parameters to a script, as shown here:

$ myscript.bsh a b c d e f g h i j

For illustration purposes, suppose that this line of code is contained within the script:

echo $10

In this case, a result of a0 is produced because it is echoing the contents of parameter 1 concatenated with a 0.

When the braces are used with the echo command, the line of code produces a j, which is the tenth parameter that was passed to the script:

echo ${10}

Remember to use the braces any time you reference a multidigit parameter within a shell script, or else you won’t get the results you intended.

Note that besides positional parameters, the shell provides other special variables for you. Some of the more useful special shell variables are described in Table 7-5. Examples of using these variables are found throughout this chapter.

Table 7-5. Special Shell Variables

Name

Description

$1 - $n

Positional parameters that hold values for parameters passed to the script.

$?

The exit status of the last command. Contains a 0 value for successfully executed commands. Contains a nonzero value for commands that failed. This nonzero value depends on what the command actually returned.

$0

Within a shell script, contains the name of the shell script being executed.

$#

The number of positional parameters passed to a script.

$$

The process number of the shell. Can be used to generate unique file names.

$!

The process number of the most recently executed background process.

$*

Contains all the positional parameters passed to the script.

7-11. Processing Parameters

Problem

When passing parameters to a shell script, you require an efficient built-in method for processing the parameters. For example, you want a simple method to test for unexpected parameters or missing parameters.

Solution

The getopts tool is a built-in shell command that provides an efficient mechanism for validating switches and parameters passed into a shell script. The best way to understand how this works is to examine a script that uses getopts to process parameters. Suppose that you have the requirement to pass in to an RMAN backup script the database name to be backed up and whether backup compression should be enabled. You want to be able to see whether the correct parameters have been passed in and display an informative message if the parameters are incorrect. Here is a script that uses getopts to examine and act on parameters passed to an RMAN backup script:

#!/bin/bash
PRG=$(basename $0)
USAGE="Usage: $PRG -s SID [-c compress] [-h]"
if [ $# -eq 0 ]; then
  echo $USAGE
  exit 1
fi
#
# In the OPTSTRING variable, if the first character is a :, then surpress system
# generated messages. If a char is followed by :, then an argument is expected to be
# passed in for a given option. The OPTARG environment variable contains the
# argument passed in for a given option.
#
OPTSTRING=":s:c:h"
while getopts "$OPTSTRING" ARGS; do
  case $ARGS in
  s) ORACLE_SID=${OPTARG}
     ;;
  c) COMP_SWITCH=$(echo ${OPTARG} | tr ’[A-Z]’ ’[a-z’)
     if [ $COMP_SWITCH = "compress" ]; then
       COMP_MODE=" as compressed backupset "
     else
       echo $USAGE
       exit 1
     fi
     ;;
  h) echo $USAGE
     exit 0
     ;;
  *) echo "Error: Not a valid switch or missing argument."
     echo ${USAGE}
     exit 1
     ;;
  esac
done
#
echo rman backup
rman target / <<EOF
backup $COMP_MODE database;
EOF
#
exit 0

Assuming that the prior code is placed in a script named rman.bsh, ensure that the script is executable:

$ chmod +x rman.bsh

The valid parameters to this script are s (accompanied with ORACLE_SID), c (compress), and h (help). First, here’s what happens when the script is run with an invalid parameter:

$ rman.bsh -v mydb
Error: Not a valid switch or missing argument.
Usage: rman.bsh -s SID [-c compress] [-h]

The usage note specifies the correct way to run the script; for example:

$ rman.bsh -s O1212 -c compress
connected to target database: O1212 (DBID=353735090)
RMAN>
Starting backup...

If you want to display help for the shell script, do so as follows:

$ rman.bsh -h
Usage: rman.bsh -s SID [-c compress] [-h]

This simple example demonstrates the flexibility and ease with which parameters are evaluated using getopts.

How It Works

The getopts (get options) utility enables you to efficiently inspect and process command-line switches and parameters. The getopts program ensures that a standard interface is used for shell program parameter handling. The basic syntax for getopts is as follows:

getopts optstring name

The OPTSTRING variable contains the list of options expected to be passed in from the command line when the script is executed. NAME is the variable used to read the command-line options one by one. The getopts command also relies on the OPTARG environment variable. This variable contains the argument value passed in for each option.

With that understanding, the script in the “Solution” section contains the following two lines of code:

OPTSTRING=":s:c:h"
while getopts "$OPTSTRING" ARGS; do

When the first character in OPSTRING is :, it will suppress any system–generated error messages. In other words, when the first character in OPTSTRING is :, it instructs the getopts command to handle all error messages that are generated (and don’t display system-generated messages).

If an option character is followed by :, an argument is expected on the command line. The s and c options are both followed by colons, so they require arguments to be passed into the script. The h option is not followed by a colon, so it does not require an argument.

Image Note  There is also a getopt command (no s on the end). This command is used in a similar fashion to getopts. View the man getopt documentation for more information.

7-12. Running Database Commands in Scripts

Problem

You want to run a database utility command within a shell script. For example, you want to run an RMAN backup from within a shell script.

Solution

There are several techniques for running database commands from within shell scripts. These two techniques are commonly used:

  • Running commands directly
  • Capturing output in a variable

These techniques are described in the following sections.

Running a Command Directly

Here is a script that invokes the Oracle RMAN utility and takes a backup of the database:

#!/bin/bash
ORACLE_SID=DEV_DB
rman target / <<EOF
backup database;
EOF
exit 0

The << characters instruct the command that is running (in this case, the rman utility) to receive its input from anything that appears between the first EOF and the last EOF. You don’t have to use EOF for start and finish markers, you can use any text string. It simply marks the beginning and end of where the shell directs the input for the command running.

This technique applies to any Oracle utility. The following runs a SQL*Plus command within a shell script:

#!/bin/bash
ORACLE_SID=DEV_DB
sqlplus -s <<EOF
/ as sysdba
select sysdate from dual;
EOF
exit 0

Capturing Output in a Variable

Command substitution is a technique in which you run a command and store its output in a variable. You can use command substitution to run a database utility (such as SQL*Plus). For example, you want to determine whether critical materialized views are refreshing on a daily basis. One way of doing this is to select a count from the data dictionary view USER_MVIEWS, where the last refresh date is greater than one day. This bit of code uses command substitution to run a database command and capture the output of the command in the critVar variable:

#/bin/bash
critVar=$(sqlplus -s <<EOF
pdb_m/abc@papd
SET HEAD OFF FEED OFF
SELECT count(*) FROM user_mviews WHERE sysdate-last_refresh_date > 1;
EXIT;
EOF)

The script returns a value into the variable critVar, and you can test to see whether the value is 0:

if [ $critVar -ne 0 ]; then
mail -s "Problem with MV refresh" [email protected] <<EOF
MVs not okay.
EOF
else
  echo "MVs okay."
fi
exit 0

If the value in the critVar variable isn’t 0, the script will send an e-mail.

How It Works

The basic technique for running database utilities within a script is to run the utility directly (as you would from the command line) or use command substitution. The key here is that you must use the following syntax:

database_utility << EOF
<run database_utility commands>
EOF

The code instructs the shell to execute the database utility; then anything between << EOF and the next EOF are commands run by the database utility. The string "EOF" can be any string. We use EOF as our standard.

The following example uses EOF for the start and end maker. Any text between the start and end markers are commands executed by SQL*Plus. This particular script displays any database objects that have been created within the last week:

#!/bin/bash
newobjs=$(sqlplus -s << EOF
fbar/invqi@INVQI
select object_name
from dba_objects
where created > sysdate - 7
and owner not in (’SYS’,’SYSTEM’);
EOF)
echo $newobjs | mailx -s "new objects" [email protected]
exit 0

Be aware that there are two techniques for achieving command substitution in a shell script:

  • $(command)
  • `command`

For example, if you want to return the name of a server into a variable, you can use two techniques. The first is the following:

$ BOX=$(uname -a | awk ’{print$2}’)

The second is the following:

$ BOX=`uname -a | awk ’{print$2}’`

The $(command) is more modern and is thus the preferred technique. Just be aware that you may see command substitution implemented with the `command` syntax.

One last note: if you’re using a data dictionary view (within a shell script) that contains a $ as part of the view name, you must escape with a backslash the dollar sign within the shell script. For example, the following selects from the view V$DATAFILE the number of datafiles that have an OFFLINE status:

#!/bin/bash
nf=$(sqlplus -s << EOF
/ as sysdba
set head off
select count(*)
from v$datafile
where status=’OFFLINE’;
EOF)
echo "offline count: $nf"  | mailx -s "# files offline" [email protected]

You must escape the $ as shown in the script (e.g., v$datafile). If you don’t escape the $, the shell script interprets the view name as a shell variable. The backslash () in front of the $ instructs the shell script to ignore the meaning of special characters.

7-13. Crafting a Robust DBA Shell Script

Problem

You want to write a flexible and reusable shell script that incorporates the techniques used by experienced shell writers.

Solution

Most shell scripts that DBAs use require the following functionality:

  1. Sets the shell
  2. Validates parameters passed to the script
  3. Sets any special variables to be used in the script
  4. Sets the Oracle environment variables
  5. Calls the Oracle utility
  6. Captures the output in a unique log file name
  7. Sends an e-mail indicating the success or failure of the job
  8. Exits the script

Listed next is a basic shell script that uses these techniques to determine whether a SQL*Plus connection can be made to a database. The line numbers have been included for discussion purposes; they should be deleted before you attempt to run the script:

1  #!/bin/bash
2  PRG=$(basename $0)
3  #
4  # Validate parameters
5  USAGE="Usage: ${PRG} <database name> "
6  if [ $# -ne 1 ]; then
7     echo "${USAGE}"
8     exit 1
9  fi
10  #
11  # Set variables used in the script
12  SID=${1}
13  CONSTR=system/foo@${SID}
14  MAILX=’/bin/mailx’
15  MAIL_LIST=’[email protected]
16  LOG_DIR=/home/oracle/scripts
17  DAY=$(date +%F)
18  LOG_FILE=${LOG_DIR}/${PRG}.${DAY}.$$.log
19  LOC_SID=O1212
20  BOX=$(uname -a | awk ’{print$2}’)
21  #
22  # Source oracle variables
23  . /etc/oraset $LOC_SID
24  #
25  # Attempt to connect to database via SQL*Plus
26  crit_var=$(sqlplus -s <<EOF
27  $CONSTR
28  SET HEAD OFF FEED OFF
29  select ’success’ from dual;
30  EOF)
31  #
32  # Write output to log file
33  echo ${crit_var} > $LOG_FILE
34  #
35  # Send status
36  echo $crit_var | grep success 2>&1 >/dev/null
37  if [[ $? -ne 0 ]]; then
38    $MAILX -s "Problem with ${SID} on ${BOX}" $MAIL_LIST <$LOG_FILE
39  else
40    echo "Success: ${SID} on ${BOX}" | 
41    $MAILX -s "Success: ${SID} okay on ${BOX}" $MAIL_LIST
42  fi
43  #
44  exit 0

Image Tip  If you’re using vi for an editor, use the set number and set nonumber commands to toggle the viewing of line numbers (see recipe 4-11 for more details).

How It Works

The shell script in the “Solution” section of this recipe uses a wide variety of shell-programming techniques. You can use these methods to automate a diverse assortment of DBA tasks. Line numbers are included in the shell program to describe the purpose of each line. Table 7-6 contains a brief description of each line of code.

Table 7-6. Explanation of Shell Script to Check on Database Status

Line Number

Explanation

1

Specifies the Bash shell command interpreter for this script.

2

Captures the name of the shell script in the PRG shell variable. The $0 variable contains the name of the program. The basename command strips off any directory text that is prepended to the program name.

3–4

Comments.

5

Constructs an information string and places it in the USAGE variable.

6–9

If the number of parameters is not equal to 1, displays the script usage string and exits the program. See recipe 7-11 for an advanced discussion of processing variables.

10–11

Comments.

12

Sets the SID variable to the parameter passed into the script.

13

Sets the CONSTR variable to contain the SQL*Plus database connection string.

14

Sets the MAILX variable to the path and name of the mail utility on the server.

15

Specifies the e-mail address of the DBA(s) to receive the job status.

16

Sets the LOG_DIR variable to the directory of the log files.

17

Sets the DAY variable to the current date string.

18

Specifies the LOG_FILE to be a combination of the program name and date. The $$ variable is a unique process identifier that allows you to generate multiple log files per day.

19

Sets the LOC_SID to a local instance name on the box that the shell script is running on.

20

Sets the BOX variable to contain the name of the local database server.

21–22

Comments.

23

Use a program to set the required OS variables such as ORACLE_HOME. See recipe 7-7 for an example of a file that sets the Oracle variables.

24–25

Comments.

26

Captures in the crit_var variable the output of the SQL*Plus command. Initiates a connection to SQL*Plus. EOF specifies the starting point for the SQL*Plus commands.

27

Connects to SQL*Plus with the value in CONSTR.

28–29

Runs the SQL*Plus formatting and SQL command.

30

EOF specifies the end of the text to be interpreted as SQL.

31–32

Comments.

33

Writes the contents of the crit_var variable to the log file.

34–35

Comments.

36

Examines the contents of the crit_var variable for the string success that should have been returned from the SQL*Plus command.

37

$? contains the status of the previously run command. Checks to see whether the previous grep command found the string success. If the grep command succeeded, $? will contain a 0. The $? variable will contain a nonzero value if the grep command does not find the string success in the crit_var variable.

38

Sends an e-mail indicating there is a problem.

39–41

$? is equal to 0; therefore, the grep command found the string success in the crit_var variable. Sends an e-mail indicating that the database is up.

42

The end of the if statement.

43

Blank comment line.

44

Exits the shell script with a success status (indicated by a 0).

CREATING A LOCK FILE

One common method to ensure that only one instance of a shell script is ever running at a time is to create a lock file for a script. When executing a script, if the lock file already exists, the job is currently running or previously terminated abnormally (and the lock file was not removed). Place the following code at the beginning of your script. Modify the LOCKFILE parameter to match your environment:

LOCKFILE=/ora01/oradata/BRDSTN/lock/rman.lock
if [ -f $LOCKFILE ]; then
  echo "lock file exists, exiting..."
  exit 1
else
  echo "DO NOT REMOVE,  RMAN LOCKFILE" > $LOCKFILE
fi

At the end of the script, remove the lock file:

if [ -f $LOCKFILE ]; then
  rm $LOCKFILE
fi

The use of a lock file ensures that if the script is already running and is called again, it won’t start a new job.

7-14. Running Scripts in the Background

Problem

You work in a distributed environment and have database servers in remote locations. You want to run a job in the background that will continue to run, even if there are network problems or after you log off the box.

Solution

Use the & (ampersand) character to place a job in the background. This example runs the rman.bsh script in the background:

$ rman.bsh &
[1] 6507

From the previous output, [1] indicates the job number, and 6507 is the process identifier. You can verify that the program is running in the background via the jobs command:

$ jobs
[1]+  Running                rman.bsh &

On some older systems, you may be required to use the nohup (no hangup) command to ensure that the job will still execute even if you log off the server. If using an older shell, use this syntax to place a job in the background:

$ nohup rman.bsh &

To stop a background job, use the kill command. This next line of code stops job 1:

$ kill %1

How It Works

Sometimes you’ll need to run jobs in the background. Running a job in the background has the advantage of continuing to execute even after the following situations occur:

  • You logged off the box
  • Network issues cause your terminal session to become disconnected
  • Your session gets disconnected due to a server session timeout setting

For long-running jobs, you can run them in the background and not have to worry about restarting the job just because you become disconnected from the server.

Explaining & and nohup

By default, when you run a shell script from the command line, it will run in the foreground. To execute a job in the background, place an ampersand character at the end of the command string. Here is the general syntax:

$ <command>  &

Using & ensures that a command will continue to run, even if you log off the box. On some older systems, you may be required to use the nohup command to achieve this functionality. In this example, the nohup command is used to run a shell script in the background:

$ nohup export_db.bsh &

By default, the output from a nohup command is written to a file named nohup.out. You can monitor the job by continuously viewing the output file:

$ tail -f nohup.out

You can redirect the output to the file of your choice as follows:

$ nohup export_db.bash >exp.out &

You can interactively monitor the output of the previous job by viewing it with the tail -f command:

$ tail -f exp.out

Image Note  If you want a job to consistently run in the background at a specified time, use a scheduling utility such as cron. See Chapter 10 for details on automating jobs.

Using screen to Detach and Reattach to a Session

Using the & and nohup commands is the traditional way of keeping a job running in the background. You can also use the Linux screen command to achieve the same result with significantly more functionality. The screen command starts a terminal session on your server that will persist for you even if there is an unexpected network disruption.

To start a screen session, issue the following command:

$ screen

If you receive an error message such as “Cannot open terminal /dev/pts/1,” change the permissions on that file as root:

# chmod a+rw /dev/pts/1

When you invoke screen, you see a terminal from which you can type commands and run scripts. The difference between a screen session and a normal terminal session is that the screen session will continue to run even after you are detached.

For example, suppose that you are at your work location and you log on to a database server and start a screen session. You then start a long-running backup job in your screen session. After the job is started, you detach from the screen session by pressing Ctrl+A and then the D key (press the Ctrl and the A key at the same time, release them, and then press the D key). You can then drive home, remotely log on to the database server, and reattach to the screen session you started while you were at work. You can monitor the backup job as if you were looking at the same terminal you started at work.

Here’s a simple example of how this works. Type screen, as shown here:

$ screen

Print the current working directory so that you have some output on the screen that you can recognize when you attach to this screen session from another terminal:

$ pwd
/home/oracle

Now press Ctrl+A and then the D key, which detaches you from the screen session. You should see the following message:

[detatched]

Now start a different terminal session and log on to the database server. Issue the following command to display any detached screen sessions:

$ screen -ls
There is a screen on:
        31334.pts-1.rmougprd2   (Detached)
1 Socket in /tmp/uscreens/S-oracle.

You can reattach to any screen session using the -r (reattach) option followed by [[pid.]tty[.host]]. For this particular example, you can re-establish the screen connection by typing this:

$ screen -r 31334.pts-1.rmougprd2

You should now see the output of the previously entered pwd command. It is as if you never left the screen terminal session. This is a very powerful utility that can be used to start jobs and then monitor them from another remote terminal session. You can even share a screen session with other users.

To display screen online help, press Ctrl+A and then the ? key. To leave a screen session, use the exit command, which will stop your screen session.

7-15. Monitoring the Progress of a Script

Problem

You’re executing a shell script and want to monitor its progress.

Solution

Sometimes you have to monitor the progress of a long-running shell script. You can use the Linux tail command with the f (follow) switch to display the output of a job as it is written to a log file. In this example, the output of a backup job is redirected to an output file named rmanback.out:

$ rmanback.bash >rmanback.out 2>&1

From another session, the output being written to the log file is interactively viewed with the tail -f command:

$ tail -f rmanback.out

Here is a snippet of typical output that might be displayed to the screen:

channel ORA_DISK_2: starting archive log backupset
channel ORA_DISK_2: specifying archive log(s) in backup set
input archive log thread=1 sequence=868 recid=859 stamp=628426116
...

How It Works

DBAs often used the tail command to monitor things like alert logs and view potential issues with the database as they are happening. In this example, you continuously follow the display of what’s being written to an Oracle database alert.log file:

$ tail -f alert_BRDSTN.log

Here’s a snippet of typical output written to the alert.log file:

Completed: ALTER DATABASE BACKUP CONTROLFILE TO TRACE
DBID: 2917656785
Thread 1 advanced to log sequence 71
Current  log# 2 seq# 71 mem# 0: /ora01/oradata/BRDSTN/oradata/redo02a.log

When you want to discontinue viewing the contents of a log file, press Ctrl+C to break out of the tail command.

7-16. Debugging a Script

Problem

Your script isn’t doing what you expected. You want to debug the script.

Solution

The Bash shell has several features that are useful for debugging and troubleshooting problems in scripts. The -n (no execution) switch allows you to check the syntax of a script before you run it. To check a Bash shell script for syntax errors, use -n as shown here:

$ bash -n db.bash

If the script contains any errors, it will display a message such as this:

db.bsh: line 10: syntax error: unexpected end of file

Another useful debugging feature is the -o xtrace option, which instructs the Bash shell to display every command before it is executed. This option also shows any variable substitutions and expansions, so you can view the actual variable values used when the shell script executes. You can invoke the -o xtrace feature from the command line as follows:

$ bash -o xtrace <script name>

Notice that the output contains lines that don’t seem to have anything to do with your code:

+ alias ’rm=rm -i’
+ alias ’cp=cp -i’
+ alias ’mv=mv -i’
+ ’[’ -f /etc/bashrc ’]’
+ . /etc/bashrc
+++ id -gn
+++ id -un
+++ id -u
++ ’[’ root = root -a 0 -gt 99 ’]’
++ umask 022
++ ’[’ ’’ ’]’
+ export JAVA_HOME=/opt/java

That’s because the first several lines in the output are from code-executed startup scripts. Also of note: the plus signs in the output indicate the level of nesting of a command within the script.

How It Works

As shell scripts become longer and more complex, it can sometimes be problematic to squash the source of bugs within a script. This problem can be especially acute when you maintain code that somebody else wrote.

If you just want to see the tracing for specific commands within the script, embed set-o xtrace directly within your code at the desired location. In this example, tracing is turned on before the if statement and then turned off at the end:

set -o xtrace
if [ $? -eq 0 ]; then
  echo "$critProc is available."
else
  echo "$critProc has issues." | mail -s "problem with  $critProc" [email protected]
fi
set +o xtrace

Here is what the output looks like when the prior script is run with tracing enabled:

++ ’[’ 0 -eq 0 ’]’
++ echo ’ is available.’
 is available.
++ set +o xtrace

To enable a set command feature, you must use the minus (-) sign, which may seem counterintuitive. Equally counterintuitive, use a plus (+) sign to disable a set command feature.

Image Note  You can also use the set -x command to print each command’s parameter assignments before they are executed; to turn off tracing, use set +x, which is identical to the set -o xtrace and set +o xtrace commands.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset