Shell scripting is an important expertise that every professional DBA must possess. Most DBAs use this coding technique to automate many critical database administration tasks such as database backups and monitoring. As part of your job, you'll be asked to maintain scripts that prior DBAs have written, and you'll also be required to write new shell scripts as unique needs arise. The better you are at scripting, the better you'll be able to perform your job.
To be effective at shell scripting, DBAs minimally need to be familiar with four types of tools:
SQL application
Text editor
Shell interface
Shell scripting
The most common tools that map to the previous list are SQL*Plus, the vi
editor, and the Bash shell. The Bash shell is both a shell interface and a scripting language. The previous bulleted list is a minimal list of tools that you should be familiar with. You'll need to use a wide variety of Oracle database utilities in shell scripts such as RMAN, Export/Import, Data Pump, SQL*Loader, and so on.
As discussed in previous chapters, a shell is a command-line interface that allows you to interact with the Linux kernel. Some shells that are commonly used by DBAs in Linux environments are the Bourne Again shell (bash
) and the Korn shell (ksh
). As explained in Chapter 3, this book focuses on the Bash shell, so we won't rehash those details here.
A shell script is an operating system file that contains one or more commands that a shell can execute. Any command that the shell can execute can be placed in a shell script file. For example, a shell script can run other shell scripts, executable binary files (such as sqlplus
or rman
), or any Linux utility.
DBAs use shell scripts for critical tasks such as proactively monitoring the database and taking backups. These critical tasks need to be repeatable and testable. DBAs typically use shell scripts to run a series of commands that accomplish the desired task. DBAs like shell scripting because they don't need to be expert programmers to write shell scripts. With a base knowledge (and a good example), it's fairly quick and easy to create a shell script that performs the desired task.
The purpose of this chapter is to provide you with a core set of techniques to enable you to successfully write shell scripts. This chapter does not cover all the facets of shell scripting (that would take an entire book to accomplish). Rather, we provide you with the common approaches and shell scripting fundamentals that DBAs need to perform their jobs. Once you become adept at shell scripting, you'll be able to leverage these skills to automate critical database jobs (see Chapter 11 for details on automating tasks).
The first recipe of this chapter introduces you to the basics of how to write a shell script. If you're already familiar with the basics, then you can skip recipe 7-1.
One could argue that a database administrator might exclusively use a graphical user interface (GUI) or a browser user interface (BUI) tool to automate database tasks. We strongly recommend you learn how to write shell scripts to automate DBA tasks. If the GUI or BUI ever becomes inaccessible or doesn't do something you need it to do, then—for your job's sake—you had better know how to write and debug a shell script.
You're new to shell programming, and you want to write a simple script to determine whether your database is accessible.
Use an editor to create a new file (see Chapter 4 for details on using the vi
file editor). Place within the file the following text:
#!/bin/bash ORACLE_SID=SCDEV ORACLE_HOME=/orahome/oracle/product/10.2.0.1/db_1 PATH=$ORACLE_HOME/bin:$PATH echo "select 'DB up' from dual;" | sqlplus -s system/foo exit 0
Modify the previous code to match your ORACLE_SID
and ORACLE_HOME
variables and your system
schema password. After you've created the file, you'll also need to modify the permissions on the file to be executable. In this example, the file name is dbcheck.bsh
:
$ chmod u+x dbcheck.bsh
This changes the permission for the owner (u
) of the file to executable (x
). Now you should be able to successfully run the program:
$ dbcheck.bsh
If your database is up, you should receive a message like this:
'DBUP ----- DB up
All of the checking within the script is performed with one line of code:
echo "select 'DB up' from dual;" | sqlplus -s system/foo
The echo
command pipes a valid SQL statement to the sqlplus
executable. The sqlplus
executable will then attempt to log on with the system
schema and run the statement.
The first line of the shell script in the "Solution" section of this recipe needs a little more explanation. We've reproduced the line here for the discussion:
#!/bin/bash
The #
character is normally used to comment out a line in a shell script. One exception to that rule is when #!
appears as the first text in the shell script. When #!
is placed on the first line, it can be then combined with a path and program name. The path and program name specify the location and name of the program that will interpret the commands within the script. This is important because it means you can have the script run with a designated shell regardless of the interactive shell you're using.
If you don't specify a #!/<path>/<program>
, then the shell you're currently logged on to will be used to interpret the commands within the script. We recommend you specify the path and shell program on the first line of your script so that there is no ambiguity about which shell will be used to interpret the commands within the script.
On most Linux distributions, the bash
interpreter is in the /bin
directory. If you don't know the location of your bash
executable, you can use the which
or whereis
command to locate it:
$ whereis bash bash: /bin/bash
When you first attempt to run a shell script, you may receive an error similar to the following:
-bash: dbcheck.bsh: command not found
This means your PATH
variable doesn't include the current working directory. To work around this, you can reset your PATH
variable to include the current working directory. This example exports the PATH
variable to include the current working directory:
$ export PATH=$PATH:.
Another method for ensuring that the Bash shell can locate a script is to include the complete directory path to where the script resides. In this example, the script is located in the directory /home/oracle
and is run as shown here:
$ /home/oracle/dbcheck.bsh
You can also instruct the shell to look in the current working directory to determine the location of the script. You do this by placing a ./
before the script name:
$ ./dbcheck.bsh
Adding the file extension .sh
, .bsh
, or .bash
to the end of a Bash shell script is a common industry practice. Keep in mind that file extensions are meaningless in Linux/Unix environments (other than helping you document the type of script). This is different from DOS where .exe
, .com
, and .bat
indicate executable operating system files.
You want to check for a condition such as whether a critical database background process is running and send an e-mail if there is a problem.
Use the if/then/else
Bash control structure to check for a condition and perform an appropriate action. The following example uses an if/then/else
structure to determine whether the Oracle system monitor process is running and sends an e-mail if the process is not detected:
#!/bin/bash SID=SAND critProc=ora_smon ps -ef | grep -v 'grep' | grep ${critProc}_$SID if [ $? -eq 0 ]; then echo "$SID is available." else echo "$SID has issues." | mail -s "problem with $SID" [email protected] fi exit 0
The previous example uses the $?
variable. This variable is often used after conditional statements to evaluate the success or failure of the previous command. The $?
contains the status of the last executed command. If the previously executed command was successful, then the $?
variable will contain a zero; otherwise, it will contain a nonzero value.
The if/then/else
control structure comes in three basic forms. The first one states that if a condition is true, then execute the following commands. Its syntax is as follows:
if condition ; then commands fi
On the first line of code in the previous example, the keyword then
is a separate command, so you must insert a semicolon to indicate the end line termination point of the if
keyword. Another way of executing the previous bit of code would be as follows:
if condition then commands fi
The next form of the if/then/else
structure states if a condition is true, execute the following commands. If the first condition is false, then execute a separate set of commands. Its syntax is as follows:
if condition ; then commands else commands fi
The third form of the if/then/else
structure states that if a condition is true, then execute the first set of commands; otherwise, check for the next condition, and if it is true, execute the commands. This functionality is enabled with the elif
keyword. You can have many elif
conditions in this form. Its syntax is as follows:
if condition ; then commands elif condition commands elif condition commands fi
You want to write a script that checks for certain conditions such as the number of parameters passed to a script. Based on the condition, you want to perform an action such as display an informational message or exit the script.
As shown in recipe 7-2, the if/then/else
structure is an important programming technique. However, it is the combination of if/then/else
with a condition that can be tested that gives you a much more powerful tool to automate DBA tasks. The test command gives you the ability to check a condition within an if
command. Here is the basic syntax for the test
command:
test operand1 operator operand2
The test
command can also be written with the [ ]
syntax. This syntax uses a left brace to start the command and then finishes the command with a right brace. Its syntax is as follows:
[ operand1 operator operand2 ]
The shell script examples in this chapter will use the [ ]
form of the test
command.
For some test conditions, an operand1
isn't required. The syntax for this condition is as follows:
[ operator operand2 ]
The previous test conditional checks will exit with a status of 0 (true) or 1 (false), depending on the evaluation of the condition. Ensure that you have a space between the operands, operators, and brackets. The space is how the shell knows where the operator and operand are separated. If there is no space between the operator, operand, and brackets, then the shell will interpret the value as one string, which will result in erroneous outcomes.
To bring if/then/else
and test
together, you'll write a small—but useful—piece of code that checks to see whether the correct number of parameters are passed to a script. The script will use the $# parameter. The $# parameter automatically gets assigned to the number of positional parameters typed at the command line. The $#
variable is handy when you want to check for the correct number of parameters passed to a script.
The following bit of code uses the -ne
conditional check to determine whether the number of variables passed to the script is not equal to 1:
#!/bin/bash if [ $# -ne 1 ] then echo "Wrong number of parameters passed to script." exit 1 fi
The $0
parameter is often used in conjunction with the $#
parameter to display the correct syntax required when invoking a script. Within a shell script, the $0
parameter contains the name of the shell script being executed. Here's a slight variation of the previous code that uses the $0
variable to display the name of the script:
if [ $# -ne 1 ] then echo "Wrong number of parameters passed to script." echo "Usage: $0 ORACLE_SID" exit 1 fi
The -ne
operator is an arithmetic operator and is used to test whether the operands are not equal. If the script is called without passing exactly one parameter to it, then the following output is displayed:
Wrong number of parameters passed to script. Usage: ./ss.bsh ORACLE_SID
Notice that there is a ./
in front of the script name in the previous output. To scrape the ./
out of the output, use the basename
command as follows:
Pgm=$(basename $0) if [ $# -ne 1 ] then echo "Wrong number of parameters passed to script." echo "Usage: $Pgm ORACLE_SID" exit 1 fi
In the previous piece of code, notice the first line of code uses a shell technique known as command substitution. Command substitution allows you to take the output of a command and load it into a variable. The basic syntax for doing this is as follows:
variable=$(shell commands)
Another arithmetic check that you may want to do is to verify the amount of physical memory on your database server. Here's a simple script that verifies that the database server memory is greater than 1 gigabyte:
#!/bin/bash thresh=1048576 totMem=$(grep MemTotal /proc/meminfo | awk '{print $2}') if [ $totMem -lt $thresh ]; then echo "Total Memory $totMem is less than: $thresh" exit 1 else echo "Total Memory is: $totMem" fi
Several arithmetic operators are available with the Bash shell. Table 7-1 gives a brief description of each operator.
Table 7.1. Arithmetic Operators
Operator | Description |
---|---|
| True if two integers are equal |
| True if two integers are not equal |
| True if |
| True if |
| True if |
| True if |
You can also use strings with test conditions. There are a wide variety of ways to use string comparisons. Table 7-2 lists test operations for strings and their descriptions. For example, you may want to check to ensure that you're logged on as a certain operating system user before you run commands in a script. This example checks to see whether the user running the script is oracle
:
#!/bin/bash checkUser=oracle curWho=$(whoami) if [ "$curWho" != "$checkUser" ]; then echo "You are currently logged on as: $curWho" echo "Must be logged in as $checkUser to run this script.." exit 1 fi
In the previous bit of code, the curWho
variable is assigned to the name of the user running the script. That string variable is then checked to see whether it matches the string of oracle
. If user doesn't match, then the script displays informational messages and exits the script.
Table 7.2. String Operators
String Operator | Description |
---|---|
| True if the string is empty |
| True if the string is not empty |
| True if the strings are equal |
| True if the strings are not equal |
| True if |
| True if |
Another useful example of a string comparison is to read input from a user and verify an operation. Suppose you want to check the current Oracle SID variable before continuing to run more commands within the script. This is useful if you work with multiple databases contained on one physical server. This script displays the value of ORACLE_SID
and asks whether you want to continue running the script:
#!/bin/bash keepGoing=n echo "Current value of ORACLE_SID: $ORACLE_SID" echo -n "Do you want to continue? y/n " read keepGoing if [ "$keepGoing" = "y" ]; then echo "Continue to run script." else echo "Exiting script" exit 1 fi
In addition to arithmetic and string comparisons, you can also perform various tests on operating system files. The test
command allows you to perform checks such as the availability of a file, the file type, and so on. Table 7-3 contains descriptions of the Bash shell tests for file operations.
For example, you may want to determine whether an error log file exists and, if it does, then send an e-mail to the appropriate support person. This script uses the -e
parameter of the test
command to determine this:
#!/bin/bash checkFile=/home/trace/error.log if [ -e $checkFile ]; then mail -s "errors" [email protected] <$checkFile else echo "$checkFile does not exist" fi
If you want your shell script to do nothing after checking for a condition, then use the colon (:
) command (sometimes called no-operation or null). For example, the following bit of code does nothing if it detects that the given file exists:
#!/bin/bash checkFile=/home/oracle/.bashrc if [ -e $checkFile ]; then : else echo "$checkFile does not exist" fi
Table 7.3. File Operators
File Operator | Description |
---|---|
| True if file exists |
| True if file is a block device file |
| True if file is a character device file |
| True if file is a directory |
| True if file exists |
| True if file exists and is a regular file |
| True if file has |
| True if file is a symbolic link |
| True if file is a symbolic link |
| True if file's sticky bit is set |
| True if file is a named pipe |
| True if the file is readable (by current user) |
| True if file exists and is not empty |
| True if file is socket |
| True if file is |
| True if file is writable (by current user) |
| True if file is executable |
| True if file is effectively owned by current user |
| True if file is effectively owned by current user's group |
| True if file has been modified since it was last read |
| True if |
| True if |
| True if |
The test options will vary by vendor and version of Linux. For a complete listing of available test operations in your environment, use the help test
command or the man test
command.
You need to perform a sophisticated set of checking for conditions. You wonder whether there is a structure more powerful than if/then/else
.
In many cases (pun intended), a simple if/then/else
construct is all you need to check a condition. However, as soon as you are presented with many different actions to take, the if/then/else
syntax can become unwieldy and nonintuitive. In these situations, use a case
statement instead. The basic syntax for a case
statement is as follows:
case expression in pattern1) commands ;; pattern2) commands ;; esac
The following bit of code uses a case
statement to check for the percentage of free disk space in the archive log file location and e-mails the DBA a report:
#!/bin/bash archLoc=/dev/sda2 usedSpc=$(df -h $archLoc | awk '{print $5}' | grep -v Use | cut -d "%" -f1 -) # case $usedSpc in [0-9]) arcStat="relax, lots of disk space: $usedSpc" ;; [1-7][0-9]) arcStat="disk space okay: $usedSpc" ;; [8][0-9]) arcStat="gulp, space getting low: $usedSpc" ;; [9][0-9]) arcStat="red alert, running out of space: $usedSpc" ;;
[1][0][0]) arcStat="update resume, no space left: $usedSpc" ;; *) arcStat="huh?: $usedSpc" esac # BOX=$(uname -a | awk '{print $2}') echo $arcStat | mail -s "archive space on: $BOX" [email protected] exit 0
In the prior bit of code, the usedSpc
variable gets assigned a value that shows what percentage of disk space is used on a mount point. The case
statement then examines usedSpc
to determine which range the variable falls within. Lastly, it e-mails the production support alias an informational message.
If the device name being assigned to archLoc
is longer than 21 characters, then the df
command does not neatly display the output on one line (and thus breaks the previous script). If this is the situation on your server, then you can create a symbolic link name (see recipe 5-34 for details) to a name that is 21 characters or fewer.
The code in the "Solution" section of this recipe compresses a wide variety of shell commands and coding techniques into a short amount of code. The result is a small but extremely useful script that monitors disk space. The third line of the script needs additional explanation. We repeat it here for convenience:
usedSpc=$(df -h $archLoc | awk '{print $5}' | grep -v Use | cut -d "%" -f1 -)
The output of the df
(disk free) command is piped to the awk
command, which extracts the fifth column. This in turn is passed to the grep
command, which eliminates any output that contains the string Use
. This output is piped to the cut
command, which extracts the first field delimited by a %
character. The resultant string should be the percentage of disk space used on the mount point in question.
On some Linux/Unix platforms, the output of the df
command might not display the string Use
. For example, on Solaris, the output from the df
command uses the word Capacity
to indicate the amount of disk space used. In those situations, you can modify the script to use egrep
to filter out multiple strings. This example uses egrep
to filter a line containing either use
or capacity
from the output:
usedSpc=$(df -h $archLoc|awk '{print $5}'|egrep -iv "use|capacity"|cut -d "%" -f1 -)
In the example in the "Solution" section of this recipe, the case
statement performs a sophisticated set of string comparisons on the value stored in the usedSpc
variable. The case
statement will check each condition until it finds a match. When a condition is met, the case
statement runs any statements within the matched section and then exits.
An example will help clarify this concept. Let's look at the first condition in the case
statement in the "Solution" section of this recipe:
[0-9]) arcStat="relax, lots of space: $usedSpc" ;;
In the previous snippet of code, the case
statement checks the value of the variable to see whether it is a one-digit string that contains a value within the range of 0 through 9. If it matches, then it sets the arcStat
variable to an appropriate message and exits the case
statement.
Take a look at the second condition in the case
statement:
[1-7][0-9]) arcStat="space okay: $usedSpc" ;;
In the prior bit of code, the case
statement checks for a two-digit number. The first character it's looking for must be in the range of 1 through 7. The second character can be any number from 0 to 9. If the pattern matches, the arcStat
variable is set to an appropriate value, and the case
statement exits.
The case
statement will continue to attempt to find a match based on the specified pattern. If no match is made, then the catchall clause *)
will be executed, and the case
statement will be exited. The structure of the case
statement allows you to perform complicated comparisons that would probably drive you crazy if you tried to code it using if/then/else
statements. Table 7-4 lists common pattern matching characters used in case statements.
Table 7.4. Common Character Matching Patterns
Pattern | Description |
---|---|
| Matches either an |
| Matches any string of characters, often used for a catchall |
| Matches any character |
| Matches any character |
| Matches any character 0 through 9 |
| Matches the string enclosed in the quotes |
Case closed!
You want to repeatedly perform a task a specific number of times or for a specific number of arguments that are passed in a list.
A for
loop allows you to rerun a section of code a fixed number of times. This control construct is particularly useful because DBAs often have a known set of databases or files that need to be operated on. The for
loop syntax is as follows:
for name [in list] do commands that can use $name done
The following code illustrates the power of a for
loop. In this environment, there are three databases that are being monitored for a critical background process. The for
loop allows you to provide an input list and have the same code reexecuted for each database name in the input list:
#!/bin/bash SID_LIST="dev1 dev2 dev3" critProc=ora_smon for curSid in $SID_LIST do ps -ef | grep -v 'grep' | grep ${critProc}_$curSid if [ $? -eq 0 ]; then echo "$curSid is available." else echo "$curSid has issues." | mail -s "issue with $curSid" [email protected] fi done
The for
loop iterates through each argument passed in to the parameter list. This control structure is ideal for (no pun intended) a fixed input list. Depending on which shell you use, the syntax may be slightly different from the one described in the "Solution" section.
You can use the built-in Bash shell $@
variable to pass a list to a for
loop. The $@
variable contains a quoted list of arguments passed to the script. By default, a for
loop will use $@
if no input list is provided. The previous code snippet can be slightly modified to take advantage of this technique, as shown here:
#!/bin/bash critProc=ora_smon for curSid in $@ do ps -ef | grep -v 'grep' | grep -i ${critProc}_$curSid if [ $? -eq 0 ]; then echo "$curSid is available." else echo "$curSid has issues." | mail -s "issue with $curSid" [email protected] fi done
Assume that the previous bit of code is placed in a file named dbup.bsh
. It can now be run from the command line to pass in a list of databases to check:
$ dbup.bsh dev1 dev2 dev3
The while
and until
flow control constructs allow a piece of code to iterate until a condition is met. In contrast to a for
loop construct, the while
and until
loops are useful when the number of times needed to continue looping is not known beforehand. The while
loop runs until a test condition has zero exit status. The syntax for the while
loop is as follows:
while condition ; do commands done
The until
control construct is similar to the while
loop. The until
loop runs until a test condition has a nonzero exit status.
until condition ; do commands done
Sometimes it's useful when debugging scripts to iterate through all arguments passed to a shell script and view the parameter values. This next snippet of code uses a while
loop to display all parameters passed into a script:
while [ $# -ne 0 ]; do echo $1 shift 1 done
In the previous code sample in the "Solution" section, the shift
command is used to move the positional parameters one position to the left. You can think of the positional parameters as an array of values, and (when invoked) the shift
command (destructively) moves these values left in the array by the specified number of values.
An example will help clarify this shifty concept. Suppose there are three parameters passed into a program: A
, B
, and C
. This means $1
will contain A
, $2
will contain B
, and $3
will contain C
. When you issue the shift 1
command, $1
now contains B
, $2
contains C
, and $3
now contains nothing. Another shift will move C
into $1
, and $2
and $3
will now be empty, and so forth.
The select
command allows you to create a menu from an input list. If the input list is omitted, then the positional parameters (contained in the $@
variable) are used to construct the menu. The syntax of the select
command is nearly identical to the for
command:
select name in [input list ] do commands that use $name done
Listed next is a shell script that uses the select
command to query the contents of the /var/opt/oracle/oratab
file and set your Oracle operating system variables depending on which value for ORACLE_SID
that you chose.
#!/bin/bash # Why: Sets Oracle environment variables. # Setup: 1. Put oraset file in /var/opt/oracle # 2. Ensure /var/opt/oracle is in $PATH # Usage: batch mode: . oraset <SID> # menu mode: . oraset #==================================================== OTAB=/var/opt/oracle/oratab if [ -z $1 ]; then SIDLIST=$(grep -v '#' ${OTAB} | cut -f1 -d:) # PS3 incidates the prompt to be used for the Bash select command. PS3='SID? ' select sid in ${SIDLIST}; do if [ -n $sid ]; then HOLD_SID=$sid break fi done else if grep -v '#' ${OTAB} | grep -w "${1}:">/dev/null; then HOLD_SID=$1 else echo "SID: $1 not found in $OTAB" fi shift fi #
export ORACLE_SID=$HOLD_SID export ORACLE_HOME=$(grep -v '#' $OTAB|grep -w $ORACLE_SID:|cut -f2 -d:) export ORACLE_BASE=${ORACLE_HOME%%/product*} export TNS_ADMIN=$ORACLE_HOME/network/admin export PATH=$ORACLE_HOME/bin:/usr/ccs/bin:/opt/SENSsshc/bin/ :/bin:/usr/bin:.:/var/opt/oracle export LD_LIBRARY_PATH=/usr/lib:$ORACLE_HOME/lib
When you run the script, be sure to run it with the source
command, or the .
notation, in one of the following ways:
$ . oraset <database name>
Or you can run the script interactively and pick your database name from a menu as follows:
$ . oraset
The .
instructs the shell to source the script. Sourcing tells your current shell process to inherit any variables set within an executed script. If you don't use the .
notation, then the variables set within the script are visible only within the context of the subshell that is spawned when the script is executed.
In the Bash and C-shell shell, the source
command and the .
built-in are synonymous.
When running the oraset
program, suppose the contents of your /var/opt/oracle/oratab
file contains the following text:
BRDSTN1:/global/ORAHOME1/product/10.2:Y DEV9:/global/ORAHOME1/product/9.2:Y
The names of the databases in the previous text are BRDSTN1
and DEV9
. The paths to the each database's home directory is next in the line (separated from the database name by a :
). The last column should contain a Y
or an N
and indicates whether you want the databases to automatically be restarted when the system reboots (see Chapter 11 for details).
When you run oraset
from the command line, you should be presented with a menu like this:
1) BRDSTN1 2) DEV9
In this example, you can now type a 1
or a 2
to set the operating system variables required for the particular database for whichever database you want to use. This allows you to interactively set up OS variables regardless of the number of database installations on the server.
Use the &&
and ||
control operators to conditionally execute a command based on the status of the previously run command. The &&
operator says that if the first command was successful (exits with an exit status of zero), then run the second command in the list. If the first command did not run successfully, then do not run the second command. For example, in this next line of code, an e-mail will be sent only if the grep command finds the string ORA-00600 in the alert.log
file:
$ grep ORA-00600 alert.log && echo "DB prob" | Mail -s "ORA 600 error" [email protected]
The || control operator says that if the first command is not successful (runs with a nonzero exit status), then the next command should be run. This next command checks for the existence of a log file, and if it is not found, then an e-mail is sent:
$ ls alert.log || echo "no log" | Mail -s "log file error" [email protected]
The previous examples give you another shell scripting tool that you can use in creative ways to monitor for various activities and alert you if there are problems.
Sometimes it's useful to have a command execute depending on the success or failure of the previous immediately run command. For example, you may want to check the status of a background process. If the background process is not running, then have the shell program e-mail you an appropriate message.
You want to make your scripts more modular and functional. You determine that shell functions will help accomplish this task.
Functions are commonly used in most programming languages. A function is a block of commands that perform an action. You can think of a function as a small script within another script that compartmentalizes a section of code. Using functions allows you to modularize your code and make it more reusable, readable, and maintainable.
Like variables, functions must be declared before you can use them. Not surprisingly, the function
command is used to declare functions. To illustrate the use of functions, let's say you need to create a reusable bit of code that displays some debugging information. This example creates a function named showMsg
:
#!/bin/bash function showMsg { echo "----------------------------------------" echo "You're at location: $1 in the $0 script." echo "----------------------------------------" } # showMsg
The function can now be referenced anywhere in the script after the point at which it was declared. For example, further down in the script this snippet of code invokes the showMsg
function:
showMsg 1 # more code goes here... showMsg 2
Functions can also be declared and invoked directly from the operating system command line. From a DBA perspective, this gives you a very powerful tool that allows you to create and use any number of useful functions that can be run as if they were operating system commands. For example, create a file named dbaFunk.bsh
, and place it in the following commands:
#!/bin/bash #------------------------------------------------------------------------------# # sshl : ssh with your login to remote host function sshl { echo "ssh -l $LOGNAME $*" ssh -l $LOGNAME $* } # sshl #------------------------------------------------------------------------------# # ssho : ssh with the oracle userid to remote host function ssho { echo "ssh -l oracle $*" ssh -l oracle $* } # ssho #------------------------------------------------------------------------------# # chkps: check for a process on the box function chkps { ps -ef | grep -i $1 } # chkps #------------------------------------------------------------------------------#
Now source the file as shown here:
$ . dbaFunk.bsh
You now have access to these functions—sshl
, ssho
, and chkps
—from the operating system command line. To illustrate this, the chkps
function is called while passing in the string of oracle
to operate on:
$ chkps oracle
You'll most likely collect many functions in your DBA tool bag to alleviate having to type long, typo-prone shell commands. Yes, for the previous simple functions, you could create aliases to accomplish essentially the same task. However, functions give you the additional ability to combine several different commands, use parameters, and echo useful informational messages.
You don't like hard-coding variables in your script. You want to change a script to set variables based on parameters passed to the script. This makes your code more reusable, flexible, and maintainable.
First, take a look at this script with hard-coded values in it for the database SID:
#!/bin/bash ORACLE_SID=brdstn rman target / <<EOF backup database; EOF
The <<
characters instruct the command running (in this case, the rman
utility) to receive its input from anything that appears between the first EOF
and the last EOF
. You don't have to use EOF
for start and finish markers; it can be any text string. It simply marks the beginning and end of where the shell directs the input for the command running.
If you want to use this script (as it is) to back up a different database, then you'll have to manually edit it and change the name of the database. That isn't very efficient. A better approach would be to modify the script so that it dynamically can be passed the name of the database to be backed up. Assume for this example that the script name is back.bsh
. The script is modified as shown to accept as input the database name:
#!/bin/bash ORACLE_SID=$1 rman target / <<EOF backup database; EOF
In the previous bit of code, $1
is a built-in variable in the Bash shell. The $1
variable holds the first parameter passed to the script when invoking it. Now the script can be run by passing in a database name. In this example, we pass in the name of devdb
:
$ back.bsh devdb
You can pass any number of arguments into a shell script. The first parameter is referenced inside the script as $1
, the second parameter is referenced as $2
, and so on. These shell variables are known as positional parameters.
Positional parameters are special variables that are set internally by the shell and are available for you to use within a script. The shell provides several special variables for you. Some of the more useful of these variables are described in Table 7-5. Examples of using these variables will be sprinkled throughout the chapter.
Table 7.5. Special Shell Variables
Name | Description |
---|---|
| Positional parameters that hold values for parameters passed to the script. |
| The exit status of the last command. Contains a 0 for successfully executed commands. Contains a nonzero value for commands that failed. This nonzero value depends on what the command actually returned. |
| Within a shell script, contains the name of the shell script being executed. |
| The number of positional parameters passed to a script. |
| The process number of the shell. Can be used to generate unique file names. |
| The process number of the most recently executed background process. |
| Contains all the positional parameters passed to the script. |
If you pass in more than nine positional parameters to a script, you will have to use braces ({}
) to wrap the number portion of the parameter with multidigit parameters. This is because $10
will be interpreted as the contents of $1
with a 0 concatenated to it, whereas ${10}
will be interpreted as the contents of the tenth variable. An example will help illustrate this point; say you pass ten parameters to a script as shown here:
$ myscript.bsh a b c d e f g h i j
For illustration purposes, suppose this line of code is contained within the script:
echo $10
In this case, that will produce a result of a0
, because it is echoing the contents of parameter 1 concatenated with a 0.
When the braces are used with the echo
command:
echo ${10}
that will produce a j
, which is the tenth parameter that was passed to the script. Remember to use the braces anytime you reference a multidigit parameter within a shell script, or you will probably not get the results you intended.
Use the getopts
function to process parameters. Here is a script that uses getopts
to examine and act on parameters passed to an RMAN backup script:
#!/bin/bash PRG=$(basename $0) USAGE="Usage: $PRG -d SID [-c compress] [-h]" if [ $# -eq 0 ]; then echo $USAGE exit 1 fi
# If char followed by :, then argument is expected, the argument # should be separated from the option by a white space. OPTSTRING=":d:c:h" # while getopts "$OPTSTRING" ARGS; do case $ARGS in d) ORACLE_SID=${OPTARG} ;; c) COMP_SWITCH=$(echo ${OPTARG} | tr '[A-Z]' '[a-z') if [ $COMP_SWITCH = "compress" ]; then COMP_MODE=" as compressed backupset " else echo $USAGE exit 1 fi ;; h) echo $USAGE exit 0 ;; *) echo "Error: Not a valid switch or missing argument." echo ${USAGE} exit 1 ;; esac done # echo rman backup rman target / <<EOF backup $COMP_MODE database; EOF # exit 0
The getopts
(get options) utility provides you with an efficient way to inspect and process command-line switches. The getopts
program ensures a standard interface is used for shell program parameter handling.
The while getopts
command will iterate through each character in the OPTSTRING
. This next bulleted list describes how getopts
handles characters in the string:
When the first character in the string is a :
, that instructs the shell script to generate error messages. If the first character is not a :, then the getopts
command should handle processing errors.
If an option character is followed by :
, then an argument is expected on command line; the argument should be separated from the option by a space.
If an option is missing an argument, ARGS
is set to a :
, and OPTARG
is set to the option missing an argument.
If an option is invalid, ARGS
is set to a ?
question mark, and OPTARG
gets set to the invalid option passed in.
In the "Solution" section of the recipe, the string name is OPTSTRING
and is defined as follows:
OPTSTRING=":d:c:h"
The OPTSTRING
starts with a :
, which specifies that the shell script should handle error conditions and generate any related error messages. If OPTSTRING
does not start with a :
, this instructs the getopts
command to display any applicable error messages.
The d
and c
options are both followed by colons and therefore require arguments. The h
option is not followed by a colon and therefore does not require an option.
Assume that the name of the script in the "Solution" section is named rman.bsh
. The following is a valid way to invoke the script:
$ rman.bsh -d ORCL -c compress
If you attempt to run the script with an invalid option, an error is returned:
$ rman.bsh -g Error: Not a valid switch or missing argument. Usage: rman.bsh -d SID [-c compress] [-h]
For this shell script, the -d
(database) switch signifies a valid database SID. The -c
(compress) switch will specify whether compression should be used. The -h
(help) switch will display the correct usage of the script.
Linux also has a getopt
command (no s
on the end). This command is used in a similar fashion to getopts
. View the man getopt
documentation for more information.
There are several techniques for running database commands from within shell scripts. These are two techniques commonly used:
Running commands directly
Capturing output in a variable
These techniques are described in the following sections.
For example, here is a script that invokes the Oracle RMAN utility and takes a backup of the control file:
#!/bin/bash ORACLE_SID=DEV_DB rman target / <<EOF backup current controlfile; EOF
The <<
characters instruct the command running (in this case the rman
utility) to receive its input from anything that appears between the first EOF
and the last EOF
. You don't have to use EOF
for start and finish markers, it can be any text string. It simply marks the beginning and end of where the shell directs the input for the command running.
This technique applies to any Oracle utility. The following runs a SQL*Plus command within a shell script:
#!/bin/bash ORACLE_SID=RMDB1 sqlplus -s <<EOF / as sysdba select sysdate from dual; EOF
Capturing the output of a command within a variable is known as command substitution. DBAs use this technique often in shell scripts.
For example, a DBA task might be to determine whether some Oracle materialized views are refreshing on a daily basis. One way of doing this is to select a count from the Oracle data dictionary view USER_MVIEWS
where the last refresh date is greater than one day.
#/bin/bash critVar=$(sqlplus -s <<EOF pdb_m/abc@papd SET HEAD OFF FEED OFF SELECT count(*) FROM user_mviews WHERE sysdate-last_refresh_date > 1; EXIT; EOF)
The script returns a value into the variable critVar
, and then you can test to see whether the value is a 0:
if [ $critVar -ne 0 ]; then mail -s "Problem with MV refresh" [email protected] <<EOF MVs not okay. EOF else echo "MVs okay." fi
If the value isn't 0, the script will send an e-mail to the appropriate support e-mail address.
You can put any commands that a shell can interpret inside a script. The basic technique for running database tasks is to invoke the database utility and run any specific commands that the utility can interpret. The database commands need to be enclosed by a "start of command" and "end of command" marker.
The following example uses EOF
for the start and end of a SQL*Plus command that reports the names of objects created in the database in the last week:
#!/bin/bash newobjs=$(sqlplus -s << EOF fbar/invqi@INVQI select object_name from dba_objects where created > sysdate - 7 and owner not in ('SYS','SYSTEM'), EOF) echo $newobjs | mailx -s "new objects" [email protected]
There are two techniques for achieving command substitution in a shell script:
$(command)
`command`
For example, if you wanted to return the name of a server into a variable, it could be accomplished with the following techniques:
$ BOX=$(uname -a | awk '{print$2}')
or the following:
$ BOX=`uname -a | awk '{print$2}'`
Either technique is a valid method. Depending on which shell you use, the $(command)
command substitution technique might not be available.
If you're using a data dictionary view (within a shell script) that contains a dollar sign ($
) as part of the view name, then you must escape the dollar sign within the shell script. For example, the following selects from the view V$DATAFILE
the number of datafiles that have an OFFLINE
status:
#!/bin/bash nf=$(sqlplus -s << EOF / as sysdba set head off select count(*) from v$datafile where status='OFFLINE'; EOF) echo "offline count: $nf" | mailx -s "# files offline" [email protected]
If you don't escape the dollar sign, then the shell script interprets the view name as a shell variable. The backslash () in front of the dollar sign instructs the shell script to ignore the meaning of special characters.
You want to write a flexible and reusable shell script that incorporates the techniques used by experienced shell writers.
Most shell scripts that DBAs use require the following functionality:
Set the shell.
Validate parameters passed to the script.
Set any special variables to be used in the script.
Set the Oracle environment variables.
Call the Oracle utility.
Capture the output in a unique log file name.
Send an e-mail indicating the success or failure of the job.
Exit the script.
Listed next is a basic shell script that uses these techniques to determine whether a SQL*Plus connection can be made to a database. The line numbers have been included for discussion purposes and should be taken out before attempting to run the script:
1 #!/bin/bash 2 PRG=$(basename $0) 3 # 4 # Validate parameters 5 USAGE="Usage: ${PRG} <database name> " 6 if [ $# -ne 1 ]; then 7 echo "${USAGE}" 8 exit 1 9 fi 10 # 11 # Set variables used in the script 12 SID=${1} 13 CONSTR=system/foo@${SID}
14 MAILX='/bin/mailx' 15 MAIL_LIST='[email protected]' 16 LOG_DIR=/orahome/oracle/scripts 17 DAY=$(date +%F) 18 LOG_FILE=${LOG_DIR}/${PRG}.${DAY}.$$.log 19 LOC_SID=SCDEV 20 BOX=$(uname -a | awk '{print$2}') 21 # 22 # Source oracle variables 23 . /var/opt/oracle/oraset $LOC_SID 24 # 25 # Attempt to connect to database via SQL*Plus 26 crit_var=$(sqlplus -s <<EOSQL 27 $CONSTR 28 SET HEAD OFF FEED OFF 29 select 'success' from dual; 30 EOSQL) 31 # 32 # Write output to log file 33 echo ${crit_var} > $LOG_FILE 34 # 35 # Send status 36 echo $crit_var | grep success 2>&1 >/dev/null 37 if [[ $? -ne 0 ]]; then 38 $MAILX -s "Problem with ${SID} on ${BOX}" $MAIL_LIST <$LOG_FILE 39 else 40 echo "Success: ${SID} on ${BOX}" | 41 $MAILX -s "Success: ${SID} okay on ${BOX}" $MAIL_LIST 42 fi 43 # 44 exit 0
If you're using vi
for an editor, use the set number
and set nonumber
commands to toggle the viewing of line numbers (see recipe 4-11 for more details).
The shell script in the "Solution" section of this recipe uses a wide variety of shell programming techniques. You can use these methods to automate a diverse assortment of DBA tasks. We included line numbers in the shell program so that we could describe the purpose of each line. Table 7-6 contains a brief description of each line of code.
Table 7.6. Explanation of Shell Script to Check on Database Status
Line Number | Explanation |
---|---|
1 | Specifies the Bash shell command interpreter for this script. |
2 | Captures the name of the shell script in the PRG shell variable. The |
3–4 | Comments. |
5 | Construct an information string and place it in the |
6–9 | If the number of parameters is not equal to 1, then display the script usage string and exit the program. See recipe 7-11 for an advanced discussion of processing variables. |
10–11 | Comments. |
12 | Set the |
13 | Set the |
14 | Set the |
15 | Specify the e-mail address of the DBA(s) to receive the job status. |
16 | Set the |
17 | Set the |
18 | Specify the |
19 | Set the |
20 | Set the |
21–22 | Comments. |
23 | Use a program to set the required operating system variables such as ORACLE_HOME. See recipe 7-7 for an example of a file that sets the Oracle variables. |
24–25 | Comments. |
26 | Capture in the |
27 | Connect to SQL*Plus with the value in |
28–29 | Run the SQL*Plus formatting and SQL command. |
30 | The |
31–32 | Comments. |
33 | Write the contents of the |
34–35 | Comments. |
36 | Examine the contents of the |
37 | The |
38 | Send an e-mail indicating there is a problem. |
39–41 | The |
42 | The |
43 | Blank comment line. |
44 | Exit the shell script with a success status (indicated by a 0). |
You work in a distributed environment and have database servers in remote locations. You want to run a job in the background that will continue to run, even if there are network problems or you log off the box.
Use the ampersand (&
) character to place a job in the background. This example runs the rman.bsh
script in the background:
$ rman.bsh & [1] 6507
From the previous output, the [1]
indicates the job number, and 6507
is the process identifier. You can verify that the program is running in the background via the jobs
command:
$ jobs [1]+ Running rman.bsh &
On some older systems, you may be required to use the nohup
(no hangup) command to ensure that the job will still execute even if you log off the server. If using an older shell, then use this syntax to place a job in the background:
$ nohup rman.bsh &
To stop a background job use the kill
command. This next line of code stops job 1:
$ kill %1
In today's global environment, DBAs may find themselves logging onto remote servers halfway around the planet. Sometimes DBAs find themselves running long jobs on a remote server from the command line. For example, you might encounter an issue with a backup job, and you want to kick it off manually so that you can baby-sit it. One issue that arises is that sometimes the network connection being used can be somewhat unstable, and it will terminate your session before the given job has completed.
By default, when you run a shell script from the command line, it will run in the foreground. To execute a job in the background, place an ampersand character at the end of the command string. Here is the general syntax:
$ <command> &
Using &
ensures that a command will continue to run, even if you log off the box. On some older systems, you may be required to use the nohup
command to achieve this functionality. In this example, the nohup
command is used to run a shell script in the background:
$ nohup export_db.bsh &
By default, the output from a nohup
command is written to a file named nohup.out
. You can redirect the output to the file of your choice as follows:
$ nohup export_db.bash >exp.out &
You can interactively monitor the output of the previous job by viewing it with the tail -f
command:
$ tail -f exp.out
If you want a job to consistently run in the background at a specified time, then use a scheduling utility like cron
. See Chapter 11 for details on automating jobs.
The &
and nohup
commands are the traditional ways of keeping a job running in the background. You can use the Linux screen
command to achieve the same result with significantly more functionality. The screen
command starts a terminal session on your server that will persist for you even if there is unexpected network disruption.
To start a screen
session, issue the following command:
$ screen
If you receive an error like "Cannot open terminal /dev/pts/1," then as root
change the permissions on that file:
# chmod a+rw /dev/pts/1
When you invoke screen
, you are presented with a terminal from which you can type commands and run scripts. The difference between a screen
session and a normal terminal session is that the screen
session will continue to run even after you are detached.
For example, say you are at your work location and log on to a database box and start a screen
session. You then start a long-running backup job in your screen
session. After the job is started, you detach from the screen session by pressing Ctrl+A and then the D key. You can then drive home, remotely log on to the database server, and reattach to the screen
session you started previously while you were at work and monitor the backup job as if you were looking at the same terminal you started at work.
Here's a simple example of how this works. Type screen
, as shown here:
$ screen
Print the current working directory so that you have some output on the screen that you can recognize when you attach to this screen
session from another terminal:
$ pwd /home/oracle
Now press Ctrl+A and then the D key (that's pressing the Ctrl and the A key at the same time, releasing them, and then pressing the D key). This detaches you from the screen
session. You should see the following message:
[detatched]
Now start a different terminal session, and log on to the database server. Issue the following command to display any detached screen
sessions:
$ screen -ls There is a screen on: 31334.pts-1.rmougprd2 (Detached) 1 Socket in /tmp/uscreens/S-oracle.
You can reattach to any screen
session using the -r
(reattach) option followed by [[pid.]tty[.host]]
. For this particular example, you can reestablish the screen
connection by typing this:
$ screen -r 31334.pts-1.rmougprd2
You should now see the output of the previously entered pwd
command. It's as if you never left the screen
terminal session. This is a very powerful utility that can be used to start jobs and then monitor them from another remote terminal session. You can even share a screen
session with other users.
To display screen
online help, press Ctrl+A and then the ? key. (That's pressing the Ctrl key and the A key at the same time, releasing those keys, and then pressing the ? key.) To leave a screen
session, use the exit
command. This will stop your screen
session.
Sometimes you'll need to monitor the progress of a long-running shell script. You can use the Linux tail
command with the f
(follow) switch to display the output of a job as it is written to a log file. In this example, the output of a backup job is redirected to an output file named rmanback.out
:
$ rmanback.bash >rmanback.out 2>&1
From another session, the output being written to the log file is interactively viewed with the tail -f
command:
$ tail -f rmanback.out
Here is snippet of typical output that might be displayed to the screen:
channel ORA_DISK_2: starting archive log backupset channel ORA_DISK_2: specifying archive log(s) in backup set input archive log thread=1 sequence=868 recid=859 stamp=628426116 input archive log thread=1 sequence=869 recid=860 stamp=628426523 input archive log thread=1 sequence=870 recid=861 stamp=628466994 channel ORA_DISK_2: starting piece 1 at 21-JUL-07
DBAs often used the tail
command to monitor things like alert logs and view potential issues with the database as they are happening. In this example, we continuously follow the display of what's being written to an Oracle database alert.log
file:
$ tail -f alert_BRDSTN.log
Here's a snippet of typical output written to the alert.log
file:
Completed: ALTER DATABASE BACKUP CONTROLFILE TO TRACE DBID: 2917656785 Fri Jul 20 23:13:09 2007 Thread 1 advanced to log sequence 71 Current log# 2 seq# 71 mem# 0: /ora01/oradata/BRDSTN/oradata/redo02a.log
When you want to discontinue viewing the contents of a log file, press Ctrl+C (press Ctrl and C keys at the same time) to break out of the tail
command.
The Bash shell has several features that are useful for debugging and troubleshooting problems in scripts. The -n
switch (no execution) allows you to check the syntax of a script before you run it. To check a Bash shell script for syntax errors, use the -n
as shown here:
$ bash -n db.bash
If the script contains any errors, it will display a message such as this:
db.bsh: line 10: syntax error: unexpected end of file
Another useful debugging feature is the -o xtrace
option. This instructs the Bash shell to display every command before it is executed. This option also shows any variable substitutions and expansions. This allows you to view the actual variable values used when the shell script executes. You can invoke the -o xtrace
feature from the command line as follows:
$ bash -o xtrace <script name>
You'll notice that the output contains lines that don't seem to have anything to do with your code:
+ alias 'rm=rm -i' + alias 'cp=cp -i' + alias 'mv=mv -i' + '[' -f /etc/bashrc ']' + . /etc/bashrc +++ id -gn +++ id -un +++ id -u ++ '[' root = root -a 0 -gt 99 ']' ++ umask 022 ++ '[' '' ']' + export JAVA_HOME=/opt/java
That's because the first several lines in the output are from code executed in profile scripts that are in your Linux environment. Also of note, the plus signs in the output indicate the level of nesting of a command within the script.
As shell scripts become longer and more complex, it can sometimes be problematic to squash the source of bugs within a script. This can be especially true when maintaining code that somebody else wrote.
If you just want to see the tracing for specific commands within the script, then embed set -o xtrace
directly within your code at the desired location. In this example, tracing is turned on before the if
statement and then turned off at the end:
set -o xtrace if [ $? -eq 0 ]; then echo "$critProc is available." else echo "$critProc has issues." | mail -s "problem with $critProc" [email protected] fi set +o xtrace
To enable a set command feature, you must use the minus (-) sign. This may seem counterintuitive. Equally counterintuitive, use a plus (+
) sign to disable a set
command feature.
You can also use the set -x
command to print each command's parameter assignments before they are executed. To turn off tracing, use set +x
. This is identical to the set -o xtrace
command.