The previous sections covered all the basics of a SELECT statement using DUAL and how strings and numbers are constructed, compared, and combined. Now we can start looking at some of Oracle's built-in single-row functions that operate on strings and numbers in database table columns.
In both Oracle SQL and most programming languages, a function is a predefined set of steps that can be accessed using a common name. A function may include zero, one, or more arguments that are passed to the function, and it may return a result. For example, the SQRT function calculates the square root of a number and returns a value of 1.414214 when called with an argument of 2: SQRT(2) = 1.414214.
function
A named set of predefined programming language commands that performs a specific task given zero, one, or more arguments and returns a value.
Single-row functions are functions that may have zero, one, or more arguments and will return one result for each row returned in the query. Functions can be called in the SELECT, WHERE, and ORDER BY clauses of a SELECT statement. (The WHERE and ORDER BY clauses are used to restrict and organize query output, as explained in the next chapter.)
single-row function
A function that may have zero, one, or more arguments and will return one result for each row returned in a query.
NOTE
All of these functions are available for use in both SQL and PL/SQL (Oracle's SQL-based programming language). As of Oracle9i, SQL and PL/SQL share the same core SQL engine.
In this section, we'll cover the highlights of Oracle's string functions, numeric functions, date functions, conversion functions, and general functions that don't fall neatly into any of the other categories.
String functions are functions that perform some kind of transformation on a string literal, a column containing a string, or an expression consisting of string literals and table columns. String functions will return a string as the result of the transformation. Table 3.2 briefly describes the built-in string functions.
Function | Description |
---|---|
ASCII | Returns the decimal equivalent of the first character of a string |
CHR | Given a decimal number, returns the ASCII equivalent character |
CONCAT | Concatenates two strings |
INITCAP | Converts the first letter of each word in a string to uppercase |
INSTR | Searches a string for an occurrence of another string |
LENGTH | Returns the length of a string |
LOWER | Converts all characters in a string to lowercase |
LPAD | Left-fills a character string with a given character for a specified total length |
LTRIM | Trims a specific character from the front of a string |
REGEXP_INSTR | Searches a string for an occurrence of a regular expression |
REGEXP_REPLACE | Replaces occurrences of a specified regular expression with another string |
REGEXP_SUBSTR | Returns a substring of another string matching a regular expression |
REPLACE | Replaces occurrences of a specified string within another string |
RPAD | Right-fills a string with a given character for a specified total length |
RTRIM | Trims a specific character from the end of a string |
SOUNDEX | Returns a phonetic equivalent of a string |
SUBSTR | Returns a specified portion of a string |
TRANSLATE | Converts single characters to alternate single characters in a string |
TREAT | Changes the declared type of an expression |
TRIM | Removes leading, trailing, or both leading and trailing characters from a string |
UPPER | Converts all characters in a string to uppercase |
Let's consider some practical uses for string functions. Now that Scott's widget company is off the ground, Scott regrets some of the shortcuts he took when creating the initial version of the database. The users don't find the reports very readable, and it would look a lot better if the names were in uppercase and lowercase letters.
The INITCAP function offers a quick way to clean up names and addresses that may be in all uppercase, all lowercase, or mixed case. It will work for a first pass over the data to at least make the names and addresses somewhat readable. Until Scott can overhaul the database, he can use the INITCAP function and column aliases to make things look a bit better:
string function
A function that operates on string literals, columns containing strings, or an expression containing string literals and table columns, returning a string as the result.
select empno "Empl#", initcap(ename) "EmplName" from emp; Empl# EmplName ---------- ---------- 7369 Smith 7499 Allen
7521 Ward 7566 Jones 7654 Martin 7698 Blake 7782 Clark 7788 Scott 7839 King 7844 Turner 7876 Adams 7900 James 7902 Ford 7934 Miller 14 rows selected.
NOTE
The INITCAP function cannot capitalize mixed-case names correctly. For example, if one of the employee names were McDonald, the INITCAP function would not capitalize that name correctly (unless there was a space between MC and DONALD, which wouldn't be right either).
The next day, the Publications department wants to put the employee numbers and names on an intranet web page. The web page designers would like the employee number left justified and the employee name right justified, for a total width of 40 characters. Between the employee number and name must be a series of dots (or periods). To provide the complete 40-character field, Scott must use the LENGTH and LPAD functions in addition to what he already had from the example above:
select empno || lpad(initcap(ename),40-length(empno),'.') "Employee Directory" from emp; Employee Directory 7369............................... Smith 7499............................... Allen 7521................................ Ward 7566............................... Jones 7654.............................. Martin 7698............................... Blake 7782............................... Clark 7788............................... Scott 7839................................ King 7844.............................. Turner
7876............................... Adams 7900............................... James 7902................................ Ford 7934.............................. Miller 14 rows selected.
This query uses three string functions: two of them are nested within another function, plus a concatenation operation. Let's break down the query to clarify how it works.
As you've seen, the function call INITCAP(ename) changes the first letter of each word to uppercase. The function call LENGTH(empno) returns the length of a character string. In this case, there is an implicit conversion of a numeric type to a string type. An implicit conversion occurs automatically when Oracle evaluates an expression; conversely, an explicit conversion occurs when the SQL statement makes no assumptions about how Oracle will convert one datatype to another and uses one or more of the built-in functions to perform the conversion. The column is converted to a character string, and the length of the converted character string is returned.
implicit conversion
Conversion of one datatype to another that occurs automatically when columns or constants with dissimilar datatypes appear in an expression.
The LPAD function will left-pad a character string to a specified number of characters with the character you specify. Scott wants to end up with a total of 40 characters, so he subtracts the number of characters that the employee number would take up. Here, he will left-pad the employee name with periods, less the amount of space taken up by the employee number. Once the LPAD function is evaluated, he will concatenate the employee number at the front, and once again, he will allow the implicit conversion of the employee number from numeric to string.
explicit conversion
Conversion of one datatype to another in an expression using function calls such as TO_CHAR instead of relying on automatic conversion rules (implicit conversion).
Finally, Scott wants the title for the report to look readable, so he assigns a column alias to the result of the concatenated function calls. The column alias can act as a report title.
Numeric functions are functions that perform some kind of transformation on a numeric literal, a column containing a number, or an expression consisting of numeric literals and table columns. Numeric functions will return a number as the result of the transformation. Table 3.3 briefly describes the built-in numeric functions.
numeric function
A function that operates on numeric literals, columns containing numbers, or an expression containing numeric literals and table columns, returning a number as the result.
Function | Description |
---|---|
ABS | Returns the absolute value of the argument |
ACOS | Returns the arc cosine |
ASIN | Returns the arc sine |
ATAN | Returns the arc tangent |
ATAN2 | Returns the arc tangent of two values |
BITAND | Performs a bitwise AND on two arguments |
CEIL | Returns the next highest integer |
COS | Returns the cosine |
COSH | Returns the hyperbolic cosine |
EXP | Raises e (2.718281828...) to the specified power |
FLOOR | Returns the next lowest integer |
LN | Returns the natural logarithm (base e) |
LOG | Returns the base 10 logarithm |
MOD | Returns the remainder of the first argument divided by the second, using FLOOR in the calculation |
NANVL | Returns an alternate value if the first argument is non-numeric |
POWER | Raises a number to an arbitrary power |
REMAINDER | Returns the remainder of the first argument divided by the second, similar to MOD except that REMAINDER uses ROUND |
ROUND | Returns a rounded value to an arbitrary precision |
SIGN | Returns −1 if the argument is negative, 0 if 0, or 1 if positive |
SIN | Returns the sine |
SQRT | Returns the square root of the argument |
TAN | Returns the tangent |
TRUNC | Truncates a number to an arbitrary precision |
Scott's company has survived its first month and has even turned a small profit. Scott wants to find a way to distribute the first month's profit in a fair manner, so he turns to the company mathematician and statistician, Julie. She suggests that the employees get a one-time bonus that is based on the square root of their current salary. Scott can run the following query to see what the potential bonuses might be using the SQRT function:
select ename, sal, sqrt(sal) from emp;
ENAME SAL SQRT(SAL) ---------- ---------- ---------- SMITH 700 26.4575131 ALLEN 1600 40 WARD 1250 35.3553391 JONES 2975 54.5435606 MARTIN 1250 35.3553391 BLAKE 2850 53.3853913 CLARK 2450 49.4974747 SCOTT 3000 54.7722558 KING 5000 70.7106781 TURNER 1300 36.0555128 ADAMS 1100 33.1662479 JAMES 950 30.82207 FORD 3000 54.7722558 MILLER 1600 40 14 rows selected.
Scott seems to like this idea, since the bonuses for the highest paid workers are not as big of a percentage of their base wage as they are for the lowest paid workers.
The report is a bit unreadable; Scott wants the bonus rounded to two digits with a better heading for the bonus. The new query looks something like this, using the ROUND function:
select ename, sal, round(sqrt(sal),2) "Bonus" from emp; ENAME SAL Bonus ---------- ---------- ---------- SMITH 700 26.46 ALLEN 1600 40 WARD 1250 35.36 JONES 2975 54.54 MARTIN 1250 35.36 BLAKE 2850 53.39 CLARK 2450 49.5 SCOTT 3000 54.77 KING 5000 70.71 TURNER 1300 36.06 ADAMS 1100 33.17 JAMES 950 30.82 FORD 3000 54.77
MILLER 1600 40 14 rows selected.
The report is looking better, but the Bonus column is still not formatted quite right. We'll look at ways to fix this in the section on conversion functions later in this chapter.
Since a lot of employees are on commission, Scott may want to base the bonus on both the salary and commission. We'll look at how to do this in the section on general functions.
Date functions are functions that perform some kind of transformation on a date literal, a column containing a date, or an expression consisting of date literals and table columns. Date functions will return a date or a string containing a portion of the date as the result of the transformation. Table 3.4 describes the date-related functions.
Function | Description |
---|---|
ADD_MONTHS | Increments a date value by a number of months |
CURRENT_DATE | Returns the current date for the session's time zone |
CURRENT_ TIMESTAMP | Returns the current date and time in the session's time zone to a particular precision |
DBTIMEZONE | Returns the database time zone as an offset in hours and minutes from UTC |
EXTRACT | Returns a portion of the date and time (e.g., hour, month) from a timestamp value |
FROM_TZ | Returns a timestamp with the time zone for a given combination of an individual timestamp and time zone |
LAST_DAY | Returns the last day of the month for a given date |
LOCALTIMESTAMP | Returns the current date and time in the session's time zone to a given precision |
MONTHS_BETWEEN | Returns the numeric number of months between two date arguments |
NEW_TIME | Returns a date in a second time zone given a date in the first time zone |
NEXT_DAY | Finds the next occurrence of a specific day of the week given a date |
ROUND | Rounds a date value to a specific unit of time |
SESSIONTIMEZONE | Returns the database time zone (DBTIMEZONE) unless altered during the session |
SYS_EXTRACT_UTC | Returns the UTC for a timestamp with time zone value |
SYSDATE | Returns the current date and time |
SYSTIMESTAMP | Returns a timestamp with the time zone for the database date and time |
TRUNC | Truncates a date value to a specified unit of time |
TZ_OFFSET | Converts a text time zone to a numeric offset |
NOTE
Date and time handling has been greatly enhanced since Oracle9i. Not only can the precision of Oracle's timestamp datatypes support fractions of a second to nine decimal places, other functions and system parameters smooth the process of handling Oracle servers and sessions across multiple time zones. This is handy for companies with national and international business.
When Scott started his widget company, he hired most of the people away from a competitor. As part of the employment agreement, he kept the new employees' original hire date for the new company. He wants to see how many employees have been working for the company (or competitor) more than 250 months. He can run this query to get the answer:
date function
A function that performs some kind of transformation on a date literal, a column containing a date, or an expression consisting of date literals and table columns. Date functions return a date or a string containing a portion of the date as the result of the transformation.
select ename, hiredate, months_between(sysdate,hiredate) "Months" from emp; ENAME HIREDATE Months ---------- --------- ---------- SMITH 17-DEC-80 260.608914 ALLEN 20-FEB-81 258.51214 WARD 22-FEB-81 258.447624 JONES 02-APR-81 257.092785 MARTIN 28-SEP-81 251.254076 BLAKE 01-MAY-81 256.125043 CLARK 09-JUN-81 254.866979 SCOTT 19-APR-87 184.544398 KING 17-NOV-81 249.608914 TURNER 08-SEP-81 251.899237 ADAMS 23-MAY-87 183.415366
JAMES 03-DEC-81 249.060527 FORD 03-DEC-81 249.060527 MILLER 23-JAN-82 247.415366 14 rows selected.
Note that there are two functions being called: SYSDATE and MONTHS_ BETWEEN. SYSDATE has no arguments; it merely returns the current date and time, so the parentheses must be omitted. The MONTHS_BETWEEN function returns the difference between dates in months. If you wanted to know the number of days instead, you would not need the MONTHS_BETWEEN function and could use the expression SYSDATE-HIREDATE instead. Date arithmetic returns values in units of days.
As the name implies, conversion functions convert between numbers, strings, and date values. The common conversion functions are described in Table 3.5.
Function | Description |
---|---|
ASCIISTR | Converts non-ASCII characters to ASCII |
CAST | Converts one datatype to another |
NUMTODSINTERVAL | Converts a number and a character string representing a unit of time to an INTERVAL DAY TO SECOND type |
NUMTOYMINTERVAL | Converts a number and a character string representing a unit of time to an INTERVAL YEAR TO MONTH type |
TO_CHAR | Converts a date or a number to character format |
TO_DATE | Converts a character format date to a DATE datatype |
TO_DSINTERVAL | Converts a character string to an INTERVAL DAY TO SECOND datatype |
TO_NUMBER | Converts a character string to an internal numeric format |
TO_YMINTERVAL | Converts a character string to an INTERVAL YEAR TO MONTH datatype |
Scott knows he can improve on the query he used to see which employees have been with the company more than 250 months. Rather than see the number of months since the original hire date, he wants to see the dates when the employee will reach or has reached the 250-month mark. For this result, he will use the NUMTOYMINTERVAL function to add 250 months to the hire date:
select ename, hiredate, hiredate + numtoyminterval(250,'month') "250 Months" from emp; ENAME HIREDATE 250 Month ---------- --------- --------- SMITH 17-DEC-80 17-OCT-01 ALLEN 20-FEB-81 20-DEC-01 WARD 22-FEB-81 22-DEC-01 JONES 02-APR-81 02-FEB-02 MARTIN 28-SEP-81 28-JUL-02 BLAKE 01-MAY-81 01-MAR-02 CLARK 09-JUN-81 09-APR-02 SCOTT 19-APR-87 19-FEB-08 KING 17-NOV-81 17-SEP-02 TURNER 08-SEP-81 08-JUL-02 ADAMS 23-MAY-87 23-MAR-08 JAMES 03-DEC-81 03-OCT-02 FORD 03-DEC-81 03-OCT-02 MILLER 23-JAN-82 23-NOV-02 14 rows selected.
Scott could have used the function TO_YMINTERVAL('20-10') to add 20 years and 10 months (250 months total) to the hire date. Whether to use one method or another depends on how you want to specify the format—as a discrete number of months or years or as a combination of months and years.
Now that Scott knows more about the conversion functions, he wants to revisit one of the queries he wrote previously:
select ename, sal, round(sqrt(sal),2) "Bonus" from emp;
The problem with this query was that the default numeric formatting didn't look good, even after applying the ROUND function. Scott can apply another function here, TO_CHAR, to force the bonus to have two decimal places, even if the bonus does not have any significance beyond the first decimal point. The TO_ CHAR function specifies the value to be formatted and the desired format, and it can be used to format both numbers and date values. Here, Scott wants to fix that rounded number:
select ename, sal, to_char(round(sqrt(sal),2),'9999.99') "Bonus" from emp;
ENAME SAL Bonus ---------- ---------- -------- SMITH 700 26.46 ALLEN 1600 40.00 WARD 1250 35.36 JONES 2975 54.54 MARTIN 1250 35.36 BLAKE 2850 53.39 CLARK 2450 49.50 SCOTT 3000 54.77 KING 5000 70.71 TURNER 1300 36.06 ADAMS 1100 33.17 JAMES 950 30.82 FORD 3000 54.77 MILLER 1600 40.00 14 rows selected.
In addition to the '9' digit in the format, you can use '0' to force leading zeros, a '$' to show dollar amounts, a '-' for leading or trailing signs, commas to make large numbers more readable, or even roman numerals. Table 3.6 shows a few sample numeric formats and how the value 7322.8 would look in that format.
Format | Result |
---|---|
99999.99 | 7322.80 |
$999.999 | ######### |
00999.90 | 07322.80 |
99,999.99 | 7,322.80 |
S9999 | +7323 |
9.9999EEEE | 7.3228E+03 |
Notice that when a number will not fit into the format provided, it is displayed as all #s. Notice also that rounding will automatically occur if there are not enough positions to the right of the decimal to accommodate the entire number.
The category of general functions covers all of the functions that don't fit neatly into a single category. Many of them are shortcuts that allow the DBA or developer to avoid needing to use PL/SQL for certain types of processing, such as a conditional operation that would normally require more than one statement. Table 3.7 briefly describes the general functions.
Scott is continuing to analyze the profitability versus expenses in his widget company by looking at the total compensation for each employee. Most employees are salaried, but a few are salaried with a commission. Scott's first attempt at a total compensation calculation is something like this:
select ename, sal+comm from emp; ENAME SAL+COMM ---------- ---------- SMITH ALLEN 1900 WARD 1750 JONES MARTIN 2650 BLAKE CLARK
SCOTT KING TURNER 1300 ADAMS JAMES FORD MILLER 14 rows selected.
Wait a minute, what happened to the salaries for the other employees? As noted earlier in the chapter, NULL values provide a great benefit in that they can indicate that a value is unknown, unavailable, or not applicable. However, when combined in some kind of calculation with non-NULL values, the result will always be NULL. For example, adding 15 to an unknown value will result in a new value that is also unknown.
In the case of the employee salaries and commissions, however, Scott wants to treat the commissions as zero if they are NULL for the purpose of calculating total compensation. For this, he will use the NVL function. NVL takes two arguments. The first argument is compared to NULL, and if it is NULL, it returns the second argument; otherwise, it returns the first argument. Scott's query can be modified with the NVL function to produce the correct results:
select ename, sal+nvl(comm,0) from emp; ENAME SAL+NVL(COMM,0) ---------- --------------- SMITH 700 ALLEN 1900 WARD 1750 JONES 2975 MARTIN 2650 BLAKE 2850 CLARK 2450 SCOTT 3000 KING 5000 TURNER 1300 ADAMS 1100 JAMES 950 FORD 3000 MILLER 1600 14 rows selected.
That looks a lot better. Other, more esoteric functions such as VSIZE are more often used by DBAs to determine how much space a particular column for a particular row is using, in bytes:
select ename, vsize(ename), sal, vsize(sal) from emp; ENAME VSIZE(ENAME) SAL VSIZE(SAL) ---------- ------------ ---------- ---------- SMITH 5 700 2 ALLEN 5 1600 2 WARD 4 1250 3 JONES 5 2975 3 MARTIN 6 1250 3 BLAKE 5 2850 3 CLARK 5 2450 3 SCOTT 5 3000 2 KING 4 5000 2 TURNER 6 1300 2 ADAMS 5 1100 2 JAMES 5 950 3 FORD 4 3000 2 MILLER 6 1600 2 14 rows selected.
The lengths for the employee names make sense, but why would a salary of 3000 take up less space than a salary of 2450? This is because all numbers are stored internally in scientific notation. Only the 3 from the 3000 salary needs to be stored with an exponent of 3, whereas the salary 2450 is stored as 2.45 with an exponent of 3. More digits of precision require more storage space in Oracle's variable internal numeric format.