Hive supports the following built-in functions:
Data type |
Function |
Description |
BIGINT |
round(double a) |
Returns the rounded BIGINT value of the double. |
BIGINT |
floor(double a) |
Returns the maximum BIGINT value that is equal or less than the double. |
BIGINT |
ceil(double a) |
Returns the minimum BIGINT value that is equal or greater than the double. |
double |
rand(), rand(int seed) |
Returns a random number (that changes from row to row). Specifying the seed will make sure the generated random number sequence is deterministic. |
string |
concat(string A, string B,...) |
Returns the string resulting from concatenating B after A. For example, concat('foo', 'bar') results in 'foobar'. This function accepts an arbitrary number of arguments and returns the concatenation of all of them. |
string |
substr(string A, int start) |
Returns the substring of A starting from start position till the end of string A. For example, substr('foobar', 4) results in 'bar'. |
string |
substr(string A, int start, int length) |
Returns the substring of A starting from start position with the given length, for example, |
string |
upper(string A) |
Returns the string resulting from converting all characters of A to uppercase, for example, upper('fOoBaR') results in 'FOOBAR'. |
string |
ucase(string A) |
Same as upper. |
string |
lower(string A) |
Returns the string resulting from converting all characters of B to lowercase, for example, lower('fOoBaR') results in 'foobar'. |
string |
lcase(string A) |
Same as lower. |
string |
trim(string A) |
Returns the string resulting from trimming spaces from both ends of A, for example, trim('foobar ') results in 'foobar'. |
string |
ltrim(string A) |
Returns the string resulting from trimming spaces from the beginning (left hand side) of A. For example, ltrim(' foobar ') results in 'foobar '. |
string |
rtrim(string A) |
Returns the string resulting from trimming spaces from the end (right hand side) of A. For example, rtrim(' foobar') results in 'foobar'. |
string |
regexp_replace(string A, string B, string C) |
Returns the string resulting from replacing all substrings in B that match the Java regular expression syntax (See Java regular expressions syntax) with C. For example, regexp_replace('foobar', 'oo|ar', ) returns 'fb'. |
int |
size(Map<K.V>) |
Returns the number of elements in the map type. |
int |
size(Array<T>) |
Returns the number of elements in the array type. |
value of <type> |
cast(<expr> as <type>) |
Converts the results of the expression expr to <type>, for example, cast('1' as BIGINT) will convert the string '1' to its integral representation. A null is returned if the conversion does not succeed. |
string |
from_unixtime(int unixtime) |
Convert the number of seconds from the UNIX epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of 1970-01-01 00:00:00. |
string |
to_date(string timestamp) |
Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01". |
int |
year(string date) |
Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970. |
int |
month(string date) |
Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11. |
int |
day(string date) |
Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1. |
string |
get_json_object(string json_string, string path) |
Extract a json object from a json string based on the json path specified, and return json string of the extracted a .json object. It will return null if the input json string is invalid. |
The following built-in aggregate functions are supported in Hive:
Data type |
Functions |
Description |
BIGINT |
count(*), count(expr), count(DISTINCT expr[, expr_.]) |
count(*) – Returns the total number of retrieved rows, including rows containing NULL values; count(expr) – Returns the number of rows for which the supplied expression is non-NULL; count(DISTINCT expr[, expr]) – Returns the number of rows for which the supplied expressions are unique and non-NULL. |
DOUBLE |
sum(col), sum(DISTINCT col) |
Returns the sum of the elements in the group or the sum of the distinct values of the column in the group. |
DOUBLE |
avg(col), avg(DISTINCT col) |
Returns the average of the elements in the group or the average of the distinct values of the column in the group. |
DOUBLE |
min(col) |
Returns the minimum value of the column in the group. |
DOUBLE |
max(col) |
Returns the maximum value of the column in the group. |