Built-in functions

Hive supports the following built-in functions:

Data type

Function

Description

BIGINT

round(double a)

Returns the rounded BIGINT value of the double.

BIGINT

floor(double a)

Returns the maximum BIGINT value that is equal or less than the double.

BIGINT

ceil(double a)

Returns the minimum BIGINT value that is equal or greater than the double.

double

rand(), rand(int seed)

Returns a random number (that changes from row to row). Specifying the seed will make sure the generated random number sequence is deterministic.

string

concat(string A, string B,...)

Returns the string resulting from concatenating B after A. For example, concat('foo', 'bar') results in 'foobar'. This function accepts an arbitrary number of arguments and returns the concatenation of all of them.

string

substr(string A, int start)

Returns the substring of A starting from start position till the end of string A. For example, substr('foobar', 4) results in 'bar'.

string

substr(string A, int start, int length)

Returns the substring of A starting from start position with the given length, for example,
substr('foobar', 4, 2) results in 'ba'.

string

upper(string A)

Returns the string resulting from converting all characters of A to uppercase, for example, upper('fOoBaR') results in 'FOOBAR'.

string

ucase(string A)

Same as upper.

string

lower(string A)

Returns the string resulting from converting all characters of B to lowercase, for example, lower('fOoBaR') results in 'foobar'.

string

lcase(string A)

Same as lower.

string

trim(string A)

Returns the string resulting from trimming spaces from both ends of A, for example, trim('foobar ') results in 'foobar'.

string

ltrim(string A)

Returns the string resulting from trimming spaces from the beginning (left hand side) of A. For example, ltrim(' foobar ') results in 'foobar '.

string

rtrim(string A)

Returns the string resulting from trimming spaces from the end (right hand side) of A. For example, rtrim(' foobar') results in 'foobar'.

string

regexp_replace(string A, string B, string C)

Returns the string resulting from replacing all substrings in B that match the Java regular expression syntax (See Java regular expressions syntax) with C. For example, regexp_replace('foobar', 'oo|ar', ) returns 'fb'.

int

size(Map<K.V>)

Returns the number of elements in the map type.

int

size(Array<T>)

Returns the number of elements in the array type.

value of <type>

cast(<expr> as <type>)

Converts the results of the expression expr to <type>, for example, cast('1' as BIGINT) will convert the string '1' to its integral representation. A null is returned if the conversion does not succeed.

string

from_unixtime(int unixtime)

Convert the number of seconds from the UNIX epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the format of 1970-01-01 00:00:00.

string

to_date(string timestamp)

Return the date part of a timestamp string: to_date("1970-01-01 00:00:00") = "1970-01-01".

int

year(string date)

Return the year part of a date or a timestamp string: year("1970-01-01 00:00:00") = 1970, year("1970-01-01") = 1970.

int

month(string date)

Return the month part of a date or a timestamp string: month("1970-11-01 00:00:00") = 11, month("1970-11-01") = 11.

int

day(string date)

Return the day part of a date or a timestamp string: day("1970-11-01 00:00:00") = 1, day("1970-11-01") = 1.

string

get_json_object(string json_string, string path)

Extract a json object from a json string based on the json path specified, and return json string of the extracted a .json object. It will return null if the input json string is invalid.

 

The following built-in aggregate functions are supported in Hive:

Data type

Functions

Description

BIGINT

count(*), count(expr), count(DISTINCT expr[, expr_.])

count(*) – Returns the total number of retrieved rows, including rows containing NULL values; count(expr) – Returns the number of rows for which the supplied expression is non-NULL; count(DISTINCT expr[, expr]) – Returns the number of rows for which the supplied expressions are unique and non-NULL.

DOUBLE

sum(col), sum(DISTINCT col)

Returns the sum of the elements in the group or the sum of the distinct values of the column in the group.

DOUBLE

avg(col), avg(DISTINCT col)

Returns the average of the elements in the group or the average of the distinct values of the column in the group.

DOUBLE

min(col)

Returns the minimum value of the column in the group.

DOUBLE

max(col)

Returns the maximum value of the column in the group.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset