DataFusion Functions
This page is generated from the Apache DataFusion project's documents:
- DataFusion Scalar Functions
- DataFusion Scalar Functions (NEW)
- DataFusion Aggregate Functions
- DataFusion Window Functions
Scalar Functions
Scalar functions operate on a single row at a time and return a single value.
Math Functions
- abs
- acos
- acosh
- asin
- asinh
- atan
- atan2
- atanh
- cbrt
- ceil
- cos
- cosh
- cot
- degrees
- exp
- factorial
- floor
- gcd
- isnan
- iszero
- lcm
- ln
- log
- log10
- log2
- nanvl
- pi
- pow
- power
- radians
- random
- round
- signum
- sin
- sinh
- sqrt
- tan
- tanh
- trunc
abs
Returns the absolute value of a number.
abs(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
acos
Returns the arc cosine or inverse cosine of a number.
acos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
acosh
Returns the area hyperbolic cosine or inverse hyperbolic cosine of a number.
acosh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
asin
Returns the arc sine or inverse sine of a number.
asin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
asinh
Returns the area hyperbolic sine or inverse hyperbolic sine of a number.
asinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
atan
Returns the arc tangent or inverse tangent of a number.
atan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
atan2
Returns the arc tangent or inverse tangent of expression_y / expression_x.
atan2(expression_y, expression_x)
Arguments
- expression_y: First numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_x: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
atanh
Returns the area hyperbolic tangent or inverse hyperbolic tangent of a number.
atanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cbrt
Returns the cube root of a number.
cbrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
ceil
Returns the nearest integer greater than or equal to a number.
ceil(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cos
Returns the cosine of a number.
cos(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cosh
Returns the hyperbolic cosine of a number.
cosh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
cot
Returns the cotangent of a number.
cot(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
degrees
Converts radians to degrees.
degrees(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
exp
Returns the base-e exponential of a number.
exp(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
factorial
Factorial. Returns 1 if value is less than 2.
factorial(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
floor
Returns the nearest integer less than or equal to a number.
floor(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
gcd
Returns the greatest common divisor of expression_x and expression_y. Returns 0 if both inputs are zero.
gcd(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
isnan
Returns true if a given number is +NaN or -NaN otherwise returns false.
isnan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
iszero
Returns true if a given number is +0.0 or -0.0 otherwise returns false.
iszero(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
lcm
Returns the least common multiple of expression_x and expression_y. Returns 0 if either input is zero.
lcm(expression_x, expression_y)
Arguments
- expression_x: First numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- expression_y: Second numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
ln
Returns the natural logarithm of a number.
ln(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log
Returns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.
log(base, numeric_expression)
log(numeric_expression)
Arguments
- base: Base numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log10
Returns the base-10 logarithm of a number.
log10(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
log2
Returns the base-2 logarithm of a number.
log2(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
nanvl
Returns the first argument if it's not NaN. Returns the second argument otherwise.
nanvl(expression_x, expression_y)
Arguments
- expression_x: Numeric expression to return if it's not NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
- expression_y: Numeric expression to return if the first expression is NaN. Can be a constant, column, or function, and any combination of arithmetic operators.
pi
Returns an approximate value of π.
pi()
pow
Alias of power.
power
Returns a base expression raised to the power of an exponent.
power(base, exponent)
Arguments
- base: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- exponent: Exponent numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
Aliases
- pow
radians
Converts degrees to radians.
radians(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
random
Returns a random float value in the range [0, 1). The random seed is unique to each row.
random()
round
Rounds a number to the nearest integer.
round(numeric_expression[, decimal_places])
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- decimal_places: Optional. The number of decimal places to round to. Defaults to 0.
signum
Returns the sign of a number.
Negative numbers return -1.
Zero and positive numbers return 1.
signum(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sin
Returns the sine of a number.
sin(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sinh
Returns the hyperbolic sine of a number.
sinh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
sqrt
Returns the square root of a number.
sqrt(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
tan
Returns the tangent of a number.
tan(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
tanh
Returns the hyperbolic tangent of a number.
tanh(numeric_expression)
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
trunc
Truncates a number to a whole number or truncated to the specified decimal places.
trunc(numeric_expression[, decimal_places])
Arguments
- numeric_expression: Numeric expression to operate on. Can be a constant, column, or function, and any combination of operators.
- decimal_places: Optional. The number of decimal places to
truncate to. Defaults to 0 (truncate to a whole number). If
decimal_placesis a positive integer, truncates digits to the right of the decimal point. Ifdecimal_placesis a negative integer, replaces digits to the left of the decimal point with0.
Conditional Functions
coalesce
Returns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.
coalesce(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expression to use if previous expressions are null. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select coalesce(null, null, 'datafusion');
+----------------------------------------+
| coalesce(NULL,NULL,Utf8("datafusion")) |
+----------------------------------------+
| datafusion |
+----------------------------------------+
greatest
Returns the greatest value in a list of expressions. Returns null if all expressions are null.
greatest(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expressions to compare and return the greatest value.. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select greatest(4, 7, 5);
+---------------------------+
| greatest(4,7,5) |
+---------------------------+
| 7 |
+---------------------------+
ifnull
Alias of nvl.
least
Returns the smallest value in a list of expressions. Returns null if all expressions are null.
least(expression1[, ..., expression_n])
Arguments
- expression1, expression_n: Expressions to compare and return the smallest value. Can be a constant, column, or function, and any combination of arithmetic operators. Pass as many expression arguments as necessary.
Example
> select least(4, 7, 5);
+---------------------------+
| least(4,7,5) |
+---------------------------+
| 4 |
+---------------------------+
nullif
Returns null if expression1 equals expression2; otherwise it returns expression1.
This can be used to perform the inverse operation of coalesce.
nullif(expression1, expression2)
Arguments
- expression1: Expression to compare and return if equal to expression2. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to compare to expression1. Can be a constant, column, or function, and any combination of operators.
Example
> select nullif('datafusion', 'data');
+-----------------------------------------+
| nullif(Utf8("datafusion"),Utf8("data")) |
+-----------------------------------------+
| datafusion |
+-----------------------------------------+
> select nullif('datafusion', 'datafusion');
+-----------------------------------------------+
| nullif(Utf8("datafusion"),Utf8("datafusion")) |
+-----------------------------------------------+
| |
+-----------------------------------------------+
nvl
Returns expression2 if expression1 is NULL otherwise it returns expression1.
nvl(expression1, expression2)
Arguments
- expression1: Expression to return if not null. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.
Example
> select nvl(null, 'a');
+---------------------+
| nvl(NULL,Utf8("a")) |
+---------------------+
| a |
+---------------------+\
> select nvl('b', 'a');
+--------------------------+
| nvl(Utf8("b"),Utf8("a")) |
+--------------------------+
| b |
+--------------------------+
Aliases
- ifnull
nvl2
Returns expression2 if expression1 is not NULL; otherwise it returns expression3.
nvl2(expression1, expression2, expression3)
Arguments
- expression1: Expression to test for null. Can be a constant, column, or function, and any combination of operators.
- expression2: Expression to return if expr1 is not null. Can be a constant, column, or function, and any combination of operators.
- expression3: Expression to return if expr1 is null. Can be a constant, column, or function, and any combination of operators.
Example
> select nvl2(null, 'a', 'b');
+--------------------------------+
| nvl2(NULL,Utf8("a"),Utf8("b")) |
+--------------------------------+
| b |
+--------------------------------+
> select nvl2('data', 'a', 'b');
+----------------------------------------+
| nvl2(Utf8("data"),Utf8("a"),Utf8("b")) |
+----------------------------------------+
| a |
+----------------------------------------+
String Functions
- ascii
- bit_length
- btrim
- char_length
- character_length
- chr
- concat
- concat_ws
- contains
- ends_with
- find_in_set
- initcap
- instr
- left
- length
- levenshtein
- lower
- lpad
- ltrim
- octet_length
- overlay
- position
- repeat
- replace
- reverse
- right
- rpad
- rtrim
- split_part
- starts_with
- strpos
- substr
- substr_index
- substring
- substring_index
- to_hex
- translate
- trim
- upper
- uuid
ascii
Returns the Unicode character code of the first character in a string.
ascii(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select ascii('abc');
+--------------------+
| ascii(Utf8("abc")) |
+--------------------+
| 97 |
+--------------------+
> select ascii('🚀');
+-------------------+
| ascii(Utf8("🚀")) |
+-------------------+
| 128640 |
+-------------------+
Related functions:
bit_length
Returns the bit length of a string.
bit_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select bit_length('datafusion');
+--------------------------------+
| bit_length(Utf8("datafusion")) |
+--------------------------------+
| 80 |
+--------------------------------+
Related functions:
btrim
Trims the specified trim string from the start and end of a string. If no trim string is provided, all whitespace is removed from the start and end of the input string.
btrim(str[, trim_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- trim_str: String expression to operate on. Can be a constant, column, or function, and any combination of operators. Default is whitespace characters.
Example
> select btrim('__datafusion____', '_');
+-------------------------------------------+
| btrim(Utf8("__datafusion____"),Utf8("_")) |
+-------------------------------------------+
| datafusion |
+-------------------------------------------+
Alternative Syntax
trim(BOTH trim_str FROM str)
trim(trim_str FROM str)
Aliases
- trim
Related functions:
char_length
Alias of character_length.
character_length
Returns the number of characters in a string.
character_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select character_length('Ångström');
+------------------------------------+
| character_length(Utf8("Ångström")) |
+------------------------------------+
| 8 |
+------------------------------------+
Aliases
- length
- char_length
Related functions:
chr
Returns the character with the specified ASCII or Unicode code value.
chr(expression)
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select chr(128640);
+--------------------+
| chr(Int64(128640)) |
+--------------------+
| 🚀 |
+--------------------+
Related functions:
concat
Concatenates multiple strings together.
concat(str[, ..., str_n])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- str_n: Subsequent string expressions to concatenate.
Example
> select concat('data', 'f', 'us', 'ion');
+-------------------------------------------------------+
| concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) |
+-------------------------------------------------------+
| datafusion |
+-------------------------------------------------------+
Related functions:
concat_ws
Concatenates multiple strings together with a specified separator.
concat_ws(separator, str[, ..., str_n])
Arguments
- separator: Separator to insert between concatenated strings.
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- str_n: Subsequent string expressions to concatenate.
Example
> select concat_ws('_', 'data', 'fusion');
+--------------------------------------------------+
| concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) |
+--------------------------------------------------+
| data_fusion |
+--------------------------------------------------+
Related functions:
contains
Return true if search_str is found within string (case-sensitive).
contains(str, search_str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- search_str: The string to search for in str.
Example
> select contains('the quick brown fox', 'row');
+---------------------------------------------------+
| contains(Utf8("the quick brown fox"),Utf8("row")) |
+---------------------------------------------------+
| true |
+---------------------------------------------------+
ends_with
Tests if a string ends with a substring.
ends_with(str, substr)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- substr: Substring to test for.
Example
> select ends_with('datafusion', 'soin');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("soin")) |
+--------------------------------------------+
| false |
+--------------------------------------------+
> select ends_with('datafusion', 'sion');
+--------------------------------------------+
| ends_with(Utf8("datafusion"),Utf8("sion")) |
+--------------------------------------------+
| true |
+--------------------------------------------+
find_in_set
Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.
find_in_set(str, strlist)
Arguments
- str: String expression to find in strlist.
- strlist: A string list is a string composed of substrings separated by , characters.
Example
> select find_in_set('b', 'a,b,c,d');
+----------------------------------------+
| find_in_set(Utf8("b"),Utf8("a,b,c,d")) |
+----------------------------------------+
| 2 |
+----------------------------------------+
initcap
Capitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.
initcap(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select initcap('apache datafusion');
+------------------------------------+
| initcap(Utf8("apache datafusion")) |
+------------------------------------+
| Apache Datafusion |
+------------------------------------+
Related functions:
instr
Alias of strpos.
left
Returns a specified number of characters from the left side of a string.
left(str, n)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- n: Number of characters to return.
Example
> select left('datafusion', 4);
+-----------------------------------+
| left(Utf8("datafusion"),Int64(4)) |
+-----------------------------------+
| data |
+-----------------------------------+
Related functions:
length
Alias of character_length.
levenshtein
Returns the Levenshtein distance between the two given strings.
levenshtein(str1, str2)
Arguments
- str1: String expression to compute Levenshtein distance with str2.
- str2: String expression to compute Levenshtein distance with str1.
Example
> select levenshtein('kitten', 'sitting');
+---------------------------------------------+
| levenshtein(Utf8("kitten"),Utf8("sitting")) |
+---------------------------------------------+
| 3 |
+---------------------------------------------+
lower
Converts a string to lower-case.
lower(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select lower('Ångström');
+-------------------------+
| lower(Utf8("Ångström")) |
+-------------------------+
| ångström |
+-------------------------+
Related functions:
lpad
Pads the left side of a string with another string to a specified string length.
lpad(str, n[, padding_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- n: String length to pad to.
- padding_str: Optional string expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Example
> select lpad('Dolly', 10, 'hello');
+---------------------------------------------+
| lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) |
+---------------------------------------------+
| helloDolly |
+---------------------------------------------+
Related functions:
ltrim
Trims the specified trim string from the beginning of a string. If no trim string is provided, all whitespace is removed from the start of the input string.
ltrim(str[, trim_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- trim_str: String expression to trim from the beginning of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is whitespace characters.
Example
> select ltrim(' datafusion ');
+-------------------------------+
| ltrim(Utf8(" datafusion ")) |
+-------------------------------+
| datafusion |
+-------------------------------+
> select ltrim('___datafusion___', '_');
+-------------------------------------------+
| ltrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| datafusion___ |
+-------------------------------------------+
Alternative Syntax
trim(LEADING trim_str FROM str)
Related functions:
octet_length
Returns the length of a string in bytes.
octet_length(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select octet_length('Ångström');
+--------------------------------+
| octet_length(Utf8("Ångström")) |
+--------------------------------+
| 10 |
+--------------------------------+
Related functions:
overlay
Returns the string which is replaced by another string from the specified position and specified count length.
overlay(str PLACING substr FROM pos [FOR count])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- substr: Substring to replace in str.
- pos: The start position to start the replace in str.
- count: The count of characters to be replaced from start position of str. If not specified, will use substr length instead.
Example
> select overlay('Txxxxas' placing 'hom' from 2 for 4);
+--------------------------------------------------------+
| overlay(Utf8("Txxxxas"),Utf8("hom"),Int64(2),Int64(4)) |
+--------------------------------------------------------+
| Thomas |
+--------------------------------------------------------+
position
Alias of strpos.
repeat
Returns a string with an input string repeated a specified number.
repeat(str, n)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- n: Number of times to repeat the input string.
Example
> select repeat('data', 3);
+-------------------------------+
| repeat(Utf8("data"),Int64(3)) |
+-------------------------------+
| datadatadata |
+-------------------------------+
replace
Replaces all occurrences of a specified substring in a string with a new substring.
replace(str, substr, replacement)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- substr: Substring expression to replace in the input string. Substring expression to operate on. Can be a constant, column, or function, and any combination of operators.
- replacement: Replacement substring expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select replace('ABabbaBA', 'ab', 'cd');
+-------------------------------------------------+
| replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) |
+-------------------------------------------------+
| ABcdbaBA |
+-------------------------------------------------+
reverse
Reverses the character order of a string.
reverse(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select reverse('datafusion');
+-----------------------------+
| reverse(Utf8("datafusion")) |
+-----------------------------+
| noisufatad |
+-----------------------------+
right
Returns a specified number of characters from the right side of a string.
right(str, n)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- n: Number of characters to return.
Example
> select right('datafusion', 6);
+------------------------------------+
| right(Utf8("datafusion"),Int64(6)) |
+------------------------------------+
| fusion |
+------------------------------------+
Related functions:
rpad
Pads the right side of a string with another string to a specified string length.
rpad(str, n[, padding_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- n: String length to pad to.
- padding_str: String expression to pad with. Can be a constant, column, or function, and any combination of string operators. Default is a space.
Example
> select rpad('datafusion', 20, '_-');
+-----------------------------------------------+
| rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) |
+-----------------------------------------------+
| datafusion_-_-_-_-_- |
+-----------------------------------------------+
Related functions:
rtrim
Trims the specified trim string from the end of a string. If no trim string is provided, all whitespace is removed from the end of the input string.
rtrim(str[, trim_str])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- trim_str: String expression to trim from the end of the input string. Can be a constant, column, or function, and any combination of arithmetic operators. Default is whitespace characters.
Example
> select rtrim(' datafusion ');
+-------------------------------+
| rtrim(Utf8(" datafusion ")) |
+-------------------------------+
| datafusion |
+-------------------------------+
> select rtrim('___datafusion___', '_');
+-------------------------------------------+
| rtrim(Utf8("___datafusion___"),Utf8("_")) |
+-------------------------------------------+
| ___datafusion |
+-------------------------------------------+
Alternative Syntax
trim(TRAILING trim_str FROM str)
Related functions:
split_part
Splits a string based on a specified delimiter and returns the substring in the specified position.
split_part(str, delimiter, pos)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- delimiter: String or character to split on.
- pos: Position of the part to return.
Example
> select split_part('1.2.3.4.5', '.', 3);
+--------------------------------------------------+
| split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) |
+--------------------------------------------------+
| 3 |
+--------------------------------------------------+
starts_with
Tests if a string starts with a substring.
starts_with(str, substr)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- substr: Substring to test for.
Example
> select starts_with('datafusion','data');
+----------------------------------------------+
| starts_with(Utf8("datafusion"),Utf8("data")) |
+----------------------------------------------+
| true |
+----------------------------------------------+
strpos
Returns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.
strpos(str, substr)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- substr: Substring expression to search for.
Example
> select strpos('datafusion', 'fus');
+----------------------------------------+
| strpos(Utf8("datafusion"),Utf8("fus")) |
+----------------------------------------+
| 5 |
+----------------------------------------+
Alternative Syntax
position(substr in origstr)
Aliases
- instr
- position
substr
Extracts a substring of a specified number of characters from a specific starting position in a string.
substr(str, start_pos[, length])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- start_pos: Character position to start the substring at. The first character in the string has a position of 1.
- length: Number of characters to extract. If not specified, returns the rest of the string after the start position.
Example
> select substr('datafusion', 5, 3);
+----------------------------------------------+
| substr(Utf8("datafusion"),Int64(5),Int64(3)) |
+----------------------------------------------+
| fus |
+----------------------------------------------+
Alternative Syntax
substring(str from start_pos for length)
Aliases
- substring
substr_index
Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned.
substr_index(str, delim, count)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- delim: The string to find in str to split str.
- count: The number of times to search for the delimiter. Can be either a positive or negative number.
Example
> select substr_index('www.apache.org', '.', 1);
+---------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) |
+---------------------------------------------------------+
| www |
+---------------------------------------------------------+
> select substr_index('www.apache.org', '.', -1);
+----------------------------------------------------------+
| substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) |
+----------------------------------------------------------+
| org |
+----------------------------------------------------------+
Aliases
- substring_index
substring
Alias of substr.
substring_index
Alias of substr_index.
to_hex
Converts an integer to a hexadecimal string.
to_hex(int)
Arguments
- int: Integer expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select to_hex(12345689);
+-------------------------+
| to_hex(Int64(12345689)) |
+-------------------------+
| bc6159 |
+-------------------------+
translate
Translates characters in a string to specified translation characters.
translate(str, chars, translation)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- chars: Characters to translate.
- translation: Translation characters. Translation characters replace only characters at the same position in the chars string.
Example
> select translate('twice', 'wic', 'her');
+--------------------------------------------------+
| translate(Utf8("twice"),Utf8("wic"),Utf8("her")) |
+--------------------------------------------------+
| there |
+--------------------------------------------------+
trim
Alias of btrim.
upper
Converts a string to upper-case.
upper(str)
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
Example
> select upper('dataFusion');
+---------------------------+
| upper(Utf8("dataFusion")) |
+---------------------------+
| DATAFUSION |
+---------------------------+
Related functions:
uuid
Returns UUID v4 string value which is unique per row.
uuid()
Example
> select uuid();
+--------------------------------------+
| uuid() |
+--------------------------------------+
| 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 |
+--------------------------------------+
Binary String Functions
decode
Decode binary data from textual representation in string.
decode(expression, format)
Arguments
- expression: Expression containing encoded string data
- format: Same arguments as encode
Related functions:
encode
Encode binary data into a textual representation.
encode(expression, format)
Arguments
- expression: Expression containing string or binary data
- format: Supported formats are:
base64,hex
Related functions:
Regular Expression Functions
Apache DataFusion uses a PCRE-like regular expression syntax (minus support for several features including look-around and backreferences). The following regular expression functions are supported:
regexp_count
Returns the number of matches that a regular expression has in a string.
regexp_count(str, regexp[, start, flags])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
- start: - start: Optional start position (the first position is 1) to search for the regular expression. Can be a constant, column, or function.
- flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
Example
> select regexp_count('abcAbAbc', 'abc', 2, 'i');
+---------------------------------------------------------------+
| regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) |
+---------------------------------------------------------------+
| 1 |
+---------------------------------------------------------------+
regexp_like
Returns true if a regular expression has at least one match in a string, false otherwise.
regexp_like(str, regexp[, flags])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- regexp: Regular expression to operate on. Can be a constant, column, or function, and any combination of operators.
- flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
Example
select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+--------------------------------------------------------+
| regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+--------------------------------------------------------+
| true |
+--------------------------------------------------------+
SELECT regexp_like('aBc', '(b|d)', 'i');
+--------------------------------------------------+
| regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+--------------------------------------------------+
| true |
+--------------------------------------------------+
Additional examples can be found here
regexp_match
Returns the first regular expression matches in a string.
regexp_match(str, regexp[, flags])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- regexp: Regular expression to match against. Can be a constant, column, or function.
- flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
Example
> select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}');
+---------------------------------------------------------+
| regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) |
+---------------------------------------------------------+
| [Köln] |
+---------------------------------------------------------+
SELECT regexp_match('aBc', '(b|d)', 'i');
+---------------------------------------------------+
| regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) |
+---------------------------------------------------+
| [B] |
+---------------------------------------------------+
Additional examples can be found here
regexp_replace
Replaces substrings in a string that match a regular expression.
regexp_replace(str, regexp, replacement[, flags])
Arguments
- str: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- regexp: Regular expression to match against. Can be a constant, column, or function.
- replacement: Replacement string expression to operate on. Can be a constant, column, or function, and any combination of operators.
- flags: Optional regular expression flags that control the behavior of the regular expression. The following flags are supported:
- g: (global) Search globally and don't return after the first match
- i: case-insensitive: letters match both upper and lower case
- m: multi-line mode: ^ and $ match begin/end of line
- s: allow . to match \n
- R: enables CRLF mode: when multi-line mode is enabled, \r\n is used
- U: swap the meaning of x* and x*?
Example
> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g');
+------------------------------------------------------------------------+
| regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) |
+------------------------------------------------------------------------+
| fooXarYXazY |
+------------------------------------------------------------------------+
SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i');
+-------------------------------------------------------------------+
| regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) |
+-------------------------------------------------------------------+
| aAbBac |
+-------------------------------------------------------------------+
Additional examples can be found here
Time and Date Functions
- current_date
- current_time
- current_timestamp
- date_bin
- date_format
- date_part
- date_trunc
- datepart
- datetrunc
- from_unixtime
- make_date
- now
- to_char
- to_date
- to_local_time
- to_timestamp
- to_timestamp_micros
- to_timestamp_millis
- to_timestamp_nanos
- to_timestamp_seconds
- to_unixtime
- today
current_date
Returns the current UTC date.
The current_date() return value is determined at query time and will return the same date, no matter when in the query plan the function executes.
current_date()
Aliases
- today
current_time
Returns the current UTC time.
The current_time() return value is determined at query time and will return the same time, no matter when in the query plan the function executes.
current_time()
current_timestamp
Alias of now.
date_bin
Calculates time intervals and returns the start of the interval nearest to the specified timestamp. Use date_bin to downsample time series data by grouping rows into time-based "bins" or "windows" and applying an aggregate or selector function to each window.
For example, if you "bin" or "window" data into 15 minute intervals, an input timestamp of 2023-01-01T18:18:18Z will be updated to the start time of the 15 minute bin it is in: 2023-01-01T18:15:00Z.
date_bin(interval, expression, origin-timestamp)
Arguments
-
interval: Bin interval.
-
expression: Time expression to operate on. Can be a constant, column, or function.
-
origin-timestamp: Optional. Starting point used to determine bin boundaries. If not specified defaults 1970-01-01T00:00:00Z (the UNIX epoch in UTC). The following intervals are supported:
- nanoseconds
- microseconds
- milliseconds
- seconds
- minutes
- hours
- days
- weeks
- months
- years
- century
Example
-- Bin the timestamp into 1 day intervals
> SELECT date_bin(interval '1 day', time) as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z') t(time);
+---------------------+
| bin |
+---------------------+
| 2023-01-01T00:00:00 |
| 2023-01-03T00:00:00 |
+---------------------+
2 row(s) fetched.
-- Bin the timestamp into 1 day intervals starting at 3AM on 2023-01-01
> SELECT date_bin(interval '1 day', time, '2023-01-01T03:00:00') as bin
FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z') t(time);
+---------------------+
| bin |
+---------------------+
| 2023-01-01T03:00:00 |
| 2023-01-03T03:00:00 |
+---------------------+
2 row(s) fetched.
date_format
Alias of to_char.
date_part
Returns the specified part of the date as an integer.
date_part(part, expression)
Arguments
-
part: Part of the date to return. The following date parts are supported:
- year
- quarter (emits value in inclusive range [1, 4] based on which quartile of the year the date is in)
- month
- week (week of the year)
- day (day of the month)
- hour
- minute
- second
- millisecond
- microsecond
- nanosecond
- dow (day of the week)
- doy (day of the year)
- epoch (seconds since Unix epoch)
-
expression: Time expression to operate on. Can be a constant, column, or function.
Alternative Syntax
extract(field FROM source)
Aliases
- datepart
date_trunc
Truncates a timestamp value to a specified precision.
date_trunc(precision, expression)
Arguments
-
precision: Time precision to truncate to. The following precisions are supported:
- year / YEAR
- quarter / QUARTER
- month / MONTH
- week / WEEK
- day / DAY
- hour / HOUR
- minute / MINUTE
- second / SECOND
-
expression: Time expression to operate on. Can be a constant, column, or function.
Aliases
- datetrunc
datepart
Alias of date_part.
datetrunc
Alias of date_trunc.
from_unixtime
Converts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z). Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z) return the corresponding timestamp.
from_unixtime(expression[, timezone])
Arguments
- expression: The expression to operate on. Can be a constant, column, or function, and any combination of operators.
- timezone: Optional timezone to use when converting the integer to a timestamp. If not provided, the default timezone is UTC.
Example
> select from_unixtime(1599572549, 'America/New_York');
+-----------------------------------------------------------+
| from_unixtime(Int64(1599572549),Utf8("America/New_York")) |
+-----------------------------------------------------------+
| 2020-09-08T09:42:29-04:00 |
+-----------------------------------------------------------+
make_date
Make a date from year/month/day component parts.
make_date(year, month, day)
Arguments
- year: Year to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
- month: Month to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
- day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
Example
> select make_date(2023, 1, 31);
+-------------------------------------------+
| make_date(Int64(2023),Int64(1),Int64(31)) |
+-------------------------------------------+
| 2023-01-31 |
+-------------------------------------------+
> select make_date('2023', '01', '31');
+-----------------------------------------------+
| make_date(Utf8("2023"),Utf8("01"),Utf8("31")) |
+-----------------------------------------------+
| 2023-01-31 |
+-----------------------------------------------+
Additional examples can be found here
now
Returns the current UTC timestamp.
The now() return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.
now()
Aliases
- current_timestamp
to_char
Returns a string representation of a date, time, timestamp or duration based on a Chrono format. Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.
to_char(expression, format)
Arguments
- expression: Expression to operate on. Can be a constant, column, or function that results in a date, time, timestamp or duration.
- format: A Chrono format string to use to convert the expression.
- day: Day to use when making the date. Can be a constant, column or function, and any combination of arithmetic operators.
Example
> select to_char('2023-03-01'::date, '%d-%m-%Y');
+----------------------------------------------+
| to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) |
+----------------------------------------------+
| 01-03-2023 |
+----------------------------------------------+
Additional examples can be found here
Aliases
- date_format
to_date
Converts a value to a date (YYYY-MM-DD).
Supports strings, integer and double types as input.
Strings are parsed as YYYY-MM-DD (e.g. '2023-07-20') if no Chrono formats are provided.
Integers and doubles are interpreted as days since the unix epoch (1970-01-01T00:00:00Z).
Returns the corresponding date.
Note: to_date returns Date32, which represents its values as the number of days since unix epoch(1970-01-01) stored as signed 32 bit value. The largest supported date value is 9999-12-31.
to_date('2017-05-31', '%Y-%m-%d')
Arguments
- expression: String expression to operate on. Can be a constant, column, or function, and any combination of operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_date('2023-01-31');
+-------------------------------+
| to_date(Utf8("2023-01-31")) |
+-------------------------------+
| 2023-01-31 |
+-------------------------------+
> select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d');
+---------------------------------------------------------------------+
| to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) |
+---------------------------------------------------------------------+
| 2023-01-31 |
+---------------------------------------------------------------------+
Additional examples can be found here
to_local_time
Converts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.
to_local_time(expression)
Arguments
- expression: Time expression to operate on. Can be a constant, column, or function.
Example
> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp);
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20 |
+---------------------------------------------+
> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels');
+---------------------------------------------+
| to_local_time(Utf8("2024-04-01T00:00:20Z")) |
+---------------------------------------------+
| 2024-04-01T00:00:20 |
+---------------------------------------------+
> SELECT
time,
arrow_typeof(time) as type,
to_local_time(time) as to_local_time,
arrow_typeof(to_local_time(time)) as to_local_time_type
FROM (
SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time
);
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| time | type | to_local_time | to_local_time_type |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
| 2024-04-01T00:00:20+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) | 2024-04-01T00:00:20 | Timestamp(Nanosecond, None) |
+---------------------------+------------------------------------------------+---------------------+-----------------------------+
## combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather
## than UTC boundaries
> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin;
+---------------------+
| date_bin |
+---------------------+
| 2024-04-01T00:00:00 |
+---------------------+
> SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone;
+---------------------------+
| date_bin_with_timezone |
+---------------------------+
| 2024-04-01T00:00:00+02:00 |
+---------------------------+
to_timestamp
Converts a value to a timestamp (YYYY-MM-DDT00:00:00Z). Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no [Chrono formats] are provided. Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
Note: to_timestamp returns Timestamp(Nanosecond). The supported range for integer input is between -9223372037 and 9223372036. Supported range for string input is between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.0. Please use to_timestamp_seconds for the input outside of supported bounds.
to_timestamp(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_timestamp('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------+
| to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------+
| 2023-01-31T14:26:56.123456789 |
+-----------------------------------------------------------+
> select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------+
| to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789 |
+--------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_micros
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z) Returns the corresponding timestamp.
to_timestamp_micros(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123456 |
+------------------------------------------------------------------+
> select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_millis
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_millis(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00');
+------------------------------------------------------------------+
| to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+------------------------------------------------------------------+
| 2023-01-31T14:26:56.123 |
+------------------------------------------------------------------+
> select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+---------------------------------------------------------------------------------------------------------------+
| to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+---------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_nanos
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_nanos(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00');
+-----------------------------------------------------------------+
| to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-----------------------------------------------------------------+
| 2023-01-31T14:26:56.123456789 |
+-----------------------------------------------------------------+
> select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+--------------------------------------------------------------------------------------------------------------+
| to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+--------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00.123456789 |
+---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_seconds
Converts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided. Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_seconds(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00');
+-------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) |
+-------------------------------------------------------------------+
| 2023-01-31T14:26:56 |
+-------------------------------------------------------------------+
> select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y');
+----------------------------------------------------------------------------------------------------------------+
| to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) |
+----------------------------------------------------------------------------------------------------------------+
| 2023-05-17T03:59:00 |
+----------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_unixtime
Converts a value to seconds since the unix epoch (1970-01-01T00:00:00Z). Supports strings, dates, timestamps and double types as input. Strings are parsed as RFC3339 (e.g. '2023-07-20T05:44:00') if no Chrono formats are provided.
to_unixtime(expression[, ..., format_n])
Arguments
- expression: Expression to operate on. Can be a constant, column, or function, and any combination of arithmetic operators.
- format_n: Optional Chrono format strings to use to parse the expression. Formats will be tried in the order they appear with the first successful one being returned. If none of the formats successfully parse the expression an error will be returned.
Example
> select to_unixtime('2020-09-08T12:00:00+00:00');
+------------------------------------------------+
| to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) |
+------------------------------------------------+
| 1599566400 |
+------------------------------------------------+
> select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z');
+-----------------------------------------------------------------------------------------------------------------------------+
| to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) |
+-----------------------------------------------------------------------------------------------------------------------------+
| 1673638290 |
+-----------------------------------------------------------------------------------------------------------------------------+
today
Alias of current_date.
Array Functions
- array_any_value
- array_append
- array_cat
- array_concat
- array_contains
- array_dims
- array_distance
- array_distinct
- array_element
- array_empty
- array_except
- array_extract
- array_has
- array_has_all
- array_has_any
- array_indexof
- array_intersect
- array_join
- array_length
- array_max
- array_ndims
- array_pop_back
- array_pop_front
- array_position
- array_positions
- array_prepend
- array_push_back
- array_push_front
- array_remove
- array_remove_all
- array_remove_n
- array_repeat
- array_replace
- array_replace_all
- array_replace_n
- array_resize
- array_reverse
- array_slice
- array_sort
- array_to_string
- array_union
- arrays_overlap
- cardinality
- empty
- flatten
- generate_series
- list_any_value
- list_append
- list_cat
- list_concat
- list_contains
- list_dims
- list_distance
- list_distinct
- list_element
- list_empty
- list_except
- list_extract
- list_has
- list_has_all
- list_has_any
- list_indexof
- list_intersect
- list_join
- list_length
- list_max
- list_ndims
- list_pop_back
- list_pop_front
- list_position
- list_positions
- list_prepend
- list_push_back
- list_push_front
- list_remove
- list_remove_all
- list_remove_n
- list_repeat
- list_replace
- list_replace_all
- list_replace_n
- list_resize
- list_reverse
- list_slice
- list_sort
- list_to_string
- list_union
- make_array
- make_list
- range
- string_to_array
- string_to_list
array_any_value
Returns the first non-null element in the array.
array_any_value(array)
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example
> select array_any_value([NULL, 1, 2, 3]);
+-------------------------------+
| array_any_value(List([NULL,1,2,3])) |
+-------------------------------------+
| 1 |
+-------------------------------------+
Aliases
- list_any_value
array_append
Appends an element to the end of an array.
array_append(array, element)
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
- element: Element to append to the array.
Example
> select array_append([1, 2, 3], 4);
+--------------------------------------+
| array_append(List([1,2,3]),Int64(4)) |
+--------------------------------------+
| [1, 2, 3, 4] |
+--------------------------------------+
Aliases
- list_append
- array_push_back
- list_push_back
array_cat
Alias of array_concat.
array_concat
Concatenates arrays.
array_concat(array[, ..., array_n])
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
- array_n: Subsequent array column or literal array to concatenate.
Example
> select array_concat([1, 2], [3, 4], [5, 6]);
+---------------------------------------------------+
| array_concat(List([1,2]),List([3,4]),List([5,6])) |
+---------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
+---------------------------------------------------+
Aliases
- array_cat
- list_concat
- list_cat
array_contains
Alias of array_has.
array_dims
Returns an array of the array's dimensions.
array_dims(array)
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example
> select array_dims([[1, 2, 3], [4, 5, 6]]);
+---------------------------------+
| array_dims(List([1,2,3,4,5,6])) |
+---------------------------------+
| [2, 3] |
+---------------------------------+
Aliases
- list_dims
array_distance
Returns the Euclidean distance between two input arrays of equal length.
array_distance(array1, array2)
Arguments
- array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
- array2: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example
> select array_distance([1, 2], [1, 4]);
+------------------------------------+
| array_distance(List([1,2], [1,4])) |
+------------------------------------+
| 2.0 |
+------------------------------------+
Aliases
- list_distance
array_distinct
Returns distinct values from the array after removing duplicates.
array_distinct(array)
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example
> select array_distinct([1, 3, 2, 3, 1, 2, 4]);
+---------------------------------+
| array_distinct(List([1,2,3,4])) |
+---------------------------------+
| [1, 2, 3, 4] |
+---------------------------------+
Aliases
- list_distinct
array_element
Extracts the element with the index n from the array.
array_element(array, index)
Arguments
- array: Array expression. Can be a constant, column, or function, and any combination of array operators.
- index: Index to extract the element from the array.
Example
> select array_element([1, 2, 3, 4], 3);
+-----------------------------------------+
| array_element(List([1,2,3,4]),Int64(3)) |
+-----------------------------------------+
| 3 |
+-----------------------------------------+
Aliases
- array_extract
- list_element
- list_extract
array_empty
Alias of empty.
array_except
Returns an array of the elements that appear in the first array but not in the second.
array_except(array1, array2)
Arguments
- array1: Array expression. Can be a constant, column, or function, and any combination of array operators.
- array2: Array expression. Can be a constant, column, or function, and any combination of array operators.
Example
> select array_except([1, 2, 3, 4], [5, 6, 3, 4]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [5, 6, 3, 4]); |
+----------------------------------------------------+
| [1, 2] |
+----------------------------------------------------+
> select array_except([1, 2, 3, 4], [3, 4, 5, 6]);
+----------------------------------------------------+
| array_except([1, 2, 3, 4], [3, 4, 5, 6]); |
+----------------------------------------------------+
| [1, 2] |
+----------------------------------------------------+
Aliases
- list_except
array_extract
Alias of array_element.
array_has
Returns true if the array contains the element.
array_has(array, element)