Skip to main content

Array Functions

This section describes functions and operators to inspect and transform array values. In the following, array(T) denotes an array type with element type T. The element type is omitted when irrelevant.

Access & Inspection

The following functions yield access to array elements or metadata. These operations are compatible with all (nullable and non-nullable) element types.

SignatureDescriptionExample
array(T)[int]TReturns the n-th element of the array (1-indexed).(array[1,2,3])[1]1
array(T)[int:int]TReturns the subarray within the given boundes (1-indexed, inclusive).(array[1,2,3])[2:3]{2,3}
array_length(array)intReturns the length of the array.array_length(array[1,2,3])3
array_to_string(array, text [, text])Converts the array into a textual representation, with the given element separator and (optional) null indicator.array_to_string(array[1,2,3], ';')1;2;3
array_to_string(array[3,2,1,null], '⏰', '🎉')3⏰2⏰1⏰🎉
array_contains(array, value)Checks if a given value is contained within the array.array_contains(array[1,3,4], 3)true
array_contains(array[1,3,4], 2)false
array_position(array, value)Returns the index of the first occurrence of value inside array. Comparisons are done using IS NOT DISTINCT FROM semantics, so it is possible to search for NULL. Returns NULL if the element is not found.array_position(array[1,3,4,3], 3)2
array_contains(array[1,3,4,3], 2)NULL
array_positions(array, value)Returns the an array containing the indices of all occurrences of value inside array. Comparisons are done using IS NOT DISTINCT FROM semantics, so it is possible to search for NULL. NULL is returned only if the array is NULL; if the value is not found in the array, an empty array is returned.array_positions(array[1,3,4,3], 3)[2,4]
array_contains(array[1,3,4,3], 2)[]

Transformations

The following functions produce new array values from existing ones. These operations are compatible with all (nullable and non-nullable) element types.

SignatureDescriptionExample
array_prepend(T,array(T))array(T)
or
T || array(T)array(T)
Inserts a value at the front of the array.
If null is prepended to an array with non-nullable element type, the resulting element type is nullable.
array_prepend(1, array[2,3,4])array[1,2,3,4]
array_prepend(null, [2,3,4])array[null,2,3,4]
array_append(array(T), T)array(T)
or
array(T) || Tarray(T)
Inserts a value at the end of the array.
If null is appended to an array with non-nullable element type, the resulting element type is nullable.
array_append(array[1,2,3], 4)array[1,2,3,4]
array_append(array[1,2,3], null)array[1,2,3,null]
array_cat(array(T), array(T))array(T)
or
array(T) || array(T)array(T)
Concatenates two arrays.
The resulting element type is non-nullable if and only if both input element types are non-nullable.
array_cat(array[1,2], array[3,4])array[1,2,3,4]
array_cat(array[1,null], array[3,4])array[1,null,3,4]
array_cat(array[1,2], array[null,4])array[1,2,null,4]

Inner Products

Hyper offers high-performance implementations of common inner products in vector spaces. These functions operate on two input vectors (represented as arrays) and produce a scalar value of type double precision.

Contrary to other array operations, vector functions only apply to a certain set of arrays:

  • the element type must be real or double precision
  • both arrays must have the same element type (but may have different element nullability)
  • both arrays must have the same length

If any of the input arrays contains a null element, or is itself null, the result of an inner product will be null. Passing arrays of different lengths or incompatible types will result in an error.

In the following table, signatures and examples are abbreviated for clarity. vec denotes a suitable array type (i.e., array(real) or array(double precision not null)). It is implied that both array arguments have the same element type. Similarly, the {1.0, 2.0, 3.0} syntax in the example column represents a suitable array value (e.g., '{1.0, 2.0, 3.0}'::array(real not null)).

SignatureDescriptionExample
dot_product(vec, vec)double precisionComputes the conventional dot product between two vectors.dot_product({1.0, 2.0, 3.0}, {-1.0, 2.0, -3.0})-6.0
dot_product({1.0, null, 3.0}, {-1.0, 2.0, -3.0})null
cosine_similarity(vec, vec)double precisionComputes cosine similarity between two vectors.cosine_similarity({1.0, 2.0, 3.0}, {-1.0, 2.0, -3.0})-0.42857...
cosine_similarity({1.0, 2.0, 3.0}, {null, 2.0, -3.0})null
tip

If possible, prefer arrays with non-nullable element types (i.e., array(real not null) or array(double precision not null)) when computing inner products. This allows Hyper to skip element null checks, resulting in better performance.

With inner products, vector search can be implemented directly in Hyper.

info

Vector search is a search technique to find semantically similar items (called documents) in a given corpus. Each document is represented as an embedding vector in a high-dimensional vector space, e.g. using word-count statistics or machine learning models. Two documents are considered similar if their embeddings are "close by" in the space  as measured by an inner product.

For illustration, consider the problem of finding products similar to a search string in the product catalog of an e-commerce business. To simplify the example, let's assume that the product catalog is already loaded into a temporary table products with the following columns:

  • description_vec (type: array(real not null)) a description of the product, reprsented as a suitable embedding vector
  • name (type: text) the name of the product

Further, let's assume that the search string has already been converted into a vector {1.1, -0.2, 0.7, -0.3}, using the same embedding model as the description_vec column.

Retrieving the top five most-similar products can be expressed in SQL as follows:

select
name,
dot_product('{1.1, -0.2, 0.7, -0.3}'::array(real not null), description_vec) as score
from products
limit 5
order by score descending
note

The embedding vector in this example has been chosen as four-dimensional to keep the query readable; realistic applications likely use much higher dimensionalities.