The ML.HASH_BUCKETIZE function
This document describes the ML.HASH_BUCKETIZE
function, which lets you
convert a string expression to a deterministic hash and then bucketize it by the
modulo value of that hash.
Syntax
ML.HASH_BUCKETIZE(string_expression, hash_bucket_size)
Arguments
ML.HASH_BUCKETIZE
takes the following arguments:
string_expression
: theSTRING
expression to bucketize.hash_bucket_size
: anINT64
value that specifies the number of buckets to create. This value must be greater than or equal to0
. Ifhash_bucket_size
equals0
, the function only hashes the string without bucketizing the hashed value.
Output
ML.HASH_BUCKETIZE
returns an INT64
value that identifies the bucket.
Example
The following example bucketizes string expressions into three buckets:
SELECT f, ML.HASH_BUCKETIZE(f, 3) AS bucket FROM UNNEST(['a', 'b', 'c', 'd']) AS f;
The output looks similar to the following:
+---+--------+ | f | bucket | +---+--------+ | a | 0 | +---+--------+ | b | 1 | +---+--------+ | c | 1 | +---+--------+ | d | 2 | +------------+
What's next
- For information about feature preprocessing, see Feature preprocessing overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.