Bloom skip index

Bloom skip indexes are local indexes and must be declared with LOCAL. In the INDEX clause use bloom_filter or bloom_ngram_filter (similar to a secondary index, but with LOCAL and the matching USING type). See also local indexes.

CREATE TABLE `<table_name>` (
    ...
    INDEX `<index_name>`
        LOCAL
        USING bloom_filter | bloom_ngram_filter
        ON ( <index_columns> )
        [WITH ( <parameter_name> = <parameter_value>[, ...])]
    [,   ...]
)

Where:

  • <index_name>: Index name.
  • LOCAL: Required keyword for Bloom skip indexes.
  • <index_columns>: List of columns used to build the index; the number of columns depends on the table type and index type.
  • COVER (...) and data columns are not supported for Bloom skip indexes.

WITH (...) parameters:

  • bloom_filter
    • false_positive_probability: Target false-positive rate of the filter: the fraction of fragments that are not skipped even though the requested value is not there (range (0, 1)). A lower value reduces extra reads but increases index size.
      • If omitted, the default is 0.0001 for row-oriented tables and 0.1 for column-oriented tables. The stricter default on row-oriented tables matches OLTP point lookups; on column-oriented tables it reflects analytical scans, where a larger trade-off between index size and fragment skipping is acceptable.
    • Indexed columns: YQL types that support equality comparison, except Yson, Json, and JsonDocument (see comparison operators). One column in a column-oriented table; multiple columns in a row-oriented table.
  • bloom_ngram_filter (String and Utf8 columns; column-oriented tables only)
    • ngram_size: N-gram length, an integer from 3 to 8 (default 3).
    • false_positive_probability: Target false-positive rate (range (0, 1); default 0.1).
    • case_sensitive: Whether n-grams respect character case: true or false (default true).

Creating and altering these indexes on an existing table is described in ALTER TABLE ADD INDEX.

Examples

bloom_filter index

CREATE TABLE events (
    id Uint64,
    resource_id Utf8,
    PRIMARY KEY (id),
    INDEX idx_bloom LOCAL USING bloom_filter
        ON (resource_id)
        WITH (false_positive_probability = 0.01)
);

bloom_ngram_filter index

CREATE TABLE events (
    id Uint64,
    message Utf8,
    PRIMARY KEY (id),
    INDEX idx_ngram LOCAL USING bloom_ngram_filter
        ON (message)
        WITH (
            ngram_size = 3,
            false_positive_probability = 0.01,
            case_sensitive = true
        )
);