Adding, removing, and renaming a index

Adding an index

ADD INDEX — adds an index with the specified name and type for a given set of columns. Grammar:

ALTER TABLE `<table_name>`
  ADD INDEX `<index_name>`
    [GLOBAL|LOCAL]
    [SYNC|ASYNC]
    [USING <index_type>]
    ON ( <index_columns> )
    [COVER ( <cover_columns> )]
    [WITH ( <parameter_name> = <parameter_value>[, ...])]
  [,   ...]

GLOBAL/LOCAL — global or local index; depending on the index type (<index_type>), only one of them may be available:
- GLOBAL — an index implemented as a separate table or set of tables. Synchronous updates to such an index require distributed transactions.
- LOCAL — a local index within a shard of a row-oriented or column-oriented table. Does not require distributed transactions for updates, but does not provide pruning during search.
<index_name> — unique index name that will be used to access data.
SYNC/ASYNC — the index synchronization mode.
- SYNC — a synchronous index. This is the default value.
- ASYNC — an asynchronous index.
<index_type> — index type, currently supported:
- secondary — secondary index. Only GLOBAL is available. This is the default value.
- vector_kmeans_tree — vector index. Described in detail in Vector index.
- fulltext_plain — basic fulltext index. Described in detail in Fulltext index.
- fulltext_relevance — fulltext index with BM25 statistics for relevance scoring. Described in detail in Fulltext index.
<index_columns> — comma-separated list of column names for the table being created. This list defines the composition and order of columns included in the index key. Must be specified. The index key will include both the columns listed and the columns from the table's primary key.
<cover_columns> — comma-separated list of column names from the created table that will be saved in the index in addition to index key columns, providing the ability to get additional data without accessing the table. Empty by default.
<parameter_name> and <parameter_value> — index parameters specific to a particular <index_type>.

Parameters for all index types:

parallel - maximum number of parallel partition-based workers used during index build (an integer between 1 and MaxBuildIndexShardsInFlight from SchemeShardConfig).
- If not specified, currently defaults to 32 or MaxBuildIndexShardsInFlight if it's lower. Default MaxBuildIndexShardsInFlight is 1000. Default parallelism selection logic may be changed in future versions.
- You may set a smaller limit to reduce the impact of index build on the DB performance.
- You may also set a larger limit to speed up the index build if you have enough hardware resources.

Parameters specific to vector indexes:

common parameters for all vector indexes:
- vector_dimension - embedding vector dimensionality (should be between 1 and 16384)
- vector_type - vector value type (float, uint8, or int8)
- distance - distance function (cosine, manhattan, or euclidean), mutually exclusive with similarity
- similarity - similarity function (inner_product or cosine), mutually exclusive with distance
specific parameters for vector_kmeans_tree (read more about the index type):
- clusters - number of centroids for k-means algorithm (should be between 2 and 2048)
- levels - number of levels in the tree (should be between 1 and 16)
- overlap_clusters - the number of nearest clusters to add each vector to (default 1)
- the total number of nodes in the tree, calculated as clusters raised to the power of levels, should be no more than 1073741824
- the product of vector_dimension and clusters should be no more than 4194304

Note

For vector indexes, the vector_type and vector_dimension parameters can be omitted if the table is not empty — they are determined automatically based on the row contents. The levels and clusters parameters are also determined automatically, and the table may be empty for them, but this is highly unrecommended because the default values in that case are levels=1, clusters=2. It is far better to create the index on a table that already has data loaded, so that the values can be determined correctly.

Parameters specific to fulltext indexes:

common parameters for all fulltext indexes:
- tokenizer - tokenizer type (standard, whitespace, or keyword)
- use_filter_lowercase - lowercase filter (true or false)
- use_filter_length - token length filter (true or false); when true, tokens shorter than filter_length_min or longer than filter_length_max are not indexed and are ignored during search
- filter_length_min - minimum token length (positive integer); only applied when use_filter_length=true
- filter_length_max - maximum token length (positive integer); only applied when use_filter_length=true
- use_filter_snowball - Snowball stemmer filter (true or false)
- language - language for the Snowball stemmer (for example, english, russian)
- use_filter_ngram - n-gram filter (true or false)
- use_filter_edge_ngram - edge n-gram filter (true or false)
- filter_ngram_min_length - minimum n-gram length (positive integer)
- filter_ngram_max_length - maximum n-gram length (positive integer, (\ge) filter_ngram_min_length)

You can also add a secondary index using the YDB CLI table index command.

Warning

Supported only for row-oriented tables. Support for column-oriented tables is currently under development.

Examples

A regular secondary index:

ALTER TABLE `series`
  ADD INDEX `title_index`
  GLOBAL ON (`title`);

Vector index:

ALTER TABLE `series`
  ADD INDEX emb_cosine_idx GLOBAL SYNC USING vector_kmeans_tree
  ON (embedding) COVER (title)
  WITH (
    distance="cosine", vector_type="float", vector_dimension=512
  );

A fulltext index:

ALTER TABLE `series`
  ADD INDEX ft_idx GLOBAL USING fulltext_plain
  ON (title)
  WITH (tokenizer=standard, use_filter_lowercase=true);

Altering an index

Indexes have type-specific parameters that can be tuned. Global indexes, whether synchronous or asynchronous, are implemented as hidden tables, and their automatic partitioning and followers settings can be adjusted just like those of regular tables.

Note

Currently, specifying secondary index partitioning settings during index creation is not supported in either the ALTER TABLE ADD INDEX or the CREATE TABLE INDEX statements.

ALTER TABLE <table_name> ALTER INDEX <index_name> SET <setting_name> <value>;
ALTER TABLE <table_name> ALTER INDEX <index_name> SET (<setting_name_1> = <value_1>, ...);

<table_name>: The name of the table whose index is to be modified.
<index_name>: The name of the index to be modified.
<setting_name>: The name of the setting to be modified, which should be one of the following:

Note

These settings cannot be reset.

<value>: The new value for the setting. Possible values include:
- ENABLED or DISABLED for the AUTO_PARTITIONING_BY_SIZE and AUTO_PARTITIONING_BY_LOAD settings
- "PER_AZ:<count>" or "ANY_AZ:<count>" where <count> is the number of replicas for the READ_REPLICAS_SETTINGS
- An integer of Uint64 type for the other settings

Example

The query in the following example enables automatic partitioning by load for the index named title_index of the table series, sets its minimum partition count to 5, and enables one follower per AZ for every partition:

ALTER TABLE `series` ALTER INDEX `title_index` SET (
    AUTO_PARTITIONING_BY_LOAD = ENABLED,
    AUTO_PARTITIONING_MIN_PARTITIONS_COUNT = 5,
    READ_REPLICAS_SETTINGS = "PER_AZ:1"
);

Deleting an index

DROP INDEX: Deletes the index with the specified name. The code below deletes the index named title_index.

ALTER TABLE `series` DROP INDEX `title_index`;

You can also remove a index using the YDB CLI table index command.

Renaming an index

RENAME INDEX: Renames the index with the specified name.

If an index with the new name exists, an error is returned.

Replacement of atomic indexes under load is supported by the command ydb table index rename in the YDB CLI and by YDB SDK ad-hoc methods.

Example of index renaming:

ALTER TABLE `series` RENAME INDEX `title_index` TO `title_index_new`;

Was the article helpful?

Overview

COLUMN