Adding, removing, and renaming a index
Adding an index
ADD INDEX — adds an index with the specified name and type for a given set of columns. Grammar:
ALTER TABLE `<table_name>`
ADD INDEX `<index_name>`
[GLOBAL|LOCAL]
[SYNC|ASYNC]
[USING <index_type>]
ON ( <index_columns> )
[COVER ( <cover_columns> )]
[WITH ( <parameter_name> = <parameter_value>[, ...])]
[, ...]
-
GLOBAL/LOCAL— global or local index; depending on the index type (<index_type>), only one of them may be available:GLOBAL— an index implemented as a separate table or set of tables. Synchronous updates to such an index require distributed transactions.LOCAL— a local index within a shard of a row-oriented or column-oriented table. Does not require distributed transactions for updates, but does not provide pruning during search.
-
<index_name>— unique index name that will be used to access data. -
SYNC/ASYNC— the index synchronization mode.SYNC— a synchronous index. This is the default value.ASYNC— an asynchronous index.
-
<index_type>— index type, currently supported:secondary— secondary index. OnlyGLOBALis available. This is the default value.vector_kmeans_tree— vector index. Described in detail in Vector index.fulltext_plain— basic fulltext index. Described in detail in Fulltext index.fulltext_relevance— fulltext index with BM25 statistics for relevance scoring. Described in detail in Fulltext index.
-
<index_columns>— comma-separated list of column names for the table being created. This list defines the composition and order of columns included in the index key. Must be specified. The index key will include both the columns listed and the columns from the table's primary key. -
<cover_columns>— comma-separated list of column names from the created table that will be saved in the index in addition to index key columns, providing the ability to get additional data without accessing the table. Empty by default. -
<parameter_name>and<parameter_value>— index parameters specific to a particular<index_type>.
Parameters for all index types:
parallel- maximum number of parallel partition-based workers used during index build (an integer between1andMaxBuildIndexShardsInFlightfromSchemeShardConfig).- If not specified, currently defaults to
32orMaxBuildIndexShardsInFlightif it's lower. DefaultMaxBuildIndexShardsInFlightis1000. Default parallelism selection logic may be changed in future versions. - You may set a smaller limit to reduce the impact of index build on the DB performance.
- You may also set a larger limit to speed up the index build if you have enough hardware resources.
- If not specified, currently defaults to
Parameters specific to vector indexes:
- common parameters for all vector indexes:
vector_dimension- embedding vector dimensionality (should be between 1 and 16384)vector_type- vector value type (float,uint8, orint8)distance- distance function (cosine,manhattan, oreuclidean), mutually exclusive withsimilaritysimilarity- similarity function (inner_productorcosine), mutually exclusive withdistance
- specific parameters for
vector_kmeans_tree(read more about the index type):clusters- number of centroids for k-means algorithm (should be between 2 and 2048)levels- number of levels in the tree (should be between 1 and 16)overlap_clusters- the number of nearest clusters to add each vector to (default 1)- the total number of nodes in the tree, calculated as
clustersraised to the power oflevels, should be no more than 1073741824 - the product of
vector_dimensionandclustersshould be no more than 4194304
Note
For vector indexes, the vector_type and vector_dimension parameters can be omitted if the table is not empty — they are determined automatically based on the row contents. The levels and clusters parameters are also determined automatically, and the table may be empty for them, but this is highly unrecommended because the default values in that case are levels=1, clusters=2. It is far better to create the index on a table that already has data loaded, so that the values can be determined correctly.
Parameters specific to fulltext indexes:
- common parameters for all fulltext indexes:
tokenizer- tokenizer type (standard,whitespace, orkeyword)use_filter_lowercase- lowercase filter (trueorfalse)use_filter_length- token length filter (trueorfalse); whentrue, tokens shorter thanfilter_length_minor longer thanfilter_length_maxare not indexed and are ignored during searchfilter_length_min- minimum token length (positive integer); only applied whenuse_filter_length=truefilter_length_max- maximum token length (positive integer); only applied whenuse_filter_length=trueuse_filter_snowball- Snowball stemmer filter (trueorfalse)language- language for the Snowball stemmer (for example,english,russian)use_filter_ngram- n-gram filter (trueorfalse)use_filter_edge_ngram- edge n-gram filter (trueorfalse)filter_ngram_min_length- minimum n-gram length (positive integer)filter_ngram_max_length- maximum n-gram length (positive integer, (\ge)filter_ngram_min_length)
You can also add a secondary index using the YDB CLI table index command.
Warning
Supported only for row-oriented tables. Support for column-oriented tables is currently under development.
Examples
A regular secondary index:
ALTER TABLE `series`
ADD INDEX `title_index`
GLOBAL ON (`title`);
ALTER TABLE `series`
ADD INDEX emb_cosine_idx GLOBAL SYNC USING vector_kmeans_tree
ON (embedding) COVER (title)
WITH (
distance="cosine", vector_type="float", vector_dimension=512
);
A fulltext index:
ALTER TABLE `series`
ADD INDEX ft_idx GLOBAL USING fulltext_plain
ON (title)
WITH (tokenizer=standard, use_filter_lowercase=true);
Altering an index
Indexes have type-specific parameters that can be tuned. Global indexes, whether synchronous or asynchronous, are implemented as hidden tables, and their automatic partitioning and followers settings can be adjusted just like those of regular tables.
Note
Currently, specifying secondary index partitioning settings during index creation is not supported in either the ALTER TABLE ADD INDEX or the CREATE TABLE INDEX statements.
ALTER TABLE <table_name> ALTER INDEX <index_name> SET <setting_name> <value>;
ALTER TABLE <table_name> ALTER INDEX <index_name> SET (<setting_name_1> = <value_1>, ...);
-
<table_name>: The name of the table whose index is to be modified. -
<index_name>: The name of the index to be modified. -
<setting_name>: The name of the setting to be modified, which should be one of the following:
Note
These settings cannot be reset.
<value>: The new value for the setting. Possible values include:ENABLEDorDISABLEDfor theAUTO_PARTITIONING_BY_SIZEandAUTO_PARTITIONING_BY_LOADsettings"PER_AZ:<count>"or"ANY_AZ:<count>"where<count>is the number of replicas for theREAD_REPLICAS_SETTINGS- An integer of
Uint64type for the other settings
Example
The query in the following example enables automatic partitioning by load for the index named title_index of the table series, sets its minimum partition count to 5, and enables one follower per AZ for every partition:
ALTER TABLE `series` ALTER INDEX `title_index` SET (
AUTO_PARTITIONING_BY_LOAD = ENABLED,
AUTO_PARTITIONING_MIN_PARTITIONS_COUNT = 5,
READ_REPLICAS_SETTINGS = "PER_AZ:1"
);
Deleting an index
DROP INDEX: Deletes the index with the specified name. The code below deletes the index named title_index.
ALTER TABLE `series` DROP INDEX `title_index`;
You can also remove a index using the YDB CLI table index command.
Renaming an index
RENAME INDEX: Renames the index with the specified name.
If an index with the new name exists, an error is returned.
Replacement of atomic indexes under load is supported by the command ydb table index rename in the YDB CLI and by YDB SDK ad-hoc methods.
Example of index renaming:
ALTER TABLE `series` RENAME INDEX `title_index` TO `title_index_new`;