Vector index
Warning
Supported only for row-oriented tables. Support for column-oriented tables is currently under development.
Alert
The functionality of vector indexes is available in the test mode in main. This functionality will be fully available in version 25.1.
The following features are not supported:
- Index update: the main table can be modified, but the existing index will not be updated. A new index is to be built to reflect the changes. If necessary, the existing index can be atomically replaced with the newly built one.
- Building an index for vectors with bit quantization.
These limitations may be removed in future versions.
Warning
It makes no sense to create an empty table with a vector index, because for now we don't allow mutations in tables with vector indexes.
You should use ALTER TABLE ... ADD INDEX
command) to add a vector index to an existing table.
Vector index in row-oriented tables is created using the same syntax as secondary indexes, by specifying vector_kmeans_tree
as the index type. Subset of syntax available for vector indexes:
CREATE TABLE `<table_name>` (
...
INDEX `<index_name>`
GLOBAL
[SYNC]
USING vector_kmeans_tree
ON ( <index_columns> )
[COVER ( <cover_columns> )]
[WITH ( <parameter_name> = <parameter_value>[, ...])]
[, ...]
)
Where:
<index_name>
- unique index name for data accessSYNC
- indicates synchronous data writing to the index. This is the only currently available option, and it is used by default.<index_columns>
- comma-separated list of table columns used for index searches (the last column is used as embedding, others as filtering columns)<cover_columns>
- list of additional table columns stored in the index to enable retrieval without accessing the main table<parameter_name>
and<parameter_value>
- list of key-value parameters:
- common parameters for all vector indexes:
vector_dimension
- embedding vector dimensionality (16384 or less)vector_type
- vector value type (float
,uint8
,int8
, orbit
)distance
- distance function (cosine
,manhattan
, oreuclidean
), mutually exclusive withsimilarity
similarity
- similarity function (inner_product
orcosine
), mutually exclusive withdistance
- specific parameters for
vector_kmeans_tree
(see Vector Index Type `vector_kmeans_tree` {#kmeans-tree-type}):clusters
- number of centroids for k-means algorithm (values greater than 1000 may degrade performance)levels
- number of levels in the tree
Warning
Vector indexes with vector_type=bit
are not currently supported.
Example
CREATE TABLE user_articles (
article_id Uint64,
user String,
title String,
text String,
embedding String,
INDEX emb_cosine_idx GLOBAL SYNC USING vector_kmeans_tree
ON (user, embedding) COVER (title, text)
WITH (
distance="cosine",
vector_type="float",
vector_dimension=512,
clusters=128,
levels=2
),
PRIMARY KEY (article_id)
)