Graph Index since v0.5.0

VectorChord's index type vchordg is a disk-based graph index. It has low memory consumption.

Let's start by creating a table named items with an embedding column of type vector(n), and then populate it with sample data.

To create a vchordg index, you can use the following SQL.

sql

CREATE INDEX ON items USING vchordg (embedding vector_l2_ops);

TIP

This feature is in preview.

Tuning

When building an index, two options usually need tuning: m and ef_construction. m is the maximum number of neighbors per vertex, and ef_construction is the size of the dynamic list containing the nearest neighbors during insertion.

In search, you need to tune ef_search. ef_search is the size of the dynamic list containing the nearest neighbors during search.

sql

CREATE INDEX ON items USING vchordg (embedding vector_l2_ops) WITH (options = $$
m = 64
ef_construction = 128
$$);

SET vchordg.ef_search TO '128';
SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 10;

As a disk-based index, vchordg usually requires only the quantized vectors to be kept in the buffer pool to maintain performance. By default, vchordg quantizes a -dimensional vector to bits. Let the number of rows be . Then, the total memory required for the index is bits. If you have very limited memory and are using ultra-high dimensional vectors, consider quantizing a vector to bits. Then, the total memory required for the index is bits.

sql

CREATE INDEX ON items USING vchordg (embedding vector_l2_ops) WITH (options = $$
bits = 1
m = 64
ef_construction = 128
$$);

The index building can be sped up by using multiple processes. Refer to PostgreSQL Tuning.

Reference

Operator Classes vchordg

The following table lists all available operator classes supported by vchordg.

Operator Class	Description	Operator 1	Operator 2
`vector_l2_ops`	index works for `vector` type and Euclidean distance	`<->(vector,vector)`	`<<->>(vector,vector)`
`vector_ip_ops`	index works for `vector` type and negative inner product	`<#>(vector,vector)`	`<<#>>(vector,vector)`
`vector_cosine_ops`	index works for `vector` type and cosine distance	`<=>(vector,vector)`	`<<=>>(vector,vector)`
`halfvec_l2_ops`	index works for `halfvec` type and Euclidean distance	`<->(halfvec,halfvec)`	`<<->>(halfvec,halfvec)`
`halfvec_ip_ops`	index works for `halfvec` type and negative inner product	`<#>(halfvec,halfvec)`	`<<#>>(halfvec,halfvec)`
`halfvec_cosine_ops`	index works for `halfvec` type and cosine distance	`<=>(halfvec,halfvec)`	`<<=>>(halfvec,halfvec)`

<<->>, <<#>>, <<=>> are operators defined by VectorChord.

For more information about <<->>, <<#>>, <<=>>, refer to Similarity Filter.

All operator classes are available since version 0.3.0.

Indexing Options vchordg

`bits` since v0.5.0

Description: The ratio of bits to dimensions after RaBitQ quantization. bits = 2 provides better recall and QPS, while bits = 1 consumes less memory.
Type: integer
Default: 2
Example:
- bits = 2 means a -dimensional vector is quantized to bits.
- bits = 1 means a -dimensional vector is quantized to bits .

`m` since v0.5.0

Description: The maximum number of neighbors per vertex. The larger m is, the better the performance, but the higher the storage requirement. m corresponds to in HNSW and in DiskANN.
Type: integer
Default: 32
Example:
- m = 32 means that there are at most neighbors per vertex.
- m = 64 means that there are at most neighbors per vertex.

`ef_construction` since v0.5.0

Description: The size of the dynamic list containing the nearest neighbors during insertion. The larger ef_construction is, the better the performance, but the slower the insertion. ef_construction corresponds to in HNSW and in DiskANN.
Type: integer
Default: 64
Example:
- ef_construction = 64 means that the size of the dynamic list containing the nearest neighbors is during insertion.
- ef_construction = 128 means that the size of the dynamic list containing the nearest neighbors is during insertion.

`alpha` since v0.5.0

Description: The alpha values selected during pruning. alpha corresponds to in DiskANN. This option must be an ascending list, where the first element is 1.0 and the last element is less than 2.0.
Type: list of floats
Default: [1.0, 1.2]
Example:
- alpha = [1.0, 1.2] is equivalent to setting alpha = 1.2 in DiskANN.
- alpha = [1.0] is equivalent to the default pruning strategy in HNSW.
Note: this option is ineffective when the distance metric is negative inner product.

`beam_construction` since v0.5.0

Description: Beam width used during insertion. Beam width refers to the number of vertices accessed at once. Since the time to randomly read a small number of sectors from the SSD is almost the same as reading a single sector, a larger beam width effectively reduces the number of round trips to the SSD, resulting in better performance. Since it increases computation, it is disadvantageous for indexes that fit entirely in memory.
Type: integer
Default: 1
Example:
- beam_construction = 8 means that the index accesses 8 vertices at once during insertion.
- beam_construction = 1 means that the index accesses 1 vertex at once during insertion.

Search Parameters vchordg

`vchordg.ef_search` since v0.5.0

Description: The size of the dynamic list containing the nearest neighbors in search. The larger vchordg.ef_search is, the better the recall, but the worse the QPS. vchordg.ef_search corresponds to in HNSW and DiskANN.
Type: integer
Default: 64
Domain: [1, 65535]
Example:
- SET vchordg.ef_search = 64 indicates the size of the dynamic list containing the nearest neighbors is during search.
- SET vchordg.ef_search = 128 indicates the size of the dynamic list containing the nearest neighbors is during search.

`vchordg.beam_search` since v0.5.0

Description: Beam widths used during search. Beam width refers to the number of vertices accessed at once. Since the time to randomly read a small number of sectors from the SSD is almost the same as reading a single sector, a larger beam width effectively reduces the number of round trips to the SSD, resulting in better performance. Since it increases computation, it is disadvantageous for indexes that fit entirely in memory.
Type: integer
Default: 1
Domain: [1, 65535]
Example:
- SET vchordg.beam_search = 8 indicates that the index accesses 8 vertices at once during search.
- SET vchordg.beam_search = 1 indicates that the index accesses 1 vertex at once during search.

Graph Index since v0.5.0 ​

Tuning ​

Reference ​

Operator Classes vchordg ​

Indexing Options vchordg ​

bits since v0.5.0 ​

m since v0.5.0 ​

ef_construction since v0.5.0 ​

alpha since v0.5.0 ​

beam_construction since v0.5.0 ​

Search Parameters vchordg ​

vchordg.ef_search since v0.5.0 ​

vchordg.beam_search since v0.5.0 ​