我要评分
获取效率
正确性
完整性
易理解

Configuration Description

When creating a collection in Milvus, it is required to specify the vector dimension. The test tool loads the dimension when reading the dataset. However, dimensions supported by KBest are within a range, and the specified index types are case sensitive. Table 1 describes the configuration items in the config.yml file of the ANN-Benchmarks test tool.

After creating an index in Milvus, check the logs for error messages. Continuous error message loops indicate misconfigured parameters. Use these error details to diagnose and resolve issues, ensuring queries run correctly.

Table 1 Parameter description

Parameter

Description

Value Type and Range

Recommended Value

Configuration Principle

index_type

Index type specified during the test.

std::string, "KBEST"

KBEST

None

metric_type

Distance measurement mode specified during the test.

const char*,

  • L2: Euclidean distance.
  • IP: Inner product.

None

This parameter is set by the dataset and does not require configuration.

dim

Feature dimension.

Integer, [1, 2999]

None

This parameter is set by the dataset and does not require configuration.

R

Number of neighboring nodes.

Integer, [11, 499]

[50]

This parameter affects the graph construction time and final index quality. The value 50 is recommended. If the value is too large, the construction time may be too long and the search performance may deteriorate. If the value is too small, the search precision may be affected.

L

Candidate node list during the graph construction.

Integer, [11, 1999]

[100]

This parameter affects the graph construction time and final index quality. The value 100 is recommended. If the value is too large, the construction time may be too long.

A

Angle threshold during the pruning of graph construction.

Integer, [10, 360]

[60]

For the IP dataset, the value 120 is used, while for the L2 dataset, 60.

init_builder_type

Name of the built index algorithm.

const std::string&,

  • "RNNDescent"
  • "NNDescent"

"RNNDescent"

Unless otherwise specified, RNNDescent is preferred.

consecutive

Block size.

Integer, [1, 31]

[20]

You may adjust the value as required.

efs

Size of the candidate node list during search.

Integer, [1, number_of_graph_construction_nodes]

[400]

For small-scale datasets, the value ranges from 10 to 500. A larger efs value leads to higher search precision but lowers search performance. It is advised to set efs to a smaller value when the precision meets the requirement.

num_search_thread

Number of threads during query.

Integer, [1, number_of_CPU_cores]

[1]

You may adjust the value as required.

build_index_type

Index type during graph construction to select a neighboring node.

const std::string &

  • "HNSW"
  • "SSG"
  • "NSG"
  • "TSDG"

"SSG"

Unless otherwise specified, SSG is preferred.

graph_opt_iter

Number of rounds for index self-iteration during graph construction.

Integer, [0, 30]

[6]

This parameter affects the graph construction time and final index quality. If the value is too large, the construction time may be too long.

reorder

Indicates whether to perform reordering after graph construction.

Boolean, true or false

[true]

This parameter affects the graph construction time and final index quality. You are advised to enable it.

adding_pref

Threshold for inserting a hyperparameter candidate set before retrieval.

Integer, greater than 0

[52]

This parameter is used to limit the retrieval path length and stop the retrieval in advance. You may adjust the value as required.

patience

Retrieval patience value.

Integer, greater than 0

[80]

This parameter is used to limit the retrieval path length and stop the retrieval in advance. You may adjust the value as required.

level

Quantization level. Range change allowed.

Integer, [0, 3]

[2]

Level 1 indicates SQ8 quantization, and level 2 indicates SQ4 quantization. For the IP dataset, the value 1 is used, while for the L2 dataset, 2.