Rate This Document
Findability
Accuracy
Completeness
Readability

Modifying the Algorithm Parameters of ANN-Benchmarks

Before running the test command, modify the parameters of the corresponding algorithm to obtain a better recall rate and queries per second (QPS).

  1. Taking modifying the algorithm parameters of Milvus as an example. Open the config.yml file in the Milvus directory.
    1
    vim /data/ann-benchmarks-main/ann_benchmarks/algorithms/milvus/config.yml 
    
  2. Modify the Milvus-HNSW algorithm parameters. For details, see Table 1 Parameters.
    Table 1 Parameters

    Parameter

    Description

    Recommended Value

    Configuration Principle

    M

    The M parameter represents "M-layer", and determines a hierarchy depth of the HNSW graph.

    M controls the number of subnodes of each node in the graph. If M is set to 2, each node has two subnodes. If M is set to 3, each node has three subnodes. The others follow the same rule.

    A larger M value indicates a deeper hierarchy of the HNSW graph, a smaller quantity of nodes that need to be traversed in a search, and higher search efficiency. However, an excessively large M value increases time and memory consumption for building the index.

    [24]

    Set M based on the data volume, storage space, search speed, and search precision.

    efConstruction

    The efConstruction parameter determines the number of effective searches required by the algorithm at each layer during HNSW graph construction.

    In the HNSW algorithm, effective search refers to searching downwards along a hierarchical structure of a graph starting from the top node until a nearest neighbor point that meets a specific precision is found.

    The larger the efConstruction value, the more the effective searches performed by the algorithm during index build. In this way, the nearest neighbor point can be found more accurately. This is advantageous for application scenarios that require high precision, but at the same time, more effective searches increase the time and memory consumption of index build.

    [200]

    Set efConstruction based on the sensitivity to the index build time and requirements for search precision.

  3. Modify the Milvus-ScaNN algorithm parameters. For details, see Table 2 Parameters.
    Table 2 Parameters

    Parameter

    Description

    Recommended Value

    Configuration Principle

    nlist

    The nlist parameter determines the size of the nearest neighbor list stored in each bucket by the ScaNN algorithm.

    Each bucket is actually a hash table used to store vertices similar to the query vector. When building an index, the algorithm selects nlist of nearest neighbors for each bucket and stores them in the bucket.

    A larger nlist value can provide higher search precision because the number of nearest neighbors stored in each bucket increases, and more candidate points are available during the search, but more memory is consumed throughout the process.

    In a high-dimensional space, the search precision is usually high enough. A smaller nlist value may sacrifice some precision, but it enables a faster search.

    [128]

    Set nlist based on the requirements for the storage space, search precision, and search speed.

The values of M, efConstruction, and nlist are determined based on multiple tests and analyses. The values are determined based on the query result precision, memory consumption, and time consumption. You can set the parameters as required.