Rate This Document
Findability
Accuracy
Completeness
Readability

Initialize

API Definition

Status ScannInterface::Initialize(ConstSpan<float> dataset, DatapointIndex n_points, const std::string& config, int training_threads);

Function

Index construction (consistent with the open source algorithm).

Parameters

Parameter

Description

Data Type

Value Range

dataset

Base library vector.

ConstSpan<float>

The value cannot be null.

n_points

Number of vectors in the base library.

DatapointIndex

The length must be the same as that of dataset. dataset indicates the base library vector.

config

Configuration file required for creating the index, containing all configuration parameters.

const std::string&

-

training_threads

Number of threads during query construction.

int

≥ 1

config is generated by the create_config.py script in combination with the following parameters.

Parameter

Description

Data Type

Value Range

n_leaves

Total subspace number in the IVF partition.

int

≥ 1.

nb

Number of vectors in the base library.

int32_t

The length must be the same as that of dataset. dataset indicates the base library vector.

metricType

Distance type of the vector.

std::string

dot_product or squared_l2.

dims_per_block

Number of dimensions combined by PQ.

int

[1,dim], where dim indicates the dimension of the base library vector.

avq_threshold

Asymmetric bucket parameter. This parameter takes effect only for the L2 (squared_l2) dataset.

float

[0,1]

dim

Dimensions of vectors in the base library.

int32_t

The dimensions must be the same as those of dataset. dataset indicates the base library vector.

topK

Number of returned results.

int

≥ 1.

soar_lambda

Controls orthogonality. This parameter takes effect only for the IP (dot product) dataset.

float

> 0. Set the value to -1 t o disable the function.

overretrieve_factor

Used together with soar_lambda to specify the over-retrieval factor . This parameter takes effect only for the IP (dot_product) dataset.

float

[1,2]. Set the value to -1 t o disable the function.

config is generated by create_config.py. The command is as follows:
python create_config.py  + std::to_string(n_leaves) + " "
                         + std::to_string(nb) + " "
                         + metricType + " "
                         + std::to_string(dims_per_block) + " "
                         + std::to_string(avq_threshold) + " "
                         + std::to_string(dim) + " "
                         + std::to_string(topK) + " "
                         + std::to_string(soar_lambda) + " "
                         + std::to_string(overretrieve_factor)

Return Value

Data Type

Description

Status

Execution status of the method. You can determine whether the method is successfully executed by calling status.ok().