Rate This Document
Findability
Accuracy
Completeness
Readability

Initialize

API Definition

Status ScannInterface::Initialize(ConstSpan<float> dataset, DatapointIndex n_points, const std::string& config, int training_threads);

Function

Index construction (consistent with the open source algorithm).

Parameters

Parameter

Data Type

Description

Value Range

dataset

ConstSpan<float>

Base library vector.

The value cannot be null.

n_points

DatapointIndex

Number of vectors in the base library.

The length must be the same as that of dataset. dataset indicates the base library vector.

config

const std::string&

Configuration file required for creating the index, containing all configuration parameters.

-

training_threads

int

Number of threads during query construction.

≥ 1

config is generated by the create_config.py script in combination with the following parameters.

Parameter

Data Type

Description

Value Range

n_leaves

int

Total subspace number in the IVF partition.

≥ 1

nb

int32_t

Number of vectors in the base library.

The length must be the same as that of dataset. dataset indicates the base library vector.

metricType

std::string

Distance type of the vector.

dot_product or squared_l2.

dims_per_block

int

Number of dimensions combined by PQ.

[1,dim], where dim indicates the dimension of the base library vector.

avq_threshold

float

Asymmetric bucket parameter. This parameter takes effect only for the L2 (squared_l2) dataset.

[0, 1]

dim

int32_t

Dimensions of vectors in the base library.

The dimensions must be the same as those of dataset. dataset indicates the base library vector.

topK

int

Number of returned results.

≥ 1

soar_lambda

float

Controls orthogonality. This parameter takes effect only for the IP (dot product) dataset.

> 0. If the value is to -1, the function is disabled.

overretrieve_factor

float

Used together with soar_lambda to specify the over-retrieval factor. This parameter takes effect only for the IP (dot_product) dataset.

[1, 2]. If the value is to -1, the function is disabled.

config is generated by create_config.py. The command is as follows:
python create_config.py  + std::to_string(n_leaves) + " "
                         + std::to_string(nb) + " "
                         + metricType + " "
                         + std::to_string(dims_per_block) + " "
                         + std::to_string(avq_threshold) + " "
                         + std::to_string(dim) + " "
                         + std::to_string(topK) + " "
                         + std::to_string(soar_lambda) + " "
                         + std::to_string(overretrieve_factor)

Return Value

Data Type

Description

Status

Execution status of the method. You can determine whether the method is successfully executed by calling status.ok().