Initialize
API Definition
Status ScannInterface::Initialize(ConstSpan<float> dataset, DatapointIndex n_points, const std::string& config, int training_threads);
Function
Index construction (consistent with the open source algorithm).
Parameters
Parameter |
Description |
Data Type |
Value Range |
|
|---|---|---|---|---|
dataset |
Base library vector. |
ConstSpan<float> |
The value cannot be null. |
|
n_points |
Number of vectors in the base library. |
DatapointIndex |
The length must be the same as that of dataset. dataset indicates the base library vector. |
|
config |
Configuration file required for creating the index, containing all configuration parameters. |
const std::string& |
- |
|
training_threads |
Number of threads during query construction. |
int |
≥ 1 |
|
config is generated by the create_config.py script in combination with the following parameters.
Parameter |
Description |
Data Type |
Value Range |
|
|---|---|---|---|---|
n_leaves |
Total subspace number in the IVF partition. |
int |
≥ 1. |
|
nb |
Number of vectors in the base library. |
int32_t |
The length must be the same as that of dataset. dataset indicates the base library vector. |
|
metricType |
Distance type of the vector. |
std::string |
dot_product or squared_l2. |
|
dims_per_block |
Number of dimensions combined by PQ. |
int |
[1,dim], where dim indicates the dimension of the base library vector. |
|
avq_threshold |
Asymmetric bucket parameter. This parameter takes effect only for the L2 (squared_l2) dataset. |
float |
[0,1] |
|
dim |
Dimensions of vectors in the base library. |
int32_t |
The dimensions must be the same as those of dataset. dataset indicates the base library vector. |
|
topK |
Number of returned results. |
int |
≥ 1. |
|
soar_lambda |
Controls orthogonality. This parameter takes effect only for the IP (dot product) dataset. |
float |
> 0. Set the value to -1 t o disable the function. |
|
overretrieve_factor |
Used together with soar_lambda to specify the over-retrieval factor . This parameter takes effect only for the IP (dot_product) dataset. |
float |
[1,2]. Set the value to -1 t o disable the function. |
|
python create_config.py + std::to_string(n_leaves) + " "
+ std::to_string(nb) + " "
+ metricType + " "
+ std::to_string(dims_per_block) + " "
+ std::to_string(avq_threshold) + " "
+ std::to_string(dim) + " "
+ std::to_string(topK) + " "
+ std::to_string(soar_lambda) + " "
+ std::to_string(overretrieve_factor)
Return Value
Data Type |
Description |
|---|---|
Status |
Execution status of the method. You can determine whether the method is successfully executed by calling status.ok(). |