Initialize
API Definition
Status ScannInterface::Initialize(ConstSpan<float> dataset, DatapointIndex n_points, const std::string& config, int training_threads);
Function
Index construction (consistent with the open source algorithm).
Parameters
Parameter |
Data Type |
Description |
Value Range |
|
|---|---|---|---|---|
dataset |
ConstSpan<float> |
Base library vector. |
The value cannot be null. |
|
n_points |
DatapointIndex |
Number of vectors in the base library. |
The length must be the same as that of dataset. dataset indicates the base library vector. |
|
config |
const std::string& |
Configuration file required for creating the index, containing all configuration parameters. |
- |
|
training_threads |
int |
Number of threads during query construction. |
≥ 1 |
|
config is generated by the create_config.py script in combination with the following parameters.
Parameter |
Data Type |
Description |
Value Range |
|
|---|---|---|---|---|
n_leaves |
int |
Total subspace number in the IVF partition. |
≥ 1 |
|
nb |
int32_t |
Number of vectors in the base library. |
The length must be the same as that of dataset. dataset indicates the base library vector. |
|
metricType |
std::string |
Distance type of the vector. |
dot_product or squared_l2. |
|
dims_per_block |
int |
Number of dimensions combined by PQ. |
[1,dim], where dim indicates the dimension of the base library vector. |
|
avq_threshold |
float |
Asymmetric bucket parameter. This parameter takes effect only for the L2 (squared_l2) dataset. |
[0, 1] |
|
dim |
int32_t |
Dimensions of vectors in the base library. |
The dimensions must be the same as those of dataset. dataset indicates the base library vector. |
|
topK |
int |
Number of returned results. |
≥ 1 |
|
soar_lambda |
float |
Controls orthogonality. This parameter takes effect only for the IP (dot product) dataset. |
> 0. If the value is to -1, the function is disabled. |
|
overretrieve_factor |
float |
Used together with soar_lambda to specify the over-retrieval factor. This parameter takes effect only for the IP (dot_product) dataset. |
[1, 2]. If the value is to -1, the function is disabled. |
|
python create_config.py + std::to_string(n_leaves) + " "
+ std::to_string(nb) + " "
+ metricType + " "
+ std::to_string(dims_per_block) + " "
+ std::to_string(avq_threshold) + " "
+ std::to_string(dim) + " "
+ std::to_string(topK) + " "
+ std::to_string(soar_lambda) + " "
+ std::to_string(overretrieve_factor)
Return Value
Data Type |
Description |
|---|---|
Status |
Execution status of the method. You can determine whether the method is successfully executed by calling status.ok(). |