Examples
The following uses the sift-128-euclidean.hdf5 dataset and 80 threads as an example.
- Obtain a dataset.
1wget http://ann-benchmarks.com/sift-128-euclidean.hdf5 --no-check-certificate
- Obtain a test program.Link. The branch is v1.2.0. Assume that the program runs at the directory /path/to/kbest_test/test. The full directory structure is as follows:
1 2 3 4 5 6 7 8 9 10 11
├── build // Build files, which are automatically generated after compilation ├── CMakeLists.txt // Compilation configuration file ├── data // Dataset └── sift-128-euclidean.hdf5 ├── graph // Built graph index, which needs to be manually created └── sift_tsdg.kgn // Built graph index, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_graph in the dataset configuration file) ├── searcher // Built searcher, which needs to be manually created └── sift.ksn // Built searcher, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_searcher in the dataset configuration file) ├── main.cpp // File that contains the running functions ├── run // Executable file generated after compilation └── sift.config // Dataset configuration file
Procedure:
- Assume that the program runs at the /path/to/kbest_test/test directory. Store the dataset to the data folder in the directory.
- If the command is executed for the first time, ensure that index_save_or_load in the sift.config file is set to save. In subsequent execution, the value can be changed to load to use the built graph index or searcher for query.
- Install the dependencies.
1yum install hdf5 hdf5-devel openssl-devel libcurl-devel numactl numactl-devel
- Compile and run the program.
1 2 3 4 5
mkdir build cd build cmake .. make -j mv run ../
- Run the executable file run.
1./run 80 2 -1 sift.config
The test command parameters are described as follows:
./run <threads> <query_mode> <batch_size> <config_name>
- threads indicates the number of running threads.
- query_mode indicates the test mode. 1 indicates the batch query mode, that is, batch_size queries are executed at a time. 2 indicates the concurrent single query mode, that is, each thread executes a single query concurrently. In this case, batch_size is invalid.
- batch_size indicates the number of queries to be executed at a time in batch query mode. If batch_size is set to -1, all queries in the dataset are executed at a time.
- config_name indicates the name of the configuration file corresponding to the test dataset.
The command output is as follows:

Parent topic: C++