我要评分
获取效率
正确性
完整性
易理解

Examples

This section provides details on how to call the KBest algorithm API in C++. In the example, the sift-128-euclidean.hdf5 dataset is used and the program runs with 80 threads.

Obtaining the Dataset and Test Program

  1. Obtain a dataset.
    1
    wget http://ann-benchmarks.com/sift-128-euclidean.hdf5 --no-check-certificate
    
  2. Obtain a test program.
    Obtain it from this link. The branch is v1.2.0. Assume that the program runs at the directory /path/to/kbest_test/test. The full directory structure is as follows:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    ├── build                                                   // Build files, which are automatically generated after compilation
    ├── CMakeLists.txt                                          // Compilation configuration file
    ├── data                                                    // Stores the dataset.
          └── sift-128-euclidean.hdf5
    ├── graph                                                   // Stores the built graph index, which needs to be manually created.
          └── sift_tsdg.kgn                                     // Built graph index, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_graph in the dataset configuration file)
    ├── searcher                                                // Stores the built searcher, which needs to be manually created
          └── sift.ksn                                          // Built searcher, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_searcher in the dataset configuration file)
    ├── main.cpp                                                // File that contains the running functions
    ├── run                                                     // Executable file generated after compilation
    └── sift.config                                             // Dataset configuration file
    

Procedure

  1. Assume that the program running directory is /path/to/kbest_test/test. Save the dataset to the data folder in this directory.
  2. If the command is executed for the first time, ensure that index_save_or_load in the sift.config file is set to save. In subsequent execution, the value can be changed to load to use the built graph index or searcher for query.
  3. Install the dependencies.
    1
    yum install hdf5 hdf5-devel openssl-devel libcurl-devel numactl numactl-devel
    
  4. Compile and run the program.
    1
    2
    3
    4
    5
    mkdir build
    cd build
    cmake ..
    make -j
    mv run ../
    
  5. Run the executable file run.
    1
    ./run 80 2 -1 sift.config
    

    The test command parameters are described as follows:

    ./run <threads> <query_mode> <batch_size> <config_name>
    • threads indicates the number of running threads.
    • query_mode indicates the test mode. If it is set to 1, the batch query mode is used, that is, batch_size queries are executed at a time. If it is set to 2, the concurrent single-query mode is used, that is, each thread executes a single query in parallel with the others. In this case, batch_size is invalid.
    • batch_size indicates the number of queries to be executed at a time in batch query mode. If batch_size is set to -1, all queries in the dataset are executed at a time.
    • config_name indicates the name of the configuration file corresponding to the test dataset.

    The command output is as follows: