我要评分
获取效率
正确性
完整性
易理解

Examples

The following uses the sift-128-euclidean.hdf5 dataset and 80 threads as an example.

  1. Obtain a dataset.
    1
    wget http://ann-benchmarks.com/sift-128-euclidean.hdf5 --no-check-certificate
    
  2. Obtain a test program.
    Link. The branch is v1.2.0. Assume that the program runs at the directory /path/to/kbest_test/test. The full directory structure is as follows:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    ├── build                                                   // Build files, which are automatically generated after compilation
    ├── CMakeLists.txt                                          // Compilation configuration file
    ├── data                                                    // Dataset
          └── sift-128-euclidean.hdf5
    ├── graph                                                   // Built graph index, which needs to be manually created
          └── sift_tsdg.kgn                                     // Built graph index, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_graph in the dataset configuration file)
    ├── searcher                                                // Built searcher, which needs to be manually created
          └── sift.ksn                                          // Built searcher, which is generated after the run executable file is executed (the value of index_save_or_load is save, and the value of save_types is save_searcher in the dataset configuration file)
    ├── main.cpp                                                // File that contains the running functions
    ├── run                                                     // Executable file generated after compilation
    └── sift.config                                             // Dataset configuration file
    

Procedure:

  1. Assume that the program runs at the /path/to/kbest_test/test directory. Store the dataset to the data folder in the directory.
  2. If the command is executed for the first time, ensure that index_save_or_load in the sift.config file is set to save. In subsequent execution, the value can be changed to load to use the built graph index or searcher for query.
  3. Install the dependencies.
    1
    yum install hdf5 hdf5-devel openssl-devel libcurl-devel numactl numactl-devel
    
  4. Compile and run the program.
    1
    2
    3
    4
    5
    mkdir build
    cd build
    cmake ..
    make -j
    mv run ../
    
  5. Run the executable file run.
    1
    ./run 80 2 -1 sift.config
    

    The test command parameters are described as follows:

    ./run <threads> <query_mode> <batch_size> <config_name>
    • threads indicates the number of running threads.
    • query_mode indicates the test mode. 1 indicates the batch query mode, that is, batch_size queries are executed at a time. 2 indicates the concurrent single query mode, that is, each thread executes a single query concurrently. In this case, batch_size is invalid.
    • batch_size indicates the number of queries to be executed at a time in batch query mode. If batch_size is set to -1, all queries in the dataset are executed at a time.
    • config_name indicates the name of the configuration file corresponding to the test dataset.

    The command output is as follows: