我要评分
获取效率
正确性
完整性
易理解

Examples

This section provides details on how to call the KBest algorithm API in Python. In the example, the sift-128-euclidean.hdf5 dataset is used and the program runs with 80 threads.

Obtaining the Dataset and Test Program

  1. Obtain a dataset.
    1
    wget http://ann-benchmarks.com/sift-128-euclidean.hdf5 --no-check-certificate
    
  2. Obtain a test program.
    Obtain it from this link. The branch is v1.2.0. Assume that the program runs at the directory /path/to/kbest_test/demo. The full directory structure is as follows:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    ├── ann_dataset                                           // Dataset to process
    ├── indices                                               // Built graph index, which is automatically created during run time (In the corresponding dataset configuration file, save_types is set to save_graph.)
          └── sift-128-euclidean_TSDG_R_32_L_300.ksn          // Built graph index, which is automatically generated during run time (In the corresponding dataset configuration file, save_types is set to save_graph.)
    ├── searcher_indices                                      // Built searcher, which is automatically created during run time (In the corresponding dataset configuration file, save_types is set to save_searcher.)
          └── sift-128-euclidean_TSDG_R_32_L_300.ksn          // Built searcher, which is automatically generated during run time (In the corresponding dataset configuration file, save_types is set to save_searcher.)
    ├── datasets                                                // Stores the dataset.
          └── sift-128-euclidean.hdf5
    ├── main.py                                               // File that contains the running functions
    └── sift_99.json                                          // Dataset configuration file
    └── run.sh                                                // Example script
    

Procedure

  1. Assume that the program running directory is /path/to/kbest_test/demo. Store the dataset to the datasets folder in the directory.
  2. Install the dependencies.
    1
    2
    pip install scikit-learn h5py psutil numpy==1.24.2
    yum install numactl numactl-devel
    
  3. Run main.py.
    1
    python main.py 80 -1 sift_99.json
    

    The test command parameters are described as follows:

    python main.py <threads> <batch_size> <json_name>
    • threads indicates the number of running threads.
    • batch_size indicates the number of queries to be executed at a time in batch query mode. If batch_size is set to -1, all queries in the dataset are executed at a time.
    • json_name indicates the name of the configuration file corresponding to the test dataset.

    The command output is as follows: