Rate This Document
Findability
Accuracy
Completeness
Readability

Loading the Dataset

ANN-Benchmarks has pre-generated datasets (in HDF5 format) and Docker containers for each algorithm, as well as test suites for verifying function integrity. This section describes how to obtain the dataset.

  1. Download the dataset.

    Method 1: Use the wget command to download the dataset. To download other datasets, change the dataset name in the wget command.

    This test uses the gist-960-euclidean.hdf5 dataset as an example.
    1
    2
    3
    mkdir /data/ann-benchmarks-main/data
    cd /data/ann-benchmarks-main/data
    wget http://ann-benchmarks.com/gist-960-euclidean.hdf5 --no-check-certificate
    

    Method 2: Download the dataset and upload it to the /data/ann-benchmarks-main/data directory on the server.

    Access this page, scroll down to find Data sets, and select the corresponding dataset in the Download list.

  2. Modify the file access permission (the milvus user is used as an example.).
    1
    chown -R milvus:milvus /data/ann-benchmarks-main/data/gist-960-euclidean.hdf5