Skip to main content
Version: v0.1.1

Benchmark

This document compares the following key specifications of Elasticsearch, Qdrant, and Infinity:

  • QPS
  • Recall
  • Time to insert & build index
  • Time to import & build index
  • Disk usage
  • Peak memory usage

Versions

Version
Elasticsearchv8.13.0
Qdrantv1.8.2
Infinityv0.1.0

Run Benchmark

  1. Install necessary dependencies.
pip install requirements.txt
  1. Download the required Benchmark datasets to your /datasets folder:
  2. Start up the databases to compare:
docker compose up -d
  1. Run Benchmark:

    Tasks of this Python script include:

    • Delete the original data.
    • Re-insert the data
    • Calculate the time to insert data and build index
    • Calculate QPS.
    • Calculate query latencies.
python run.py
  1. Navigate to the results folder to view the results and latency of each query.

Benchmark Results

SIFT1M

  • Metric: L2
  • 10000 queries
QPSRecallTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch9340.992131 sN/A874 MB1.463 GB
Qdrant13030.97946 sN/A418 MB1.6 GB
Infinity163200.97374 s28 s792 MB0.95 GB

GIST1M

  • Metric: L2
  • 1000 queries
QPSRecallTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch3050.885872 sN/A13 GB6.9 GB
Qdrant3390.947366 sN/A4.4 GB7.3 GB
Infinity22000.946463 s112 s4.7 GB6.0 GB

Dbpedia

  • 4160000 documents
  • 467 queries
QPSTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch777291 sN/A2 GB1.7 GB
Infinity817237 s123 s3.4 GB0.49 GB

Enwiki

  • 33000000 documents
  • 100 queries
QPSTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch4842289 sN/A28 GB5.3 GB
Infinity4842321 s944 s54 GB5.1 GB

Deprecated Benchmark

Infinity provides a Python script for benchmarking the SIFT1M and GIST1M datasets.

Build and start Infinity

You have two options for building Infinity. Choose the option that best fits your needs:

Download the Benchmark datasets

To obtain the benchmark datasets, you have the option to download them using the wget command.

#download sift benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz
#download gist benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/gist.tar.gz

Alternatively, you can manually download the benchmark datasets by visiting http://corpus-texmex.irisa.fr/.

# Unzip and move the SIFT1M benchmark file.
tar -zxvf sift.tar.gz
mv sift/sift_base.fvecs test/data/benchmark/sift_1m/sift_base.fvecs
mv sift/sift_query.fvecs test/data/benchmark/sift_1m/sift_query.fvecs
mv sift/sift_groundtruth.ivecs test/data/benchmark/sift_1m/sift_groundtruth.ivecs

# Unzip and move the GIST1M benchmark file.
tar -zxvf gist.tar.gz
mv gist/gist_base.fvecs test/data/benchmark/gist_1m/gist_base.fvecs
mv gist/gist_query.fvecs test/data/benchmark/gist_1m/gist_query.fvecs
mv gist/gist_groundtruth.ivecs test/data/benchmark/gist_1m/gist_groundtruth.ivecs

Benchmark dependencies

cd python

pip install -r requirements.txt
pip install .

Import the Benchmark datasets

cd benchmark

# options:
# -h, --help show this help message and exit
# -d DATA_SET, --data DATA_SET

python remote_benchmark_knn_import.py -d sift_1m
python remote_benchmark_knn_import.py -d gist_1m

Run Benchmark

# options:
# -h, --help show this help message and exit
# -t THREADS, --threads THREADS
# -r ROUNDS, --rounds ROUNDS
# -d DATA_SET, --data DATA_SET

# ROUNDS indicates the number of times Python executes the benchmark, and the result represents the average duration for each run.

# Perform a latency benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d gist_1m

# Perform a QPS benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d gist_1m

A SIFT1M Benchmark report

  • Hardware: Intel i5-12500H, 16C, 16GB
  • Operating system: Ubuntu 22.04
  • Dataset: SIFT1M; topk: 100; recall: 97%+
  • P99 QPS: 15,688 (16 clients)
  • P99 Latency: 0.36 ms
  • Memory usage: 408 MB