Skip to main content
Version: DEV

Benchmark

This document compares the following key specifications of Elasticsearch, Qdrant, Quickwit and Infinity:

  • Time to insert & build index
  • Time to import & build index
  • query latency
  • QPS

You need to watch resource (persisted index size, peak memory, peak cpu, system load etc.) manually.

Keep the environment clean to ensure that the database under test is able to use up all resource of the system.

Avoid to run multiple databases at the same time, as each one is a significant resource consumer.

Test environment:

  • OS: OpenSUSE Tumbleweed x86_64
  • CPU: Intel CORE i5-13500H 16vCPU
  • RAM: 32GB
  • Disk: 1TB

Versions

Version
Elasticsearchv8.13.4
Qdrantv1.9.2
Quickwitv0.8.1
Infinityv0.2.0

Run Benchmark

  1. Install necessary dependencies.
cd python/benchmark
pip install -r requirements.txt
  1. Download the required Benchmark datasets to your /datasets folder:

Preprocess dataset:

sed '1d' datasets/enwiki/enwiki-20120502-lines-1k.txt > datasets/enwiki/enwiki.csv
  1. Start up the databases to compare:
mkdir -p $HOME/elasticsearch/data
docker run -d --name elasticsearch --network host -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms16384m -Xmx32000m" -e "xpack.security.enabled=false" -v $HOME/elasticsearch/data:/usr/share/elasticsearch/data elasticsearch:8.13.4

mkdir -p $HOME/qdrant/storage
docker run -d --name qdrant --network host -v $HOME/qdrant/storage:/qdrant/storage qdrant/qdrant:v1.9.2

mkdir -p $HOME/quickwit
docker run -d --name quickwit --network=host -v $HOME/quickwit/qwdata:/quickwit/qwdata quickwit/quickwit:0.8.1 run

mkdir -p $HOME/infinity
docker run -d --name infinity --network=host -v $HOME/infinity:/var/infinity --ulimit nofile=500000:500000 infiniflow/infinity:nightly
  1. Run Benchmark:

Drop file cache before benchmark.

echo 3 | sudo tee /proc/sys/vm/drop_caches

Tasks of the Python script run.py include:

  • Generate fulltext query set.
  • Measure the time to import data and build index.
  • Measure the query latency.
  • Measure the QPS.
$ python run.py -h
usage: run.py [-h] [--generate] [--import] [--query QUERY] [--query-express QUERY_EXPRESS] [--concurrency CONCURRENCY] [--engine ENGINE] [--dataset DATASET]

RAG Database Benchmark

options:
-h, --help show this help message and exit
--generate Generate fulltext query set based on the dataset (default: False)
--import Import dataset into database engine (default: False)
--query QUERY Run the query set only once using given number of clients with recording the result and latency. This is for result validation and latency analysis (default: 0)
--query-express QUERY_EXPRESS
Run the query set randomly using given number of clients without recording the result and latency. This is for QPS measurement. (default: 0)
--concurrency CONCURRENCY
Choose concurrency mechanism, one of: mp - multiprocessing(recommended), mt - multithreading. (default: mp)
--engine ENGINE Choose database engine to benchmark, one of: infinity, qdrant, elasticsearch, quickwit (default: infinity)
--dataset DATASET Choose dataset to benchmark, one of: gist, sift, geonames, enwiki, tantivy (default: enwiki)

Following are commands for engine infinity and dataset enwiki:

python run.py --generate --engine infinity --dataset enwiki
python run.py --import --engine infinity --dataset enwiki
python run.py --query=16 --engine infinity --dataset enwiki
python run.py --query-express=16 --engine infinity --dataset enwiki

Following are commands to issue a single query so that you can compare results among several engines.

curl -X GET "http://localhost:9200/elasticsearch_enwiki/_search" -H 'Content-Type: application/json' -d'{"size":10,"_source":"doctitle","query": {"match": { "body": "wraysbury istorijos" }}}'

curl -X GET "http://localhost:7280/api/v1/_elastic/qucikwit_enwiki/_search" -H 'Content-Type: application/json' -d'{"query": {"query_string": {"query": "wraysbury istorijos", "fields": [ "body" ] } },"sort": ["_score"],"size":10}'

psql -h 0.0.0.0 -p 5432 -c "SELECT doctitle, ROW_ID(), SCORE() FROM infinity_enwiki SEARCH MATCH TEXT ('body', 'wraysbury istorijos', 'topn=10');"

Benchmark Results

SIFT1M

  • Metric: L2
  • 10000 queries
QPSRecallTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch9340.992131 sN/A874 MB1.463 GB
Qdrant13030.97946 sN/A418 MB1.6 GB
Infinity163200.97374 s28 s792 MB0.95 GB

GIST1M

  • Metric: L2
  • 1000 queries
QPSRecallTime to insert & build indexTime to import & build indexDiskPeak memory
Elasticsearch3050.885872 sN/A13 GB6.9 GB
Qdrant3390.947366 sN/A4.4 GB7.3 GB
Infinity22000.946463 s112 s4.7 GB6.0 GB

Enwiki

  • 33000000 documents
  • 100000 OR queries generated based on the dataset. All terms are extracted from the dataset and very rare(occurrence < 100) terms are excluded. The number of terms of each query match the weight [0.03, 0.15, 0.25, 0.25, 0.15, 0.08, 0.04, 0.03, 0.02].
Time to insert & build indexTime to import & build indexP95 Latency(ms)QPS (16 python clients)MemoryvCPU
Elasticsearch2289 sN/A14.75134021.0GB10.6
Quickwit3962 sN/A65.551791.2GB11.3
Infinity1562 s2244 s1.371373110.0GB11.0

Deprecated Benchmark

Infinity provides a Python script for benchmarking the SIFT1M and GIST1M datasets.

Build and start Infinity

You have two options for building Infinity. Choose the option that best fits your needs:

Download the Benchmark datasets

To obtain the benchmark datasets, you have the option to download them using the wget command.

#download sift benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/sift.tar.gz
#download gist benchmark
wget ftp://ftp.irisa.fr/local/texmex/corpus/gist.tar.gz

Alternatively, you can manually download the benchmark datasets by visiting http://corpus-texmex.irisa.fr/.

# Unzip and move the SIFT1M benchmark file.
tar -zxvf sift.tar.gz
mv sift/sift_base.fvecs test/data/benchmark/sift_1m/sift_base.fvecs
mv sift/sift_query.fvecs test/data/benchmark/sift_1m/sift_query.fvecs
mv sift/sift_groundtruth.ivecs test/data/benchmark/sift_1m/sift_groundtruth.ivecs

# Unzip and move the GIST1M benchmark file.
tar -zxvf gist.tar.gz
mv gist/gist_base.fvecs test/data/benchmark/gist_1m/gist_base.fvecs
mv gist/gist_query.fvecs test/data/benchmark/gist_1m/gist_query.fvecs
mv gist/gist_groundtruth.ivecs test/data/benchmark/gist_1m/gist_groundtruth.ivecs

Benchmark dependencies

cd python

pip install -r requirements.txt
pip install .

Import the Benchmark datasets

cd benchmark

# options:
# -h, --help show this help message and exit
# -d DATA_SET, --data DATA_SET

python remote_benchmark_knn_import.py -d sift_1m
python remote_benchmark_knn_import.py -d gist_1m

Run Benchmark

# options:
# -h, --help show this help message and exit
# -t THREADS, --threads THREADS
# -r ROUNDS, --rounds ROUNDS
# -d DATA_SET, --data DATA_SET

# ROUNDS indicates the number of times Python executes the benchmark, and the result represents the average duration for each run.

# Perform a latency benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 1 -r 1 -d gist_1m

# Perform a QPS benchmark on the SIFT1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d sift_1m
# Perform a latency benchmark on the GIST1M dataset using a single thread, running it only once.
python remote_benchmark_knn.py -t 16 -r 1 -d gist_1m

A SIFT1M Benchmark report

  • Hardware: Intel i5-12500H, 16C, 16GB
  • Operating system: Ubuntu 22.04
  • Dataset: SIFT1M; topk: 100; recall: 97%+
  • P99 QPS: 15,688 (16 clients)
  • P99 Latency: 0.36 ms
  • Memory usage: 408 MB