Additional Performance Recommendations

This page provides additional recommendations to improve the performance of your MongoDB Vector Search queries.

Ensure Enough Memory

Hierarchical Navigable Small Worlds works efficiently when vector data is held in memory. You must ensure that the data nodes have enough RAM to hold the vector data and indexes. We recommend deploying separate Search Nodes for workload isolation without data isolation, which enables more efficient usage of memory for vector search use cases.

Embedding Model	Vector Dimension	Space Requirement
OpenAI `text-embedding-ada-002`	1536	6kb
Google `text-embedding-gecko`	768	3kb
Cohere `embed-english-v3.0`	1024	1.07kb (for `int8`) 0.167kb (for `int1`)

BinData quantized vectors. To learn more, see Ingest Quantized Vectors.

Warm up the Filesystem Cache

When you perform vector search not using dedicated Search Nodes, your queries initially perform random seeks on disk as you traverse the Hierarchical Navigable Small Worlds graph and the vector values are read into memory. When using Search Nodes, this cache warming typically only occurs in the event of an index rebuild, usually during scheduled maintenance windows.

Avoid Indexing Vectors When Running Queries

Vector embeddings consume computational resources during indexing. As a result, indexing and querying at the same time may cause resource bottlenecks. When performing an initial sync, ensure that your Search Node CPU usage crashes back down to ~0%, indicating segments have been merged and flushed to disk, before issuing test queries.

Back

Benchmark Results

Multi-Tenant Architecture