Qdrant vs ChromaDB: Choosing the Right Vector Database for Your Stack

When you start building a RAG pipeline or a semantic search feature, two names come up almost immediately: Qdrant and ChromaDB. Both are open-source vector databases. Both run locally without a cloud subscription. Both can store embeddings and find the nearest neighbors to a query vector. So which one should you actually use?

The short answer is: it depends on your stack and your team’s workflow. The longer answer is this article.

This is a practical comparison, not a marketing table. You will see real commands, real API responses, and real trade-offs. By the end you will know which tool fits your use case, rather than just which one has a longer feature list on its website.

If you have already installed either database, check out the Qdrant getting started guide or the ChromaDB server installation guide for setup instructions. This article assumes you can run both and focuses entirely on comparison.

What They Have in Common

Before getting into differences, it is worth grounding yourself in what both tools share:

Both store vectors (arrays of floats) alongside metadata (arbitrary key-value pairs).
Both support nearest-neighbor search, given a query vector, find the stored vectors closest to it.
Both support metadata filtering, restrict search results to records matching certain conditions.
Both run on a single machine without external dependencies and persist data to disk.
Neither includes built-in authentication out of the box (both need a reverse proxy or firewall for production security).
Both have Docker images and can be deployed in containers.

The overlap is real. For a small prototype or a solo project, either tool works fine. The differences become meaningful at the edges: language integration, embedded vs. server-only mode, performance under load, and how you manage collections over time.

Prerequisites

To run the examples in this article you need:

Ubuntu 20.04, 22.04, or 24.04
Docker installed and running
curl and jq for Qdrant examples
Python 3.8+ with pip and venv for ChromaDB examples
At least 2 GB of free RAM

Architecture and Data Model

Qdrant

Qdrant is written in Rust. It runs as a standalone binary or Docker container and exposes a REST API and a gRPC API. There is no embedded library mode, Qdrant is always a separate process.

Its data model has three layers:

Collections, top-level containers, like tables. You define the vector dimension and distance metric when you create a collection. These settings are permanent.
Points, individual records. Each point has an id, a vector, and an optional payload (a JSON object with arbitrary fields).
Payload, metadata stored alongside the vector. Qdrant can index payload fields for fast filtering.

Collection: articles
  └── Point { id: 1, vector: [0.1, 0.9, ...], payload: { category: "runbook" } }
  └── Point { id: 2, vector: [0.8, 0.2, ...], payload: { category: "tutorial" } }

Qdrant uses HNSW (Hierarchical Navigable Small World) as its vector index. This gives approximate nearest-neighbor search that scales to millions of vectors while staying fast, typically under 10 ms for a query on a dataset of 1 million points.

ChromaDB

ChromaDB is written in Python (with parts in Rust and Go for the newer server mode). It can run in two modes:

Embedded mode, ChromaDB runs inside your Python process. No separate service, no ports. Fastest path for prototyping.
Server mode, ChromaDB runs as a standalone HTTP server. Other processes connect over the network. This is what production use looks like.

Its data model uses the same concepts but different names:

Collections, same as Qdrant. You set the distance metric via a metadata field ("hnsw:space": "cosine").
Documents, the text strings you add. ChromaDB can embed them automatically.
Embeddings, the vectors, stored alongside documents.
Metadatas, the payload equivalent.
IDs, string identifiers (not integers like Qdrant defaults).

Collection: knowledge_base
  └── { id: "doc1", document: "restart nginx...", embedding: [...], metadata: { category: "nginx" } }
  └── { id: "doc2", document: "kubernetes pods...", embedding: [...], metadata: { category: "k8s" } }

ChromaDB also uses HNSW internally via the hnswlib library.

Key architectural difference: Qdrant is a first-class server with no embedded option. ChromaDB’s embedded mode is a first-class feature used in most tutorials. The embedded mode is convenient but creates an invisible wall: the moment two separate processes need to share the same vector store, you must switch to server mode and update your client code.

Installation: Side-by-Side

Starting both databases with Docker is the quickest path for comparison. Run these in separate terminals.

Qdrant:

mkdir -p ~/qdrant/storage
docker run -d \
  --name qdrant \
  --restart unless-stopped \
  -p 6333:6333 \
  -v ~/qdrant/storage:/qdrant/storage \
  qdrant/qdrant:latest

Verify:

curl -s http://localhost:6333/healthz
# healthz check passed

ChromaDB:

mkdir -p ~/chromadb/data
docker run -d \
  --name chromadb \
  --restart unless-stopped \
  -p 8000:8000 \
  -v ~/chromadb/data:/chroma/chroma \
  chromadb/chroma:latest

Verify:

curl -s http://localhost:8000/api/v1/heartbeat
# {"nanosecond heartbeat": 1717000000000000000}

Both are now running. Qdrant uses port 6333, ChromaDB uses port 8000.

API Design: REST vs. Python-First

This is one of the most practical differences.

Qdrant: Language-Neutral REST API

Qdrant’s primary interface is REST. You can operate it entirely with curl, no SDK, no language runtime required. This means:

Any language that can make HTTP calls can use Qdrant.
Shell scripts, Go services, Node.js applications, and Python scripts all talk to the same API in the same way.
The Qdrant Python, Go, and TypeScript clients are thin wrappers around the REST API.

Create a collection and insert a point with plain curl:

# Create collection
curl -s -X PUT http://localhost:6333/collections/docs \
  -H "Content-Type: application/json" \
  -d '{"vectors": {"size": 4, "distance": "Cosine"}}' | jq

# Insert a point
curl -s -X PUT http://localhost:6333/collections/docs/points \
  -H "Content-Type: application/json" \
  -d '{
    "points": [{
      "id": 1,
      "vector": [0.1, 0.9, 0.2, 0.8],
      "payload": {"text": "restart nginx service", "category": "runbook"}
    }]
  }' | jq

This works from a Bash script on a CI server, from a Go microservice, or from Postman. Nothing special required.

ChromaDB: Python-First, HTTP Second

ChromaDB’s primary interface is its Python client. The HTTP API exists and works, but it is lower-level and not what ChromaDB documents as the normal path. If you are building in Python, the API is excellent:

import chromadb

client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection("docs", metadata={"hnsw:space": "cosine"})

collection.add(
    documents=["restart nginx service"],
    metadatas=[{"category": "runbook"}],
    ids=["doc1"]
)

If you are building in Node.js or Go, ChromaDB has official clients, but they are less mature than the Python client and the REST API documentation is sparse compared to Qdrant’s.

Verdict: If your application is Python, ChromaDB’s API is more natural and requires less boilerplate. If your application is anything else like Node.js, Go, Rust, a mixed-language system, Qdrant’s REST-first design means you never get a second-class experience.

Built-In Embedding: ChromaDB’s Biggest Differentiator

ChromaDB’s most frequently cited advantage is built-in automatic embedding. When you add documents, ChromaDB can embed them for you using the bundled all-MiniLM-L6-v2 model from SentenceTransformers:

collection.add(
    documents=["How to restart nginx", "Kubernetes scaling with HPA"],
    ids=["d1", "d2"]
)

results = collection.query(query_texts=["restart the web server"], n_results=1)
print(results["documents"][0])
# ['How to restart nginx']

You pass plain text strings, no embedding step in your code. ChromaDB does it automatically on both .add() and .query(). This is the reason ChromaDB dominates beginner RAG tutorials: the pipeline from text to semantic search is four lines of code.

Qdrant has no built-in embedding. You must generate vectors yourself and pass them as floats. That means calling an external embedding model like Ollama, OpenAI, or a local SentenceTransformers script, before every insert and query.

# With Qdrant you always deal with raw vectors
curl -s -X POST http://localhost:6333/collections/docs/points/search \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.12, 0.88, 0.22, 0.78], "limit": 3, "with_payload": true}' | jq

The trade-off: ChromaDB’s built-in embedding is a shortcut that works great up to a point. The problem appears in production when you need consistency between indexing time and query time. If you index with all-MiniLM-L6-v2 and later want to switch to nomic-embed-text because it is more accurate, you must drop and re-index every collection. With Qdrant, the embedding model is decoupled from the database, you can change or upgrade the model by simply re-indexing, because the database only stores and retrieves float arrays.

For a quick prototype: ChromaDB’s auto-embedding saves time. For a production system with a controlled embedding pipeline: Qdrant’s separation of concerns is cleaner.

Filtering Capabilities

Both databases support filtering search results by metadata. The syntax differs.

Qdrant filtering uses a structured JSON filter in the search request. The operators (must, should, must_not) map directly to AND, OR, and NOT:

curl -s -X POST http://localhost:6333/collections/docs/points/search \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.12, 0.88, 0.22, 0.78],
    "limit": 3,
    "with_payload": true,
    "filter": {
      "must": [{"key": "category", "match": {"value": "runbook"}}]
    }
  }' | jq

Qdrant applies filters using a dedicated payload index (you create one with a PUT request). On large collections this is significantly faster than scanning all records.

ChromaDB filtering uses a where clause with MongoDB-style operators:

results = collection.query(
    query_texts=["restart the web server"],
    n_results=3,
    where={"category": "runbook"}
)

For compound conditions:

results = collection.query(
    query_texts=["restart the web server"],
    n_results=3,
    where={"$and": [{"category": "runbook"}, {"reviewed": True}]}
)

ChromaDB also supports where_document to filter by substring match in the document text itself, a feature Qdrant does not have natively.

Verdict: Both get the job done. Qdrant’s explicit payload indexes make filtering faster on large datasets. ChromaDB’s where_document is useful when you want to combine semantic and keyword search in one query without extra code.

Performance and Scalability

Qdrant’s Rust implementation gives it a clear performance edge in raw throughput and memory efficiency. In published benchmarks, Qdrant consistently leads in queries per second and memory-per-vector compared to Python-based alternatives at scale.

For a practical sense of the difference:

On a machine with 4 GB of RAM, Qdrant comfortably handles 1–2 million vectors in a single collection.
ChromaDB’s server mode is more memory-hungry because the Python process, the hnswlib index, and any active embedding models compete for memory. Expect ChromaDB to use 2–3x more RAM than Qdrant for the same dataset.
ChromaDB’s built-in embedding (all-MiniLM-L6-v2) keeps the model loaded in memory which is RAM that Qdrant never uses because it has no embedding model at all.

For datasets under 100,000 vectors, the performance difference is invisible in practice. Both respond in under 5 ms. The gap becomes material at 500,000+ vectors under concurrent query load.

Production Readiness

Feature	Qdrant	ChromaDB
Authentication	None built-in	None built-in
TLS/HTTPS	Via reverse proxy	Via reverse proxy
Backup	Built-in snapshot API	Manual directory copy
Horizontal scaling	Distributed mode (self-hosted)	Not supported (server is single-node)
Language clients	Python, TypeScript, Go, Rust, .NET, Java	Python, TypeScript, Go (less mature)
gRPC support	Yes	No
Dashboard / UI	Built-in web UI at `:6333/dashboard`	No built-in UI

Qdrant’s built-in snapshot API is a notable advantage. You can trigger a backup over HTTP without touching the filesystem:

curl -s -X POST http://localhost:6333/collections/docs/snapshots | jq

The resulting file is a self-contained snapshot you can copy to another machine and restore with a single API call.

ChromaDB has no equivalent, backup means copying the data directory while the server is idle or accepting data loss risk during the copy.

Qdrant also ships a distributed mode for running across multiple nodes. ChromaDB’s server mode is single-node only; horizontal scaling requires putting it behind a load balancer with shared persistent storage, which is not officially supported.

When to Use Each

Choose ChromaDB when:

Your application is Python-only and you want the fastest path from text to semantic search.
You are building a prototype, a research tool, or a personal assistant where deployment complexity is a cost, not a concern.
You want built-in embedding to reduce the number of moving parts you manage.
Your dataset is under 100,000 documents and you do not expect it to grow significantly.

Choose Qdrant when:

Your application is not Python, or you have a mixed-language stack.
You are building for production and need predictable performance under load.
You want the embedding model decoupled from the vector store, so you can upgrade or swap models independently.
You need a built-in UI for debugging, backup snapshots via API, or a gRPC interface.
Your dataset is large (500,000+ vectors) or will grow over time.
You need multi-node scaling or high availability.

Common Mistakes and Troubleshooting

Using ChromaDB embedded mode and later needing multi-process access

The embedded PersistentClient locks the data directory to one process. If a second Python script or a Node.js service tries to open the same path simultaneously, you will get file lock errors. Switch to HttpClient pointing at a running ChromaDB server from the start.

Qdrant collection dimension mismatch

Once a collection is created with "size": 768, every inserted vector must have exactly 768 dimensions. Switching embedding models without dropping and recreating the collection causes insert failures. Always drop and recreate when changing embedding models.

ChromaDB distance metric cannot be changed after creation

The "hnsw:space" metadata field is set once at collection creation. Creating a collection without it uses Euclidean distance (L2), which gives poor results for text embeddings. Use "hnsw:space": "cosine" from the start. If you forget, drop and recreate the collection.

Comparing similarity scores between Qdrant and ChromaDB

Qdrant returns scores where 1.0 is a perfect cosine match. ChromaDB returns distances where 0.0 is a perfect cosine match (because it returns the raw distance, not the similarity). Do not compare score values across the two databases.

Best Practices

Decouple the embedding model from the database in any serious project. Whether you use Qdrant or ChromaDB, generate embeddings with a dedicated model (Ollama’s nomic-embed-text, for example) rather than relying on the database’s built-in function. This gives you portability, the same vectors work in either database, and you can migrate between them without re-embedding.

Create payload indexes before querying large collections in Qdrant. Without an index, every filtered query scans all points. Add indexes for fields you filter on frequently:

curl -s -X PUT http://localhost:6333/collections/docs/index \
  -H "Content-Type: application/json" \
  -d '{"field_name": "category", "field_schema": "keyword"}' | jq

In ChromaDB, always use get_or_create_collection in application code. create_collection fails if the collection already exists. get_or_create_collection is idempotent and safe to call at application startup.

Do not expose either database directly to the internet. Neither has built-in authentication. Put Qdrant behind a firewall rule (ufw) and ChromaDB behind Nginx with HTTP Basic Auth. See the ChromaDB server guide for the Nginx setup, and apply the same approach to Qdrant port 6333.

Use stable, predictable IDs. Both databases support upsert semantics, inserting with an existing ID replaces the record. Use IDs derived from your content (a hash of the document path, for example) so re-indexing the same document replaces the old version cleanly without duplicates.

Conclusion

Qdrant and ChromaDB solve the same core problem but optimize for different things.

ChromaDB optimizes for developer ergonomics in Python. You can go from a text corpus to semantic search results in ten lines of code. The built-in embedding, the simple collection API, and the embedded mode make it the fastest tool to get started with, especially for prototyping RAG applications.

Qdrant optimizes for production reliability and language neutrality. Its REST-first API works from any language without compromise. Its Rust core is memory-efficient and fast at scale. Its snapshot API, built-in dashboard, and distributed mode give you the operational tooling a production service needs.

If you are prototyping in Python and want to move fast, start with ChromaDB. If you are building something that will run in production, serve traffic from a non-Python backend, or grow beyond 100,000 vectors, Qdrant is the more durable foundation.

For a practical end-to-end example of building a RAG API on top of ChromaDB, see the Fastify RAG API tutorial. The vectorstore.js module in that tutorial is the layer you would replace with a Qdrant client if you decide to switch, and the rest of the application stays identical.