What vector databases do and why they matter
Traditional databases search by exact or pattern match. Vector databases search by semantic similarity — finding content that means the same thing even when phrased differently. This is what enables RAG systems to retrieve "our policy on refunds after 30 days" in response to "can I get my money back next month?" — even though no words match.
The core operation: every piece of text is converted to a high-dimensional vector (typically 384–1536 numbers) by an embedding model. Texts with similar meaning cluster together in this vector space. A query vector is compared to all stored vectors using cosine similarity or dot product, and the closest matches are returned. This entire search process takes milliseconds even across millions of vectors.
Vector database options: comparison for AI automation
| Database | Type | Free tier | Best for | Setup complexity |
|---|---|---|---|---|
| FAISS (Meta) | In-memory library | Fully free | Development, single-server production under 1M vectors | None (Python library) |
| Chroma | Embedded / server | Fully free OSS | Local development, LangChain integration | Very low |
| Pinecone | Managed cloud | 1 index, 100K vectors | Production, managed, scales to billions | Low (API only) |
| Supabase pgvector | PostgreSQL extension | 500MB storage | Teams already using Supabase/PostgreSQL | Low |
| Qdrant | Open-source / cloud | Self-hosted free | Self-hosted production, advanced filtering | Medium |
| Weaviate | Open-source / cloud | Self-hosted free | Multi-modal (text + images), complex schemas | Medium |
FAISS: the free in-memory option for most use cases
FAISS (Facebook AI Similarity Search) is a Python library for efficient vector similarity search. No server, no subscription, no API calls. Runs entirely in your Python process. For knowledge bases up to approximately 1 million vectors on a server with 4+ GB RAM, FAISS delivers excellent performance — sub-millisecond search times on typical business knowledge bases.
import faiss, numpy as np, pickle
# Build index
embeddings = np.array(all_vectors, dtype=np.float32)
faiss.normalize_L2(embeddings) # For cosine similarity
index = faiss.IndexFlatIP(embeddings.shape[1])
index.add(embeddings)
# Save to disk
faiss.write_index(index, "knowledge_base.faiss")
pickle.dump(chunks_metadata, open("metadata.pkl","wb"))
# Load later
index = faiss.read_index("knowledge_base.faiss")
metadata = pickle.load(open("metadata.pkl","rb"))
# Search
query_vec = np.array([embed(query)], dtype=np.float32)
faiss.normalize_L2(query_vec)
scores, indices = index.search(query_vec, k=5)
results = [metadata[i] for i in indices[0] if i != -1]Pinecone: managed production vector search
Pinecone is fully managed — no infrastructure to maintain, automatic scaling, and a simple REST API. The free tier (1 index, 100K vectors) handles most small business RAG applications. Pinecone shines for production deployments where managed infrastructure is worth the cost and you need to scale beyond what a single server can handle.
from pinecone import Pinecone, ServerlessSpec
import os
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
# Create index (once)
if "knowledge-base" not in pc.list_indexes().names():
pc.create_index(
"knowledge-base", dimension=1536, metric="cosine",
spec=ServerlessSpec(cloud="aws", region="us-east-1")
)
idx = pc.Index("knowledge-base")
# Upsert vectors with metadata
idx.upsert([
{"id": "chunk_001", "values": vector, "metadata": {"text": text, "source": "FAQ", "category": "refunds"}},
# ... more vectors
])
# Query with metadata filter
results = idx.query(
vector=query_vec, top_k=5,
filter={"category": {"$eq": "refunds"}}, # Only search refund-related chunks
include_metadata=True
)
for match in results.matches:
print(f"Score: {match.score:.3f} | {match.metadata['text'][:100]}")Supabase pgvector: SQL + vectors together
pgvector extends PostgreSQL with vector storage and similarity search. Supabase provides managed PostgreSQL with pgvector enabled. The key advantage: combine vector search with SQL filters in a single query — "find the 5 most relevant support articles that are also tagged as applying to Enterprise plan customers." This hybrid filtering is difficult to achieve cleanly in dedicated vector databases.
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Create table with vector column
CREATE TABLE knowledge_chunks (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
content TEXT NOT NULL,
source TEXT,
category TEXT,
embedding VECTOR(1536)
);
-- Create index for fast similarity search
CREATE INDEX ON knowledge_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- Hybrid search: semantic + filter
SELECT content, source, 1 - (embedding <=> $1) AS similarity
FROM knowledge_chunks
WHERE category = 'enterprise' -- SQL filter
ORDER BY embedding <=> $1 -- Vector similarity
LIMIT 5;Choosing the right vector database for your use case
For learning and development: FAISS or Chroma. No setup, no cost, runs locally. Perfect for building and testing RAG systems before production deployment.
For small production deployments (under 500K vectors, single server): FAISS persisted to disk or Chroma with SQLite backend. Still free, adequate performance, minimal operational overhead.
For production needing SQL integration: Supabase pgvector if you are already using Supabase/PostgreSQL, or neon.tech for a serverless PostgreSQL-with-pgvector option.
For managed production at any scale: Pinecone. The free tier gets you to 100K vectors; the Starter plan ($70/month) scales further. Worth the cost when infrastructure management time is more expensive than the subscription.
For self-hosted production with advanced features: Qdrant provides the best combination of performance, advanced filtering, and open-source licensing for teams comfortable with self-hosting.
Full RAG implementation: RAG pipeline automation guide — covers the complete indexing and query pipeline using these vector databases.
Frequently asked questions
FAISS and Chroma are open-source with no vector limits — you are bounded only by your server RAM (approximately 6MB per 1,000 1536-dimension vectors). Pinecone free tier: 100,000 vectors in 1 index. Supabase free tier: 500MB database storage, approximately 250,000 vectors. For most small business knowledge bases (FAQ pages, product documentation, policy documents), these free tiers are more than sufficient.
OpenAI text-embedding-3-small ($0.02/million tokens) provides excellent quality at low cost for most business text. OpenAI text-embedding-3-large provides higher quality at higher cost ($0.13/million tokens) — worth it for domains with highly technical or specialised vocabulary. For zero API cost, Sentence Transformers all-MiniLM-L6-v2 runs locally and provides good quality for general business text. Always use the same embedding model for indexing and querying — mixing models destroys retrieval quality.
Keep building expertise
The complete guide covers every tool and strategy.
Complete AI Automation Guide →ThinkForAI Editorial Team
Updated November 2024.

