Vector Databases for AI Automation: FAISS, Pinecone, and pgvector Compared

Vector databases are the memory layer of AI automation — they store text as mathematical vectors that capture semantic meaning, enabling your AI systems to find relevant information instantly even in knowledge bases with millions of documents. This guide covers every major vector database option, when to use each, and how to implement production-grade semantic search.

What vector databases do and why they matter

Traditional databases search by exact or pattern match. Vector databases search by semantic similarity — finding content that means the same thing even when phrased differently. This is what enables RAG systems to retrieve "our policy on refunds after 30 days" in response to "can I get my money back next month?" — even though no words match.

The core operation: every piece of text is converted to a high-dimensional vector (typically 384–1536 numbers) by an embedding model. Texts with similar meaning cluster together in this vector space. A query vector is compared to all stored vectors using cosine similarity or dot product, and the closest matches are returned. This entire search process takes milliseconds even across millions of vectors.

Vector database options: comparison for AI automation

Database	Type	Free tier	Best for	Setup complexity
FAISS (Meta)	In-memory library	Fully free	Development, single-server production under 1M vectors	None (Python library)
Chroma	Embedded / server	Fully free OSS	Local development, LangChain integration	Very low
Pinecone	Managed cloud	1 index, 100K vectors	Production, managed, scales to billions	Low (API only)
Supabase pgvector	PostgreSQL extension	500MB storage	Teams already using Supabase/PostgreSQL	Low
Qdrant	Open-source / cloud	Self-hosted free	Self-hosted production, advanced filtering	Medium
Weaviate	Open-source / cloud	Self-hosted free	Multi-modal (text + images), complex schemas	Medium

FAISS: the free in-memory option for most use cases

FAISS (Facebook AI Similarity Search) is a Python library for efficient vector similarity search. No server, no subscription, no API calls. Runs entirely in your Python process. For knowledge bases up to approximately 1 million vectors on a server with 4+ GB RAM, FAISS delivers excellent performance — sub-millisecond search times on typical business knowledge bases.

import faiss, numpy as np, pickle

# Build index
embeddings = np.array(all_vectors, dtype=np.float32)
faiss.normalize_L2(embeddings)  # For cosine similarity
index = faiss.IndexFlatIP(embeddings.shape[1])
index.add(embeddings)

# Save to disk
faiss.write_index(index, "knowledge_base.faiss")
pickle.dump(chunks_metadata, open("metadata.pkl","wb"))

# Load later
index = faiss.read_index("knowledge_base.faiss")
metadata = pickle.load(open("metadata.pkl","rb"))

# Search
query_vec = np.array([embed(query)], dtype=np.float32)
faiss.normalize_L2(query_vec)
scores, indices = index.search(query_vec, k=5)
results = [metadata[i] for i in indices[0] if i != -1]

Pinecone: managed production vector search

Pinecone is fully managed — no infrastructure to maintain, automatic scaling, and a simple REST API. The free tier (1 index, 100K vectors) handles most small business RAG applications. Pinecone shines for production deployments where managed infrastructure is worth the cost and you need to scale beyond what a single server can handle.

from pinecone import Pinecone, ServerlessSpec
import os

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

# Create index (once)
if "knowledge-base" not in pc.list_indexes().names():
    pc.create_index(
        "knowledge-base", dimension=1536, metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )
idx = pc.Index("knowledge-base")

# Upsert vectors with metadata
idx.upsert([
    {"id": "chunk_001", "values": vector, "metadata": {"text": text, "source": "FAQ", "category": "refunds"}},
    # ... more vectors
])

# Query with metadata filter
results = idx.query(
    vector=query_vec, top_k=5,
    filter={"category": {"$eq": "refunds"}},  # Only search refund-related chunks
    include_metadata=True
)
for match in results.matches:
    print(f"Score: {match.score:.3f} | {match.metadata['text'][:100]}")

Supabase pgvector: SQL + vectors together

pgvector extends PostgreSQL with vector storage and similarity search. Supabase provides managed PostgreSQL with pgvector enabled. The key advantage: combine vector search with SQL filters in a single query — "find the 5 most relevant support articles that are also tagged as applying to Enterprise plan customers." This hybrid filtering is difficult to achieve cleanly in dedicated vector databases.

-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create table with vector column
CREATE TABLE knowledge_chunks (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT,
    category TEXT,
    embedding VECTOR(1536)
);

-- Create index for fast similarity search
CREATE INDEX ON knowledge_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

-- Hybrid search: semantic + filter
SELECT content, source, 1 - (embedding <=> $1) AS similarity
FROM knowledge_chunks
WHERE category = 'enterprise'  -- SQL filter
ORDER BY embedding <=> $1         -- Vector similarity
LIMIT 5;

Choosing the right vector database for your use case

For learning and development: FAISS or Chroma. No setup, no cost, runs locally. Perfect for building and testing RAG systems before production deployment.

For small production deployments (under 500K vectors, single server): FAISS persisted to disk or Chroma with SQLite backend. Still free, adequate performance, minimal operational overhead.

For production needing SQL integration: Supabase pgvector if you are already using Supabase/PostgreSQL, or neon.tech for a serverless PostgreSQL-with-pgvector option.

For managed production at any scale: Pinecone. The free tier gets you to 100K vectors; the Starter plan ($70/month) scales further. Worth the cost when infrastructure management time is more expensive than the subscription.

For self-hosted production with advanced features: Qdrant provides the best combination of performance, advanced filtering, and open-source licensing for teams comfortable with self-hosting.

Full RAG implementation: RAG pipeline automation guide — covers the complete indexing and query pipeline using these vector databases.

Frequently asked questions

How many vectors can I store for free?

FAISS and Chroma are open-source with no vector limits — you are bounded only by your server RAM (approximately 6MB per 1,000 1536-dimension vectors). Pinecone free tier: 100,000 vectors in 1 index. Supabase free tier: 500MB database storage, approximately 250,000 vectors. For most small business knowledge bases (FAQ pages, product documentation, policy documents), these free tiers are more than sufficient.

What embedding model should I use for best retrieval quality?

OpenAI text-embedding-3-small ($0.02/million tokens) provides excellent quality at low cost for most business text. OpenAI text-embedding-3-large provides higher quality at higher cost ($0.13/million tokens) — worth it for domains with highly technical or specialised vocabulary. For zero API cost, Sentence Transformers all-MiniLM-L6-v2 runs locally and provides good quality for general business text. Always use the same embedding model for indexing and querying — mixing models destroys retrieval quality.

Keep building expertise

The complete guide covers every tool and strategy.

Complete AI Automation Guide →

⚡

ThinkForAI Editorial Team

Updated November 2024.

Vector Databases for AI Automation:FAISS, Pinecone, and pgvector