⚙️ Technical Depth

Vector Databases for AI Automation:
FAISS, Pinecone, and pgvector

Vector databases are the memory layer of AI automation — enabling semantic search across your knowledge base. This guide compares FAISS, Chroma, Pinecone, and Supabase pgvector with implementation code and clear selection criteria for each use case.

Technical·ThinkForAI Editorial Team·November 2024
Vector databases are the memory layer of AI automation — they store text as mathematical vectors that capture semantic meaning, enabling your AI systems to find relevant information instantly even in knowledge bases with millions of documents. This guide covers every major vector database option, when to use each, and how to implement production-grade semantic search.
Sponsored

What vector databases do and why they matter

Traditional databases search by exact or pattern match. Vector databases search by semantic similarity — finding content that means the same thing even when phrased differently. This is what enables RAG systems to retrieve "our policy on refunds after 30 days" in response to "can I get my money back next month?" — even though no words match.

The core operation: every piece of text is converted to a high-dimensional vector (typically 384–1536 numbers) by an embedding model. Texts with similar meaning cluster together in this vector space. A query vector is compared to all stored vectors using cosine similarity or dot product, and the closest matches are returned. This entire search process takes milliseconds even across millions of vectors.

Vector database options: comparison for AI automation

DatabaseTypeFree tierBest forSetup complexity
FAISS (Meta)In-memory libraryFully freeDevelopment, single-server production under 1M vectorsNone (Python library)
ChromaEmbedded / serverFully free OSSLocal development, LangChain integrationVery low
PineconeManaged cloud1 index, 100K vectorsProduction, managed, scales to billionsLow (API only)
Supabase pgvectorPostgreSQL extension500MB storageTeams already using Supabase/PostgreSQLLow
QdrantOpen-source / cloudSelf-hosted freeSelf-hosted production, advanced filteringMedium
WeaviateOpen-source / cloudSelf-hosted freeMulti-modal (text + images), complex schemasMedium

FAISS: the free in-memory option for most use cases

FAISS (Facebook AI Similarity Search) is a Python library for efficient vector similarity search. No server, no subscription, no API calls. Runs entirely in your Python process. For knowledge bases up to approximately 1 million vectors on a server with 4+ GB RAM, FAISS delivers excellent performance — sub-millisecond search times on typical business knowledge bases.

import faiss, numpy as np, pickle

# Build index
embeddings = np.array(all_vectors, dtype=np.float32)
faiss.normalize_L2(embeddings)  # For cosine similarity
index = faiss.IndexFlatIP(embeddings.shape[1])
index.add(embeddings)

# Save to disk
faiss.write_index(index, "knowledge_base.faiss")
pickle.dump(chunks_metadata, open("metadata.pkl","wb"))

# Load later
index = faiss.read_index("knowledge_base.faiss")
metadata = pickle.load(open("metadata.pkl","rb"))

# Search
query_vec = np.array([embed(query)], dtype=np.float32)
faiss.normalize_L2(query_vec)
scores, indices = index.search(query_vec, k=5)
results = [metadata[i] for i in indices[0] if i != -1]

Pinecone: managed production vector search

Pinecone is fully managed — no infrastructure to maintain, automatic scaling, and a simple REST API. The free tier (1 index, 100K vectors) handles most small business RAG applications. Pinecone shines for production deployments where managed infrastructure is worth the cost and you need to scale beyond what a single server can handle.

from pinecone import Pinecone, ServerlessSpec
import os

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

# Create index (once)
if "knowledge-base" not in pc.list_indexes().names():
    pc.create_index(
        "knowledge-base", dimension=1536, metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1")
    )
idx = pc.Index("knowledge-base")

# Upsert vectors with metadata
idx.upsert([
    {"id": "chunk_001", "values": vector, "metadata": {"text": text, "source": "FAQ", "category": "refunds"}},
    # ... more vectors
])

# Query with metadata filter
results = idx.query(
    vector=query_vec, top_k=5,
    filter={"category": {"$eq": "refunds"}},  # Only search refund-related chunks
    include_metadata=True
)
for match in results.matches:
    print(f"Score: {match.score:.3f} | {match.metadata['text'][:100]}")

Supabase pgvector: SQL + vectors together

pgvector extends PostgreSQL with vector storage and similarity search. Supabase provides managed PostgreSQL with pgvector enabled. The key advantage: combine vector search with SQL filters in a single query — "find the 5 most relevant support articles that are also tagged as applying to Enterprise plan customers." This hybrid filtering is difficult to achieve cleanly in dedicated vector databases.

-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create table with vector column
CREATE TABLE knowledge_chunks (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT,
    category TEXT,
    embedding VECTOR(1536)
);

-- Create index for fast similarity search
CREATE INDEX ON knowledge_chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

-- Hybrid search: semantic + filter
SELECT content, source, 1 - (embedding <=> $1) AS similarity
FROM knowledge_chunks
WHERE category = 'enterprise'  -- SQL filter
ORDER BY embedding <=> $1         -- Vector similarity
LIMIT 5;

Choosing the right vector database for your use case

For learning and development: FAISS or Chroma. No setup, no cost, runs locally. Perfect for building and testing RAG systems before production deployment.

For small production deployments (under 500K vectors, single server): FAISS persisted to disk or Chroma with SQLite backend. Still free, adequate performance, minimal operational overhead.

For production needing SQL integration: Supabase pgvector if you are already using Supabase/PostgreSQL, or neon.tech for a serverless PostgreSQL-with-pgvector option.

For managed production at any scale: Pinecone. The free tier gets you to 100K vectors; the Starter plan ($70/month) scales further. Worth the cost when infrastructure management time is more expensive than the subscription.

For self-hosted production with advanced features: Qdrant provides the best combination of performance, advanced filtering, and open-source licensing for teams comfortable with self-hosting.

Full RAG implementation: RAG pipeline automation guide — covers the complete indexing and query pipeline using these vector databases.

Frequently asked questions

How many vectors can I store for free?

FAISS and Chroma are open-source with no vector limits — you are bounded only by your server RAM (approximately 6MB per 1,000 1536-dimension vectors). Pinecone free tier: 100,000 vectors in 1 index. Supabase free tier: 500MB database storage, approximately 250,000 vectors. For most small business knowledge bases (FAQ pages, product documentation, policy documents), these free tiers are more than sufficient.

What embedding model should I use for best retrieval quality?

OpenAI text-embedding-3-small ($0.02/million tokens) provides excellent quality at low cost for most business text. OpenAI text-embedding-3-large provides higher quality at higher cost ($0.13/million tokens) — worth it for domains with highly technical or specialised vocabulary. For zero API cost, Sentence Transformers all-MiniLM-L6-v2 runs locally and provides good quality for general business text. Always use the same embedding model for indexing and querying — mixing models destroys retrieval quality.

Sponsored

Keep building expertise

The complete guide covers every tool and strategy.

Complete AI Automation Guide →

ThinkForAI Editorial Team

Updated November 2024.