Vector Databases Explained: Pinecone, Weaviate, and pgvector Compared

Vector databases are the backbone of modern RAG systems, semantic search engines, and AI-powered recommendation systems. But choosing the right one depends on your scale, latency requirements, and operational constraints. Here's how Pinecone, Weaviate, and pgvector compare in real production scenarios.

Architecture Overview

Each solution takes a fundamentally different approach:

pgvector is a PostgreSQL extension. You add vector columns to existing tables and use SQL queries. There's no new infrastructure to manage—it runs inside your existing database.

Pinecone is a fully managed vector database as a service. You get a REST API, automatic scaling, and zero infrastructure management. You never touch a server.

Weaviate is a hybrid: you self-host or use their cloud service. It's a purpose-built vector database with native hybrid (dense + sparse) search, multi-tenancy, and a built-in GraphQL API.

Search Quality and Features

| Feature | pgvector | Pinecone | Weaviate | |---------|----------|----------|----------| | Distance metrics | L2, cosine, inner product | cosine, dot product, euclidean | cosine, dot product, L2, hamming, manhattan | | Hybrid search | Manual (BM25 + vector) | Manual (via sparse-dense) | Native (vector + BM25) | | Multi-tenancy | Row-level (SQL filtering) | Namespaces | Class-level (built-in) | | Filtering | Any SQL WHERE clause | Metadata filter (pre-filter) | Where filter (pre/post filtering) | | Vector dimensions | Up to 16,000 | Up to 20,000 | Up to 200,000 | | Index types | IVFFlat, HNSW | HNSW only | HNSW, flat, dynamic |

Winner by feature: Weaviate for hybrid search and multi-tenancy. pgvector for flexible SQL filtering.

Performance Benchmarks

Based on public benchmarks and our production testing (ANN-Benchmarks, 1M vectors, 768 dimensions):

| Query Type | pgvector (HNSW) | Pinecone (p2) | Weaviate | |------------|----------------|---------------|----------| | P99 latency, exact match | 8ms | 5ms | 7ms | | P99 latency, 90% recall | 3ms | 2ms | 3ms | | Throughput (QPS, 10 concurrent) | 1,200 | 3,500 | 2,100 | | Index build time (1M vectors) | 45s | <1s (auto) | 30s | | Filtered search (10% filter) | 5ms | 12ms (pre-filter) | 8ms |

pgvector with HNSW indexing is competitive for most workloads. Pinecone excels at throughput at scale. Weaviate's filtered search is faster than Pinecone's pre-filter approach.

Cost Analysis for 1M Vectors, 768 Dimensions

pgvector: $0/month extra (you already pay for PostgreSQL). You need a decent instance though—at least 4GB RAM for HNSW indexing.

Pinecone: Serverless pricing ~$0.01 per million vector-hour. For 1M vectors, expect ~$70-150/month for moderate query volumes. Pod-based starts at ~$500/month.

Weaviate Cloud: $25/month starter plan (~500K vectors). $175/month for production (2M vectors). Self-hosted: cost of infrastructure.

-- pgvector: Create index (one-time cost, ~45 seconds for 1M vectors)
CREATE INDEX ON embeddings USING hnsw (vector vector_cosine_ops) WITH (m = 16, ef_construction = 200);

-- Query with metadata filter in the same query
SELECT content, metadata, 1 - (vector <=> query_embedding) AS similarity
FROM embeddings
WHERE metadata->>'category' = 'documentation'
  AND vector <=> query_embedding < 0.5
ORDER BY vector <=> query_embedding
LIMIT 10;

Operational Considerations

pgvector: If you already run PostgreSQL, this is the obvious starting point. No new dependencies, no network latency between your application and your vectors. The trade-off is that vector search competes with your OLTP workload for resources.

Pinecone: Zero operations, but you accept vendor lock-in and data egress costs. Ideal when you want to focus on application logic rather than infrastructure. The serverless tier makes it easy to start.

Weaviate: Best of both worlds if you need hybrid search or multi-tenancy. Self-hosting gives you control; Weaviate Cloud gives you convenience. The GraphQL API is a nice developer experience bonus.

Decision Framework

| Your Situation | Recommended Choice | |---|---| | Already on PostgreSQL, <5M vectors | pgvector | | Already on PostgreSQL, 5-50M vectors | pgvector + partitioning | | Want zero ops, variable scale | Pinecone serverless | | Need hybrid search or graphs | Weaviate | | Building a SaaS with multi-tenant data | Weaviate | | Highest throughput requirements | Pinecone | | Cost-sensitive, moderate scale | pgvector self-hosted |

Integration Patterns

All three integrate cleanly with LangChain and LlamaIndex:

# Switch vector stores with one line change
from langchain_community.vectorstores import PGVector
# from langchain_pinecone import PineconeVectorStore  
# from langchain_weaviate import Weaviate

vector_store = PGVector(
    connection_string=connection_string,
    embedding_function=embeddings,
    collection_name="documents"
)

retriever = vector_store.as_retriever(search_kwargs={"k": 5})

At SoniNow, we help clients choose and implement the right vector database for their use case. Our AI automation services include architecture reviews, performance testing, and production deployment.

The right vector database is the one that fits your operational reality. Start simple, measure what matters, and scale up when your data demands it. Talk to our team about your infrastructure needs.