AI

Install pgvector on PostgreSQL 17 (Rocky Linux 10 / Ubuntu 24.04)

pgvector turns PostgreSQL into a vector database. If you already run PostgreSQL for your applications, you can store and query vector embeddings without adding another service to your stack. That means semantic search, recommendation engines, and RAG pipelines all running inside the database you already manage and back up.

Original content from computingforgeeks.com - post 74809

This guide covers installing pgvector on PostgreSQL 17 for both Rocky Linux 10 and Ubuntu 24.04, creating vector columns, building indexes (HNSW and IVFFlat), running similarity searches, and practical demos you can adapt for your own projects.

Tested March 2026 on Rocky Linux 10.1 (SELinux enforcing) and Ubuntu 24.04 LTS with PostgreSQL 17.9, pgvector 0.8.2

What You Need

  • Rocky Linux 10 or Ubuntu 24.04 with PostgreSQL 17 installed
  • Root or sudo access
  • PGDG repository configured (the PostgreSQL install guide covers this)

pgvector itself is lightweight. It adds no background processes and uses memory only when you query or build indexes.

Install pgvector

pgvector is available as a prebuilt package from the official PostgreSQL repository (PGDG) on both OS families. No compilation needed.

Rocky Linux 10 / AlmaLinux 10 / RHEL 10

Install pgvector from the PGDG repository:

sudo dnf install -y pgvector_17

The package name follows the pattern pgvector_<pg-version>. For PostgreSQL 16, that would be pgvector_16.

Ubuntu 24.04 / Debian 13

On Debian-based systems, install the pgvector package:

sudo apt install -y postgresql-17-pgvector

On Debian/Ubuntu the pattern is postgresql-<pg-version>-pgvector.

Enable the Extension

After installing the package, enable the extension inside the database where you need vectors:

sudo -u postgres psql -c "CREATE EXTENSION vector;"

Verify the extension is active and check the version:

sudo -u postgres psql -c "SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';"

The output confirms pgvector is loaded:

 extname | extversion 
---------+------------
 vector  | 0.8.2
(1 row)

Vector Data Types

pgvector 0.8.2 supports four data types for different use cases:

TypeStorageMax DimensionsUse Case
vector32-bit float16,000Standard embeddings (most common)
halfvec16-bit float16,000Half the memory, slight precision loss
sparsevecSparse format16,000Mostly-zero vectors (BM25, TF-IDF)
bit1-bit per dim64,000Binary embeddings, Hamming distance

For most AI embedding models (OpenAI, Ollama, Cohere), use vector. Switch to halfvec when memory is tight and you can tolerate minor precision differences.

Create Tables with Vector Columns

A vector column stores fixed-length arrays of floats. Specify the number of dimensions when creating the column. Common embedding dimensions are 384 (MiniLM, nomic-embed-text), 768 (BERT), 1536 (OpenAI text-embedding-ada-002), and 3072 (OpenAI text-embedding-3-large).

Connect to PostgreSQL:

sudo -u postgres psql

Create a table with a 384-dimension vector column (matches Ollama’s nomic-embed-text model):

CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT,
  embedding vector(384)
);

Distance Operators

pgvector provides operators for different distance metrics. The choice depends on how your embedding model was trained:

OperatorDistance MetricIndex Ops ClassWhen to Use
<=>Cosine distancevector_cosine_opsMost embedding models (default choice)
<->L2 (Euclidean)vector_l2_opsWhen magnitude matters (image features)
<#>Negative inner productvector_ip_opsNormalized vectors, maximum inner product search

Cosine distance (<=>) is the safest default. Most text embedding models are optimized for cosine similarity. Here is a quick demo:

CREATE TABLE items (
  id SERIAL PRIMARY KEY,
  name TEXT,
  embedding vector(3)
);

INSERT INTO items (name, embedding) VALUES
  ('cat', '[0.1, 0.2, 0.8]'),
  ('dog', '[0.15, 0.25, 0.75]'),
  ('fish', '[0.9, 0.1, 0.05]'),
  ('bird', '[0.3, 0.7, 0.4]'),
  ('snake', '[0.85, 0.05, 0.1]');

Find items most similar to “cat” using cosine distance:

SELECT name, 1 - (embedding <=> '[0.1, 0.2, 0.8]') AS similarity
FROM items
ORDER BY embedding <=> '[0.1, 0.2, 0.8]'
LIMIT 3;

“Dog” ranks highest because its vector is closest to “cat” in direction:

 name |     similarity     
------+--------------------
 cat  |                  1
 dog  | 0.9948949708177414
 bird | 0.6857339969219595
(3 rows)

All three distance operators side by side on the same data:

SELECT name,
  embedding <-> '[0.1, 0.2, 0.8]' AS l2_distance,
  embedding <=> '[0.1, 0.2, 0.8]' AS cosine_distance,
  embedding <#> '[0.1, 0.2, 0.8]' AS neg_inner_product
FROM items
ORDER BY embedding <=> '[0.1, 0.2, 0.8]';

The results show how each metric measures distance differently. L2 uses absolute magnitude, cosine ignores magnitude and compares direction only, and inner product combines both:

 name  |   l2_distance   | cosine_distance |  neg_inner_product   
-------+-----------------+-----------------+----------------------
 cat   |               0 |               0 |   -0.690000057220459
 dog   | 0.0866025474761 |  0.005105029182 |  -0.6650000214576721
 bird  | 0.6708204287913 |  0.314266003078 | -0.49000000953674316
 snake | 1.0368220331737 |  0.754263412560 | -0.17500001192092896
 fish  |  1.101135712321 |  0.800887375587 | -0.15000000596046448
(5 rows)

Build Vector Indexes

Without an index, pgvector performs exact (brute-force) nearest neighbor search by scanning every row. This is accurate but slow on large datasets. Indexes give you approximate nearest neighbor (ANN) search that is dramatically faster at the cost of slightly less perfect recall.

HNSW Index (Recommended)

HNSW (Hierarchical Navigable Small World) is the preferred index type for most workloads. It builds a multi-layer graph and delivers consistently fast queries without the training step that IVFFlat requires.

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

Key parameters:

ParameterDefaultEffect
m16Connections per node. Higher = better recall, more memory, slower builds
ef_construction64Build-time search depth. Higher = better index quality, slower builds

At query time, control the search accuracy vs speed tradeoff with hnsw.ef_search:

SET hnsw.ef_search = 100;  -- higher = better recall, slower

IVFFlat Index

IVFFlat divides vectors into clusters (lists) and searches only the nearest clusters at query time. It requires data in the table before building (unlike HNSW which can be built on an empty table). Use IVFFlat when memory is constrained because it uses less memory than HNSW.

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

The lists parameter controls the number of clusters. A good starting point is the square root of the row count: for 1 million rows, use lists = 1000. At query time, ivfflat.probes controls how many clusters to search:

SET ivfflat.probes = 10;  -- search 10 clusters (default is 1)

Which Index to Choose?

CriteriaHNSWIVFFlat
Build speedSlowerFaster
Query speedFasterSlightly slower
Memory usageHigherLower
Recall at default settingsBetterLower (needs tuning)
Requires data before build?NoYes
Best forMost workloadsMemory-constrained, batch scenarios

Start with HNSW unless you have a specific reason to use IVFFlat.

Performance Tuning

pgvector benefits from the same PostgreSQL tuning you would apply for any analytics workload, plus a few vector-specific settings.

Edit the PostgreSQL configuration:

sudo vi /var/lib/pgsql/17/data/postgresql.conf

On Ubuntu, the path is /etc/postgresql/17/main/postgresql.conf.

Key settings to adjust:

# Memory for sorting and index operations
work_mem = '256MB'
maintenance_work_mem = '1GB'

# Parallel workers for index building (speeds up HNSW/IVFFlat creation)
max_parallel_maintenance_workers = 4
max_parallel_workers_per_gather = 4

# Shared memory (25% of total RAM is a good starting point)
shared_buffers = '1GB'
effective_cache_size = '3GB'

Restart PostgreSQL after changes:

sudo systemctl restart postgresql-17

On Ubuntu, the service name is postgresql instead of postgresql-17.

Practical Demo: Semantic Search with 1,000 Documents

This demo creates a table with 1,000 documents, builds an HNSW index, and runs similarity queries to show real query plans and timing.

sudo -u postgres psql

Create and populate the table with random 384-dimension vectors (in production, these come from an embedding model like Ollama’s nomic-embed-text):

CREATE TABLE docs (
  id SERIAL PRIMARY KEY,
  title TEXT,
  embedding vector(384)
);

INSERT INTO docs (title, embedding)
SELECT 
  'Document ' || i,
  ('[' || string_agg((random() * 2 - 1)::text, ',') || ']')::vector(384)
FROM generate_series(1, 1000) AS i,
     generate_series(1, 384) AS j
GROUP BY i;

Build the HNSW index:

CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

Check the index size:

SELECT indexname, pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes WHERE tablename = 'docs';

The HNSW index on 1,000 rows with 384 dimensions takes about 2 MB:

     indexname      |  size   
-------------------+---------
 docs_pkey         | 40 kB
 docs_embedding_idx| 2008 kB
(2 rows)

Run a similarity search with EXPLAIN ANALYZE to see the query plan:

SET hnsw.ef_search = 40;
EXPLAIN ANALYZE
SELECT id, title, 1 - (embedding <=> (SELECT embedding FROM docs WHERE id = 1)) AS similarity
FROM docs
ORDER BY embedding <=> (SELECT embedding FROM docs WHERE id = 1)
LIMIT 5;

The query completes in under 1ms on 1,000 rows:

 Planning Time: 0.466 ms
 Execution Time: 0.551 ms

Generate Embeddings with Ollama

To store real embeddings (not random vectors), you need an embedding model. Ollama runs embedding models locally. The nomic-embed-text model produces 768-dimension vectors and is one of the most popular choices.

Pull the embedding model (assumes Ollama is installed):

ollama pull nomic-embed-text

Generate an embedding via the API:

curl -s http://localhost:11434/api/embed \
  -d '{"model":"nomic-embed-text","input":"PostgreSQL is a powerful open-source database"}' \
  | python3 -c "import sys,json; d=json.load(sys.stdin); print(f'Dimensions: {len(d[\"embeddings\"][0])}')"

This returns Dimensions: 768. To insert these embeddings into PostgreSQL, use a script that calls the Ollama API and inserts the result. Here is a Python example using psycopg2:

import psycopg2
import requests
import json

conn = psycopg2.connect("dbname=postgres user=postgres")
cur = conn.cursor()

# Create table for real embeddings
cur.execute("""
  CREATE TABLE IF NOT EXISTS knowledge_base (
    id SERIAL PRIMARY KEY,
    content TEXT NOT NULL,
    embedding vector(768)
  )
""")

# Documents to embed
docs = [
    "PostgreSQL supports advanced indexing including GiST and GIN",
    "Linux kernel 6.12 introduced real-time scheduling improvements",
    "Nginx can be configured as a reverse proxy with load balancing",
    "Docker containers share the host kernel unlike virtual machines",
    "SELinux enforcing mode prevents unauthorized process access"
]

for doc in docs:
    # Get embedding from Ollama
    resp = requests.post("http://localhost:11434/api/embed",
        json={"model": "nomic-embed-text", "input": doc})
    embedding = resp.json()["embeddings"][0]
    
    # Insert into PostgreSQL
    cur.execute(
        "INSERT INTO knowledge_base (content, embedding) VALUES (%s, %s)",
        (doc, json.dumps(embedding))
    )

conn.commit()
cur.close()
conn.close()
print("Inserted", len(docs), "documents with embeddings")

This pattern (embed with Ollama, store in pgvector) is the foundation of a self-hosted RAG pipeline. No data leaves your network, no API costs, and your existing PostgreSQL backup strategy covers the vectors too.

Embedding Model Dimensions Reference

When creating your vector column, the dimension must match the model you use. Here are the common embedding models and their output dimensions:

ModelProviderDimensionsNotes
nomic-embed-textOllama (local)768Open source, no API cost
mxbai-embed-largeOllama (local)1024Higher quality, larger
all-MiniLM-L6-v2Sentence Transformers384Fast, lightweight
text-embedding-ada-002OpenAI1536Legacy model
text-embedding-3-smallOpenAI1536Current recommended
text-embedding-3-largeOpenAI3072Highest quality (OpenAI)
embed-english-v3.0Cohere1024Multilingual available

Rocky Linux 10 vs Ubuntu 24.04

The pgvector extension works identically on both platforms. The differences are in package names, config paths, and service management:

ItemRocky Linux 10Ubuntu 24.04
pgvector packagepgvector_17postgresql-17-pgvector
PostgreSQL config/var/lib/pgsql/17/data/postgresql.conf/etc/postgresql/17/main/postgresql.conf
Service namepostgresql-17postgresql
Data directory/var/lib/pgsql/17/data//var/lib/postgresql/17/main/
SELinux/AppArmorSELinux enforcing (no pgvector issues)AppArmor (no action needed)

Production Considerations

A few things to keep in mind when running pgvector in production:

  • Backup: pgvector data is stored in regular PostgreSQL tables. Your existing pg_dump or streaming replication setup covers vectors automatically
  • Index rebuilds: HNSW indexes can be built with parallel workers. Set max_parallel_maintenance_workers to match your CPU cores for faster builds on large datasets
  • Memory: HNSW indexes are memory-mapped. For best performance, ensure effective_cache_size is large enough to keep the index in the OS page cache
  • Dimension limit: pgvector supports up to 16,000 dimensions for vector and halfvec types. Use halfvec to cut memory usage in half when you can tolerate minor precision differences
  • Vacuuming: Run VACUUM after large bulk deletes to reclaim space in HNSW indexes

Going Further

With pgvector installed, you have the building blocks for several AI-powered workflows:

  • Self-hosted RAG: Combine pgvector with Ollama to build a document Q&A system that runs entirely on your infrastructure
  • Semantic search: Replace keyword-based search in your application with vector similarity for more relevant results
  • Recommendation engines: Store user preference vectors and find similar items or users
  • Image similarity: Store CLIP embeddings and find visually similar images
  • Anomaly detection: Flag items that are distant from their expected cluster centroid

Related Articles

Debian Fix Unable to Mount Windows NTFS Filesystem on Linux Databases Convert MySQL tables from MyISAM into InnoDB Storage engine AlmaLinux Install Podman and Buildah on Rocky / AlmaLinux Web Hosting How To Install Webmin on Ubuntu / Debian / Kali Linux

Leave a Comment

Press ESC to close