PostgreSQL Full-Text Search Is Underrated

The first time a product manager asks for a search feature, the engineering team reaches for Elasticsearch. It's the obvious choice — it's what you've heard about, it's the dedicated search engine, it scales. What nobody mentions is that Elasticsearch is also a distributed system with its own operational complexity, eventual consistency semantics, index management overhead, and a steep learning curve. For most applications at most stages of growth, it's massive overkill.

PostgreSQL has had full-text search since version 8.3. It is not a replacement for a dedicated search engine at scale, but it will serve the vast majority of applications well — and it requires zero additional infrastructure.

The core types: tsvector and tsquery

Full-text search in PostgreSQL revolves around two types. A tsvector is a preprocessed, lexeme-normalized representation of a document. A tsquery is a normalized query expression. The @@ operator checks whether a query matches a document.

SELECT to_tsvector('english', 'The quick brown fox jumped over the lazy dog')
       @@ to_tsquery('english', 'fox & dog');
-- true

SELECT to_tsvector('english', 'The quick brown fox jumped over the lazy dog')
       @@ to_tsquery('english', 'cat');
-- false

to_tsvector tokenizes the text, removes stop words, and reduces words to their lexemes (stems). "jumped" becomes "jump". "running" becomes "run". This means a search for "run" matches rows containing "running", "runs", or "ran" — the behavior you'd expect from a real search engine.

Adding full-text search to a table

The idiomatic approach is to store a precomputed tsvector column and keep it updated with a trigger:

ALTER TABLE articles
  ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(body, '')),  'B')
  ) STORED;

CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

Note the use of setweight — this gives title matches higher relevance than body matches. The GIN index makes searches fast; without it, Postgres would do a sequential scan.

Querying is then straightforward:

SELECT id, title,
       ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgres & search') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;

Ranking and highlighting

PostgreSQL provides two ranking functions:

ts_rank(vector, query) — ranks by term frequency relative to document length
ts_rank_cd(vector, query) — ranks by cover density (proximity of terms to each other)

For building a search results UI, ts_headline generates highlighted snippets:

SELECT title,
       ts_headline('english', body, query,
                   'MaxWords=30, MinWords=10, ShortWord=3') AS snippet
FROM articles, to_tsquery('english', 'full-text & search') query
WHERE search_vector @@ query;

Handling user input safely

Never pass raw user input directly to to_tsquery — it will error on malformed input. Use websearch_to_tsquery (added in PostgreSQL 11) or plainto_tsquery instead:

-- websearch_to_tsquery understands +, -, "", OR
SELECT * FROM articles
WHERE search_vector @@ websearch_to_tsquery('english', $1)
-- $1 = user input, passed as a parameter

Always use parameterised queries. Never interpolate user input into SQL strings — that's an injection vulnerability regardless of which search function you're using.

When PostgreSQL FTS is enough

PostgreSQL full-text search handles well:

Simple keyword search with stemming and stop words
Phrase search and boolean queries (AND, OR, NOT)
Weighted multi-column search (title vs. body vs. tags)
Relevance ranking and result highlighting
Tables up to tens of millions of rows with proper indexing

When to reach for something else

There are cases where PostgreSQL FTS genuinely won't cut it:

Fuzzy / typo-tolerant search — "postgress" should match "postgres". PostgreSQL has pg_trgm for basic trigram similarity, but dedicated engines do this much better.
Synonyms and semantic search — requires a thesaurus configuration or a vector embedding approach that Postgres alone doesn't handle.
Extremely high write volume with concurrent reads — GIN index updates can become a bottleneck at very high write rates.
Faceted search with complex aggregations — Elasticsearch's aggregation pipeline is far ahead here.

For most applications — a blog, a SaaS product knowledge base, a marketplace — you won't hit these limits for a long time, if ever. Start with PostgreSQL. When you hit a concrete limit that PostgreSQL can't clear, add a dedicated search layer. Don't add the operational complexity upfront for a problem you might never have.