The first time a product manager asks for a search feature, the engineering team reaches for Elasticsearch. It's the obvious choice — it's what you've heard about, it's the dedicated search engine, it scales. What nobody mentions is that Elasticsearch is also a distributed system with its own operational complexity, eventual consistency semantics, index management overhead, and a steep learning curve. For most applications at most stages of growth, it's massive overkill.
PostgreSQL has had full-text search since version 8.3. It is not a replacement for a dedicated search engine at scale, but it will serve the vast majority of applications well — and it requires zero additional infrastructure.
The core types: tsvector and tsquery
Full-text search in PostgreSQL revolves around two types. A tsvector is a preprocessed, lexeme-normalized representation of a document. A tsquery is a normalized query expression. The @@ operator checks whether a query matches a document.
SELECT to_tsvector('english', 'The quick brown fox jumped over the lazy dog')
@@ to_tsquery('english', 'fox & dog');
-- true
SELECT to_tsvector('english', 'The quick brown fox jumped over the lazy dog')
@@ to_tsquery('english', 'cat');
-- false
to_tsvector tokenizes the text, removes stop words, and reduces words to their lexemes (stems). "jumped" becomes "jump". "running" becomes "run". This means a search for "run" matches rows containing "running", "runs", or "ran" — the behavior you'd expect from a real search engine.
Adding full-text search to a table
The idiomatic approach is to store a precomputed tsvector column and keep it updated with a trigger:
ALTER TABLE articles
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(body, '')), 'B')
) STORED;
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);
Note the use of setweight — this gives title matches higher relevance than body matches. The GIN index makes searches fast; without it, Postgres would do a sequential scan.
Querying is then straightforward:
SELECT id, title,
ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'postgres & search') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
Ranking and highlighting
PostgreSQL provides two ranking functions:
ts_rank(vector, query)— ranks by term frequency relative to document lengthts_rank_cd(vector, query)— ranks by cover density (proximity of terms to each other)
For building a search results UI, ts_headline generates highlighted snippets:
SELECT title,
ts_headline('english', body, query,
'MaxWords=30, MinWords=10, ShortWord=3') AS snippet
FROM articles, to_tsquery('english', 'full-text & search') query
WHERE search_vector @@ query;
Handling user input safely
Never pass raw user input directly to to_tsquery — it will error on malformed input. Use websearch_to_tsquery (added in PostgreSQL 11) or plainto_tsquery instead:
-- websearch_to_tsquery understands +, -, "", OR
SELECT * FROM articles
WHERE search_vector @@ websearch_to_tsquery('english', $1)
-- $1 = user input, passed as a parameter
Always use parameterised queries. Never interpolate user input into SQL strings — that's an injection vulnerability regardless of which search function you're using.
When PostgreSQL FTS is enough
PostgreSQL full-text search handles well:
- Simple keyword search with stemming and stop words
- Phrase search and boolean queries (
AND,OR,NOT) - Weighted multi-column search (title vs. body vs. tags)
- Relevance ranking and result highlighting
- Tables up to tens of millions of rows with proper indexing
When to reach for something else
There are cases where PostgreSQL FTS genuinely won't cut it:
- Fuzzy / typo-tolerant search — "postgress" should match "postgres". PostgreSQL has
pg_trgmfor basic trigram similarity, but dedicated engines do this much better. - Synonyms and semantic search — requires a thesaurus configuration or a vector embedding approach that Postgres alone doesn't handle.
- Extremely high write volume with concurrent reads — GIN index updates can become a bottleneck at very high write rates.
- Faceted search with complex aggregations — Elasticsearch's aggregation pipeline is far ahead here.
For most applications — a blog, a SaaS product knowledge base, a marketplace — you won't hit these limits for a long time, if ever. Start with PostgreSQL. When you hit a concrete limit that PostgreSQL can't clear, add a dedicated search layer. Don't add the operational complexity upfront for a problem you might never have.