Indexes Are the Single Biggest Performance Win

A missing index can turn a 10ms query into a 10-second full table scan. Proper indexing is the most impactful performance optimization for database-backed applications. At Nexis Limited, we performance-tune databases across all SaaS products — finding and fixing missing indexes is typically the first and highest-impact optimization.

How Indexes Work

An index is a data structure (typically a B-tree) that allows the database to find rows quickly without scanning the entire table. Think of it as a book index — instead of reading every page to find "PostgreSQL," you look up "PostgreSQL" in the index and go directly to the relevant page.

Without an index on the "email" column, finding a user by email requires scanning every row in the table. With an index, the database jumps directly to the matching row using O(log n) lookups instead of O(n) scans.

Index Types

B-Tree Index (Default)

The standard index type. Supports equality (=), range (<, >, BETWEEN), ordering (ORDER BY), and prefix matching (LIKE 'prefix%'). Use for most columns used in WHERE clauses, JOINs, or ORDER BY clauses.

Hash Index

Optimized for exact equality lookups (=) only. Faster than B-tree for equality but does not support range queries or ordering. Use only when equality lookups are the sole use case and performance is critical.

GIN Index

Generalized Inverted Index — designed for values that contain multiple elements. Essential for full-text search (tsvector), JSONB querying, and array containment operations. If you query JSONB fields or use full-text search, you need GIN indexes.

GiST Index

Generalized Search Tree — supports geometric and range data types. Used for geographic queries (PostGIS), range types, and nearest-neighbor searches.

Composite Indexes

A composite index covers multiple columns. The column order matters significantly:

  • A composite index on (tenant_id, created_at) supports queries filtering on tenant_id alone, and queries filtering on both tenant_id and created_at.
  • It does NOT efficiently support queries filtering only on created_at. The leftmost column must be present in the query for the index to be used.
  • Order columns from most selective (most unique values) to least selective, or match the order of your most common query patterns.

Covering Indexes

A covering index includes all columns referenced in a query. The database can answer the query entirely from the index without accessing the table data (an "index-only scan"). In PostgreSQL, use the INCLUDE clause to add non-indexed columns to the index for covering.

Query Analysis with EXPLAIN

Use EXPLAIN ANALYZE to understand how the database executes a query:

  • Seq Scan: Full table scan — the database reads every row. Often indicates a missing index.
  • Index Scan: Uses an index to find matching rows, then fetches row data from the table.
  • Index Only Scan: Answers the query entirely from the index. The fastest scan type.
  • Bitmap Index Scan: Combines multiple index results. Common when multiple conditions use different indexes.

Look at the "actual time" and "rows" columns. High actual time with many rows scanned indicates an optimization opportunity.

Common Anti-Patterns

Over-Indexing

Every index has a write cost — inserts and updates must maintain all indexes on the table. Too many indexes slow down write operations and increase storage. Index only columns that are actually queried. Remove indexes that are not used — query pg_stat_user_indexes to find unused indexes.

Indexing Low-Cardinality Columns

An index on a boolean column (is_active) or a status column with few distinct values provides little benefit — the database may still choose a full table scan because the index does not narrow the results significantly. Combine low-cardinality columns with high-cardinality columns in composite indexes instead.

Functions on Indexed Columns

WHERE LOWER(email) = '[email protected]' cannot use an index on "email" because LOWER() transforms the value. Create a functional index on LOWER(email) or store normalized values in a separate column.

LIKE with Leading Wildcards

WHERE name LIKE '%smith' cannot use a B-tree index because the wildcard is at the beginning. Use full-text search (GIN index on tsvector) for substring searching, or use trigram indexes (pg_trgm extension).

Practical Optimization Workflow

  1. Enable slow query logging (log_min_duration_statement = 100ms in PostgreSQL).
  2. Identify the slowest and most frequent queries from logs.
  3. Run EXPLAIN ANALYZE on each slow query.
  4. Add missing indexes based on WHERE, JOIN, and ORDER BY columns.
  5. Verify improvement with EXPLAIN ANALYZE after adding indexes.
  6. Monitor pg_stat_user_indexes to identify and remove unused indexes.

Conclusion

Database indexing is the highest-impact performance optimization available. Understand your query patterns, use EXPLAIN ANALYZE to diagnose issues, create targeted indexes, and monitor for unused indexes. A well-indexed database handles orders of magnitude more traffic than an unindexed one with the same hardware.

Optimizing database performance? Our team conducts database performance tuning and optimization.