API with NestJS #164. Improving the performance with indexes using Drizzle ORM

September 2, 2024

This entry is part 164 of 174 in the API with NestJS

As our database grows, focusing more on its performance is important. SQL indexes act as guides within the database, helping to speed up data retrieval. In this article, we cover how to create indexes using PostgreSQL, Drizzle ORM, and NestJS and explain what indexes are.

Why we might need indexes

In the previous parts of this series, we’ve created the following database schema:

database-schema.ts

import { serial, text, integer, pgTable } from 'drizzle-orm/pg-core';

export const users = pgTable('users', {

id: serial('id').primaryKey(),

email: text('email').unique(),

name: text('name'),

password: text('password'),

});

export const articles = pgTable('articles', {

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

authorId: integer('author_id').references(() => users.id),

});

// ...

export const databaseSchema = {

articles,

users,

};

There is a good chance that we might want to find all the articles written by a particular author.

articles.service.ts

import { Injectable } from '@nestjs/common';

import { DrizzleService } from '../database/drizzle.service';

import { databaseSchema } from '../database/database-schema';

import { eq } from 'drizzle-orm';

@Injectable()

export class ArticlesService {

constructor(private readonly drizzleService: DrizzleService) {}

getByAuthorId(authorId: number) {

return this.drizzleService.db

.select()

.from(databaseSchema.articles)

.where(eq(databaseSchema.articles.authorId, authorId));

}

When we use where(eq(databaseSchema.articles.authorId, authorId)), PostgreSQL needs to scan the whole articles table to find the appropriate rows. We can verify that with the EXPLAIN ANALYZE query.

EXPLAIN ANALYZE

SELECT * FROM articles

WHERE author_id = 1

We can see that PostgreSQL performs a sequential scan, reading each row in the database to find the data that matches the filters. This approach can be slow, especially with large datasets. Let’s improve this by adding an index.

Introducing indexes

An SQL index works much like a book’s index, allowing the database to find information quickly. Let’s add an index on the author_id column to speed up our query.

database-schema.ts

import { serial, text, integer, pgTable, index } from 'drizzle-orm/pg-core';

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

authorId: integer('author_id').references(() => users.id),

(table) => ({

authorIdIndex: index('author_id_index').on(table.authorId),

}),

);

// ...

export const databaseSchema = {

articles,

// ...

};

Let’s take a look at our query now and see if there is a change

EXPLAIN ANALYZE

SELECT * FROM articles

WHERE author_id = 1

We can see that now PostgreSQL does the index scan instead of a sequential scan that uses the index and speeds up the execution time a lot.

When we create an index, PostgreSQL begins maintaining a data structure organized around a certain database column. We can visualize this as key-value pairs.

author_id	article_id
1	1
1	2
1	3
2	4
3	5
3	6

PostgreSQL uses the B-tree data structure with each leaf pointing to a specific row.

With the index in place, PostgreSQL can quickly locate all articles written by a specific author because it has a structure sorted by the author’s ID. However, there are some downsides to using indexes.

Maintaining this additional data structure requires extra space in the database. Although indexes speed up data retrieval, maintaining an additional data structure introduces extra work for the database when inserting, updating, or deleting records. This added overhead means that we must carefully plan our indexes to avoid negatively impacting overall performance.

Unique indexes

In the previous parts of this series, we created a table to store each user where each person has a unique email.

database-schema.ts

import { serial, text, integer, pgTable } from 'drizzle-orm/pg-core';

export const addresses = pgTable('addresses', {

id: serial('id').primaryKey(),

street: text('street'),

city: text('city'),

country: text('country'),

});

export const users = pgTable('users', {

id: serial('id').primaryKey(),

email: text('email').unique(),

name: text('name'),

password: text('password'),

addressId: integer('address_id')

.unique()

.references(() => addresses.id),

});

export const databaseSchema = {

addresses,

users,

};

Since we marked the email column as unique, PostgreSQL prevents us from creating two users with the same email. The database automatically creates an index to improve performance when we mark a column as unique. Let’s verify that.

We also mark the address id as unique, because a particular address should belong to only one user. If you want to know more about how to define one-to-one relationships like that, check out API with NestJS #150. One-to-one relationships with the Drizzle ORM

SELECT tablename, indexname

FROM pg_indexes

WHERE tablename='users';

Above, we can see that when we define a primary key, we also get a unique index.

Unique indexes not only help the database quickly search through existing users to determine if a particular email is unique but also improve the performance of various SELECT queries.

EXPLAIN ANALYZE

SELECT * FROM users

WHERE email = 'john@smith.com'

Indexes with multiple columns

Our queries can include multiple filters. For example, we might want to find an article written by a specific user and includes a particular title.

const articles = await this.drizzleService.db

.select()

.from(databaseSchema.articles)

.where(

and(

eq(databaseSchema.articles.authorId, 1),

eq(databaseSchema.articles.title, 'My first article'),

);

Creating separate indexes on the author_id and title columns would speed up the above query. However, we can take it further and create a multi-column index.

database-schema.ts

import { serial, text, integer, pgTable, index } from 'drizzle-orm/pg-core';

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

authorId: integer('author_id').references(() => users.id),

(table) => ({

authorIdIndex: index('author_id_index').on(table.authorId),

authorIdTitleIndex: index('author_id_title_index').on(

table.authorId,

table.title,

}),

);

// ...

export const databaseSchema = {

articles,

// ...

};

Types of indexes

So far, all of the indexes we created in this article use a B-tree structure. While it works well in most situations, there are other options available.

Generalized Inverted Indexes (GIN)

GIN indexes are particularly useful when dealing with complex data types like arrays or JSON. They can also be quite handy when implementing text search functionality.

database-schema.ts

import { serial, text, pgTable, customType, index } from 'drizzle-orm/pg-core';

import { sql, SQL } from 'drizzle-orm';

const tsvector = customType<{ data: unknown }>({

dataType() {

return 'tsvector';

});

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

textTsvector: tsvector('text_tsvector').generatedAlwaysAs(

(): SQL => sql`

setweight(to_tsvector('english', ${articles.title}), 'A') ||

setweight(to_tsvector('english', ${articles.content}), 'B')

(table) => {

return {

textTsvectorIndex: index('text_tsvector_index').using(

'gin',

table.textTsvector,

};

);

export const databaseSchema = {

articles,

};

If you want to know more about implementing the full-text search functionality with the Drizzle ORM, take a look at API with NestJS #163. Full-text search with the Drizzle ORM and PostgreSQL

To make sure that the GIN index is available in our database, we may need to enable the pg_trgm and btree_gin extensions.

1 2	CREATE EXTENSION pg_trgm; CREATE EXTENSION btree_gin;

Generalized Search Tree (GIST)

GIST indexes are useful for indexing geometric data and can also be used to implement the full-text search feature. In certain situations, they might be preferable over GIN indexes.

database-schema.ts

import { serial, text, pgTable, customType, index } from 'drizzle-orm/pg-core';

import { sql, SQL } from 'drizzle-orm';

const tsvector = customType<{ data: unknown }>({

dataType() {

return 'tsvector';

});

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

textTsvector: tsvector('text_tsvector').generatedAlwaysAs(

(): SQL => sql`

setweight(to_tsvector('english', ${articles.title}), 'A') ||

setweight(to_tsvector('english', ${articles.content}), 'B')

(table) => {

return {

textTsvectorIndex: index('text_tsvector_index').using(

'gist',

table.textTsvector,

};

);

export const databaseSchema = {

articles,

};

We may need to enable the btree_gist extension in our database to use GIST indexes.

1	CREATE EXTENSION btree_gist;

Hash indexes

Hash indexes in PostgreSQL are based on the hash table data structure and might work well in some use cases.

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

authorId: integer('author_id').references(() => users.id),

(table) => ({

titleIndex: index('title_index').using('hash', table.title),

}),

);

Block Range Indexes (BRIN)

The Block Range Indexes (BRIN) can be useful for indexing very large datasets.

export const articles = pgTable(

'articles',

{

id: serial('id').primaryKey(),

title: text('title'),

content: text('content'),

authorId: integer('author_id').references(() => users.id),

(table) => ({

titleIndex: index('title_index').using('brin', table.title),

}),

);

Summary

In this article, we explored the fundamentals of indexes and implemented examples that enhanced the performance of our SELECT queries. We also discussed the potential drawbacks of misusing indexes. Additionally, we highlighted that the default index type in PostgreSQL is the B-tree index, which is highly effective for various queries and is the most commonly used. Additionally, we touched on other index types like GIN and hash indexes.

With this knowledge, we can now effectively manage indexes when working with the Drizzle ORM.

Series Navigation<< API with NestJS #163. Full-text search with the Drizzle ORM and PostgreSQLAPI with NestJS #165. Time intervals with the Drizzle ORM and PostgreSQL >>