API with NestJS #163. Full-text search with the Drizzle ORM and PostgreSQL

NestJS SQL

This entry is part 163 of 168 in the API with NestJS

With PostgreSQL’s full-text search feature, we can quickly find documents that contain a particular word or phrase. It can also sort the results to show the most relevant matches first. In this article, we learn how to implement it with the Drizzle ORM, PostgreSQL, and NestJS.

Column types used with the text-search feature

To implement the full-text search with PostgreSQL, we need two data types. They allow us to search through a set of texts and find the ones that best match a given query.

tsvector

With the column, we can store the text in a format optimized for searching. Unfortunately, the Drizzle ORM does not support it natively yet. Therefore, we must understand how it works from the ground up and use some raw SQL in our code.

To convert a regular string to the format, we need to use the function.

The quick brown fox jumps over the lazy dog” is a common sentence that contains all the letters of the alphabet.

When we examine the result of the above query, several optimizations become apparent. The most apparent is duplicate grouping. Using the English dictionary, PostgreSQL recognized that “quick” and “quickly” are different forms of the same word.

Besides that, the type filters the stop words. These common words appear in almost every sentence but don’t add much value when searching through text. In the example above, since we used the English dictionary, PostgreSQL automatically filtered out words like “the” and “over.” This is a test.

tsquery

The data type is designed to store the text we want to search for. To easily convert a string into the format, we must use the function.

To verify if the  data matches a , we must use the operator.

We can use the boolean operators such as , , and . For example, the operator helps ensure that a particular word is not included in the text.

Check out the official documentation for a description of all operators.

Another helpful function is . It converts an unformatted phrase to a query by inserting the operator between words, making it a great option for handling the user’s input.

Working with existing data

In the previous parts of this series, we’ve worked with the following database schema.

database-schema.ts

Since our table does not contain a column, we must find another solution. The most straightforward approach is to convert our text to on the fly.

articles.service.ts

To use this approach in our NestJS application, we should use an optional query param that allows the user to provide the search query.

articles.controller.ts

We can combine the and columns to search through them both.

articles.service.ts

The key problem with this approach is that it forces PostgreSQL to convert the text from every record in the table, which can be very time-consuming. Instead, we can define a generated column that automatically transforms the data into the format.

If you want to know more about generated colums, check out API with NestJS #161. Generated columns with the Drizzle ORM and PostgreSQL

Unfortunately, the Drizzle ORM does not support the columns. To deal with that, we have to define a custom type.

database-schema.ts

Above, we create a stored generated column. PostgreSQL automatically updates it whenever the or columns change.

We also create a Generalized Inverted Index (GIN). This index type is highly effective for text searching and is ideal when a column contains multiple values. Implementing a GIN index can significantly enhance the speed of our queries.

Thanks to this approach, we can now simplify our query.

articles.service.ts

Ordering our results

Until now, we haven’t focused on the order of the results in our query. Sorting the search results by relevance can significantly improve the user experience.

For instance, we can prioritize the text from the column over the column. To achieve this, we can modify how we create the column and use the function.

database-schema.ts

With the function, we assign a weight to each term in the tsvector column. The A indicates the highest weight, and the D indicates the lowest.

Thanks to this, we can use the function to order the results based on the weight of each column.

With this approach, if our query matches the title of one article and the content of another, the article with the title match will be ranked higher.

Summary

In this article, we implemented the full-text search functionality in an application that uses PostgreSQL, NestJS, and the Drizzle ORM. To do that, we had to understand the and data types and learn how to use them in the Drizzle ORM. To boost the performance, we created a generated column and set up a Generalized Inverted Index. Thanks to this, we achieved a fast and efficient search mechanism, even though the Drizzle ORM does not natively support some of the necessary features.

Series Navigation<< API with NestJS #162. Identity columns with the Drizzle ORM and PostgreSQLAPI with NestJS #164. Improving the performance with indexes using Drizzle ORM >>
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments