API with NestJS #12. Introduction to Elasticsearch

September 7, 2020

This entry is part 12 of 187 in the API with NestJS

We can find some searching functionalities in a lot of web applications. While we might be fine when iterating through a small data set, the performance for more extensive databases can become an issue. Relational databases might prove to be relatively slow when searching through a lot of data.

A solution to the above problem might be Elasticsearch. It is a search engine that highly focuses on performance. When using it, we maintain a separate document-oriented database.

If you are familiar with MongoDB, document-oriented databases will ring a bell for you. In theory, we might use Elasticsearch as a general-purpose database. It wasn’t designed for this purpose, though. If you would like to read more about it, check out this question on Stackoverflow.

Running Elasticsearch

Running Elasticsearch includes maintaining a separate, search-optimized database. Because of that, we need to choose one of the ways to fire it up.

In the second part of this series, we’ve started using Docker Compose. Therefore, a fitting way to start using Elasticsearch would be to do so through Docker. When we go to the official Elasticsearch documentation, we can see an example using Docker Compose. It includes three nodes.

An Elasticsearch cluster is a group of one or more Elasticsearch nodes connected. Each node is an instance of Elasticsearch.

Let’s add the above official configuration to our existing file.

docker-compose.yml

version: "3"

services:

postgres:

container_name: postgres

image: postgres:latest

ports:

- "5432:5432"

volumes:

- /data/postgres:/data/postgres

env_file:

- docker.env

networks:

- postgres

pgadmin:

links:

- postgres:postgres

container_name: pgadmin

image: dpage/pgadmin4

ports:

- "8080:80"

volumes:

- /data/pgadmin:/root/.pgadmin

env_file:

- docker.env

networks:

- postgres

es01:

image: docker.elastic.co/elasticsearch/elasticsearch:7.9.1

container_name: es01

environment:

- node.name=es01

- cluster.name=es-docker-cluster

- discovery.seed_hosts=es02,es03

- cluster.initial_master_nodes=es01,es02,es03

- bootstrap.memory_lock=true

- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

ulimits:

memlock:

soft: -1

hard: -1

volumes:

- data01:/usr/share/elasticsearch/data

ports:

- 9200:9200

networks:

- elastic

es02:

image: docker.elastic.co/elasticsearch/elasticsearch:7.9.1

container_name: es02

environment:

- node.name=es02

- cluster.name=es-docker-cluster

- discovery.seed_hosts=es01,es03

- cluster.initial_master_nodes=es01,es02,es03

- bootstrap.memory_lock=true

- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

ulimits:

memlock:

soft: -1

hard: -1

volumes:

- data02:/usr/share/elasticsearch/data

networks:

- elastic

es03:

image: docker.elastic.co/elasticsearch/elasticsearch:7.9.1

container_name: es03

environment:

- node.name=es03

- cluster.name=es-docker-cluster

- discovery.seed_hosts=es01,es02

- cluster.initial_master_nodes=es01,es02,es03

- bootstrap.memory_lock=true

- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

ulimits:

memlock:

soft: -1

hard: -1

volumes:

- data03:/usr/share/elasticsearch/data

networks:

- elastic

volumes:

data01:

driver: local

data02:

driver: local

data03:

driver: local

networks:

postgres:

driver: bridge

elastic:

driver: bridge

You might run into an issue when doing the above: es01 exited with code 78. There is a high chance that increasing the vm.max_map_count will help, as described here.

By default, the password for Elasticsearch is changeme. To set up a password, we can add it to our docker.env file:

docker-compose.yml

1 2	(...) ELASTIC_PASSWORD=admin

The default username is “elastic“

Connecting to Elasticsearch in NestJS

To use Elasticsearch within our NestJS project, we can use the official @nestjs/elasticsearch library.

It wraps the @elastic/elasticsearch client. Since it is a peer dependency of @nestjs/elasticsearch, we need to install it.

Don’t confuse it with the “elasticsearch” client that will soon be deprecated.

1	npm install @nestjs/elasticsearch @elastic/elasticsearch

Due to how we did set up Elesticsearch, our cluster is available at http://localhost:9200. Our username is elastic, and the password is admin. We need to add all of the above to our environment variables.

.env

(...)

ELASTICSEARCH_NODE=http://localhost:9200

ELASTICSEARCH_USERNAME=elastic

ELASTICSEARCH_PASSWORD=admin

Now we can create our module that uses the above configuration.

/src/search/search.module.ts

import { Module } from '@nestjs/common';

import { ConfigModule, ConfigService } from '@nestjs/config';

import { ElasticsearchModule } from '@nestjs/elasticsearch';

@Module({

imports: [

ConfigModule,

ElasticsearchModule.registerAsync({

imports: [ConfigModule],

useFactory: async (configService: ConfigService) => ({

node: configService.get('ELASTICSEARCH_NODE'),

auth: {

username: configService.get('ELASTICSEARCH_USERNAME'),

password: configService.get('ELASTICSEARCH_PASSWORD'),

}

}),

inject: [ConfigService],

}),

exports: [ElasticsearchModule]

})

export class SearchModule {}

We export the ElasticsearchModule above so that we are able to use some of its features when importing SearchModule as suggested here

Populating Elasticsearch with data

The first thing to consider when populating Elasticsearch with data is the concept of the index. In the context of Elasticsearch, we group similar documents by assigning them with the same index.

In the previous versions of Elasticsearch we also used types to group documents, but this concept is being abandoned

When populating the Elasticsearch database with data, we throw in only the parts that we later use when searching. Let’s create an interface for that purpose.

/src/posts/types/postSearchBody.interface.ts

interface PostSearchBody {

id: number,

title: string,

content: string,

authorId: number

}

The TypeScript support with Elasticsearch is not that good, unfortunately. Following the official documentation, we can create a search response type for our posts.

/src/posts/types/postSearchBody.interface.ts

import PostSearchBody from './postSearchBody.interface';

interface PostSearchResult {

hits: {

total: number;

hits: Array<{

_source: PostSearchBody;

}>;

};

}

When we’re done with the above, we can create a service that takes care of interacting with our Elasticsearch cluster.

/src/posts/postsSearch.service.ts

import { Injectable } from '@nestjs/common';

import { ElasticsearchService } from '@nestjs/elasticsearch';

import Post from './post.entity';

import PostSearchResult from './types/postSearchResponse.interface';

import PostSearchBody from './types/postSearchBody.interface';

@Injectable()

export default class PostsSearchService {

index = 'posts'

constructor(

private readonly elasticsearchService: ElasticsearchService

) {}

async indexPost(post: Post) {

return this.elasticsearchService.index<PostSearchResult, PostSearchBody>({

index: this.index,

body: {

id: post.id,

title: post.title,

content: post.content,

authorId: post.author.id

}

})

}

async search(text: string) {

const { body } = await this.elasticsearchService.search<PostSearchResult>({

index: this.index,

body: {

query: {

multi_match: {

query: text,

fields: ['title', 'content']

}

})

const hits = body.hits.hits;

return hits.map((item) => item._source);

}

Above we use multi_match becase we want to search both through the title and the content of the posts

The crucial thing to acknowledge about elasticsearchService.search is that it returns just the properties that we’ve put into the Elasticsearch database. Since we save the ids of the posts, we can now get the whole documents from our Postgres database. Let’s put this logic into PostsService.

/src/posts/posts.service.ts

import { Injectable } from '@nestjs/common';

import CreatePostDto from './dto/createPost.dto';

import Post from './post.entity';

import { InjectRepository } from '@nestjs/typeorm';

import { Repository, In } from 'typeorm';

import User from '../users/user.entity';

import PostsSearchService from './postsSearch.service';

@Injectable()

export default class PostsService {

constructor(

@InjectRepository(Post)

private postsRepository: Repository<Post>,

private postsSearchService: PostsSearchService

) {}

// (...)

async createPost(post: CreatePostDto, user: User) {

const newPost = await this.postsRepository.create({

...post,

author: user

});

await this.postsRepository.save(newPost);

this.postsSearchService.indexPost(newPost);

return newPost;

}

async searchForPosts(text: string) {

const results = await this.postsSearchService.search(text);

const ids = results.map(result => result.id);

if (!ids.length) {

return [];

}

return this.postsRepository

.find({

where: { id: In(ids) }

});

}

The last thing to do is to modify the controller so that it accepts a query parameter.

/src/posts/posts.controller.ts

import {

Controller,

Get,

UseInterceptors,

ClassSerializerInterceptor, Query,

} from '@nestjs/common';

import PostsService from './posts.service';

@Controller('posts')

@UseInterceptors(ClassSerializerInterceptor)

export default class PostsController {

constructor(

private readonly postsService: PostsService

) {}

@Get()

async getPosts(@Query('search') search: string) {

if (search) {

return this.postsService.searchForPosts(search);

}

return this.postsService.getAllPosts();

}

// (...)

}

Don’t forget to import the SearchModule in the PostsModule.

Keeping Elasticsearch consistent with our database

Through our API, we can also edit and delete posts. Therefore, we need to put some effort into keeping the Elasticsearch database consistent with our Postgres instance.

Deleting documents

Since we save the id of the post in our Elasticsearch database, we can use it to find it and delete it. To do so, we can use the deleteByQuery function.

/src/posts/postsSearch.service.ts

import { Injectable } from '@nestjs/common';

import { ElasticsearchService } from '@nestjs/elasticsearch';

@Injectable()

export default class PostsSearchService {

index = 'posts'

constructor(

private readonly elasticsearchService: ElasticsearchService

) {}

// (...)

async remove(postId: number) {

this.elasticsearchService.deleteByQuery({

index: this.index,

body: {

query: {

match: {

id: postId,

}

})

}

Let’s call the above method in PostsService every time we delete a post.

/src/posts/posts.service.ts

import { Injectable } from '@nestjs/common';

import Post from './post.entity';

import { InjectRepository } from '@nestjs/typeorm';

import { Repository, In } from 'typeorm';

import PostNotFoundException from './exceptions/postNotFound.exception';

import PostsSearchService from './postsSearch.service';

@Injectable()

export default class PostsService {

constructor(

@InjectRepository(Post)

private postsRepository: Repository<Post>,

private postsSearchService: PostsSearchService

) {}

// (...)

async deletePost(id: number) {

const deleteResponse = await this.postsRepository.delete(id);

if (!deleteResponse.affected) {

throw new PostNotFoundException(id);

}

await this.postsSearchService.remove(id);

}

Modifying documents

The other thing to make sure that the Elasticsearch database is consistent with our main database is to modify existing documents. To do that, we can use the updateByQuery function.

Unfortunately, we need to write a script that updates all of the necessary fields. For example, to update the title and the content, we need:

1	ctx._source.title='New title'; ctx._source.content= 'New content';

We can create the above script dynamically.

/src/posts/postsSearch.service.ts

import { Injectable } from '@nestjs/common';

import { ElasticsearchService } from '@nestjs/elasticsearch';

import Post from './post.entity';

import PostSearchBody from './types/postSearchBody.interface';

@Injectable()

export default class PostsSearchService {

index = 'posts'

constructor(

private readonly elasticsearchService: ElasticsearchService

) {}

// (...)

async update(post: Post) {

const newBody: PostSearchBody = {

id: post.id,

title: post.title,

content: post.content,

authorId: post.author.id

}

const script = Object.entries(newBody).reduce((result, [key, value]) => {

return `${result} ctx._source.${key}='${value}';`;

}, '');

return this.elasticsearchService.updateByQuery({

index: this.index,

body: {

query: {

match: {

id: post.id,

}

script: {

inline: script

}

})

}

Now we need to use the above method whenever we modify existing posts.

/src/posts/posts.service.ts

import { Injectable } from '@nestjs/common';

import Post from './post.entity';

import UpdatePostDto from './dto/updatePost.dto';

import { InjectRepository } from '@nestjs/typeorm';

import { Repository, In } from 'typeorm';

import PostNotFoundException from './exceptions/postNotFound.exception';

import PostsSearchService from './postsSearch.service';

@Injectable()

export default class PostsService {

constructor(

@InjectRepository(Post)

private postsRepository: Repository<Post>,

private postsSearchService: PostsSearchService

) {}

async updatePost(id: number, post: UpdatePostDto) {

await this.postsRepository.update(id, post);

const updatedPost = await this.postsRepository.findOne(id, { relations: ['author'] });

if (updatedPost) {

await this.postsSearchService.update(updatedPost);

return updatedPost;

}

throw new PostNotFoundException(id);

}

The Elasticsearch documents also have ids. An alternative to the above deletes and updates would be to store the Elasticsearch id in our Postgres database and use it when deleting and updating.

Summary

Today we’ve learned the very basics of Elasticsearch. When doing so, we’ve added it to our NestJS API. We’ve also created our documents and searched through them. All of that is the tip of the Elasticsearch iceberg. There is a lot more to learn here, so stay tuned!

Series Navigation<< API with NestJS #11. Managing private files with Amazon S3API with NestJS #13. Implementing refresh tokens using JWT >>

15 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Rafael Morales

5 years ago

Is there a way to implement tests for the PostsSearchService , since it will require you to instantiate the elasticsearchService dependency on its constructor?

Jean Bosco

5 years ago

Reply to Rafael Morales

Hello. @Rafael,
Yes, you can be able to test this service with a help of TestingModule from @nestjs/testing, you can be able to override a provide and set its expected return values, or using the Jest library you can spy on the instance of that injected elastic search service.

rahsut

5 years ago

How can I add/sync existing data to elasticsearch? So is there any package available for nestJS or I have to do it in another way? If that then can you please suggest me?

Marcelo Amorim

5 years ago

Reply to rahsut

Hi @rasut,

Here we usually use two forms and it depends of your necessity:

Use Logstash (another ELK service) to run ETL process scheduled at container like a cron;
Create a NodeJS code to stream all the documents programatically using the Nest implementation as described by the author.

Best regards

Stephanie

4 years ago

Ehm, how do we use the elastic search in nestjs? 🙂

Akifcan Kara

4 years ago

Thank you so much this article is very helpful. I want to share solution an error if someone faced unknown product error. you have to have same versions with elastic and client ex

docker.compose
image: docker.elastic.co/elasticsearch/elasticsearch:7.11.1
package.json
“@elastic/elasticsearch“: “^7.11.0”,

Harsh

4 years ago

Hi
I am getting a error like this
“Cannot read property ‘cloud’ of undefined”
in my nestjs

Endress

2 years ago

Reply to Harsh

Hi did you find a solution about this issue ? I’m currently facing the same issue :/

ChauVanLoc

2 years ago

Reply to Endress

Ensure elasticsearch in source and docker same version

crazyoptimist

4 years ago

Thank you it helped a lot!

crazyoptimist

4 years ago

“id” field is not showing up in the search result items. I’m using elasticsearch v8.1.3.
Is this only for me or anyone having the same issue?

Thanks in advance.

Last edited 4 years ago by crazyoptimist

crazyoptimist

4 years ago

Never mind, found out what I was doing wrong.
In order to index id of a post item, you need to save it into the db first and index the saved result, you won’t get id until it’s saved.
For more detailed one:
https://stackoverflow.com/questions/72063044/id-field-is-missing-from-search-result-in-elasticsearch

Last edited 4 years ago by crazyoptimist

Shamim

4 years ago

If you are wondering how to set up the latest version (8.2.1 at this time) of ElasticSearch using docker and implement the above. You can follow this repo https://github.com/shamscorner/nest-stackter

By the way, body implementation on ElasticSearch is deprecated. So there are some changes to the latest versions. You will find everything on the mentioned repo.

Apisit Lee

3 years ago

What should I do if I have 3 modules (such as Books, Cars, Animals) and want to search them in one request? Just mix them.
The request may look like: https://api.xxx.com/search?fields=books,cars,animals&search=puma&offset=20&limit=20

Rajasekhar

3 years ago

What is there inside post.entity??