December 21, 2024

RavenDB full text search – blazing fast to implement

RavenDB full text search, which is an out of the box functionality of RavenDB. When it comes to full text search, RavenDB does the job for you taking care of splitting, term vectors etc. so that you can focus on the application logic and not waste time with implementation details.

For the record, RavenDB uses analyzers to extract search terms, if you are interested to find out more please have a look at this documentation. The setup which is used for this tutorial as well for the others in the series can be found in this Github repo, the tutorial8-search branch. You can also find there a RavenDB backup which you can restore in order to follow through this article.

Setting up an index with the search fields

So let’s dive in! First, we need an index with a custom search field which encapsulates searchable data for an entity. Over the last tutorials of the series, we used indexes a lot and compared to those, this index may look very primitive:

export class MovieSearchIndex extends AbstractJavaScriptIndexCreationTask<
  MovieEntity,
  MovieSearchMap
> {
  constructor() {
    super();
    this.map(new MovieEntity().collectionName, (doc) => {
      const searchTerms =
        doc.name + ' ' +
        doc.year + ' ' +
        doc.tags.join(' ');

      return {
        movieId: doc['@metadata']['@id'],
        name: doc.name,
        year: doc.year,
        tags: doc.tags,
        searchTerms,
      };
    });

    this.index('searchTerms', 'Search');
    this.store('searchTerms', 'Yes');
  }
}

In the index definition we can notice that we build this field searchTerms based on the fields that we want to allow search for: name, year and of course tags. They are simply concatenated and separated by a whitespace. Then the RavenDB search engine takes care of indexing it properly. We just need to specify that searchTerms as index ‘Search’ and also store the value inside the index (might also work without it, didn’t check). Also don’t forget to add it in the index creation list in PersistenceService:

Once this is in place, we can start testing it directly in the Studio:

As you can observe, we can now search for movies based on year, tag or name, all of them working really fast and efficient!

Exposing this feature through API

Of course, we are not going to settle down only with this, let’s have a proper way to retrieve those by demonstrating an API endpoint creation. First we need a method in the repository to be able to query our search index. This would look like this:

public async searchMovies(term: string): Promise<MovieEntity[]> {
  const session = this.documentStore.openSession();

  const query = session.query({
      indexName: MovieSearchIndex.name,
      documentType: MovieSearchMap,
    })
    .search('searchTerms', `*${term}*`)

  const results = await query.all();

  session.dispose();

  // To remove @metadata which is unnecessary
  return results.map(this.metadataRemove);
}

Then what is left is to expose the API which will call this method and return the results to the user:

@Get('search/movies')
async searchMovies(@Query('term') term: string) {
  if (term === null || term === '') {
    throw new BadRequestException();
  }
  return await this.movieRepo.searchMovies(term);
}

Good, our final step is to test it with the Postman tool and ensure proper results are being retrieved from the storage:

Yes! It works as expected, so we did a great job today 😉

Conclusion

The search is very fast and really works without much intervention from the developers perspective. Of course it can be tuned a bit, it will not compare to Google for sure. I see two directions of improvement here in our scenario:

  1. Sanitize the input for the term in the query string. This is a necessity because any character can flow in and potentially be disruptive such a wildcard which is interpreted by RavenDB. Let me know if you need some advice here, I can provide you with useful info
  2. A complementary feature for the full text search is to have a suggestions mechanism which provides user with info in case of typos. RavenDB has again out of the box support for this and it will be subject for the next tutorial to demonstrate this as well

Still, the way it is now it’s very solid and you can build your app on it knowing that it will scale well and performance will be good. Hence, users will be happy for having such a good searching feature.

Thanks for reading, I hope you found this article useful and interesting. If you have any suggestions don’t hesitate to contact me. If you found my content useful please consider a small donation. Any support is greatly appreciated! Cheers  😉

afivan

Enthusiast adventurer, software developer with a high sense of creativity, discipline and achievement. I like to travel, I like music and outdoor sports. Because I have a broken ligament, I prefer safer activities like running or biking. In a couple of years, my ambition is to become a good technical lead with entrepreneurial mindset. From a personal point of view, I’d like to establish my own family, so I’ll have lots of things to do, there’s never time to get bored 😂

View all posts by afivan →