December 3, 2024

How to create a an index which counts documents

In this tutorial I want to show you how to create an index with a map reduce operation with RavenDB and NestJS (Javascript/Typescript). This subject is of great importance once you need to have more complex operations in your persistence model. In the last tutorial about RavenDB indexes, I’ve explained how to correctly define an index and query it. This time we are taking things further to include a reduce operation as well. This article is part if this series dedicated to RavenDB + NestJS development stack. Should you be unfamiliar with the concepts, you can start off with the first tutorial which will help you along the way.

You can also find the source code in a separate branch for each tutorial in my Github repo. For this particular tutorial, you can find the code in this branch.

What is a reduce operation

With my own words, I would say a reduce is the equivalent of SQL’s GROUP BY written in a more complex way which provides versatility leveraging in our case pure Javascript functions. Common SQL functions like COUNT, MAX, MIN, SUM or AVG can be easily achieved and much more. With RavenDB reduce, we can structure our data according to our needs, in a flexible way.

It’s important to note that while RavenDB indexes can have multiple map statements, there can be only one reduce per index. This is quite the same as in SQL where maps are represented by one SELECT and possible JOINs but only one GROUP BY can be present in the query.

You can read more about map-reduce indexes in their documentation.

Let’s create a simple COUNT index

To keep things simple, I’m going to demonstrate how to create a simple index that counts the number of documents grouped by a field. I’m going to use the same database as in the previous articles so the goal will be to have the number of movies per a particular year.


  {
      "name": "Teambuilding",
      "year": 2022,
      "tags": [
          "comedy",
          "romanian"
      ],
      "id": "29c61fd0-08c6-4cdd-ba94-71848f372dda"
  },
  {
      "name": "Raven’s Hollow",
      "year": 2022,
      "tags": [
          "drama",
          "horror"
      ],
      "id": "e62e8ecf-632e-48ac-aedc-4a654cf6108d"
  }

We have for instance the data about movies like above. We want to group by year and use the COUNT aggregation to return the number of movies per year, like we do in the SQL equivalent below:


SELECT year, COUNT(*)
FROM movies
GROUP BY year  

First we need a Map class definition

We define a basic class which will represent the data returned from the index. In this simple case, we only require the year and the number of movies:


export class MovieCountByYearMap {
  public year: number;
  public count: number;
}

Now to the index definition part

In our indexes folder, we are going to create this movie-count-by-year.index.ts file:

The class will extend from AbstractJavaScriptIndexCreationTask which takes 2 generic types as arguments, first the entity type and then the map type. In the constructor we define a map which returns an object of type MovieCountByYearMap for each entity.


export class MovieCountByYearIndex extends AbstractJavaScriptIndexCreationTask<
  MovieEntity,
  MovieCountByYearMap
> {
  constructor() {
    super();

    this.map(new MovieEntity().collectionName, (doc) => {
      return {
        year: doc.year,
        count: 1,
      };
    });
  }
}

In RavenDB, after the map operation finishes, the results are then passed to the reduce operation which aggregates the final results. Hence in the map class we need to have the property by which we are grouping (in this case the year) and any other properties necessary for the aggregation to happen. In our case, we have the count property which is initialized with 1 for each Movie entity that goes through the map operation.

The aggregation operation itself may be more challenging to understand. What happens in short is that we pass into the groupBy the field (or fields) by which we want the aggregation. Then we call the aggregate function which constructs for each group the required value by calling the reduce function. The reduce function receives as arguments two things:

  1. A function callback with the current aggregated value and the current element which is to be aggregated
  2. The initial value of the aggregation

Let’s see how it looks in our case:


this.reduce((res) => {
    return res
      .groupBy((x) => x.year)
      .aggregate((g) => ({
        year: g.key,
        count: g.values.reduce((count, val) => val.count + count, 0),
      }));
  });

🌟 Golden knowledge: the reduce operation is actually a Javascript function that can get really complex and can cause confusion so I recommend you to read more about it here.

Testing it out

OK, now it’s time to put it to the test. Like last time, we are going to create an endpoint to return these results. Before jumping into the creation of the endpoint, we need to make sure the index is created in RavenDB at app startup so please add this to the PersistenceService call:

Then we need to add the code that retrieves data from this index inside our MovieRepo class:


public async retrieveMoviesCountByYear(): Promise<MovieCountByYearMap[]> {
  const session = this.documentStore.openSession();

  const query = session.advanced
    .rawQuery<MovieCountByYearMap>(
      `from index '${MovieCountByYearIndex.name}'`,
    )
    .projection('FromIndex');

  const results = await query.all();

  session.dispose();

  return results;
}

Finally, add the endpoint implementation inside the Movie controller:


@Get('by/yearCount')
async getMoviesCountByYear() {
  return await this.movieRepo.retrieveMoviesCountByYear();
}

The last final step is to check the functionality by calling the endpoint in Postman:

Conclusion

The map-reduce operations can be quite confusing at first. No reason to be intimidated though, I think this article clears the doubts a bit. Certainly the topic can get trickier and I already have some scenarios in mind which I will try to explain in future articles. I hope you find this useful and please let me know in the comment section should you have any questions. Also please subscribe to my newsletter to support the page!

Thanks for reading, I hope you found this article useful and interesting. If you have any suggestions don’t hesitate to contact me. If you found my content useful please consider a small donation. Any support is greatly appreciated! Cheers  😉

afivan

Enthusiast adventurer, software developer with a high sense of creativity, discipline and achievement. I like to travel, I like music and outdoor sports. Because I have a broken ligament, I prefer safer activities like running or biking. In a couple of years, my ambition is to become a good technical lead with entrepreneurial mindset. From a personal point of view, I’d like to establish my own family, so I’ll have lots of things to do, there’s never time to get bored 😂

View all posts by afivan →