November 21, 2024

Must know: easy and elegant way to create RavenDB indexes (NestJS + RavenDB tutorial 3)

Hello folks, this article explains everything you need to know about an index in RavenDB. I’m going to show you how to create an index in RavenDB by approaching it code-first and afterwards explain some caveats you may encounter.

The advantage (elegance) of my approach here is that we get type completion in our index definition, so it’s less likely to get runtime errors. Indexes in RavenDB are quite often misunderstood but are really important in our daily work with RavenDB so grasping them really helps a lot with the design and architecture.

This article is part of a tutorial series, so if you feel uncomfortable with the concepts, I advise you to start with the first article in the series. You can also find the source code in my Github repo with one separate branch for each tutorial.

RavenDB indexes

So in RavenDB an index is a data model derived from stored JSON documents which can be queried, searched and even restored as separate collections. An index is being built in background and any data change will trigger an index rebuild as well (this happens in an efficient way). Indexes can be queried in the RavenDB Studio by using the LINQ-like syntax or with a client library like the one we are using in NodeJS.

While most indexes are defined by the user, there are times when RavenDB automatically creates an index behind the hood when you query documents. You may notice that the query running time is higher in the beginning while the index is being created but then runs smoothly after. An example can be found here:

More than that, there are a couple of ways to write an index:

  • With the help of the LINQ-like syntax
  • If you use the C# client library, directly in C# (on the server side I think it gets converted to LINQ format)
  • By writing Javascript code directly

While there’s no preferred way to write them down, I prefer Javascript as it offers more freedom and versatility. Plus, my app is written with Typescript/Javascript so I use the same lingua-franca everywhere 🙂

Before we get to coding, I think it’s important to understand the structure of an index:

  1. At least one Map. This is basically a collection from which the index will pick up documents. There can be more than one map, but the output needs to stay the same for all maps.
  2. One or none Reduce. A reduce is a group by and then the aggregation performed to the output of the previous operation. This can be confusing at first, but it’s more or less the equivalent of the GROUP BY clause in SQL queries

Coding an index

Ok, let’s now code an index in our RavenNest application. I will work in the tutorial3-index branch which has been started from tutorial2-repository. In our app we have a collection called Movies and the entities are defined as such:

export class MovieEntity extends BaseEntity {
  get collectionName(): string {
    return 'Movies';
  }

  public name: string;
  public year: number;
  public tags: string[];
}

And inside the database I have created this data:

[
  {
    "name": "Teambuilding",
    "year": 2022,
    "tags": [
      "comedy",
      "romanian"
    ],
    "id": "29c61fd0-08c6-4cdd-ba94-71848f372dda"
  },
  {
    "name": "Raven’s Hollow",
    "year": 2022,
    "tags": [
      "drama",
      "horror"
    ],
    "id": "e62e8ecf-632e-48ac-aedc-4a654cf6108d"
  },
  {
    "name": "Dune",
    "year": 2021,
    "tags": [
      "epic",
      "science fiction"
    ],
    "id": "9b72eaa9-d6f1-47fc-9bfd-49517e25bfb4"
  },
  {
    "name": "Pup-o, mă!",
    "year": 2018,
    "tags": [
      "comedy",
      "romanian"
    ],
    "id": "f0eaa60f-b4e5-4386-bf2f-731687fce01a"
  }
]

As you can see, it’s a simple collection so for start I would like to create a simple map index which just returns, let’s say all the movie properties plus a generic description generated from the other fields. A minimal working index definition looks like this:

export class MovieDescriptionIndex extends AbstractJavaScriptIndexCreationTask<
  MovieEntity,
  MovieDescriptionMap
> {
  constructor() {
    super();
    this.map(new MovieEntity().collectionName, (doc) => {
      const description =
        doc.name +
        ' from year ' +
        doc.year +
        ' classified as: ' +
        doc.tags.join(', ');

      return {
        movieId: doc['@metadata']['@id'],
        name: doc.name,
        year: doc.year,
        tags: doc.tags,
        description,
      };
    });
  }
}

The definition of MovieDescriptionMap is the same as for the Movie, but we just add the string called description. This is the actual type of the result we want to get out of the index. Then, inside the index, we basically tell RavenDB from where to take the documents (in our case the Movies collection) then define a projection for each entity. Inside the projection, we initialize the description with data from the actual movie itself. This is just a string with the name of the movie, the year and the tags flattened together and separated by commas.

Important: it’s good to bear in mind that string literals do not work by default in RavenDB. We could have written the description as:

description = `${doc.name} from year ${doc.year} classified as: ${doc.tags.join(', ')}`

But it’s not supported by default; and as well other new JS lang features may not be available. The idea is that inside a projection you can leverage nice JS calls to define your index but don’t add anything fancy there, like business logic or API calls 😄.

How it looks in the RavenDB Studio

You can check the index inside RavenDB Studio. Just head out to the Indexes section and query the MovieDescriptionIndex. Your view should look similar to this:

Now a few quirks come up if we take a closer look. Notice that the results are missing the description field which is defined in the index. This happens because Raven associates each result of our index with the document and this is returned automatically instead of the raw index entry. There is an option in the UI to change that:

Now we can see the exact index entry including the description field defined in our index. We can notice another weird thing now: the name field contains only lowercase entries, although we have uppercase names in our movies collection. This happens for a good reason though, RavenDB automatically indexes the entries for fast look ups. If you need the exact value, the field indexing type should be set up as Exact. This can be achieved in our index Javascript definition like so:

this.index('name', 'Exact');

This actually will make the field unsearchable. If you need the field to be searchable as well, the Store option needs to be activated for the field:

this.store('name', 'Yes');

Query the index programmatically

So far so good, we have an understanding about indexes now! It’s time we query the index programmatically and send the results to the client by creating an API endpoint. Inside the MovieRepo class we can add this method to query our index:

  public async retrieveMoviesWithDescriptions(): Promise<
    MovieDescriptionMap[]
  > {
    const session = this.documentStore.openSession();

    const results = await session
      .query({
        indexName: MovieDescriptionIndex.name,
        documentType: MovieDescriptionMap,
      })
      .all();

    session.dispose();

    return results;
  }

And we just need to add another simple endpoint to return this data in the MovieController class:

  @Get('by/descriptions')
  async getMoviesWithDescriptions() {
    return await this.movieRepo.retrieveMoviesWithDescriptions();
  }

OK, let’s see that in action by calling the endpoint with Postman. We will actually come across a surprise:

Notice that we have the same quirk as in the Studio, with the query actually returning the document instead of its raw entries. Theoretically, it should be fixed by changing the query a bit:

const query = session.advanced
      .rawQuery<MovieDescriptionMap>(`from index '${MovieDescriptionIndex.name}'`)
      .projection('FromIndex');

    const results = await query.all();

But at the time of writing it seems that this does not work at all, most likely because of a bug inside the client library. I tried to overcome this, but the only workaround I found was to add a dumb reduce operation in the index like this:

this.reduce((res) => {
    return res
      .groupBy((x) => x.movieId)
      .aggregate((g) => {
        return g.values[0];
      });
  });

This will do the trick for now and will return expected results 😉 The next article will explain more about the Reduce operations, so stay tuned.

Conclusion

As you can observe, developing software is not straightforward at all and sometimes we need to resolve to workarounds. On the good side, we acquired some knowledge about the key topic of indexes of RavenDB with it’s quirks and advantages. I believe the flexibility of indexes opens many doors in developing your app in an elegant, concise and maintainable way. So I hope this introductory article really helps you and don’t hesitate to contact me if you have any concerns/questions. Wish you happy coding!

Thanks for reading, I hope you found this article useful and interesting. If you have any suggestions don’t hesitate to contact me. If you found my content useful please consider a small donation. Any support is greatly appreciated! Cheers  😉

afivan

Enthusiast adventurer, software developer with a high sense of creativity, discipline and achievement. I like to travel, I like music and outdoor sports. Because I have a broken ligament, I prefer safer activities like running or biking. In a couple of years, my ambition is to become a good technical lead with entrepreneurial mindset. From a personal point of view, I’d like to establish my own family, so I’ll have lots of things to do, there’s never time to get bored 😂

View all posts by afivan →