Hello there, hope you are fine and you are in good shape to learn something new. Lately, I’ve been dealing around with NestJS and RavenDB for one small project of mine and I encountered a situation where many documents had to be removed by a CRON job. It was working fine, but sometimes the number of documents was quite big, so the impact on performance was visible. Let’s go through together and see the better way to do this 😉
I’m going to use this repository as a codebase for the samples provided here, so feel free to clone it. You can also check this article which has the same codebase. It uses NestJS and NodeJS.
How to delete a document in RavenDB
This is a trivial operation in RavenDB. The delete function can receive either a whole document or just the id of the document you plan to delete. It’s important to mind the fact that if you want to provide a whole document, this must be previously loaded into the session, otherwise you’ll get a runtime error. The code to delete a document looks like this:
const session = this.documentStore.openSession();
await session.delete(doc.id); // Or simply doc if the document is loaded into the session previously
await session.saveChanges();
session.dispose();
As you can see, we open a session, we call the delete method and we conclude the call by saving the changes to the database. Lastly, we dispose the session from the memory.
How would we handle the deletion of more documents in one go
Basically, to delete more documents, we just modify the method a little bit to look like this:
public async deleteDocuments(docs: TestDto[]): Promise<void> {
const session = this.documentStore.openSession();
for (const d of docs) {
await session.delete(d.id);
}
await session.saveChanges();
session.dispose();
}
Cool, it looks alright, now it’s time to put it to work! To test it, I created a controller method to generate a given number of documents and I called it with Postman to generate 5000 documents inside the database. The controller method in NestJS looks like this:
@Post('generateMajorDocuments/:nr')
@HttpCode(HttpStatus.CREATED)
async generateMajorDocuments(@Param('nr') nr: number, ) {
for (let i = 0; i < nr; i++) {
await this.persistence.storeDocument({
birth: new Date(),
cucu: `${i}`,
id: null,
major: true,
});
}
}
Time to proceed to deletion, since we have the major documents created. As a sample, we will delete documents that have the major property set to true. The first step is to create a query that will return the documents corresponding to the condition, then call the actual delete method inside a controller:
public async getMajorDocuments(): Promise<TestDto[]> {
const session = this.documentStore.openSession();
const results = await session.query({
collection: "TestDtos",
documentType: TestDto,
})
.whereEquals('major', true)
.all();
session.dispose();
return results;
}
Inside the controller we expose the method to delete major documents:
@Delete('deleteMajorDocuments')
async deleteMajorDocuments() {
const documents = await this.persistence.getMajorDocuments();
await this.persistence.deleteDocuments(documents);
}
Deletion works but …
Our deletion mechanism works with no major hurdles for less than 1000 records. It’s no big deal, we load them into memory and we instruct RavenDB to delete them one by one. But for larger sets of data we need to have the following in mind:
- Server will be overloaded with lots of records into the memory
- RavenDB will work a lot to bring those records and then to execute the one by one deletion
- Performance will suffer drastically on mission critical apps used by lots of clients
You can notice in Raven Studio that the database itself complains about too many records retrieved in one go. I remember older versions of Raven were paginating the results by default and the max page size was 1000. But it seems newer versions no longer do this automatically.
The solution
The solution is simple but can be a bit tricky to see because in RavenDB there’s no such thing like “DELETE FROM” as SQL does. Instead we have database operations in place and one operation is called DeleteByQueryOperation. This is imported from the ravendb npm package itself and it’s used like this in our case:
public async deleteMajorDocuments(): Promise<void> {
await this.documentStore.operations.send(new DeleteByQueryOperation('from TestDtos where major = true'));
}
And our controller efficient method looks like this:
@Delete('deleteMajorDocumentsEfficient')
async deleteMajorDocumentsEfficient() {
await this.persistence.deleteMajorDocuments();
}
Not only the code is more “breathable”, but it’s also efficient:
- The records are not loaded into memory at all, we just specify the conditions for it to be deleted
- The database system takes care itself of the deletion in its own efficient way
- App performance is not impacted in any way
RavenDB Studio also proves the efficient deletion of respective documents:
Conclusion
As always, I end with a conclusion but this time it seems obvious: any bulk deletion should be done by the DeleteByQueryOperation. This is how we can be sure performance and user experience is not impacted. Moreover, RavenDB will be able to work at full capacity and be reliable 😀
Thanks for reading, I hope you found this article useful and interesting. If you have any suggestions don’t hesitate to contact me. If you found my content useful please consider a small donation. Any support is greatly appreciated! Cheers 😉