Building Microservices On .NET Core – Part 5 Marten An Ideal Repository For Your Domain Aggregates

This is the fifth article in our series about building microservices on .NET Core. In the first article we introduced the series and prepared the plan: business case and solution architecture. In the second article we described how you can structure internal architecture of one microservice using CQRS pattern and MediatR library. In third article we described the importance of service discovery in microservice based architecture and presented practical implementation with Eureka. In fourth part we presented how you can build API Gateways for your microservices with Ocelot.

In this article we are going to step back little bit and talk about data access and how to effectively persist data.

Source code for complete solution can be found on our GitHub.

Persistence is a solved issue, isn’t it?

When first versions of .NET Framework came out around 2002, we had two main APIs for data access: data sets and data readers. Data sets, where in memory representation of tables in your database, data reader on the other hand promised to let you read your data fast, but you had to manually push it into your objects. Many developers discovered the power of reflections and almost everyone developed their own ORM. Back in these days my team evaluated few of such frameworks, but none of these seem to be an adequate solution for us, as we were building complex apps for insurance industry. So we decided to use DataReaders and hand coded SQL for each insert, update and search. Few years later we build something that today would be called micro ORM. Even with our simple home grown tool we were able to eliminate about 70% of data access code. Then came the era of NHibernate. As a developer with Java experience I was jealous that my Java colleagues have such a great library and when early versions of NHibernate were available I was eager to try it. I think it was version 2.0 when we started using NHibernate in production. For many years NHibernate was our first choice and helped us in many projects. It is incredible flexible and feature-rich library. But Microsoft decided to implement their own proprietary solution – Entity Framework. As it was heavily promoted, many .NET shops decided to switch to EF and NHibernate popularity and community started to shrink.
Then Microsoft introduced .NET Core and again everything changed. They did not port existing Entity Framework but instead decided to develop a new version of it from scratch. As a result first versions of .NET Core didn’t really have enterprise grade solution for data access. We currently almost have .NET Core 3 and EF Core is still lacking many features you expect from mature ORM. NHibernate finally arrived on .NET Core but I don’t think it will get back its popularity as community around it is much smaller, also NHibernate compared to today’s ORMs seems to be very invasive as for example it forces you to make all your properties virtual, so it can be proxied by ORM.
Arrival of .NET Core and increasing popularity of microservices totally changed .NET architectural landscape. You can now develop and deploy on Linux. Usage of databases other than MS SQL is getting more and more popular among .NET devs.
Microservices also increased polyglot persistence popularity. Developers realized that they can use different data stores for different kind of services. There are document databases, graph databases, event stores and other kinds of database related solutions.

As you can see, there are plenty of options to choose from and one I would like to talk about in this post is using relational database as document database, to take the best of both worlds. You can achieve this with little help of Marten.

What is Marten?

Marten is a client library, that allows .NET developers to use Postgresql as a document database and as an event store. It was started by Jeremy Miller as a replacement for RavenDB database sometime around October 2015, but it is much more than that.
If you have ever worked with document database like MongoDB or RavenDB you know that it gives you great developer experience, especially ease of use and development speed, but there are certain issues related to performance and data consistency.
With Marten you can easily use relational database as a document one, with full ACID compliance and extensive LINQ support.

From my perspective, there is a certain use case that seems ideal for this kind of approach. If you are practising domain-driven design and partition your domain model into small aggregates, you can treat your aggregates as documents. If you take this approach and combine with a library like Marten, then persisting, loading and finding your aggregates requires almost no code. Due to ACID compliance, you can modify and save many aggregates in the same transaction, which would not be possible in many document databases. Using relational database also simplifies your infrastructure management, as you can still rely on familiar tools for backups and monitoring.
What’s more important, your domain model is not restricted by your ORM incapabilities.
The idea of saving aggregates as JSON in relational database was described in Vaughn Vernon article “The Ideal Domain Driven Design Aggregate Store”, that’s from where I found inspiration for title of this post.

Using Marten

Adding Marten to a project

As usual we start by adding Marten dependency to our project using NuGet.

Install-Package Marten

Next thing we need to do is add a connection string to our PostgreSQL database to appsettings.json.

{
    "ConnectionStrings": {
        "PgConnection": "User ID=lab_user;Password=*****;Database=lab_netmicro_payments;Host=localhost;Port=5432"
    }}

We also need to install PostgreSQL 9.5+ database server.

Setting Marten up

Now we can set up Marten. We will look at the example code taken from PaymentService.
In our solution we decided to separate domain logic from persistence details and we are introducing two interfaces for this purpose.

public interface IPolicyAccountRepository
{
    void Add(PolicyAccount policyAccount);

    Task FindByNumber(string accountNumber);
}

First interface represent repository for PolicyAccount aggregate. Here we use repository pattern as described in DDD Blue Book by Eric Evans. Our repository provides interface for stored data, so that we can use it as simple collection class. Please note that we do not create generic repository. If we are doing domain-driven design, repository should be part of our domain language and should only expose operations needed by our domain code.

Second interface represents unit of work pattern – a service that keeps track of loaded objects and lets us persist changes.

public interface IDataStore : IDisposable
    {
        IPolicyAccountRepository PolicyAccounts { get; }

        Task CommitChanges();
    }

Data store interface gives us access to our repository, so we can add and retrieve policy account objects from it, and it allows us to commit changes to persistent storage.

Let’s see how can we implement these interfaces using Marten. But before we get to that, we must learn about two fundamental concepts in Marten: DocumentStore and DocumentSession.
First one represents configuration of our document store. It keeps configuration data such as connections strings, serialization settings, schema customization, mapping information.
DocumentSession represents our unit of work. It is responsible for opening and managing database connection, execution of SQL statements against database, loading documents, keeping track of loaded documents and finally, saving changes back to database. First, you create an instance of DocumentStore, then you can ask it to create an instance of DocumentSession and finally you can use DocumentSession to create, load, modify and store documents in database.

There are three implementations of DocumentSession:

  • Lightweight Session – one that does not track changes,
  • Standard Session – one with identity map tracking, but no change tracking
  • Dirty Tracked Session – one with identity map and dirty tracking.

Dirty tracking is implemented as comparison between JSON loaded initially from database and JSON generated from your aggregate, so you must be aware of performance and memory cost. In our code we will use lightweight document session.
You can learn more about it from Marten’s official documentation.

Now that we know the basics. We can create MartenInstaller class that we will use in Startup class to initialize and wire up all required pieces.

public static class MartenInstaller
    {
        public static void AddMarten(this IServiceCollection services, string cnnString)
        {
            services.AddSingleton(CreateDocumentStore(cnnString));

            services.AddScoped<Domain.IDataStore, MartenDataStore>();
        }

        private static IDocumentStore CreateDocumentStore(string cn)
        {
            return DocumentStore.For(_ =>
            {
                _.Connection(cn);
                _.DatabaseSchemaName = "payment_service";
                _.Serializer(CustomizeJsonSerializer());
                _.Schema.For().Duplicate(t => t.PolicyNumber,pgType: "varchar(50)", configure: idx => idx.IsUnique = true);
            });
        }

        private static JsonNetSerializer CustomizeJsonSerializer()
        {
            var serializer = new JsonNetSerializer();

            serializer.Customize(_ =>
            {
                _.ContractResolver = new ProtectedSettersContractResolver();
        _.PreserveReferencesHandling = PreserveReferencesHandling.Objects;
            });

            return serializer;
        }
    }

Key method here is CreateDocumentStore. It creates an instance of document store and configures it.

DocumentStore.For(_ =>
            {
                _.Connection(cn); (1)
                _.DatabaseSchemaName = "payment_service"; (2)
                _.Serializer(CustomizeJsonSerializer()); (3)
                _.Schema.For().Duplicate(t => t.PolicyNumber,pgType: "varchar(50)", configure: idx => idx.IsUnique = true); (4)
            });

Here we:

  1. Provide connection string to Postgresql database.
  2. Customize schema name (if we don’t do this, tables for storing our documents will be created in public schema).
  3. Customize JsonSerializer, so it can serialize protected properties (this is important, we we are trying to design aggregates according to DDD rules, we do not want to expose internal state with public setters) and handle circular references between objects.
  4. We add “duplicate” field for policy number. Here we tell Marten to not only store policy number as part of aggregate in serialized JSON document, but also to create a separate column and unique index to have faster searches. We do this because we want to find accounts for given policy number fast.

More details about schema and mappings customization will be provided later in this post.

Let’s have a look at IDataStore interface implementation:

public class MartenDataStore : IDataStore
    {
        private readonly IDocumentSession session;

        public MartenDataStore(IDocumentStore documentStore)
        {
            session = documentStore.LightweightSession();
            PolicyAccounts = new MartenPolicyAccountRepository(session);
        }

        public IPolicyAccountRepository PolicyAccounts { get; }

        public async Task CommitChanges()
        {
            await session.SaveChangesAsync();
        }
        
        ...
    }

In constructor we open document session. We will close it when instance of our class will be disposed (IDisposable implementation is omitted here, but you can check complete code on GitHub). The CommitChanges method uses SaveChangesAsync method of DocumentSession class. Here we use async API of Marten, but you can also use synchronous version if that is your preference.
Implementation of IPolicyAccountRepository is very simple.

public class MartenPolicyAccountRepository : IPolicyAccountRepository
    {
        private readonly IDocumentSession documentSession;

        public MartenPolicyAccountRepository(IDocumentSession documentSession)
        {
            this.documentSession = documentSession;
        }

        public void Add(PolicyAccount policyAccount)
        {
            this.documentSession.Insert(policyAccount);
        }

        public async Task FindByNumber(string accountNumber)
        {
            return await this.documentSession
                .Query()
                .FirstOrDefaultAsync(p => p.PolicyNumber == accountNumber);
        }
    }

We accept reference to open document session in constructor. The Add method uses Insert method of DocumentSession to register document as new in unit of work. Document will be saved in database when CommitChanges is called on our DataStore. CommitChanges calls SaveChanges on the underlying document session.

FindByNumber is more interesting as it shows that you can use LINQ to construct queries against documents stored in database. In our case this is very simple query that look for policy account with given number. We will describe Marten LINQ capabilities in more detail further in this post.

Customizing schema and mapping

Marten will create a table for each .NET type of your aggregate that you want to save to database. Marten will also generate an “upsert” database function for each table.
By default tables are created in public schema and named with concatenation of “mt_doc_” prefix and your class name.
In our case we have PolicyAccount class, so Marten created mt_doc_policyaccount table.
There are various customization options that you can use.

You can specify database schema to be used. In our case we wanted all tables to be created as part of “payment_service” schema.

var store = DocumentStore.For(_ =>
{
    _.DatabaseSchemaName = "payment_service";
}

You can also specify schema for each table.

_.Storage.MappingFor(typeof(BillingPeriod)).DatabaseSchemaName = "billing";

By default Marten creates a table with columns for id, data serialized as json and adds some metadata columns: last modification date, .NET type name, version (used for optimistic concurrency) and soft deletion mark column.

Marten requires your class to have property that will be mapped to primary key. By default Marten will look for a property named: id, Id or ID. You can change it by annotating one of the properties of your class with [Identity] attribute or customize mapping in document store initialization code.

var store = DocumentStore.For(_ =>
{
    _.Schema.For.Identity(x => x.MyId);
}

You can also customize id generation strategy. For example you may choose to use CombGuid (sequential guid).

_.Schema.For().IdStrategy(new CombGuidIdGeneration());

If you want to improve queries performance, you can order Marten to create indexes and duplicate fields.
Your first option is to use computed index. In example below we created an index on first and last name of owner, so searching by these two fields should be faster.

_.Schema.For().Index(x => x.Owner.FirstName);
_.Schema.For().Index(x => x.Owner.LastName);

Note that computed indexes won’t work with DateTime and DateTimeOffset fields.

Second option is introduction of so called duplicate fields. We use this approach to optimize finding accounts by corresponding policy number.

_.Schema.For().Duplicate(t => t.PolicyNumber,pgType: "varchar(50)", configure: idx => idx.IsUnique = true);

Here we tell Marten to add additional field for policy number of type varchar(50) with unique index. This way Marten will save policy number not only as part of JSON data, but it will also save it into a separate column, which has unique index, so search on it should be super fast.

You can enable optimistic concurrency for given type.

_.Schema.For().UseOptimisticConcurrency(true);

There are many other options like full text indexes, foreign keys that let us link two aggregates, gin / gist indexes.

Saving aggregates

With IDataStore and IPolicyAccountRepository storing of our PolicyAccount aggregate is easy.
Here is example code that creates new account and saves it in a database.

public async Task Handle(PolicyCreated notification, CancellationToken cancellationToken)
{
            var policy = new PolicyAccount(notification.PolicyNumber, policyAccountNumberGenerator.Generate());

            using (dataStore)
            {
                dataStore.PolicyAccounts.Add(policy);
                await dataStore.CommitChanges();
            }
}

As you can see, no mapping in any form (code or attributes) or configuration is required to save and load your domain objects.

The only requirements that your classes must meet are: your class must be serializable to JSON (you can tweek JSON Serializer configuration in order to make your classes serialize/deserialize properly), your class must exposed identifier field or property. Identifier property will be used as source of value for primary key. Field/Property name must be id or Id or ID, but you can override this rule using [Identity] attribute or customize mapping in code. The following data types can be used as identifier: string, Guid, CombGuid (sequential guid), int, long or custom class. For int and long Marten uses HiLo generator. Marten ensures identifier is set during IDocumentSession.Store.

Marten also supports optimistic concurrency. This feature can be activated on per document type basis. In order to enable optimistic concurrency for your class you can add [UseOptimisticConcurrency] attribute to your class or customize schema configuration.

Loading aggregates

Loading aggregates is also trivial.

public async Task Handle(GetAccountBalanceQuery request, CancellationToken cancellationToken)
        {
            var policyAccount = await dataStore.PolicyAccounts.FindByNumber(request.PolicyNumber);
            
            if (policyAccount == null)
            {
                throw new PolicyAccountNotFound(request.PolicyNumber);
            }

            return new GetAccountBalanceQueryResult
            {
                Balance = new PolicyAccountBalanceDto
                {
                    PolicyNumber = policyAccount.PolicyNumber,
                    PolicyAccountNumber = policyAccount.PolicyAccountNumber,
                    Balance = policyAccount.BalanceAt(DateTimeOffset.Now)
                }
            };
        }

Querying

Marten provides extensive LINQ support. Example simple query that looks for policy account for policy with given number:

session.Query<PolicyAccount>().Where(p => p.PolicyNumber == "12121212")

Example query that combines multiple criteria with logical operators:

session.Query<PolicyAccount>().Where(p => p.PolicyNumber == "12121212" && p.PolicyAccountNumber!="32323232323")

Example searching through child collections of your aggregate and looking for accounts having an entry with amount equal to 200:

var accounts  = session.Query()
                    .Where(p => p.Entries.Any(_ => _.Amount == 200.0M)).ToList()

Please note that support for searching through child collection is limited to only checking equality of members of child collection (however this may change in future releases of Marten).

You can also search deep within your objects hierarchy. For example, if we had account owner data stored on policy account, we could search for policy accounts of given person like this:

var accounts  = session.Query()
                    .Where(p => p.Owner.Name.LastName == “Jones” && p.Owner.Name.FirstName == “Tim”)).ToList()

You can search string fields using StartsWith, EndsWith and Contains.

session.Query<PolicyAccount>().Where(p => p.PolicyNumber.EndsWith("009898"))

You can calculate Count, Min, Max, Average and Sum on your aggregate properties.

session.Query().Max(p => p.PolicyAccountNumber)

You can order results and use Take/Skip for paging.

session.Query().Skip(10).Take(10).OrderBy(p => p.PolicyAccountNumber)

There is also a handy shortcut method ToPagedList that combines skip and take.

If you have trouble figuring out why your query is not doing what you expected Marten offers you ability to preview LINQ query.

var query = session.Query()
                    .Where(p => p.PolicyNumber == "1223");

                var cmd = query.ToCommand(FetchType.FetchMany);
                var sql = cmd.CommandText;

The following code snippet convert LINQ query to ADO.NET command so you can inspect query text and parameters values.

Full list of supported operators can be found here.

Apart from LINQ you can use SQL to query for documents.

var user =
    session.Query("select data from payment_service.mt_doc_policyaccount where data ->> 'PolicyAccountNumber' = 1221212")
           .Single();

You can opt to retrieve raw JSON from database.

var json = session.Json.FindById<PolicyAccount>(id);

Compiled queries

There is also advanced functionality called compiled queries. LINQ is very cool and useful when constructing queries but it comes with a certain performance and memory usage overhead.
In case you have query that is complex and often executed it is a good candidate to take advantage of compiled queries. With compiled queries you avoid the cost of parsing LINQ expression tree on each query execution.

Compiled queries are classes that implement ICompiledQuery<TDoc, TResult> interface.
Example query class that searches for policy account with given number.

public class FindAccountByNumberQuery : ICompiledQuery<PolicyAccount, PolicyAccount>
    {
        public string AccountNumber { get; set; }
        
        public Expression<Func<IQueryable, PolicyAccount>> QueryIs()
        {
            return q => q.FirstOrDefault(p => p.PolicyAccountNumber == AccountNumber);
        }
    }

Key method here is QueryIs. This method returns expression that defines the query.

This class can be used like this:

var account = session.Query(new FindAccountByNumberQuery {AccountNumber = "11121212"});

You can read more about compiled queries here.

Patching data

Marten patching API can be used to update existing documents in your database. For some scenarios this can be more effective than loading whole documents into memory, serializing it, changing, deserializing and then saving back to database.
Patching API can also be very useful when fixing errors in data and for handling changes in your classes structure.
Our design won’t stay fixed forever. With time we will add new properties to our classes, change simple reference to a collection or the other way round. Some properties may be extracted and refactored to a new class, some properties may be dropped.
When working with tables in relational database we have set of well known SQL DDL commands like ALTER TABLE ADD/DROP COLUMN.
When working with JSON documents, we have to somehow deal with all the changes, so that existing documents can be still loaded and queried when corresponding classes are changed.

Let’s try to modify our PolicyAccount class and migrate existing data in database so it stays consistent.
We begin with PolicyAccount having to properties that represent account owner first and last name.

public class PolicyAccount
    {
        public Guid Id { get; protected set; }
        public string PolicyAccountNumber { get; protected set; }
        public string PolicyNumber { get; protected set; }
        public string OwnerFirstName { get; protected set; }
        public string OwnerName { get; protected set; }
        public ICollection Entries { get; protected set; }
        …

In database our data looks like this:

{
    "Id": "51d43842-896d-4d92-b1b9-b4c6512d3cf7", 
    "$id": "2", 
    "Entries": [], 
    "OwnerName": "Jones", 
    "PolicyNumber": "POLICY_1", 
    "OwnerFirstName": "Tim", 
    "PolicyAccountNumber": "231232132131"
}

As we can see OwnerName is not the best name and we’d like to rename it to OwnerLastName.
On C# part is is extremely easy as most IDEs provide rename refactoring out of the box. Let’s do it and then use Patch API to fix data in database

public void RenameProperty()
        {
            using (var session = SessionProvider.OpenSession())
            {
                session
                    .Patch(x => x.OwnerLastName == null)
                    .Rename("OwnerName", x => x.OwnerLastName);
                
                session.SaveChanges();
            }

        }

If you run this method data in database now looks like this:

{
    "Id": "51d43842-896d-4d92-b1b9-b4c6512d3cf7", 
    "$id": "2", 
    "Entries": [], 
    "PolicyNumber": "POLICY_1", 
    "OwnerLastName": "Jones", 
    "OwnerFirstName": "Tim", 
    "PolicyAccountNumber": "231232132131"
}

Let’s try something more complex. We decided to extract OwnerFirstName and OwnerLastName into a class. Now our C# code looks like this:

public class PolicyAccount
{
        public Guid Id { get; protected set; }
        public string PolicyAccountNumber { get; protected set; }
        public string PolicyNumber { get; protected set; }
        public string OwnerFirstName { get; protected set; }
        public string OwnerLastName { get; protected set; }
        public Owner Owner { get; protected set; }
        public ICollection Entries { get; protected set; }
}

We added a new class with properties FirstName and LastName. Now we will use Patch API to fix data in database.

public void AddANewProperty()
        {
            using (var session = SessionProvider.OpenSession())
            {
                 session
                    .Patch(x=>x.Owner.LastName==null)
                    .Duplicate(x => x.OwnerLastName, w => w.Owner.LastName);
                
                session
                    .Patch(x=>x.Owner.FirstName==null)
                    .Duplicate(x => x.OwnerFirstName, w => w.Owner.FirstName);
                
                session.SaveChanges();
            }
        }

And data in our database:

{
    "Id": "51d43842-896d-4d92-b1b9-b4c6512d3cf7", 
    "$id": "2", 
    "Owner": {
        "LastName": "Jones", 
        "FirstName": "Tim"
    }, 
    "Entries": [], 
    "PolicyNumber": "POLICY_1", 
    "OwnerLastName": "Jones", 
    "OwnerFirstName": "Tim", 
    "PolicyAccountNumber": "231232132131"
}

Now it is time to clean up. We have to remove unused OwnerFirstName and OwnerLastName properties from C# code and from data in database.

public void RemoveProperty()
        {
            using (var session = SessionProvider.OpenSession())
            {
                session
                    .Patch(x=>x.Owner!=null)
                    .Delete("OwnerLastName");
                
                session
                    .Patch(x=>x.Owner!=null)
                    .Delete("OwnerFirstName");
                
                session.SaveChanges();
            }
        }

Data in database now looks like this. OwnerFirstName and OwnerLastName are gone.

{
    "Id": "51d43842-896d-4d92-b1b9-b4c6512d3cf7", 
    "$id": "2", 
    "Owner": {
        "LastName": "Jones", 
        "FirstName": "Tim"
     }, 
    "Entries": [], 
    "PolicyNumber": "POLICY_1", 
    "PolicyAccountNumber": "231232132131"
}

Patch API offers more operations out of the box. You can read more about it here.
Patching API requires that you install PLV8 engine for PostgreSQL.
Apart from Marten’s Patching API you can always use full power of PostgreSQL which gives you set of functions to work with JSON type and combine it with ability to use JavaScript as language of database functions/procedures provided by PLV8 engine. In fact what Patch API generates are functions written in JavaScript and executed in the database using PLV8 engine.

Marten Pros & Cons

Pros

  • Best of both worlds: ACID and SQL support of relational databases with ease of use and development of document databases.
  • ACID support allows you to save many documents from many different tables in one transaction, which is not supported by most document databases.
  • Working with documents allows you to save and load your documents without having to defined mappings between your object model and database model which was required when using relational database. This results in faster development, especially in early stages of development when you do not have to worry about your scheme changes.
  • Extensive support for LINQ queries gives EF and NHibernate users familiar experience.
  • Ability to use as document store and as an event store.
  • Ability to quickly setup / teardown data for you unit/integration test.
  • Bulk operations support.
  • API for patching existing documents.
  • Ability to use SQL when LINQ query is not possible or not performant.
  • Easily use real database in your integration tests. Setting up a database, filling it with initial data and then cleaning it up is very simple and fast.
  • Multitenancy support.
  • Compiled and batch queries support.
  • DocumentSession ability to participate in TransactionScope managed transaction.

Cons

  • Works only with PostgreSQL.
  • Data migrations with Patch API require more work and learning new things.
  • Limited support for searching within child collection.
  • Not really good fit for reporting and ad-hoc querying.

Summary

There are multiple options to choose from when designing data access strategy for your microservice. There are options other than Entity Framework or hand-crafted SQL.

Marten is a mature library with many useful features and good LINQ support. If you are targeting PostgreSQL database and using domain-driven design approach to divided your domain model into small aggregates, Marten is worth a try.
It can also be very useful tool in design and exploration phase or when building prototypes allowing you to quickly evolve your domain model and being able to persist and query your data.

You can check out complete solution source code at: https://github.com/asc-lab/dotnetcore-microservices-poc.