Avoid In-Memory Databases for Tests

A controversial GitHub issue came to my attention a couple of weeks ago around ditching the in-memory provider for Entity Framework Core. This seemed like a no-brainer to me - these database providers are far from trivial to maintain, even for in-memory strategies. It's something our teams learned nearly a decade ago, that trying to swap out an in-memory strategy for unit testing simply doesn't provide the value folks may hope for.

It seems rather simple at first - especially in the .NET world and EF Core. EF Core's primary read API is LINQ. LINQ has two flavors - IEnumerable and IQueryable. With IQueryable, an IQueryProvider translates expression trees to...well whatever makes sense for the underlying store. There's a neat trick that you can do, however, as IEnumerable has a method, AsQueryable, to allow complex expression trees to evaluate directly against an in-memory IEnumerable.

Thus, in-memory queryables were born. So why not take advantage of this possibility for unit tests? Why not allow us to swap the implementation of some queryable to the in-memory equivalent, and allow us to write unit tests against in-memory stores?

It all seems so simple, but unfortunately, the devil is in the details.

Simple Ain't Easy

LINQ providers aren't easy. They're not even merely difficult, they're some of the most complex pieces of code you'll see. Why is that?

A LINQ provider is a compiler, a lexer, and a parser, but doesn't own any of those pieces. It's a transpiler, but instead of the output being text, it's API calls. You have to do similar operations as an actual compiler, dealing with ASTs, building your own AST (often), and figuring out how to make a very wide and complex API that is all the IQueryable surface area.

Any ORM maintainer can tell you - the code in the query provider can dwarf the code in the rest of the codebase. This is unlike other ORMs that provide a specialized API or language, such as NHibernate or the MongoDB C# driver.

LINQ surfaces a great amount of flexibility, but that flexibility makes it quite difficult to translate into SQL or any other database API that's already been specifically designed for that database. We're trying to wrap a fit-for-purpose API with a generic-purpose API, and one that can run either in-memory or translated into a database.

On top of that, a good deal of the LINQ surface area can't be translated into SQL, and there are Very Important things that don't translate into LINQ. So you have to extend IQueryable to do fun things like:

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
            .ThenInclude(post => post.Author)
                .ThenInclude(author => author.Photo)
        .ToList();
}

Yikes! We also have async in the mix, so now we're at the point where the IQueryable isn't remotely the same for in-memory.

But that won't stop us from trying!

Pitfalls of In-Memory

Our teams tried a number of years ago to speed up integration tests by swapping in-memory providers, but we found a number of problems with this approach that led us to abandoning it altogether.

You MUST write a real integration test anyway

First and foremost, an in-memory provider is a pale imitation for the real thing. Even with writing in-memory tests, we still absolutely wrote integration tests against a real database. Those unit tests with in-memory looked exactly like the integration tests, just with a different provider.

Which led us to wonder - if we were writing the tests twice, what was the value of having two tests?

You could write a single test codebase, and run it twice - one with in-memory and one with the real thing, but that has other problems.

You MUST allow raw access to the underlying data API

ORMs allow you to encapsulate your data access, which is good, it allows us to be more productive by focusing on the business problem at hand. But it's also bad because it abstracts your data access, leaving developers to assume that they don't actually need to understand what is going on behind the scenes.

In our projects, we take a pragmatic approach - use the ORM's API when it works, and drop down to the database API when it becomes painful. ORMs these days make it quite easy, such as EF Core's raw SQL capabilities:

var blogs = context.Blogs
    .FromSqlRaw("SELECT * FROM dbo.Blogs")
    .ToList();

There are numerous limitations, many more than with EF6, which is why we often bring in a tool like Dapper to do complex SQL:

var employeeHierarchy = connection.Execute<EmployeeDto>(@"WITH cte_org AS (
    SELECT       
        staff_id, 
        first_name,
        manager_id
        
    FROM       
        sales.staffs
    WHERE manager_id IS NULL
    UNION ALL
    SELECT 
        e.staff_id, 
        e.first_name,
        e.manager_id
    FROM 
        sales.staffs e
        INNER JOIN cte_org o 
            ON o.staff_id = e.manager_id
)
SELECT * FROM cte_org;");

So how do we handle this scenario in our tests? Don't write the unit test (this assumes we're actually writing tests twice)? Somehow exclude it?

What I tend to find is that instead of dropping down to SQL, developers avoid SQL just so that it satisfies the tool. This is unacceptable.

The APIs don't match

The in-memory API of vanilla IQueryProvider doesn't match the LINQ query provider. This means you'll have methods that don't make sense, are no-ops, or even nonsensical for in-memory.

The most obvious example is Include, which instructs the LINQ provider to basically do a join to eagerly fetch some child records. This is to avoid multiple round trips. However, this means nothing to in-memory. You can keep it, remove it, add more, remove more, doesn't matter.

It gets worse on the flip side - when LINQ provides some APIs that aren't supported by the query provider. Since LINQ can run in-memory, it can execute anything on the client side. But when you try to run anything on the server, that won't work:

var blogs = context.Blogs
    .Where(blog => StandardizeUrl(blog.Url).Contains("dotnet"))
    .ToList();

Instead, LINQ providers allow a narrow subset of methods, and even beyond that, a limited set of core .NET methods to translate on the server. But not all obvious methods, and not even all overloads are supported. You don't know this until you actually run the LINQ query against the enormous LINQ provider.

Databases Means Transactions

If I look at a typical integration test we write, we're using both the public and non-public API in a series of transactions to interact with the system under test. Here's a typical example:

[Fact]
public async Task Should_edit_department()
{
    var adminId = await SendAsync(new CreateEdit.Command
    {
        FirstMidName = "George",
        LastName = "Costanza",
        HireDate = DateTime.Today
    });

    var admin2Id = await SendAsync(new CreateEdit.Command
    {
        FirstMidName = "George",
        LastName = "Costanza",
        HireDate = DateTime.Today
    });

    var dept = new Department
    {
        Name = "History",
        InstructorId = adminId,
        Budget = 123m,
        StartDate = DateTime.Today
    };
    await InsertAsync(dept);

    Edit.Command command = null;
    await ExecuteDbContextAsync(async (ctxt, mediator) =>
    {
        var admin2 = await FindAsync<Instructor>(admin2Id);

        command = new Edit.Command
        {
            Id = dept.Id,
            Name = "English",
            Administrator = admin2,
            StartDate = DateTime.Today.AddDays(-1),
            Budget = 456m
        };

        await mediator.Send(command);
    });

    var result = await ExecuteDbContextAsync(db => db.Departments.Where(d => d.Id == dept.Id).Include(d => d.Administrator).SingleOrDefaultAsync());

    result.Name.ShouldBe(command.Name);
    result.Administrator.Id.ShouldBe(command.Administrator.Id);
    result.StartDate.ShouldBe(command.StartDate.GetValueOrDefault());
    result.Budget.ShouldBe(command.Budget.GetValueOrDefault());
}

It's long, but it combines both public APIs (sending commands to create items) and non-public APIs (interacting directly with the DbContext to insert rows), executing an individual command for the test, then finally querying to pull an item out.

In integration tests of long ago, we'd put this entire set of operations in a transaction/unit of work. That's not at all how the application behaves, however, and we'd see many false positives that would only break when each operation was distinct. This is because ORMs use patterns like Unit of Work and Identity Map to determine what to persist and when.

With in-memory providers, there is no "ACID", everything is immediately durable. Each operation is immediately performed, and transactions do nothing! It might seem like a trivial thing, who cares if everything is always immediately durable? The problem is, just like an integration test that uses a single transaction, is that real-life behavior is much different and more complex, and will break in ways you can't predict. Enough false positives, and you wind up distrusting these unit tests.

The database enforces constraints and visibility and isolation levels that these attempts can't, and inevitably, you'll hit problems

But it's working for me!

Great! You're one of the lucky few. Your usage is trivial enough that can fit into the constraints of an in-memory provider. We've tried this (and other in-memory DBs, like SQLite), and it's always failed.

Unfortunately for the EF team, maintaining this provider for public consumption is a cost for them, and a tradeoff. They're coding that instead of coding something else. The question becomes - is the value of a (always flawed) in-memory provider worth the effort for the team?

For me, no, it's not worth negative effects for our team.