Making history explicit

Introduction

In one product of our solution we needed to record the full history of some entities. What does this mean? It means we create a history entry whenever the state of the corresponding entity changes.

When doing this there are basically two ways one can choose – either implicitly generate the history records or doing it explicitly. To go the implicit route we could have used an interceptor mechanism to automatically and transparently create  a history record whenever the entity changes and the change is persisted to the database.

Since we chose to go the explicit route creating a history for an entity became a domain concept.

The domain

To show the concept I chose the sample of a Zoo whose management wants to keep track of the cages used throughout the facility as well as of the animals hosted in the cages. Let’s start with the simplified model of a cage and its cage history. In this model the cage entity represents the status quo that is the current state of the cage. Each cage has a collection of cage history entities which represent the full history of the cage from the moment it first appeared in the Zoo till the present day. Each cage history entity represents a snapshot of the cage’s state at a specific time in the past.

The current state (i.e. the cage) is important whenever the Zoo management wants to make an inventory (or census) of all cages and animals. On the other hand the history records (i.e. cage history entities) are important whenever the management wants to do retrospective reporting and/or billing.

image

As we can see in the above image a cage is of a certain cage type and is located at a specific location. The cage has many other properties which I do not show here to keep the sample as simple as possible. Now a cage is installed at some location but this location may change over time (imagine a fish tank that is move from floor A of building 1 to floor B. In such a situation a history record has to be generated.

To simplify things a little bit and to remain as DRY as possible Cage and CageHistory both derive from VersionedEntity<T>. VersionedEntity<T> in turn inherits from Entity<T>. Each entity in our domain inherits directly or indirectly from Entity<T> which provides the unique ID for the entity and implements equality based on this ID. If you want to know more about the implementation details of such a base class then please refer to this post.

The VersionedEntity<T> simply provides a Version (int) property to the corresponding entity. We have chosen to represent the flow through time of the entities needing to be historized with this version property.

Implementation of the model

xxx

public class Cage : VersionedEntity<Cage>
{
    private int lastVersion;
    private CageHistory currentHistory;
 
    public Cage(CageType cageType, Location location)
    {
        lastVersion = int.MinValue;
        CageType = cageType;
        Location = location;
        cageHistories = new List<CageHistory>();
 
        CreateOrUpdateHistoryRecord();
    }
 
    protected Cage() { } // default constructor only needed to satisfy NHibernate
 
    public virtual CageType CageType { get; private set; }
    public virtual Location Location { get; private set; }
 
    private IList<CageHistory> cageHistories;
    public virtual IEnumerable<CageHistory> CageHistories { get { return cageHistories; } }
 
    public virtual void ChangeLocation(Location newLocation)
    {
        Location = newLocation;
        CreateOrUpdateHistoryRecord();
    }
 
    private void CreateOrUpdateHistoryRecord()
    {
        if (lastVersion != Version)
        {
            Version++;
            currentHistory = new CageHistory();
            cageHistories.Add(currentHistory);
            lastVersion = Version;
        }
        currentHistory.CreateFromCage(this);
    }
}

We want to only deal with cage entities that are always in a valid state (but for simplicity I do not show the validation code). A cage object can be instantiated in exactly two ways either through the constructor (if its a new cage) which expects a cage type and a location as arguments or it can be instantiated by NHibernate and populated with data from the database (if it already exists). We assume that the data in the database is valid.

Creating or updating a history record

The method CreateOrUpdateHistoryRecord is responsible to either create a new or update an existing history record. Note that we create exactly one history record per unit of work if the cage entity is changed, independent of how many times the cage is changed.

In our sample code we have exactly two places where the state of a cage changes namely during creation (constructor) and when the location changes. Each time we call the CreateOrUpdateHistoryRecord  method.

Why are all properties read-only?

In our applications we chose to not use property setters in the domain. Assigning values to properties of an entity is in many case not expressive enough. The assignment of a specific value to a given property is not intention revealing. The more complex a domain gets the more important is this fact. Thus we have methods to change state of an entity. In our simple sample this might be an overkill specifically since the method ChangeLocation only changes one property. But often we encounter situations where we have to change several interrelated properties in one atomic operation.

Let’s now have a look at the CageHistory class

public class CageHistory : VersionedEntity<CageHistory>
{
    public virtual Cage Cage { get; set; }
    public virtual CageType CageType { get; private set; }
    public virtual Location Location { get; private set; }
 
    public virtual void CreateFromCage(Cage cage)
    {
        Cage = cage;
        Version = cage.Version;
        CageType = cage.CageType;
        Location = cage.Location;
    }
}

We have chosen the CageHistory to be responsible to create itself as a snapshot of the current state of the cage. The method CreateFromCage is responsible for this and can be considered to be a factory method.

Other than that the CageHistory entity is a mirror of the Cage entity regarding its properties. It also contains a reference back to the Cage itself.

Mapping of the model

To map our entities to the underlying database we choose Fluent NHibernate. To be more explicit we do not use the auto mapping functionality of Fluent NHibernate but rather use fluent mappings.

public class CageMap : ClassMap<Cage>
{
    public CageMap()
    {
        Id(x => x.Id).GeneratedBy.Assigned();
        Map(x => x.Version).Not.Nullable();
        References(x => x.CageType).Not.Nullable();
        References(x => x.Location).Not.Nullable();
        HasMany(x => x.CageHistories)
            .Cascade.AllDeleteOrphan()
            .Inverse()
            .Access.CamelCaseField();
    }
}

In this sample we use client side generated IDs (Guids) which gives us a lot of flexibility; thus the mapping GeneratedBy.Assigned(). Everything else should be straight forward.
 
public class CageHistoryMap : ClassMap<CageHistory>
{
    public CageHistoryMap()
    {
        Id(x => x.Id).GeneratedBy.Assigned();
        Map(x => x.Version).Not.Nullable();
        References(x => x.Cage).Not.Nullable();
        References(x => x.CageType).Not.Nullable();
        References(x => x.Location).Not.Nullable();
    }
}

Again the above mapping should be straight forward.

Testing

In this post I do not want to show the basic test e.g. mapping test. These kind of tests are well described in the Fluent NHibernate wiki. What I want to show is how we can test whether the creation of history records is working as expected. To do the tests we use SqLite in InMemory mode as our database.

For all database related test we can use a fixture base class like this

public abstract class database_fixture_base
{
    private ISessionFactory sessionFactory;
    protected ISession session;
    private Configuration configuration;
 
    [TestFixtureSetUp]
    public void TestFixtureSetup()
    {
        sessionFactory = Fluently.Configure()
            .Database(SQLiteConfiguration.Standard.ShowSql().InMemory)
            .Mappings(m => m.FluentMappings.AddFromAssemblyOf<Cage>())
            .ExposeConfiguration(c => configuration = c)
            .BuildSessionFactory();
    }
 
    [SetUp]
    public void Setup()
    {
        session = sessionFactory.OpenSession();
        new SchemaExport(configuration).Execute(false, true, false, session.Connection, null);
    }
 
    [TearDown]
    public void TearDown()
    {
        session.Dispose();
    }
}

Note how easy and expressibe the configuration of NHibernate becomes when using Fluent NHibernate. Before each test we create a new session and use this session to create the database schema by using the NHibernate SchemaExport class. After each test we dispose the session. The database schema is deleted when disposing the session when using SqLite in InMemory mode. This is exactly what we need to avoid any side effects from test to test.

[TestFixture]
public class cage_mapping_specs : database_fixture_base
{
    [Test]
    public void creating_a_cage_creates_a_cage_history()
    {
        var cageType = new CageType { Name = "Small lion cage" };
        var location = new Location { Name = "House of Africa" };
        var cage = new Cage(cageType, location);
 
        session.Save(cageType);
        session.Save(location);
        session.Save(cage);
        session.Flush();
 
        session.Clear();
 
        var fromDb = session.Get<Cage>(cage.Id);
 
        Assert.That(fromDb.CageHistories.Count(), Is.EqualTo(1));
    }
 
    [Test]
    public void changing_location_of_a_cage_creates_a_cage_history()
    {
        var cageType = new CageType { Name = "Small lion cage" };
        var location = new Location { Name = "House of Africa" };
        var location2 = new Location { Name = "House of Madagaskar" };
        var cage = new Cage(cageType, location);
 
        session.Save(cageType);
        session.Save(location);
        session.Save(location2);
        session.Save(cage);
        session.Flush();
 
        session.Clear();
 
        var fromDb = session.Get<Cage>(cage.Id);
        fromDb.ChangeLocation(location2);
        session.Flush();
        session.Clear();
 
        var fromDb2 = session.Get<Cage>(cage.Id);
        Assert.That(fromDb2.CageHistories.Count(), Is.EqualTo(2));
        Assert.That(fromDb2.Version, Is.EqualTo(2));
    }
}

The two tests above should be pretty much self describing.

Summary

In this post I have shown how history of an entity can be made explicit by making it a domain concept. This approach is much more discoverable than a implicit approach where history is generated auto magically by e.g. using an interceptor mechanism at the database or ORM framework level.

In a following post I will introduce animals to populate the cages and discuss the implications to the model and mapping of the model.

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Gabriel Schenker

Gabriel N. Schenker started his career as a physicist. Following his passion and interest in stars and the universe he chose to write his Ph.D. thesis in astrophysics. Soon after this he dedicated all his time to his second passion, writing and architecting software. Gabriel has since been working for over 12 years as an independent consultant, trainer, and mentor mainly on the .NET platform. He is currently working as chief software architect in a mid-size US company based in Austin TX providing software and services to the pharmaceutical industry as well as to many well-known hospitals and universities throughout the US and in many other countries around the world. Gabriel is passionate about software development and tries to make the life of developers easier by providing guidelines and frameworks to reduce friction in the software development process. Gabriel is married and father of four children and during his spare time likes hiking in the mountains, cooking and reading.
This entry was posted in Fluent NHibernate, How To, NHibernate. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • JeroenH

    I’ve done something very similar in the project I’m currently working on.

    This seemed like a good idea, but I’m not sure I would model it again like this. In the future, I would probably opt for a simple model for the current situation, and a daily export to a separate table (probably even separate) database. Of course, if you really need the most granular history, this wouldn’t work but in most circumstances it should suffice…

  • http://craniometrics.blogspot.com J Healy

    Looks like a reasonable approach, but my response to such history requirements these days would likely have me leaning more towards a CQRS solution.

  • http://jonathan-oliver.blogspot.com Jonathan Oliver

    +1 for CQRS–tracking history and state changes is where it really shines.

    Also, how can you 100% guarantee that your audit trail is correct? What happens if someone goes in an edits the table manually (to fix some data that was incorrectly computed because of a bug)?

  • http://www.lostechies.com/members/gnschenker/default.aspx Gabriel N. Schenker

    @J Healy, @Jonathan: we are actually using CQRS in our solution. But you guys are probably talking of event sourcing, isn’t it?
    @Jonathan: you can never avoid that a “malicious” DBA manually alters your data but you can at least try to make it as unlikely as possible by declaring in the contract/license that manually changing data is not allowed and violates the contract.

  • Josh

    Did you ever explore using inheritence for your history object (CageHistory : Cage) so you don’t have to worry about forgetting to maintain new properties that are essentially the same, on two different classes?

  • http://devcarl.posterous.com Carl Hörberg

    Why didnt you go with event sourcing?

    How will you handle schema changes?

  • http://www.lostechies.com/members/gnschenker/default.aspx Gabriel N. Schenker

    @Josh: we did consider using inheritance but then preferred not to use it. There have been various reasons to do so, one of them being not to have to deal with deep hierarchies which tend to be brittle.
    @Josh and Carl: Whenever the schema changes we have to change not only the entity but also the UI amongst other things. Such a change usually affects all layers of an application.
    Why didn’t we go with event sourcing? Well our solution is very large and has a long history. We have to consider a lot of “legacy stuff”. It is the typical slow migration to a “better” architectur…

  • Josh

    Event sourcing is good, but I hate how everyone jumps to it immediately as “the” only temporal solution.

  • http://ampgt.com Scott Bellware

    Just musing to myself that if you guys had gone with a compositional language, that this problem would have been much easier to solve. Just wondering why we constrain ourselves to tools that often have stark limits and then struggle to do things with them that point to tools without these limits.

    In a compositional language, so much more is available as standard work. That because more generalized solutions are possible. That’s because there’s no tripping on inheritance of templated types, which are difficult to share effectively across an entire ecosystem.

    Here’s how much work this would be in a compositional language, for example, in Ruby:

    class Cage < < ActiveRecord::Base
    versioned
    end

    (see: http://github.com/laserlemon/vestal_versions/blob/master/README.rdoc)

    Because compositional languages allow many avenues to generalization, these problems get solved early and are made available as ecosystem standard work, eg: http://ruby-toolbox.com/categories/activerecord_versioning.html

    What is it that is compelling about the platform you’re using that would see to the re-creation and re-learning of things that are essentially standard work that has been commoditized as plugins and microframeworks?

    Is Silverlight really so unique in its capabilities that it is uniquely able to address the line of business user interface issues for your app? It seems like a hell of a lot to trade off to eschew the incredible amount of off-the-shelf components in compositional language ecosystems.

    The cost of rework of so much standardized work would have to be offset by some incredible productivity and competitive advantages. Based on what we see regularly in terms of capabilities of contemporary web apps, what is it that you guys are doing with Silverlight – that justifies so much re-invention – that can’t be done with contemporary web development?