Analyzing historical data and playing with interactive extensions

21 September, 2010. It was a Tuesday.

Introduction

In my last post I showed you how we make history an explicit domain concept. This time I want to show you how we use the history of the cages to generate bills. In our Zoo each cage has an associated per diem rate (dollars per day). This per diem rate reflects the cost associated with space occupied by the cage as well as the hosting of animals in the cage (feeding, cleaning, etc.)

Calculate Time Intervals

To provide a monthly bill to the Zoo management per cage type we have to retrieve the history data we have collected over the specific time range and aggregate the cost over the time span.

In our history records we only have/store the creation date of each record. To calculate costs we need time ranges during which a specific state is valid. Now the start date of an interval corresponds to the creation date of the specific record. The end date of the interval corresponds to the creation date of the next history record in the sequence. We can easily calculate these intervals using a for loop

   1: var list = cageHistoryList.ToArray();

   2: for(var i = 0; i < list.Length - 1; i++)

   3: {

   4:     list[i].EndDate = list[i+1].StartDate;

   5: }

</div> </div>

but this is not a very declarative way of doing set based operations and can quickly become very difficult to read if we add some more logic. On the other hand we all have come to love LINQ. So let’s utilize what the new interactive extensions (which are part of the Rx framework) provide us

We can use the Zip function to build pairs of two source streams. In our case we want the two source streams both be the cage history records but shifted by one element; that is we want to have a series of pairs

assuming that our source stream is cageHistoryList and that the source stream only contains the history of one cage we can then write

   1: var list = cageHistoryList.Zip(cageHistoryList.Skip(1),

   2:                                (h1, h2) => new {H1 = h1, H2 = h2})

   3:          .Do(pair => pair.H1.EndDate = pair.H2.StartDate)

   4:          .Select(pair => pair.H1);

</div> </div>

The Do function introduces side effects onto the stream of data but remains in the monad.

The problem of the above function is that the resulting list does not contain the last element of the source stream. To fix this we can use two other handy new functions of the interactive extensions, namely the Concat and the TakeLast function.

   1: var list2 = list.Concate(cageHistoryList.TakeLast(1))

</div> </div>

Please not that the TakeLast function is very different from the well known Last function in that it remains in the monad and returns part of the tail of the source stream whereas Last() exits the monad and returns a single element (the last one).

Now if we have more than one cage then the end date of the last record of each cage has to be equal to the end date of the billing period. We can easily adjust our expression to handle this

   1: var cageHistoryList.Last().EndDate = billingIntervalEndDate;

2:

   3: var list = cageHistoryList.Zip(cageHistoryList.Skip(1), (h1, h2) => new {H1 = h1, H2 = h2})

   4:     .Do(pair => pair.H1.EndDate = pair.H1.CageId != pair.H2.CageId ? billingIntervalEndDate : pair.H2.StartDate)

   5:     .Select(pair => pair.H1)

   6:     .Concat(cageHistoryList.TakeLast(1))

   7:     .Do(x => x.NumberOfDays = (x.EndDate - x.StartDate).Days;

</div> </div>

Also note that I have added another Do function at the end of the expression to calculate the number of days in the interval. Of course we could also filter, sort or group the resulting stream…

← Making history explicit

New book on NHibernate →