A Scrum Of One: Classes and Memoization

This post talks about a set of utilities provided in the BrightSword SwissKnife library (nuget, codeplex).

When programming in the real world (as opposed to programming for an academic exercise), we and (and should) combine concepts to synergistically improve our solution.

We have visited Memoization as a technique, and have seen the use of functional programming to provide more succinct, maintainable code in discussing the Reflection Utility.

In this post, we will combine those two concepts (and derive a programming rule-of-thumb) to provide a more efficient utility for Reflection.

You will find this utility here, and can access it as BrightSword.SwissKnife.TypeMemberDiscoverer<T>

The Simple Approach

The Reflection Utility extension method surface is simple to use and highly maintainable, because there is minimal code-duplication.

To re-cap, our extension method surface for properties looks like:

public static IEnumerable<PropertyInfo> GetAllProperties(this Type _this, BindingFlags bindingFlags = DefaultBindingFlags)
{
    return _this.GetAllMembers((_type, _bindingFlags) => _type.GetProperties(_bindingFlags), bindingFlags);
}

which allows for usage like this:

var readonlyProperties = typeof (T).GetAllProperties().Where(_ => !_.CanWrite || _.GetSetMethod() == null);

One of the immediately evident drawbacks is that each invocation of the extension method involves reflection over the type’s members, and therefore we pay a linear cost for reflection.

One approach might be to consider Memoization to store and reuse the result of the reflection utilities, and pay the price for reflection just once. This post discusses the approach SwissKnife takes in order to do this.

The real problem with simple memoization is the BindingFlags parameter, which may select different sets of members each time. In order to optimally reduce the required computation, we could restrict the BindingFlags value to some reasonable default (such as ‘all public members’) and memoize the result-set of members by Type.

This line of thought leads to something like this:

public static class TypeMemberDiscoverer
{
    private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

    public static ConcurrentDictionary<Type, IEnumerable<PropertyInfo>> _propertySetCache = 
        new ConcurrentDictionary<Type, IEnumerable<PropertyInfo>>();

    public static IEnumerable<PropertyInfo> GetAllProperties(this Type type)
    {
        return _propertySetCache.GetOrAdd(type,
                                          _ => _.GetAllMembers((_type,
                                                                _bindingFlags) => _type.GetProperties(_bindingFlags),
                                                               DefaultBindingFlags));
    }
}

This approach still provides an extension method, and now the default property set is cached and the GetAllMembers call is only performed once.

However, we can use a feature of the C# language to get a slightly more elegant solution, and simultaneously derive a programming rule-of-thumb.

Static Fields in Generic Types

Because C# generic classes are true classes, a static field defined in an open generic class will actually be defined as a separate static field in each corresponding closed generic class.

Consider the open generic type:

public class Foo<T>
{
    private static int _bar;
}

Consider now, a Foo<int> and a Foo<string>.

Both of these closed generic types will have a static field of type int called _bar.

As expected, all instances of Foo<int> will share a static _bar field, and all instances of Foo<string> will share a static _bar field.

What you might not expect, however is that Foo<int> will not be the same as Foo<string>. In fact, Resharper™ and Microsoft Code Analysis both have warnings alerting you that this behaviour might be potentially unexpected – you might expect that the field is shared across all types that close the open generic type, but it won’t be!

However, in our case, this is exactly the behaviour we want! Because static fields in generic classes are actually only shared by instances of the corresponding closed generic type, we can use a generic class itself as a kind of memoization container, where we can cache something by Type!

So the general rule-of-thumb is: If you find yourself trying to cache against a type name for whatever reason, consider using a generic class as a cache container. The static initializers of a class are guaranteed thread-safe, and you can write more readable code and let the language provide the cache!

So we can write a generic type with one type parameter, as follows:

public static class TypeMemberDiscoverer<T>
{
    private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

// ReSharper disable StaticFieldInGenericType
    private static readonly IEnumerable<PropertyInfo> _properties = 
        typeof(T).GetAllMembers((_type,
                                 _bindingFlags) => _type.GetProperties(_bindingFlags), DefaultBindingFlags);
// ReSharper restore StaticFieldInGenericType

    public static IEnumerable<PropertyInfo> GetAllProperties()
    {
        return _properties;
    }
}

The usage is:

var readonlyProperties = TypeMemberDiscoverer<int>.GetAllProperties()
                                                  .Where(_ => !_.CanWrite || _.GetSetMethod() == null);

In the class above, we pay the price to discover the properties exactly once – when the TypeMemberDiscoverer<int> type is initialized. We can use the Lazy<T> pattern to further defer the evaluation to when the GetAllProperties() call is first invoked.

public static class TypeMemberDiscoverer<T>
{
    private const BindingFlags DefaultBindingFlags = BindingFlags.Default | BindingFlags.Instance | BindingFlags.Public;

// ReSharper disable StaticFieldInGenericType
    private static readonly Lazy<IEnumerable<PropertyInfo>> _propertiesL = 
        new Lazy<IEnumerable<PropertyInfo>>(
            () => typeof(T).GetAllMembers((_type,
                                           _bindingFlags) => _type.GetProperties(_bindingFlags),
                                          DefaultBindingFlags));
// ReSharper restore StaticFieldInGenericType

    public static IEnumerable<PropertyInfo> GetAllProperties()
    {
        return _propertiesL.Value;
    }
}

This is both efficient and easier-to-read, and it doesn't need a ConcurrentDictionary to work!

Summary

We can use the functional programming principle of memoization to cache the result of an expensive operation (recursive reflection). This eliminates repeated computation at the expense of memory.
We can do this elegantly by using a feature of the C# language which allows static fields in a generic type to act as a cache keyed by the type parameters.
We can further reduce wasted effort by using the Lazy pattern provided by the runtime, and only compute (and cache) results when first used

A Scrum Of One

Sunday, November 3, 2013

Classes and Memoization - A Neat C# Trick

The Simple Approach

Static Fields in Generic Types

Summary

No comments:

Post a Comment

Pages