When we write out classes and variables, we are pretty good at naming them for what they represent. What I mean by this, is that when we declare a customer first name variable, we don’t call it firstNameString
, or cfnString
, and we have also lost the bad habit of firstNameStr
or strFirstName
. The implementation detail simply doesn’t matter. Semantically, it represents the customer’s first name, so we use customerFirstName
. Or if we are clearly in the context of the customer we can simply use firstName
or customer.FirstName
.
By describing things by what they represent and not how they are represented, we make them more resistant to implementation changes, and we also make it clearer what it is we expect from their use. Why then do we not treat types the same way?
Well we do, you might say. Sure we name our classes well. We have Customer
, which is an class which has a name that matches it’s semantic meaning. But why does it have a constructor signature like this?
public Customer(string firstName, string surname, string preferredName, int age, decimal weight, decimal height) {...}
Can you tell from this constructor what each positional parameter represents? What if we did the following, is it still obvious?
public Customer(string a, string b, string c, int d, decimal e, decimal f) {...}
Ok so my example is contrived, but my point is simple. Positional parameters of value types are indistinguishable to the compiler, which neither knows nor cares what they mean, just that the types match so it can compile. If you called this method using new Customer("Jones", "Bobby", "Robert", 24, 1.68m, 91.3m)
the compiler doesn’t care that you created the lightest giant in the world called Jones Bobby with a nickname of Robert, how is it to know that’s not what you meant? What we need is a way to add semantic meaning to the positional parameters, which adds extra meaning to the data being passed around, and helps us avoid this transposition of errors, among other things.
I have come across several names for this technique, Tiny Types, Value Types, Micro Types and Semantic Types. I think my favourite is Semantic Types personally, but in discussion with others you may come across all of these and more. The point is that we create new types in our system that might be domain specific, or may be valid across various disparate systems. These types encapsulate the implementation types used to represent the data and stand in their place as an immutable value object.
It is worth noting that Enumerations are a kind of tiny type, but you do get some interesting behaviour with these sometimes when the int implementation leaks out, especially when casting and serialisation is involved.
Lets go back to our example and see what the end result looks like, and work backwards with an implementation. Here is our constructor in more detail this time.
public Customer
{
... [Auto properties probably here]
public Customer(FirstName firstName, FamilyName familyName, PreferredName preferredName,
Age age, Weight weight, Height height)
{
...
}
}
Ok now lets look at what we have here. The types of the parameters are objects that encapsulate the data they represent. Now whether you split your names like this or have a Name type that represents this as a whole is up to you, but the point is that you can’t mix us the height with the weight since the types won’t match, and it won’t compile.
There are other benefits other then transposing parameters. Overloads that expect different types of data can be created instead of having unique names just for overloading what string represents. If you use calculations on the data, you won’t add or subtract things that are not compatible accidentally, like our height and our weight. Even something as simple as wrapping your entity keys in semantic types means you can’t pass a CustomerKey
into a method like public Document[] GetForms(AccountKey key)
because the compiler will catch it for you. If you need to change your representation, this is also easy, without having to update all references to the implementation type everywhere, say if you need to add precision to your int by using decimal.
Lets take a look now at what implementing a tiny type might look like. Basically its an object. It will have a property to expose the primitive implementation, since at some point this will have to interface with something outside your control (see suggested exercises for more on this). You will also need to overload the equality operations to ensure you can compare these objects without any issue.
public class Height: IEquatable<Height>
{
private readonly decimal _value;
public Height(decimal dataValue)
{
_value = dataValue;
}
public decimal Value { get { return _value; } }
public override bool Equals(System.Object obj)
{
// If parameter is null return false.
if (obj == null)
{
return false;
}
// If parameter cannot be cast to Height return false.
Height h = obj as Height;
if ((System.Object)h == null)
{
return false;
}
// Return true if the fields match:
return Value == h.Value;
}
public bool Equals(Height h)
{
// If parameter is null return false:
if ((object)h == null)
{
return false;
}
// Return true if the fields match:
return Value == h.Value;
}
public override int GetHashCode()
{
return Value.GetHashCode();
}
public static bool operator ==(Height a, Height b)
{
// If both are null, or both are same instance, return true.
if (System.Object.ReferenceEquals(a, b))
{
return true;
}
// If one is null, but not both, return false.
if (((object)a == null) || ((object)b == null))
{
return false;
}
// Return true if the fields match:
return a.Value == b.Value;
}
public static bool operator !=(Height a, Height b)
{
return !(a == b);
}
}
(If your interest, my implementation is based on this msdn guide.)
There’s a lot of code here, so lets look at it in pieces. First its a class with a property value that we can set in the constructor. We make it a getter only with a read only backing field, making it immutable.
public class Height: IEquatable<Height>
{
private readonly decimal _value;
public Height(decimal dataValue)
{
_value = dataValue;
}
public decimal Value { get { return _value; } }
...
To make sure we can compare our tiny types in a standard way, we implement a bunch of the comparable implementation details. Specifically we override GetHashCode and Equals. We also overload the == and != operator to make it even easier to use correctly, and lastly we implement IEquatable<Height> so we can compare two instances of the same type without all the object reference stuff all the time.
Obviously you could take this approach and build out a generic base class with most of the common logic. Here is a list of things you might want to try out to get your head around the approach of building these types of objects on your own.
- Implement a generic base class representing a single primitive value wrapped in a Semantic Type.
- Create a T4 template that can read in settings about the name and type and produce classes like these without generics.
- Implement domain specific validations, so your semantic types only support a sub-set of the valid value types (think about email addresses and the string type for instance)
- Make a more complex Semantic Type that has more than one backing field (represent a fraction, or a Point in 3d space for instance)
- Can you get your object to automatically expose it’s internal value for you using custom cast operators, instead of using a Value property?
I may look at creating demos on these points, but for now enjoy playing with how these things come together and I hope to see more people using these in projects in the future.
(See it compare here)