Extension methods and primitive obsession
In another water-cooler argument today, a couple of coworkers didn’t like my extension method example. One main problem is that it violates instance semantics, where you expect that a method call off an instance won’t work if the instance is null. However, extension methods break that convention, leading the developer to question every method call and wonder if it’s an extension method or not. For example, you can run into these types of scenarios:
string nullString = null; bool isNull = nullString.IsNullOrEmpty();
In normal circumstances, the call to IsNullOrEmpty would throw a NullReferenceException. Since we’re using an extension method, we leave it up to the developer of the extension method to determine what to do with null references.
Since there’s no way to describe to the user of the API whether or not the extension method handles nulls, or how it handles null references, this can lead to quite a bit of confusion to clients of that API, or later, those maintaining code using extension methods.
In addition to problems with dealing with null references (which Elton pointed out, could be better handled with design-by-contract), some examples of extension methods online propose examples that show more than a whiff of the “Primitive Obsession” code smell:
- http://weblogs.asp.net/scottgu/archive/2007/03/13/new-orcas-language-feature-extension-methods.aspx
- http://davidhayden.com/blog/dave/archive/2006/11/30/ExtensionMethodsCSharp.aspx</ul>
Dealing with primitive obsession
In both of the examples above (Scott cites David’s example), an extension method is used to determine if a string is an email:
string email = txtEmailAddress.Text;
- http://davidhayden.com/blog/dave/archive/2006/11/30/ExtensionMethodsCSharp.aspx</ul>
if (! email.IsValidEmailAddress()) { // oh noes! } </pre> </div>
It’s something I’ve done a hundred times, taking raw text from user input and performing some validation to make sure it’s the “right” kind of string I want. But where do you stop with validation? Do you assume all throughout the application that this string is the correct kind of string, or do you duplicate the validation?
An alternative approach is accept that classes are your friend, and create a small class to represent your “special” primitive. Convert back and forth at the boundaries between your system and customer-facing layers. Here’s the new Email class:
<div class="CodeFormatContainer">
<pre><span class="kwrd">public</span> <span class="kwrd">class</span> Email {
<span class="kwrd">private</span> <span class="kwrd">readonly</span> <span class="kwrd">string</span> _value;
<span class="kwrd">private</span> <span class="kwrd">static</span> <span class="kwrd">readonly</span> Regex _regex = <span class="kwrd">new</span> Regex(<span class="str">@"^[w-.]+@([w-]+.)+[w-]{2,4}$"</span>);
<span class="kwrd">public</span> Email(<span class="kwrd">string</span> <span class="kwrd">value</span>)
{
<span class="kwrd">if</span> (!_regex.IsMatch(<span class="kwrd">value</span>))
<span class="kwrd">throw</span> <span class="kwrd">new</span> ArgumentException(<span class="str">"Invalid email format."</span>, <span class="str">"value"</span>);
_value = <span class="kwrd">value</span>;
}
<span class="kwrd">public</span> <span class="kwrd">string</span> Value
{
get { <span class="kwrd">return</span> _value; }
}
<span class="kwrd">public</span> <span class="kwrd">static</span> <span class="kwrd">implicit</span> <span class="kwrd">operator</span> <span class="kwrd">string</span>(Email email)
{
<span class="kwrd">return</span> email.Value;
}
<span class="kwrd">public</span> <span class="kwrd">static</span> <span class="kwrd">explicit</span> <span class="kwrd">operator</span> Email(<span class="kwrd">string</span> <span class="kwrd">value</span>)
{
<span class="kwrd">return</span> <span class="kwrd">new</span> Email(<span class="kwrd">value</span>);
}
<span class="kwrd">public</span> <span class="kwrd">static</span> Email Parse(<span class="kwrd">string</span> email)
{
<span class="kwrd">if</span> (email == <span class="kwrd">null</span>)
<span class="kwrd">throw</span> <span class="kwrd">new</span> ArgumentNullException(<span class="str">"email"</span>);
Email result = <span class="kwrd">null</span>;
<span class="kwrd">if</span> (!TryParse(email, <span class="kwrd">out</span> result))
<span class="kwrd">throw</span> <span class="kwrd">new</span> FormatException(<span class="str">"Invalid email format."</span>);
<span class="kwrd">return</span> result;
}
<span class="kwrd">public</span> <span class="kwrd">static</span> <span class="kwrd">bool</span> TryParse(<span class="kwrd">string</span> email, <span class="kwrd">out</span> Email result)
{
<span class="kwrd">if</span> (!_regex.IsMatch(email))
{
result = <span class="kwrd">null</span>;
<span class="kwrd">return</span> <span class="kwrd">false</span>;
}
result = <span class="kwrd">new</span> Email(email);
<span class="kwrd">return</span> <span class="kwrd">true</span>;
} }
</pre> </div>
I do a few things to make it easy on developers to use an email class that can play well with strings as well as other use cases:
* Made Email immutable
* Defined conversion operators to and from string
* Added the [Try-Parse pattern](http://msdn2.microsoft.com/en-us/library/ms229009.aspx)</ul>
The usage of the Email class closely resembles usage for other string-friendly types, such as DateTime:
<div class="CodeFormatContainer">
<pre><span class="kwrd">string</span> inputEmail = txtEmailAddress.Text;
Email email;
if (! Email.TryParse(inputEmail, out email)) { // oh noes! }
txtEmailAddress.Text = email; </pre> </div>
Now I can go back and forth from strings and my Email class, plus I provided a way to convert without throwing exceptions. This looks very similar to code dealing with textual date representations.
### Yes, but
The final Email class takes more code to write than the original extension method. However, now that we have a single class that plays nice with primitives, additional Email behavior has a nice home. With a class in place, I can now model more expressive emails, such as ones that include names like “Ricky Bobby <[email protected]>”. Once the home is created, behavior can start moving in. Otherwise, validation would be sprinkled throughout the system at each user boundary, such as importing data, GUIs, etc.
If you find yourself adding logic to primitives to the point of obsession, it’s a strong indicator you’re suffering from primitive obsession and a nice, small, specialized class can help eliminate a lot of the duplication primitive obsession tends to create.