How to unit test and refactor legacy code?

When you first start writing unit tests, you probably already have quite a large working codebase. Now, how can you start writing tests for this code?

Usually this code isn’t written with testing in mind and it’s probably very hard to test. Your team probably has a rule that says “if it works, don’t touch it”.
And indeed, you can’t just go into your code, start refactoring things to make them testable and then write your tests, because you will break things.

So you’re basically trapped in a loop: you can’t refactor because you don’t have tests and you can’t write tests because you can’t refactor.

These are a few steps you can use to get out of the loop:

  1. Make the smallest possible refactoring in order to get your code to a testable state. When you do this, choose the safest refactoring method possible. Some types of refactoring are safer than others. Choose the ones that can be done by an automatic tool such as resharper. These tools will ensure that no behavior is changed.
  2. Once you have made a small refactoring, write an exhaustive test for the part you just made testable.
  3. Rinse and repeat.

Every next loop you do you can refactor a bit more and you can take on higher risk types of refactoring. There’s a small risk involved in the first loops, but if you chose the safest refactoring you can reduce this risk.

When refactoring, always take small steps, certainly in the beginning. When you have a good suite of tests, you can tackle bigger refactorings because you have your safety net in place. When you try to refactor more riskier things, always make sure you have that part of your code under test.

When starting with tests, the goal of the refactorings shouldn’t be to improve the design of your code, it should be to make your code testable. In most cases it will also improve your design, but even if it makes the design a little less clear, it’s a fair trade-off. Later, when you have a bigger part under test, you can always go back and refactor it the way you want.

How do I make my code testable?

Making your code testable is often quite tricky. Code that hasn’t been developed to be testable usually has a lot of dependencies: it talks to a database, it sends e-mails and it writes to files. The key point is that you have to break dependencies. This doesn’t mean that your code can’t access these things (that would be ridiculous), it just means that we have to flex our codebase a little bit.

While the main goal is making your code testable, we’re also improving the overall design of our code. If you develop your code so it doesn’t have direct dependency on the file system, it’s a lot easier to later swap it out for something else.

While this maybe hardly ever happens, you get another benefit: there’s no bleed of responsibilities between different units. Because units are independent, a change in one unit won’t affect another unit. This way you prevent changes from provoking errors in other parts of your code.

Now, obviously the question is: how can I break dependencies and more importantly, how can I break them without breaking the code in the process?

Technique 1: Extract an interface

Suppose you have a class that has a dependency on a second class. In the next example we have a class InvoiceGenerator that uses a MailSender-class to send e-mails:

   1:  public class InvoiceGenerator
   2:  {
   3:      public void CreateInvoices()
   4:      {
   5:          // Lots of invoice calculation code here
   6:          MailSender sender = new MailSender();
   7:          sender.SendMail("PAYDAY, please pay this");
   8:      }
   9:  }
  10:   
  11:  public class MailSender
  12:  {
  13:      public void SendMail(string Message)
  14:      {
  15:          // Send the mail here
  16:      }
  17:  }

If you want to unit test the InvoiceGenerator that means mails will be sent.
How can we prevent this from happening in our unit tests? Here are the basic steps to get your InvoiceGenerator under test:

  1. Create an Interface that exposes the same methods as the MailSender-class does. Let’s call it INotificationSender.
  2. Modify the MailSender so it implements the INotificationSender
  3. Let the InvoiceGenerator talk to the interface instead of the MailSender directly
  4. In your test-project, create a fake that implements the INotificationSender interface
  5. While testing, let the InvoiceGenerator talk to the fake object. This can be done in various ways:
    1. You can expose a property on the InvoiceGenerator-class. The disadvantage is that it breaks encapsulation.
    2. You can create a constructor which accepts an INotificationSender. The problem here is that clients of the class will have to provide the correct instance.
    3. Create an overloaded constructor. The first one is the same constructor as in option B. The second one is an empty constructor. In the empty constructor you create the MailSender-object and pass it to the first constructor. This way, encapsulation is not broken and clients don’t need to provide any data to instantiate the InvoiceGenerator. As a bonus, your system is now more flexible, because you can easily swap out the notification system.

The modifications listed here are not very complicated modifications and with a little bit of care you can make them with relative certainty that you are not breaking anything. Here’s the modified version of the code:

   1:  public class InvoiceGenerator
   2:  {
   3:      private INotificationSender _notificationSender;
   4:      public InvoiceGenerator() : this(new MailSender())
   5:      {
   6:      }
   7:      public InvoiceGenerator(INotificationSender NotificationSender)
   8:      {
   9:          this._notificationSender = NotificationSender;
  10:      }
  11:   
  12:      public void CreateInvoices()
  13:      {
  14:          // Lots of invoice calculation code here
  15:          _notificationSender.SendMail("PAYDAY, please pay this");
  16:      }
  17:  }
  18:   
  19:  public interface INotificationSender
  20:  {
  21:      void SendMail(string Message);
  22:  }
  23:   
  24:  public class MailSender : INotificationSender
  25:  {
  26:      public void SendMail(string Message)
  27:      {
  28:          // Send the mail here
  29:      }
  30:  }

The next step is to write a fake for the mailsender and unit tests for the InvoiceGenerator. This would look like this:

   1:  [TestClass()]
   2:  public class TestInvoiceGenerator
   3:  {
   4:      [TestMethod()]
   5:      public void TestCreateInvoice()
   6:      {
   7:          FakeMailSender fakesender = new FakeMailSender();
   8:          InvoiceGenerator target = new InvoiceGenerator(fakesender);
   9:   
  10:          target.CreateInvoices();
  11:          // Do assertions here
  12:      }
  13:  }
  14:   
  15:  public class FakeMailSender : INotificationSender
  16:  {
  17:      public void SendMail(string Message)
  18:      {
  19:          // Do Nothing
  20:      }
  21:  }

Of course you could extend the fake class so you can check some things in there too, but I think the main thing to take away here is that you have successfully decoupled the InvoiceGenerator from the MailSender-class, which is already an improvement.
Once this is in place, you can start doing bigger refactorings for the InvoiceGenerator.

This technique is my personal favorite for breaking dependencies, because it’s very clean and it will almost always improve your design.
However, there are exceptions. When you have loads of these dependencies, your codebase will become cluttered with a lot of interfaces. If you always apply this technique, you will basically have one interface per class, which is a bit over the top. Therefor, sometimes other types of dependency breaking techniques are more appropriate.

Technique 2: Subclass and override method(s)

When you don’t want to create an interface, either because it’s not appropriate or becuase it bloats your codebase, this is a simple technique to make sure you can get something else in place. The concept is very similar:

  1. Make sure you can control the MailSender instance that the InvoiceGenerator holds. (for example with constructor overloading such as with the extract an Interface-method)
  2. In your test project, make a a fake that inherits from MailSender and overrides its methods (or just the ones that are necessary)
  3. Swap the MailSender-instance for the fake object

If we apply these modifications instead of the Extract Interface-method we end up with the following:

   1:  public class InvoiceGenerator
   2:  {
   3:      private MailSender _mailsender;
   4:      public InvoiceGenerator() : this(new MailSender())
   5:      {
   6:      }
   7:      public InvoiceGenerator(MailSender MailSender)
   8:      {
   9:          this._mailsender = MailSender;
  10:      }
  11:   
  12:      public void CreateInvoices()
  13:      {
  14:          // Lots of invoice calculation code here
  15:          _mailsender.SendMail("PAYDAY, please pay this");
  16:      }
  17:  }
  18:   
  19:   
  20:  public class MailSender
  21:  {
  22:      public virtual void SendMail(string Message)
  23:      {
  24:          // Send the mail here
  25:      }
  26:  }

This looks a lot like the first code listing but here we haven’t decoupled the MailSender-class. Now in the tests, we will override the SendMail-method and get a fake object in place. The result is the following:

   1:  [TestClass()]
   2:  public class TestInvoiceGenerator
   3:  {
   4:      [TestMethod()]
   5:      public void TestCreateInvoice()
   6:      {
   7:          FakeMailSender fakesender = new FakeMailSender();
   8:          InvoiceGenerator target = new InvoiceGenerator(fakesender);
   9:   
  10:          target.CreateInvoices();
  11:          // Do assertions here
  12:      }
  13:  }
  14:   
  15:  public class FakeMailSender : MailSender
  16:  {
  17:      public override void SendMail(string Message)
  18:      {
  19:          // Do Nothing
  20:      }
  21:  }

Again very similar, but now we’re not implementing an interface but inheriting a class and overriding a method.

The advantage of this technique is its low impact on existing code. Use this technique if you really want to change as little as possible in your codebase, or when an Interface just isn’t appropriate.

Conclusion

The basic steps to get your code under test and refactor it are:

  • Get it testable by breaking dependencies. Make the least risky changes while doing this. 
  • Write your unit tests

  • Refactor

In order to get your code under test, there are two important techniques that you can use:

  • Extracting interfaces
  • Subclassing and overriding

With these two techniques, you can resolve a lot of dependencies and get your code ready to be tested.
In a following post, I’ll show a few more techniques that can come in handy when dealing with specific dependency problems:

  • Breaking out classes
  • Simplify parameters
  • Extract and override construction to factory

Comments are closed.