Working with large systems that don’t have automated validation frameworks is a pain. So much so that many developers label them legacy even on their maiden voyage. The reality is that there exist many such systems in production today, and many more going into production hourly. Because of this, its not uncommon to find oneself dropped into such projects. We grumble and groan and hope to get off them as quickly as possible. Well buck up, greenfields projects are a rarity so here is where we practice our trade.

The first hurdle to cleaning up (or even just modifying) a big system is having to worry about non-regression. Now there are no magic bullets here, the only way to add tests to the behemoth is to roll up your sleeves and dive in. Well, sort of. It’s useful to find a seam through which to inspect the system. One such seam is the gigabytes of XML that many systems produce for message exchanges. These XML messages are useful for non-regression testing—they are structure, are either input or output from the system and it is possible to validate them. For more on the subject of testing with legacy systems and seams, refer to Michael Feather’s treatise “Working Effectively with Legacy Code”. Even though it’s been around for a while, XML validation is often limited to schema and DTDs. I’ve also seen XML validation done by scanning the XML for some expected sub-strings or the brute force comparison of the resulting XML against the expected XML. These options each have their merits, but all fall short in some way. The main irritant is not having all the differences highlighted upon assertion failure. I find that I need a better XML comparison/assertion. To that end I went looking for tools that can be easily used within JUnit tests. This library has been around long enough to have stagnated, but it continues to save my hide on projects. I’m talking about XMLUnit ( With the following helper method in place, I forge ahead with integration tests on my side.

import org.custommonkey.xmlunit.DetailedDiff;  import org.custommonkey.xmlunit.Diff;  import org.custommonkey.xmlunit.XMLUnit;  public class AssertXml {      public static void areSimilar(String expected, String result) {          XMLUnit.setIgnoreWhitespace(true);          XMLUnit.setNormalizeWhitespace(true);          DetailedDiff diff;          try {              diff = new DetailedDiff(new Diff(expected, result));          } catch (Exception e) {              throw new RuntimeException(e);          }            if (!diff.similar()) {              StringBuffer differenceDescription = new StringBuffer();                differenceDescription.append(diff.getAllDifferences().size()).append(" differences");              differenceDescription.append(diff.toString());                differenceDescription.append("expected => n").append(expected);              differenceDescription.append("result => n").append(result);      ;          }        }  }

This is useful when refactoring legacy systems that produce XML. It allows me to capture XML output from a system, validate it manually, and use the captured XML as the expected output to a given test. I test the execution output against my expected output to make sure that I haven’t changed any behaviour in the system while refactoring.

@Test  public void testSimila() {      AssertXml.areSimilar("", "");  }

For best results, capture a variety of outputs and allow for a greater non-regression. Often this has problems with temporal data (as simple as current system date and time). This give a first target for refactoring—breaking this out. Some are easier to break than others, the simplest are the static contexts like the current system time. Because testing a large XML typically involves varied data processed by varied rules, this is not a viable long-term solution for integration testing, the tests are fragile and not adequately targeted. It does give me a safety net to start breaking apart the tight core of code and splitting out smaller, unit tested portions while ensuring the whole essentially behaves the same. This strategy can still be used with small, target bits of generated XML as part of a longer term test strategy, but the code needs to adapted to generate isolated fragments of the XML. Xmlunit comes in a Java and .NET flavor (though I’ve not used the .NET version) and ranks as one of my favourite  old, dead, but very useful libraries.

Savoir Agile

Billet précédent

Un developpeur c'est plus qu'un codeur...

Billet suivant

Assertions in production code