Validating XML content when testing

Working with large systems that don’t have automated validation frameworks is a pain. So much so that many developers label them legacy even on their maiden voyage. The reality is that there exist many such systems in production today, and many more going into production hourly. Because of this, its not uncommon to find oneself dropped into such projects. We grumble and groan and hope to get off them as quickly as possible. Well buck up, greenfields projects are a rarity so here is where we practice our trade.

The first hurdle to cleaning up (or even just modifying) a big system is having to worry about non-regression. Now there are no magic bullets here, the only way to add tests to the behemoth is to roll up your sleeves and dive in. Well, sort of. It’s useful to find a seam through which to inspect the system. One such seam is the gigabytes of XML that many systems produce for message exchanges. These XML messages are useful for non-regression testing—they are structure, are either input or output from the system and it is possible to validate them. For more on the subject of testing with legacy systems and seams, refer to Michael Feather’s treatise “Working Effectively with Legacy Code”. Even though it’s been around for a while, XML validation is often limited to schema and DTDs. I’ve also seen XML validation done by scanning the XML for some expected sub-strings or the brute force comparison of the resulting XML against the expected XML. These options each have their merits, but all fall short in some way. The main irritant is not having all the differences highlighted upon assertion failure. I find that I need a better XML comparison/assertion. To that end I went looking for tools that can be easily used within JUnit tests. This library has been around long enough to have stagnated, but it continues to save my hide on projects. I’m talking about XMLUnit (XMLUnit.sourceforge.net) With the following helper method in place, I forge ahead with integration tests on my side.

import org.custommonkey.xmlunit.DetailedDiff;  import org.custommonkey.xmlunit.Diff;  import org.custommonkey.xmlunit.XMLUnit;  public class AssertXml {      public static void areSimilar(String expected, String result) {          XMLUnit.setIgnoreWhitespace(true);          XMLUnit.setNormalizeWhitespace(true);          DetailedDiff diff;          try {              diff = new DetailedDiff(new Diff(expected, result));          } catch (Exception e) {              throw new RuntimeException(e);          }            if (!diff.similar()) {              StringBuffer differenceDescription = new StringBuffer();                differenceDescription.append(diff.getAllDifferences().size()).append(" differences");              differenceDescription.append(diff.toString());                differenceDescription.append("expected => n").append(expected);              differenceDescription.append("result => n").append(result);                org.junit.Assert.fail(differenceDescription.toString());          }        }  }

This is useful when refactoring legacy systems that produce XML. It allows me to capture XML output from a system, validate it manually, and use the captured XML as the expected output to a given test. I test the execution output against my expected output to make sure that I haven’t changed any behaviour in the system while refactoring.

@Test  public void testSimila() {      AssertXml.areSimilar("", "");  }

For best results, capture a variety of outputs and allow for a greater non-regression. Often this has problems with temporal data (as simple as current system date and time). This give a first target for refactoring—breaking this out. Some are easier to break than others, the simplest are the static contexts like the current system time. Because testing a large XML typically involves varied data processed by varied rules, this is not a viable long-term solution for integration testing, the tests are fragile and not adequately targeted. It does give me a safety net to start breaking apart the tight core of code and splitting out smaller, unit tested portions while ensuring the whole essentially behaves the same. This strategy can still be used with small, target bits of generated XML as part of a longer term test strategy, but the code needs to adapted to generate isolated fragments of the XML. Xmlunit comes in a Java and .NET flavor (though I’ve not used the .NET version) and ranks as one of my favourite old, dead, but very useful libraries.

Validating XML content when testing

Savoir Agile

Un developpeur c'est plus qu'un codeur...

Assertions in production code

Gestion, sentiment de sécurité psychologique et changement durable

Comprendre le but du Sprint Planning

Pourquoi les Ressources Humaines doivent-elles s’intéresser à l’Agilité ?

La réalité virtuelle s’invite à un match de foot!

Les 5 compétences fondamentales d’une équipe — Optimisez la vôtre !

Le radar QIX

La peur gère-t-elle vos équipes?

Un diagnostic Agile, ça sert à quoi ?

Points de complexité et Planning Poker vous dites?

Équipes performantes – d’une équipe de stars à une équipe star

Perspectives agiles – interview de Thomas Gibot

#9 – Véronika Lévesque – Promouvoir l’Agilité au sein des administrations publiques

Cas pratique autour de la création d’un Product Goal

#8 – Marianne Masson-Delcombel – Quand les principes de l’Agilité s’invitent au coeur de l’école.

Thibaut

Ludovic

Francis

CL

Patrick

Équipes performantes – d’une équipe de stars à une équipe star

Perspectives agiles – interview de Thomas Gibot

#9 – Véronika Lévesque – Promouvoir l’Agilité au sein des administrations publiques

Cas pratique autour de la création d’un Product Goal

Validating XML content when testing

Savoir Agile

Un developpeur c'est plus qu'un codeur...

Assertions in production code

Billets connexes