Automated Acceptance Testing – the next holy war in software development?
There have been several issues in software development that I have relegated to the category of holy war. For me to put a topic in this category it is one that seems to generate much passion and differing views within a team or the software community as a whole. To date I have included in this category such topics as : -
1. The brace war in Java (do we start a new line for an open brace or not)
2. Unit testing private methods
3. Comments in your code
4. Mocks vs. Stubs
5. Naming conventions for interfaces
6. spaces vs tabs
7. indent size
Recently I have also experienced so many heated debates on the topic of Automated Acceptance Testing that I feel it now qualifies the topic to the ranks of a software holy war topics as the feelings on both sides are so strong and in opposition.
I really don’t want to go into my view on this topic at this point– rather my observation that many of the terms used by those in discussion seem to be understood differently by those involved in the discussions which only adds to the aggression and confusion. If you are to go into a discussion on this topic, I feel it necessary to begin with :-
1. an agreed glossary of terms
2. an understanding of the history around this topic and how the terms evolved
To this end, here is my understanding of the terms, some references and finally - a suggested glossary. The glossary is of course open to debate and should be used a guideline or starting point to creating your own glossary for your company. (I personally like the use of the perfection game for this exercise as outlined here or search Google books for “Playing and Perfecting” and read the chapter from Jim McCarty’s “Software for Your Head” book.)
I have tried to follow a chronological order in the development of these terms.
In the early XP (Extreme Programming) days there evolved two types of testing – unit and functional. Unit tests I do not need to go into as their definition is clear and consistent within the development community. The C2 wiki was/is the birth place of many of the XP terms and concepts so I refer to this site for a definition of the term functional test.
Functional Tests are programs or scripts configured to test that packages (groups of clusters of classes) meet external requirements and achieve goals, such as performance. They include screen-driving programs that test GUIs from without.Functional tests scripts were typically written with frameworks that extended xUnit e.g. HttpUnit, DbUnit, JWebUint.
Examples of functional testing programs are :- SoapUI, PureTest, Selenium IDE, iMacros
In the Functional Testing C2 wiki page referenced above - scroll down the page to see a comment from Kent Beck with the title “Name discussion: functional vs. acceptance test”. This was the birth of the term acceptance test and I will again refer to the C2 wiki site for a definition here.
A formal test conducted to determine whether or not a system satisfies its acceptance criteria and to enable the customer to determine whether or not to accept the system.
Originally called Functional Tests because each acceptance test tries to test the functionality of a user story.The point of clarity to be made here is the intent of using the word “acceptance” instead of functional. Because the goal of a functional test was to prove via software (scripts or programs), that a story has met it’s acceptance criteria - as defined by the customer. So it seems that the a refinement of functional testing was made such that a given suite of tests purpose was to test only the new functionality provided to a running system by the successful delivery/implementation of a story or stories.
When this concept came to light to the team I was with at the time, we created a suite of HttpUnit tests at the beginning of (or prior to) each iteration. We would run this suite after the completion of a story and it’s passing would indicate that the story was indeed complete. The suite itself became similar in functionality to a burn down chart in that you could visually see progress through the iteration as more and more of the suite would pass with the goal that by the end of the iteration the entire suite would pass and these tests were then added to our suite of regression tests. (The definition of the term regression tests is to follow.)
As an aside – we did not continue with this approach for very long for a number of reasons outside of the scope of this blog.
Referring to the Acceptance Testing wiki page above – you will see where the term Customer Testing came from.
Acceptance tests are different from Unit Tests in that Unit Tests are modeled and written by the developer of each class, while the acceptance test is at least modeled and possibly even written by the customer. ...Hence the even-newer name, Customer Test. C2 wiki definition of customer test ...
This is to clarify that Acceptance Tests are owned and defined (with assistance) by the Customer
From the IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries; IEEE; New York, NY.; 1990, System Testing is defined as :-
System testing of software or hardware is testing conducted on a complete, integrated system to evaluate the system's compliance with its specified requirements. System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic.On the Customer Test C2 wiki page on customer Testing mentioned above you will see the statement : -
As for System Test vs Acceptance Test, these two are essentially the same. There is absolutely no difference in their scope. Only that they are done by different groups of people - usually with vastly different competencies, and with somewhat different vested interest.So – to summarise where we have got to. Functional tests were re-named acceptance tests which were again renamed to Customer Tests which are also known as System tests when run by a testing or deployment team (as opposed to a development team). You wonder why there is confusion?!?!?
At the Wikipedia System Testing page you will see included in their definition : -
System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic.
As a rule, system testing takes, as its input, all of the "integrated" software components that have successfully passed integration testing and also the software system itself integrated with any applicable hardware system(s).This now introduces two more testing terms – Black Box Testing and Integration testing. When will this end?
Black Box Testing
Testing a system without any knowledge of the inner design of the code or logic.
http://c2.com/cgi/wiki?IntegrationTesting : -
[Integration testing is] validating that the boundaries of a system are functioning properly and will deliver the required results. Specifically, all components within the system should cooperate with each other properly. Integration Testing is performed by Integration Testers or Software Developers.Integration tests typicalll fall in the category of system tests with the express intention of testing the interface points of the system.
Maven’s Integration-Test Phase
I want to say something about the term integtration-test as it is used by the Maven build tool. My concern is that people whose only exposure to the term Integration Test is through using Maven may not have a clear understanding of the broader meaning.
As part of the maven build life cycle, there is an integration-test phase which is run after your artifact is built. The idea being that after that point you are able to test it from the outside without knowledge of the inner code (i.e. black box process). This would run a separate suite of tests from your unit test suite. However, the name integration-test phase may cause confusion as the suite you run at this point can run black box tests other than just integration. For example, I have used this testing phase to verify that the artifact itself is built correctly after a complex build process. This is essentially testing the build code in the hope that any changes to it that would cause an incorrect artifact to be built (e.g. missing the images folder) would be caught. It treats the build scripts themselves a testable.
Regression Testing is testing that a program has not regressed: that is, in a commonly used sense, that the functionality that was working yesterday is still working today.In other words – making sure you did not break anything that was previously working or breaking previous fixed bugs.
There are several approaches to this including keeping a suite of all previous acceptance suites (yes they can refactored).
In the spirit of TDD – whenever we find a bug, we should write a failing test(s). This may include a system test, particularly when it involves GUI functionality that we are unable to unit test. Once the bug is fixed and this test passes, it should be added to the regression test suite to ensure that we do not re-introduce this particular bug. This is also in the spirit of your tests as documentation as hopefully the new test will read and indicate that was the resolution of a bug and what the bug was.
There are strategies as to when these tests should be run but outside the scope of this document.
Business Readable Domain-specific language
There are Acceptance Test tools that aim to make it simple for the customer to write acceptance tests without (or with minimal) help from a developer or tester. A sub group of these tools include the group that have the tests written in plain text with the aim for the language to be as close to human readable as possible and expose words for domain specific concepts. The language in these text files is referred to as Business Readable-specific language. Some example languages are Gherkin and Selenese.
Behavior Driven Development (BDD)
This is the concept of writing acceptance tests prior to starting on a story/feature. This name for the process became popular with developers of testing frameworks that differentiate themselves from other frameworks by their use of natural language for writing tests. Some examples are Rspec, STIQ and Cucumber. So the common understanding of this term today is :-
Writing (and running) acceptance tests written in a Business Readable Domain-specific language prior to any other code.BDD Acceptance tests are typically written using the standard agile framework of a User story: "As a [role] I want [feature] so that [benefit]". Acceptance criteria are written in terms of scenarios and implemented as classes: Given [initial context], when [event occurs], then [ensure some outcomes].
Another name for Behavior Driven Development popular with Ruby developers (typically using Rspec and/or Cucumber).
User Acceptance Testing (UAT)
Is a subcategory of acceptance testing in as far as it applies only to acceptance tests that relate to testing the user interface. Typically this is the GUI.
The ability to be able to test a system using software to run the tests.
The opposite of automated testing is manual testing.
Automated Acceptance Tests
Writing acceptance tests in a way that they can be run by software.
How They All Fit Together
Functional Test - it was the first name for what is now known as acceptance testing. The term and concept was created by the Extreme Programming community which later re-named it to Acceptance Test because they felt this name better described the intent. See Acceptance Test for the definition.
Acceptance Test - A formal test conducted to determine whether or not a system satisfies its acceptance criteria and to enable the customer to determine whether or not to accept the system. This is currently the widest accepted term for the concept.
Customer Test - a further suggested refinement on the name Acceptance Test to indicate that these test are owned and defined by the customer. The definition is the same as Acceptance Test.
System Test - System testing of software is testing conducted on a complete, integrated system to evaluate the system's compliance with its specified requirements. This term is used by the broader IT community. The agile term of Acceptance Test can be considered as a type System Test.
Black Box Testing - is a concept or category of testing. It can be thought of as the opposite to unit testing. It's name suggests that you are testing a system without any knowledge of the business logic or code inside the system. An Acceptance Test fits in the category of Black Box Testing.
Integration Test - as the name suggests, your goal is to test the interfaces of your system. Because the GUI is often one of the interfaces, an acceptance test that is testing GUI functionality is interpreted by some teams as a type of integration test. The term is used in place of Acceptance Test by some teams.
Regression Test - a test designed to prove the stability of the system. Often an acceptance test becomes a regression test once the story has been accepted by the customer.
Business Readable Domain-specific language - a syntax and set of key words used by BDD (Behavior Driven Development) testing tools. E.g. Gherkin, Selenese
Behaviour Driven Development (BDD) - The concept is the same as Acceptance Testing by definition but has become popular recently by the developers and users of testing tools that use Business Readable Domain-specific language in plain text documents. The term is used to highlight that the test is written prior to code and in a language that highlights expected bevaiour.
Outside-in Development - in development circles, this term is used (typically amongst the Ruby community) to describe a development practice that uses Behaviour Driven Development.
User Acceptance Test (UAT) - a test of the interface that the end user is using. Typically a GUI.
Automated Testing - the opposite of manual testing. You are using software to test a system.