Can extreme programming be as rigorous as traditional software engineering?
December 12, 2003
In theory, extreme programming is a highly dynamic and beautiful process. It is about people (programmers and customers) communicating with one another, working bit by bit to develop a working, error-free system that meets the customer's needs. At the same time, extreme programming (XP) relieves developer headaches like extended overtime. Kent Beck, the father of XP, outlines twelve basic rules (Miller & Collins 2001):
- The Planning Game
- Pair Programming
- Testing
- Refactoring
- Simple Design
- Collective Code Ownership
- Continuous Integration
- On-site Customer
- Small Releases
- 40-hour Week
- Coding Standards
- System Metaphor
But can extreme programming really do all of this while being as rigorous as traditional development methods? I think that in many cases it can be just as rigorous, and in some cases even more so. XP delivers superior software in small, dynamic environments when programmers understand the system they are developing and exhibit a conscientious effort to test their code thoroughly. XP has not been significantly tested in large-scale development, however, and it is unlikely suitable for that sort of situation. In addition, it does not make sense to use XP for critical systems (nuclear reactors and the like).
How does XP promote rigorous software development?
Pair Programming
The adage “two heads are better than one” is one of the reigning principals of XP, and there is much evidence to suggest that this practice of pair programming is indeed beneficial. The idea is that two people can talk through the design of the code as they type it, working out the best solution before they implement it. (This is crucial since design occurs as code is being written, and there is not an up-front design to work from.) If one person is typing out code, the other will be constantly evaluating the effectiveness of the code, looking for syntax errors and design flaws. When two people are working together on a project, both are forced to remain actively engaged with each other and with the code: there is less chance that one or the other will give in to a sudden urge to check e-mail or surf the web (Williams& Kessler 2000).
Laurie Williams, researcher at North Carolina State University , has conducted considerable research into pair programming. In one case study involving university students, she found that students who programmed in pairs consistently produced code that was less buggy than students who programmed alone. The students were asked to write several programs, which were then tested thoroughly for errors. Pair programmers found that their projects passed about 15% more of the tests than individual programmers (Williams, Cunningham & Jeffries 2000).
A 1999 survey of pair programmers reported that 100% of pair programmers were more confident with their code when they had programmed with another person as compared to when they had programmed alone (Williams& Kessler 2000).
Indeed, testimonials from pair programmers suggest that on the whole they have found pair programming to be a beneficial exercise that keeps them producing quality code. Whether it be on the XP Usenet group or in a tech magazine, people are excited about how pair programming improves their code. Ron Jeffries writes about his experience programming SmallTalk with a novice programmer: though Jeffries was confident about his coding abilities, he found that his partner, even without a comprehensive knowledge of the language, was able to able to ask him provocative questions about his design choices. Jeffries states simply, “Having a partner makes me a better programmer” (Williams, Cunningham & Jeffries 2000).
Pair programming has these direct benefits, but it also promotes long-term quality development. This technique spreads overall knowledge of the system, since programmers usually rotate through pairs many times during the course of development. Programmers are also given more chances to learn from one another, so that good programming techniques are more quickly learned by the whole team. Pairing is especially useful for novice programmers who may not be entirely confident in their coding skills; XP gives them a chance to apprentice with more skilled programmers. Apprenticing allows the novice to learn quickly. This can only aid the overall team effort, as more people will have more knowledge about good coding practices (Cockburn & Williams 2001).
Test Driven Development
Another principal of XP is test driven development (TDD). TDD suggests that programmers write tests before they write code to pass those tests. Writing tests helps the programmers become more aware of what it is an object or method needs to accomplish. These unit tests are stored in a common repository, and are executed constantly during development to ensure that the system is performing properly. If someone makes a change to the system and causes a test to fail, it is up to that person (or pair) to fix the error so that the system passes 100% of the tests (Beck & Gamma).
In addition to verifying the system with unit tests written by programmers, the system is also checked against acceptance tests. These are tests that are defined by the customer, and are intended to ensure compliance to the customer needs. At the beginning of a project, the customer writes out onto cards all the “stories” that they would like a user to be able to do within in the software. An acceptance test is written for each user story, to make certain that the story is properly implemented, and these tests are executed regularly during the development (Lange 2002).
How are traditional software engineering practices rigorous?
Traditional software engineering processes such as the waterfall model, formal systems development, and cleanroom software development ensure rigorous software development through a variety of mechanisms, some of which I shall outline now. Traditional software development cycles rely heavily on the production of a requirements document. This document, though its precise contents may vary, usually includes plain-language descriptions of the system to be developed (suitable for end-users) and a detailed description of the system functions (suitable for developers). The requirements document is prepared with considerable attention to customer needs, and is approved by the customer before any code is written. (Sommerville 2001, pp. 109-118)
The technical specifications in the requirements document are enhanced by detailed design specifications and models later in the software lifecycle. Whether these are UML diagrams outlining class hierarchies or state diagrams showing the flow of control within a program, they provide the software engineers with valuable information about how the end system is going to function. If the system is safety-critical, such as a nuclear reactor monitoring system, the requirements document is transformed to usable code using mathematical proofs (p. 48). When all these documents have been created, the engineers know precisely how all the parts of the system fit together, and they know that the end product will work and meet the requirements established at the beginning of the process.
The sum result of all this preparation is that the programmers who sit down to write code know pretty much exactly what they need to do and how they need to do it. Furthermore, the plethora of diagrams showing interfaces between objects lets the developers know which methods need testing, which ones are the most critical, and how they should best be tested. An intense testing phase (sometimes longer than the initial development phase) ensures that no critical holes exist in the software. With the requirements document in hand, testers can start prodding the newly developed system to check for inconsistencies between the implemented system and the requirements specification. This defect testing usually includes a walkthrough of the code performed by a team of four engineers who check for common coding errors and design flaws (pp. 427-431). Alongside defect testing, testers can perform statistical tests to measure emergent properties such as overall system performance, reliability, safety, and security (p. 421).
In other words, there is a lot of testing happening during the traditional software engineering processes. In formal systems development, it can even be guaranteed that the software is error-free. The whole lifecycle has been nothing but rigorous: from the requirements documentation to formal specifications to defect testing, each step in the process has passed intense scrutiny.
How does XP compare with traditional methods?
XP certainly has its share of opponents. It produces a hodge-podge of code, with programmers designing on the fly. Unit tests can be improperly coded and fail to perform their duties. The on-site customer may not be familiar with what the end-user's needs are. The lack of documentation makes it difficult to view the development process. Constant refactoring can introduce significant errors into the code (Stephens 2003). Indeed, it does seem that XP has the potential to deliver software that contains some faults.
Kent Beck seems to suggest this very thing in a recent interview. He says, “suppose there are two styles of interaction, one of which costs one one-hundredth of the other and is 95 percent as effective” (Nelson 2002). From context, it is clear that he is discussing extreme programming. XP relies on the individual initiatives of programmers to come up with unit tests. If a programmer forgets to include a specific test, that portion of the program may not ever be tested during development, and only after deployment will an error be found. To help combat this, XP relies on both pair programming and customer-based acceptance tests. So some bugs may remain, but the system has been checked by many different people. In some ways, this is an example of “good enough” software (Bach 1997). The software performs well, but there might be some critical flaw lurking beneath the surface. For this reason, it does not seem advisable to use XP for critical systems; in this case, it is appropriate to use formal systems development.
XP requires that programmers be knowledgeable about the environment in which the system is to be deployed. A personal example: my father worked for several years with an e-commerce company that was using XP techniques to deliver software to businesses in the food sector. The programmers they had hired, however, knew very little about the food industry. As a result, business managers found themselves constantly taking finished code back to the programmers asking them to rework it so that it accurately modeled the business practices. The programmers were not willing to engage themselves with their customer to learn exactly what was needed, so the product delivered was often faulty.
I think that part of the reason traditional development models can deliver robust code is because they have been around for a long time and people have, over time, learned ways to cope with the things that do not go as planned. For example, when traditional methodologies have individual programmers coding in distinctly different styles, people developed CASE (computer-aided software engineering) tools to establish coding standards and improve the overall system reliability. (Gilles 1997, pp. 84-92.)
Until now, XP has not been tested with large groups of programmers. “XP is set up for small groups of programmers. Between 2 and 12, though larger projects of 30 have reported success” (Wells 1999). No one really knows what will happen when XP is tried with larger groups and larger systems. “[Just] as expertise in woodcarving does not automatically imply expertise in cathedral architecture, expertise in small-team software production does not automatically imply expertise in building very large systems” (Bollinger 2000). Traditional software engineering has long since passed this hurdle and can competently deliver cathedral-like software.
XP clearly has its limits, but it also can out-perform traditional software engineering in certain situations. Perhaps the greatest feature of XP it assumes that requirements will change over time, and therefore it is able to keep pace with a customer's changing needs. For example, fast and dynamic development are crucial to projects at the HP Middleware Labs, where requirements can easily change every hundred days. Traditional software engineering could not keep pace (Cowen 2002). Instead of producing on pages and pages of design documentation, XP programmers write the code to document itself and proceed immediately with implementation.
Conclusion
XP is not suitable for large or critical systems, but it delivers robust, reliable code for small, dynamic systems. It can outperform traditional software engineering when software needs to be developed quickly with a high measure of quality. XP is at its finest when programmers are working in pairs, constantly checking the code written by their partners; when programmers are able to understand the system and are able to have constant access to a knowledgeable customer; and when programmers perform thorough unit tests and customers perform thorough acceptance tests. This is when extreme programming is happening the way it was meant to happen, and this is when reliable software is built.
References
Bach, James (1997), Good Enough Quality: Beyond the Buzzword , [online], IEEE Available from: http://www.satisfice.com/articles/good_enough_quality.pdf [ 11 December 2003 ].
Beck , Kent , & Gamma, Erich. Test Infected: Programmers Love Writing Tests , [online], Available from: http://members.pingnet.ch/gamma/junit.htm [ 11 December 2003 ].
Bollinger, Terry (2000), XP: Two Concerns , [online], IEEE Computer Society Dynabook: eXtreme Programming, Available from: http://www.computer.org/SEweb/Dynabook/BollingerCom.htm [ 11 December 2003 ].
Cockburn, Alistair, & Williams, Laurie (2001), The Costs and Benefits of Pair Programming , [online], Available from: http://collaboration.csc.ncsu.edu/laurie/Papers/XPSardinia.PDF [ 11 December 2003 ].
Cowen, Amy (March 2002), Extreme Programming: Coding on the Edge , [online], mpulse, Available from: http://cooltown.hp.com/mpulse/0302-extreme.asp [ 11 December 2003 ].
Gilles, Alan C. 1997, Software Quality: Theory and Management , 4 th edn., International Thomson Computer Press, London .
Lange, Manfred (2002), Practical Tips for XP , [online], The XP Exchange, Available from: http://www.xpexchange.net/en/articles/practicalXP.html [ 11 December 2003 ].
Miller, Roy W., & Collins, Christopher T. ( 1 March 2001 ), XP Distilled , [online], Available from: http://www-106.ibm.com/developerworks/java/library/j-xp/ [ 11 December 2003 ].
Nelson, Eldon ( 15 January 2002 ), Extreme Programming vs. Interaction Design , [online], FTPOnline, Available from: http://www.fawcette.com/interviews/beck_cooper/default.asp [ 11 December 2003 ].
Sommerville, Ian 2001, Software Engineering , 6 th edn., Pearson Education Limited, Harlow , England .
Stephens, Matt. ( 26 January 2003 ). The Case Against Extreme Programming , [online], Software Reality, Available from: http://www.softwarereality.com/lifecycle/xp/case_against_xp.jsp [ 11 December 2003 ].
Wells, Dan (1999), When Should Extreme Programming Be Used? , [online], Available from: http://www.extremeprogramming.org/when.html [ 11 December 2003 ].
Williams, Laurie A., & Kessler, Robert R. (May 2000), All I Really Need to Know about Pair Programming I Learned in Kindergarten , [online], Available from: http://collaboration.csc.ncsu.edu/laurie/Papers/Kindergarten.PDF [11 December 2003].
Williams, Laurie, Kessler, Robert R., Cunningham, Ward, & Jeffries, Ron. (July/Aug 2000), Strengthening the Case for Pair-Programming , [online], Available from: http://collaboration.csc.ncsu.edu/laurie/Papers/ieeeSoftware.PDF [ 11 December 2003 ].