9.11.13

Done-done or always beta?

Some development team advocates a delivery strategy for done-done-with-high-quality, which means the engineers won't deliver their code to the main branch until the feature the code related to has been tested thoroughly, including unit tests, integration test, end-to-end test and scale test.

I know a team that has been using this strategy for three years, and release several version of the software. It is quite obvious the team is not famous for good velocity. But they were quite confident with the quality until the customer crisis broke out suddenly. Many customers went into big issues, all kind of issues, so that the team has to stop developing new features and focus on fixing the problems on the customer side.

Why a team who has been advocating quality ends up facing quality issue? There are many reasons. Technical and non-technical. If we put the technical reason away for a while, I would blame the done-done strategy most.

The done-done strategy is based on the believe that quality can be acquired by enough tests. So if we sacrifice something like development time and spend more on test, we can achieve a certain level quality.

This is not true at all. Software development is very dynamic and affected by many factors. "Building software is by its very nature unpredictable and unrepetitive." It is not possible to set up a criteria for a certain feature, have it go through all the criteria and then call it done. We simply don't know the real criteria yet.

I am not saying we don't test the system. What I am saying is don't obsessively test the system. Unfortunately, there is no clear boundary for non-obsessive testing. One thing we do know is that we should not ask the engineer to raise the bar to the so called done-done-with-high-quality level. In this way, you are asking for promise. To make sure we keep the promise, we have to obsessively test the system until the managers have to give up and ask us to stop. In this way, nobody will be accused for quality. Then, the morale of the team will continuously go down, the velocity will go down, and because we don't have enough iteration or enough time for final verification before we deliver the software to the customers, the quality goes down.

There is a better way, which is called always beta. We emphasize the importance of velocity, and call for dynamic development. We drive the development based on risks instead of static quality. We review the software functionality without requiring perfect implementation. But we pay more attention to the architecture and code quality, to make sure if we want to change something or improve something, it would be easy to do it. Simplicity is one of the important feature of the system. Since we don't expect perfect system, it is more likely that we can implement the system in the most simple way and avoid most of the complex logics.

Also, we have the team deliver the code more frequently, and everyone in the team see the system, feel the system, and run the system. With more eyes, we can find out more issue before it reach the customers. We always prepare for integration problems, and fix it very quick. We also take some simple error seriously if our engineers think it is important and quality related - yes, let the engineers tell us whether this is important or not. We keep on the refactoring work and continuously improve the design and code quality and keep the system as simple as possible, as readable as possible.

With the always-beta thought in mind, we have more time to focus on things that we don't know upfront. And usually, this is the part the real quality issues may emerge.

No comments: