Agile Architecture Revisited

I have written about agile architecture before, but since I have been working with a group of architects recently (the kind that build software, not the kind that build buildings), I figured it's time to revisit the topic. The question that kept on coming up was "how do you do proper architecture in agile?". It's a good question. Agile is all about just in time rather than up front planning and traditional architecture looks a heck of a lot like a type of upfront planning. We even have a special term for what we want in agile environments - emergent architecture. Architecture that emerges just in time from the team. The problem is that while emergent architecture works fine in some problem domains, there are others where emergent architecture just isn't enough. If you're designing banking systems, or safety critical healthcare systems, or even just regular old big complex systems, relying on emergent architecture simply doesn't cut it. You need some level of upfront thinking (or at least longer term than a sprint or two ahead) to make sure your product doesn't fall in a heap.

Some of the scaled frameworks recognise this and introduce the concept of "intentional architecture" for the upfront stuff. The amount of intentional vs emergent architecture you do is a function of the type of system you are building. That's great but it still doesn't tell us much about how to do architecture (emergent, intentional or otherwise) in an agile environment. Before we look at how to do architecture, we should start by understanding what architecture is, and more specifically, what it isn't. Let me start by saying something really important. Remember this, there will be a test later - architecture is not the same as design. Many organisations, actually all organisations that I have worked for so far, have been getting architecture wrong. In these organisations, the architects didn't actually do any architecture. They produced detailed design documents. That's design. Not architecture. Detailed design absolutely should not be done up front.

Organisations have mixed up the two because they have tried to concentrate design skills into a few specialists, who then farm out the designs to lower skilled (and therefore cheaper) coders to implement. They already had highly skilled architects doing architecture so they added design to the architect's role and over the years it has become seen as an integral part (or even the only part) of the architect's job. This is wrong. Detailed design is something that absolutely should emerge from the teams doing the work. Architecture though, real architecture (rather than design) does have a necessary up front component.

So, if architecture isn't detailed design, then what is it? The best way to think about architecture is as a set of constraints that any design must adhere to.

If you think about a problem, any problem, there are always multiple solutions to it. Each will have slightly different attributes. Some will be easier, some will be faster, some will use existing infrastructure, others need new things to be developed, some will scale, some won't. I like to refer to this set of all possible solutions to a problem as the solution space. Normally the solution space for any real world problem is very large. The role of architecture is to constrain the solution space.

So if we make the architectural decision to develop the solution using our existing .NET stack rather than build a new Java stack, then we have just divided our solution set in two - the set of .NET based solutions that we will continue to consider and the set of other solutions that we can immediately discard. We can then start to put more constraints on the solution - it must use our existing data warehouse, it must interface with the enterprise service bus, it should be based on microservices, it will follow the MVC design pattern, and and so on. Each of these decisions constrains the solution set a little (or a lot) more.

We can build our technology strategy into the constraints - it must interface with the existing database and also with the new database under development, the solution must include a replacement for the accounting system as that is old and high risk, and so on.

We should definitely build in our non-functional requirements - the solution must pass security testing, it must scale to 10,000 visits/second, it must be accessible to the visually impaired, no identifiable customer information can be stored in the cloud, and so on.

With an unconstrained solution set, emergent design is hard to do as there are too many options and it is too hard to coordinate multiple teams. With unconstrained design you end up with bad design. Assumptions are made that are correct short term but long term lead to major design changes. The once clean code becomes more and more unstructured as change after change is needed and you end up with an un-maintainable mess. The role of the architect is to allow good, emergent design to happen by constraining the solution set into something smaller and more manageable. Architecture sets boundaries and guidelines for the teams to operate within.

Not all these decisions need to be made at once. Architecture doesn't have to be done all up front. Make the big decisions (the ones with major impacts if you change them later) early in the project. Other decisions can be made later. Most decisions can be made when the need arises. This is the lean principle of delaying decisions until the last responsible moment. Architectural decisions are great examples of decisions that should be made this way. Don't make the teams wait until all the decisions are made, make a few key ones up front, enough to get things started then make the rest of the decisions when the team needs guidance in that area.

The classic example of delayed decision making is Toyota's process for making the steel dies for car doors. These are huge blocks of specially cut steel that have very long lead times. So rather than do a fully detailed design up front, then order the steel, they make a few key ones - it's going to be a mid-sized car so the doors will be below this size, it's a 4 door car so we will need 4 dies. That's enough to get the steel blocks ordered. Other design decisions can be made during the long delivery lead time. That saves them months in the design process because they can start earlier. Even when they start cutting the steel into the dies, they don't have a full design, they start with the big cuts first - the overall shape of the doors - then fill in the details (the exact profile of the wing mirror mount) later.

Software architecture should work the same way. Make the big decisions up front so the teams can get started, then make the next set of decisions once the solution starts to emerge and fill in the fine details when the teams get to them.

A famous French author and aviator, the amazingly named Antoine Marie Jean-Baptiste Roger, Comte de Saint-Exupére, said -

"Perfection in design is achieved not when there is nothing more to add, but when there is nothing more to take away".

He was talking about simplicity and elegance in aircraft design back in the 1920s but he could just as well have been talking about the process of designing software. This simplicity and elegance in design is something that we all strive for. One of the core agile principles is -

"Simplicity - the art of maximising the amount of work not done"

Having a good, well defined, well constrained solution set is key to allowing that sort of elegant simplicity to emerge. Too loosely constrained and the design becomes messy and complex, too tightly constrained (like handing a team a complete design) and you lock yourself into a single, probably sub-optimal, solution.

This is the real art of software architecture. How to constrain the solution set enough to allow elegant simplicity to flourish but not so much that you kill creativity.