Maintaining software consistency is a bit of a gummy task: it’s indirectly tied to the dollar-value of work, and the by-effects of not looking after it are slow and sometimes not easy to see. In a lot of ways, the reasons why it is difficult is why tech debt in itself is difficult.
There’s a ton of ways to attack the problem —and there’s not much that’s canonical. What I hope to accomplish here is to provide a very narrow frame by which to tackle the issue: in that consistency is a byproduct of two functions:
A.) Mechanisms by which patterns and practices are propagated,
B.) Mechanisms by which they are guided.
For this piece, I’m defining consistency as the reduction in variance for patterns and practices created by multiple teams towards the same value stream. If Team A uses say, Kubernetes to deploy their infrastructure, it’s preferable that Team B does as well. If Team C uses say, Redux to manage their state, Team D would benefit a business better if they also did so.
There’s a ton of reasons for why consistency matters, but a lot of them could be chalked up to the importance of making sure things survive the short life of organizational memory:
- Having common patterns means that more people are able to support a system,
- Which puts the system in a more maintainable state should staffing changes occur,
- …but also means that improvements made to one system can be easily transferred to another — making global improvements cheap.
Why is consistency hard?
In the world of lets-just-deliver-value-to-customers-now, consistency work often takes a back seat to making feature sets available. Why? Well, it’s complicated:
Immediacy of impact. There are different lifecycles for the value created by feature work vs. consistency work: a feature created by a team may start paying off immediately if done; whereas the negative impact caused by a tactical lets-just-do-it-this-way-for-now solution may not rear its head until months (perhaps years) later.
Visibility (and attribution). Issues caused by lack of consistency often surface themselves with a loss of attribution —in the name of timely issue response, messages often get truncated. A team may say:
- “Service X is down because System Y fell over”
And not: “Service X is down because System Y fell over because we didn’t prioritize the tech debt items 3 months ago.”
And maybe also not: “Service X is down because System Y fell over because we didn’t prioritize the tech debt items 3 months ago, because our sizing exercises placed it at 1 months because the only person who understood how it was working left six months ago.”
Personal vs. Organizational memory. Organizations as a whole have a memory of which solutions they tried work — but people coming into a system may also take their own solutions that have worked from their own context over time. The solutions are not necessarily (or often) incorrect — but it takes tremendous effort to recontextualize organizational memory against a persons’ own biases and interpretations. “Why would I do it that way? This way has worked for me so many times in the past.” or “When I did it at Company Y it worked really well.”
Of the three, I’m going to focus on the last one — as the first two (immediacy and attribution) talk to the problem of consistency after the fact.
How practices move through an org.
Without architecture or technical leadership groups, pattern propagation looks like it moves a bit like this:
The view I prefer is a bit more zoomed in towards the people: people have different toolkits in their head. As people move through organizations, they will index towards certain toolkits more than others — perhaps because of personal experience, perhaps because of the opinion of their immediate peers, perhaps because of an authoritative voice they trust.
In the end, it’s all just bias — something we can all be victim to.
This is further broken down by an exercise at a previous workplace — where we tried to break down the reasons why people didn’t stick to an organization-sponsored pattern. In the end, it broke down into three categories:
A.) There were people who didn’t know what the patterns were, or that they existed in the first place.
B.) There were people who didn’t know why the pattern existed, or how important it was to stick to it.
C.) People didn’t agree with the pattern and chose not to do it.
From these three answers, we can derive two concerns:
- Making sure people understood what the patterns where and why they were important (Propagation, for categories A and B)
- Making sure people stuck to the pattern (Guidance, for category C)
The task of propagation is basically making sure that everybody coming into a system has the same block of information in their head — which provides people with a choice to use the patterns endorsed by an organization. There’s a couple of mechanisms that can be used:
In smaller organizations, it might be preferable to have someone embed and actively talk about what the patterns are and why they’re important. Empathy is a superconductor of information; having someone actively there to be the go-to person and actively answer questions provides the least friction. Expect a lot of whiteboarding.
As the project or organization scales, the amount of effort required for information transfer sharply increases. When this happens, it might be helpful to have passively-consumable sources of repeated information (documentation, implementation examples, etc). There’s two things to keep in mind:
- Information is way easier to transfer than context. What to do is easier to relay than Why we do it. While you can choose to document, say, criteria for selection and architectural choices, you will more likely need to reitirate these than if you were just documenting implementation details.
- You should be selective about the information you’re propagating. People retain less the more they need to absorb. Focus on reoccurring questions and pain points, instead of documenting the world.
Lastly, it’s important that you make sure that every person coming into the a system of work has the information in their head. An onboarding process that is regularly inspected and improved upon makes sure that people come into a system with the least amount of friction.
This sounds like a lot of effort, I know — but I also believe that people will always do their best within the bounds of what they know, understand, and have access to. Working on knowledge systems makes sure that everybody has everything they need to be successful.
But sometimes, people just disagree. That’s okay. If knowledge propagation systems are well-set, this usually amounts to a minority. However, just as you can architect systems that make sure people have the right knowledge going into a project (input), you also need to set systems in play to inspect for output.
Software systems can be setup in a way that guides for consistent input. Maybe you can have convention tests that check for the structure of code; maybe your infrastructure definitions come with safe defaults.
I tend to index towards prescription driven by tooling, as this removes the human element — furthermore, this also puts your criteria for “good software” in code, where it can be inspected and argued upon a bit more cleanly.
Aaaand this is the yucky part — the place where the word governance rears its head. Here’s an important thing to take into account with autonomous teams: the bigger the effective radius of the concern, the less empathetic teams tend to be.
This is the area where rules need to exist — and need to be acted upon. You’re going to want people that are accountable to the outcome of consistency, and those people will need support. They’ll need priority, they’ll need time. But any open system needs ‘gardeners’ at the end to verify the quality of output.
A note about tech leadership groups: it’s critical that a portion (if not a majority) of your pattern decision groups be composed of people embedded within teams — but not as a critical executor of work. Too far from developers and you risk having the decision groups be seen as an “ivory tower” — too close, and it gets hard to see the big picture.
In a nutshell, the two vectors of approach (propagation and guidance) are basically akin to seeding and gardening, respectively. We’re making sure that everyone is equipped to develop along the patterns that an organization endorses; while also making sure that the output is up to par.
We do this to make sure that adopting the right things is as easy for people as possible. We do this so that people are equipped with everything that they need to execute well.
Note that this is only a slice of the many issues around practice consistency and tech debt. There’s a ton of other possible problems and questions — Shouldn’t business leaders understand the value of plumbing? How does moving technical standards forward look like? — but that, my friend, is another story.