Building Solution Architecture Teams - A Software Approach

What actually is Solution Architecture? How do you build and manage an Architecture team? Can you take a software approach to that?

Solution Architecture - Some Background

Much of what I have done these past ten years is a mix of Software Engineering and Cloud System Engineering: in other words, Solution Architecture (SA).

Today’s cloud software is based on the concepts of distributed systems: smaller, atomic units that communicate with each other over the network. They either accept data as an input and return some answer, or accept a request for data that returns some answer. What happens in the middle is either some process on the data or some process on stored data. That’s a vast oversimplification, of course. But it’s directionally accurate.

We often call the software modules “micro-services” and a working solution to a problem is a collection of such services. You can be a pure Software Engineer and only work on one small service, but if you are more senior you are almost certainly designing how a set of services interact. One of the goals of micro-services architecture is to decompose a large problem into a smaller problems each solved with a service. The Engineers that do that work are doing the job of a Solutions Architect. Now, that can be at both the platform/infrastructure level and/or the application level. You might be creating scripts to build out your whole infra-environment, or writing system code to process data, or GUI code that actually runs in a users browser. But it’s all basically nodes in a system that “talk” to other nodes to do stuff. All of it requires software development. And I have strong opinions on that. But it also requires System Design.

In a nutshell, Solution Architecture is the intersection of Systems and Software Engineering. A given SA may be skewed more towards Systems Engineering (perhaps more customer-engaged) or more towards Software Engineering (more of a coder) but the role of Solution Architecture lives in the common area of that venn diagram.

It’s well worth pointing out that good Solutions Architecture is good Software Design. Think SOLID in everything you do. The difference between a system that is highly reliable, secure, and operable and one that is breaking, getting hacked, and makes some on your Ops team quit after each outage is whether you have good architecture. And in today’s world, it’s all software.

Now, I could opine that topic but I won’t. AWS has written the book on that: the AWS Well Architected Framework. Go read it. Seriously. If you are building software in the cloud you NEED to know this stuff.

Solution Architects not only need to know it, they need to be able to explain it. This means becoming good at presentations, teaching, explaining, and even showing others how to do it. There’s a people-facing part to the job that is a whole new dimension to the computer-facing part of the job that most Engineers are already comfortable with.

Climbing Up 5000 Feet

But this post is not about Solutions Architecture, per se. It’s about building and managing TEAMS of SA. One can recruit and hire one team fairly easily. If you are staying true to the “two-pizza team” notion then you probably are looking at 7-9 folks. Given money, reputation and access to the right people-networks you can hire that many Engineers and seed a fantastic team. But over time attrition (good and bad) will leak folks away. More on that later. And what if you don’t have the money to woo the top talent to your team? What if you need more than one team? What if you are scaling your systems so fast that you know you are going to need to scale more teams over time? There’s no “rice-a-roni” or “hamburger helper” for that meal. You need a real solution.

This calls for a process - something repeatable. Measurable. Replicable. Something that does not “rely on the heroics of individuals” but instead can be manufactured. You want a factory that turns out great Solution Architects.

Someone once told me that the role of management is to get things to a place where you can run the company with only mediocre staff. I don’t agree. I still think you should hire and develop the best talent you can find. But the idea behind that statement is solid. Don’t base your success on whether you are lucky enough to hire all 10x Engineers. We could argue whether there is such a thing as a 10x Engineer. I can certainly argue that there is absolutely such a thing as a 0.1x Engineer. You can also have fractured processes that can cripple even the best teams.

You do need good Engineering Management. I have written about that already. You also need solid software Development practices (code reviews, automated testing, Cloudy DevOps, etc.). That’s all basic block and tackle for the task. But what you need is a process to find motivated, fundamentally strong Engineers with the right attitude - and then you need a process to mold them, over time, into what you need.

Just like building a Leadership Pipeline for your management teams, you need to think about building a Solution Architect Pipeline.

Taking a Software Approach

Data pipelines are a standard solution pattern for some kinds of problems. Our need for SA talent maps to this pattern pretty well, if you think about it in software terms. You need to recruit, train on your solution, and then mentor and grow Engineers. Yes, they are people. But if you think about this as a process you can apply Engineering thinking to the solution.

If you read my posts you know that I’m generally in favor of taking a software approach to solving problems - the so-called SRE solution. This means THINKING like a Software Engineer at a very minimum. If we could write code to automate all the work that Solution Archtects do that would be great! Wait… to some extent that is already happening, with things like the AWS Well Architected Tool. ML/AI is going to have a huge impact on Solution Architecture, even sooner than you think. But in the meantime, we need to build teams of Solution Architects and a pipeline approach seems to be appropriate. Here’s how I would lay out the pipeline:

I think about this as if each team were a kanban board with those four columns, and each team member is a row. In this model, the “work items” are actually the SA Engineers themselves. The columns are defined in reverse order:

Promote

This term is the one I still struggle with. I’m not sure “promote” is the right word for this, but I cannot think of a better one yet. These Engineers have been “in the trenches” on the team “maturing” in their abilities. They have extensive hands-on experience and are seen as the technical leaders. Other SA seek them out. They probably are conducting training, both internally and externally. They are inventing solutions to problems, developing advanced new software, perhaps even getting patents. In the guild days these would be the Masters. They are the rock-stars that can be parachuted into a mystery problem and they solve it quickly. They are the dragon-slayers. Teams should aspire to have one or two, but not count on them. Their next step is to move on to a larger challenge. Often these kind of folks form the nucleus of a new product team or move into some other role where they can keep learning and growing. You must assume that they will rotate away at some point (even as you selfishly hope they never do). They need to, for their own growth. But these are the folks you are actively “promoting” to customers, other teams, and internally as the expert. And yes, it often follows that these are the folks that get promoted to ranks like “Principal Engineer.”

Mature

This is where most of a team will be if it’s firing on all cylinders. Not all the team will be at the same level of maturity, but perhaps 60% of your team should be maturing. These folks are well trained, know their jobs, and execute. In the guild days they would be the journeymen/women. SA may stay in this zone for quite some time, depending. This is the wheelhouse. The folks in this zone are delivering value, and it’s the managers job to ensure that they are mentored, coached and constantly growing. A smart manager will use their “Promote” folks to help here.

Train

My observation is that a newly hired Engineer - no matter how strong - takes a few months on average before they are actually delivering value. I call this the “training zone” for lack of a better term. This may be training on processes, on low-level details of the code base, on the tech-stack in use, the cloud provider details, or whatever. Every situation has it’s particulars. Sometimes you totally luck out and it takes a lot less time, but if you plan on it taking some time you are going to be a lot more predictable about how much value your team can deliver.

When writing software, if you sit down and code without thinking about the problem you can LOOK productive and be totally wasting your time. Being in the training zone is a lot like that. Unless you plan and structure your training process you will get ad-hoc results. Setting up training for your SA Engineers is worth a whole post on it’s own, but it needs to include a structured review of your architecture, the structure of the code, how to build and test as a developer, how your CI and CD works, and a review of your automated testing. This includes getting access to all the tools, and ensure that everyone understands how to use the tools (and why).

Recruit

This is controversial. Some teams only recruit when they actually get an open position approved, but that makes you totally flat footed when you start recruiting. In my experience, it takes two to four months on average from listing a position to someone’s first day (“but in seat”, as I say). This can be longer if you are seeking a more specialized skill set or competition for that talent is high. This is also assuming that you can be competitive for talent in some way (great pay, benefits, work from home, reputation). If you can’t then it’s going to take you longer.

I prefer the “always be recruiting” model. This means that I am constantly looking, and I’m honest with everyone about my ability to make an offer. I’d rather go to my VP and beg for a req because I have someone amazing ready to change jobs than not even know that person was ever looking because I’m not paying attention. Yes, this takes time and energy. Yes, this means watching forums, github projects, and listening to new speakers at conferences. You want to find the best? You have to look. You cannot assume that your recruiters will magically just find the top talent for you.

Overlap

The reality is that it’s not really a pipeline. It too is a venn diagram:

You can recruit folks with more of the specific skills you need, reducing the needed training - perhaps Certified Solution Architect, or CKA. You can put folks in training doing real work, starting to mature. Your folks maturing are probably deepening in areas that you will want to promote them around. The pipeline idea is close, but reality is more fuzzy. It’s really a blend. We are talking about humans, after all.

Atrition

You will get attrition. Don’t fear it, embrace it. In Cloudy thinking you assume a VM or container can evaporate and you design around that. Treat attrition the same way. Plan to replace folks. This is not mean or inhuman. Of course you want to keep teams together as long as you can. Teams are composed of groups of people who trust each other. Trust takes time and work. Losing a team member can hurt. But it’s going to happen. You actually want your Promote folks to move on to more challenging projects for their own career development. But you also may lose folks in the Mature zone who get excited about a different technology or even leave for a different company. Too many managers assume they won’t lose staff. That’s just like assuming a server will never fail. And we know how that works out. Plan to lose people.

Using Kanban Thinking

A simple kanban board with these columns - which could be a spreadsheet, actually - lets you see where you are at with a glance.

This is a well balanced team. You have a rock star being groomed for the next big adventure, a solid team, some folks in training and some folks you have your eyes on for recruiting. This is something close to what you aspire to.

But if you inherited a team, or are forming a new team by seeding it with a few strong, mature Engineers then it might look like this:

This team is relying on three folks to carry the load, and you have four folks in training. You don’t have a rock star to lean on, so the manager is hopefully jumping in to help. But he or she really should be making sure the four in training are progressing along nicely. This is a team that may not be struggling, but they have some work to do in order to be a high performance team.

But What About Multiple Teams? How to Scale?

Anyone can manage one team. How do you do multiple teams? How do you even know where you are? You still use kanban. Consider this:

The Orange team has a minor problem: they have four SA training and only three to carry the workload. This team could be risking burnout or potentially under-delivering to the customer. There’s hope on the horizon though, since there’s four SA training. The Green team might be the highest performing team of the bunch with two SA in the Promote region. That’s a heavy hitting team. But there’s a potential risk lurking: there’s no one training yet. Good news is that there’s a pipeline of eight Engineers being recruited, so there’s a path to healthy. The easiest risk to miss is the Yellow. They look fine, at a glance, but there’s no one in the recruiting pipeline.

Conclusion (For Now)

There is no right answer: only data and the actions you take as a result of the data. This approach gives you some of that data.

Each of the zones deserves it’s own deep dive. I’ll do those in follow-on posts. But thinking of the problem in software terms - specifically using kanban (which is not just for software, I know) can really help you structure your thinking about this problem.

The astute reader might point out that this approach could work for general software development teams too. In fact, if you add a Z-axis for skill set mapping you’ve basically got my approach to staffing. That’s fodder for a whole series of additional posts!

Finally, this post assumes a little bit that your problem space can be solved by adding Solution Architects. At some point of scaling that does not hold true. Better approaches are needed, which will include automation - and maybe some of that ML/AI that I alluded to in my introduction. The future will be full of innovation and advancement. It’s not going to get boring any time soon.

 Share!