“A cloud is made of billows upon billows upon billows that look like clouds.”
-- Benoit Mandelbrot
Scaled Agile Framework provides a model for large enterprises to harness the power of an individual agile team and replicate its key benefits to the higher levels of abstraction (such as Program and Portfolio). This process of repeatedly replicating a certain set of properties from the lower to the higher level is a definitive characteristic of Fractals - complex objects that look similar at different scales. This fractal perspective will help us identify the fundamental causes of delivery problems within an organization and ways to address them.
Fractal Pattern: Value, Timebox and Self-organized Team
If we take a quick look at the Big Picture of agile enterprise as suggested by Scaled Agile Framework (further referred to as SAFe) we will see that all three levels - Team, Program and Portfolio - have a few things in common:
Value. All three deliver value. Agile teams (or as we also used to call them - Define-Build-Test teams or shorter: DBT-teams) deliver user stories; programs (Agile Release Trains or ARTs) deliver features; and the portfolio level delivers epics. Unlike in the pre-agile times, every unit pushes value to the next level of abstraction. It is important to emphasize that such a focus on value implies WIP limits at all three levels of abstraction, protecting an organization from the trap entailed in using unfinished work as a measure of progress.
Timebox. Agile teams stay on a synchronized sprint schedule, ARTs stay on a firm PSI schedule, and the portfolio level delivers epics within a larger timebox, typically the horizon of 2-3 PSIs.
Self-organized Team. Agile teams self-organize around the delivery of stories; ARTs self-organize to land the features that are in the scope of the PSI; the Program Portfolio Management Team, Enterprise Architects and Business Epic Owners self-organize around facilitating the delivery of business epics.
The three components we described above actually constitute the fractal pattern of agile organization (see Figure 1).
Figure 1. Agile organization as a fractal structure.
In the same way that fractals are used in different practical domains to provide a simple underlying pattern behind a complex system, in our case, a fractal view also helps to tame the complexity of an enterprise in a systematic way.
The "Ground Level" of Scaling
There is no feasible way to reflect all the possible aspects of a complex system in a single plain view: you can't describe all the nuances of human physiology or the behavior of a stock market in just a single view. Thus, every time that we build such a view, we have to neglect certain aspects. Such is in the case with agile enterprise and its representation as a SAFe Big Picture. One of the aspects that is not reflected in the (anyways overpopulated) Big Picture is, what I used to call the zero- or ground-level. This level will play an important role in our journey to the roots of the value delivery issues in an enterprise. So let's see what that level is...
We define it as a DBT-formation (see my previous post: DBT Framework for Story Implementation), which is a temporary group of a few team members within a certain agile team that self-organize around the delivery of a specific user story. Quite typically it consists of 1-2 developers, a tester and the PO that is shared across all DBT-formations within the same team. After finishing a user story, a DBT-formation may dissolve (and typically does). The unit of value that they deliver is actually a vertical slice of a user story, a result of running a DBT-cycle. We see such cycles typically taking no more than 1-3 days for completion. Now we can see that DBT-formations fully satisfy the fractal pattern and thus, can be considered as a starting point of the enterprise fractal structure (Figure 2).
Figure 2. DBT-formation is the ground level of scaling.
This is the scale of the fractal where the actual work is being done. As we will see in the next section, the "strength" of these units is extremely important for the entire organization.
Absorbing the Vibration
We now are going to consider a typical problem in any development at scale, which is when the organization fails to deliver according to its commitments. The typical symptom is either a failure to deliver at all (this is an extreme case, but it still happens) or the failure to achieve some of the priorities, which is something most readers have surely witnessed many times in their tech career. So how does this problem occur? The fractal structure gives us a good hint: if we fail to deliver an epic, we obviously go all the way down to the minimum quantum of value, which, in our case, is a vertical slice of a user story that is not delivered as expected. It is important to describe in more detail what "not delivered as expected" really means. Some typical problems are as follows:
• The slice/story/feature/epic is not delivered at all
• It is delivered but it's not what was intended by the PO
• It is delivered but causes further logical or physical integration problems with other pieces
In any of these cases, we would like to see what the possible impact is of such a disorder – which we will call a vibration – on the entire system.
Figure 3. Vibration in the system.
Well, first of all, if the vibration occurs within a DBT-formation that is currently working with a lower priority item, it becomes quite obvious that the impact is manageable; eventually we can always de-scope it without essentially trading off the final business value. However, in the opposite case, the impact can be huge – it will propagate all the way to the top level of our fractal organization and will result in a large-scale delivery issue (Figure 3).
The following is an example of how easily things like these end up happening: a DBT-formation fails to deliver its first slice of a user login story using SSO, which other team members and agile teams are dependent upon. Now all the team members in this team begin to help this DBT-formation to finish its story on time, and they partially succeed by delivering just the first vertical slice and totally postponing their own stuff. This effectively delays the other agile teams in the program and forces them to re-organize their own work, which is alright, but they have already lost time and are unable to deliver all the functionality working under SSO because they got the first story too late in the PSI, so the program falls short on few of the priorities and considerably slows down the entire SSO epic, cutting across all programs in the program portfolio...
Now let's see what a good scenario would look like. To understand what a good scenario is, let's analyze different scales of the fractal with regard to their capability of absorbing the vibration. Let's start from the top.
Portfolio Level. If Program X in our portfolio underdelivers their part of an epic, then obviously it is highly unlikely that Program Y, which delivers the other part, could somehow absorb the vibration. This has quite a natural explanation as ARTs have their own quite isolated concerns - products and solutions that other ARTs simply can't help with due to the fact that they simply don't know the code base.
Program Level. If agile team X causes a vibration within a program, then it is a little more likely that there will be one or a few teams who could absorb the shock. Especially if the program is mostly composed of feature teams, it has a chance, because it is quite possible that there are people on those teams that know the code base of concern to an extent sufficient to help. But let's keep in mind that this is not just about being able to help in principle, but about being able to help and still deliver your own priorities; otherwise the vibration will simply get propagated further in the fractal – both horizontally and vertically.
Team Level. This is the likeliest level to absorb the vibration and prevent it from extending outside the team boundaries. Indeed, most good agile teams benefit from collective code ownership and thus, any other DBT-formation could help.
The first important implication of this analysis is that
The ability to absorb vibration does not scale. It is highest at the team level and lowest at the portfolio level.
Figure 4. The ability to absorb the vibration at different levels of an enterprise.
The simple takeaway is to try to identify the vibration as early as possible and absorb it at the team level, thereby preventing its further propagation and a large-scale failure (Figure 4).
Below we provide a list of practices and conditions that contribute to the enterprise’s ability to absorb the vibration and to facilitate reliable delivery:
• The practical ability to de-scope but preserve the essential value (at all levels, including the ground level). Each time that a problem occurs, the unit can prevent its further propagation by delivering key priorities and de-scoping the low-priority work. This is relevant to both an external and internal “consumer” of the functionality. So, just as a feature can be de-scoped but still effectively fulfill the user need, a user story or its tiny slice can be de-scoped to satisfy the other team dependency and keep the Agile Release Train on track.
• Collective code ownership at the team level and basic elements at the program level. Peers are the first to absorb the vibration. But they can only do so when they are familiar with the code where the problem occurred. It is impossible for everyone to know everything in a large organization, but it is totally feasible to achieve 100% collective ownership at the agile team level and partial collective ownership at the program level. The former is achieved through collective work on user stories (where more than one developer works on a story as a team ground rule), while the latter is achieved by periodically taking the other team’s scope in your team’s backlog whenever it is beneficial for the program.
• Feature teams rather than component teams. Collective code ownership at the portfolio level is close to 0% because agile programs deal each with their own products – this is totally natural and expected. However, what’s not natural or expected is the same kind of isolation for agile teams within the same program. Nevertheless this is exactly the case for programs comprised of component-oriented teams, as each team only knows their own component. However, feature teams imply a much broader outlook for each team and foster a fair degree of collective code ownership at the program level.
• Smaller user stories (good splitting skills). Everything happens faster with smaller user stories, including the detection of vibration by peers and by the next immediate level.
• Multiple DBT cycles for each user story involving the PO. User story is still too big a container for the team to be able to instantaneously react to unexpected turns. Vertical slices of a story provide a much better chance of a quick low-impact corrective action.
• Effective continuous integration at team and program levels. CI prevents from the false sense of progress that a team and program may fall victim to in the case that a slice of work is finished but not fully integrated.
• Actively functioning Communities of Practice (CoPs) in different aspects of engineering. CoPs essentially support the necessary learning process that lays the foundation for collective code ownership, shared practices, and solution domain knowledge.
Large-scale software development is always a hard task, especially because a large organization is a very complex system. Scaling agility to the enterprise level by using SAFe effectively creates a simple pattern (Value / Timebox / Self-organized Team), which allows us to look at the organization as a fractal. This view allows us to understand that there are important characteristics that influence the success of delivery, which do not scale throughout the levels of the fractal. This essentially implies that the organization has to develop its ability to absorb the vibration (delivery failure) at the lowest possible levels.