Chapter 7
IN THIS CHAPTER
Organizing your people into squads, tribes, chapters, and guilds
Getting friendly with failure
Considering two approaches to product development
Figuring out whether the Spotify approach is best for you
Spotify is a digital streaming service that gives its subscribers access to a huge selection of music, podcasts, and videos via their smartphones and other electronic devices. The company developed its own approach to enterprise agility that borrows from numerous agile methodologies and practices, including Agile, Scrum, Lean Software Development, and Kanban. Spotify refers to its approach as the “Spotify Engineering Culture.”
Like other enterprise agile frameworks, Spotify’s approach is centered on self-organizing, cross-functional teams (called squads) collaborating to deliver value to customers. However, it’s less structured than most frameworks, such as Scaled Agile Framework® (SAFe®), Large-Scale Scrum (LeSS), and Disciplined Agile Delivery (DAD). In fact, it’s kind of messy. It may just be the most adaptive of all the enterprise agile frameworks.
In this chapter, I bring you up to speed on the Spotify Engineering Culture so you can determine whether it is the framework you want to use in your organization. If it is, you’ll gather enough information and insight along the way to start moving your organization in that direction.
In the Spotify Engineering Culture, employees organize into four different group types, as shown in Figure 7-1:
In the following sections, I describe all of these groups and their purpose and function in greater detail, and I offer guidance on how to get the most out of each.
In Spotify, the smallest functioning unit is a squad, which has the following characteristics:
Spotify doesn’t endorse any one type of agile team, nor does it prescribe a methodology. Squads are similar to feature teams in LeSS, in that they focus on a feature in a product. (See Chapter 5 for more about LeSS.) The differences are that Spotify teams are generally smaller, and they’re not required to use Scrum. Squads can use Scrum, Kanban, Lean Software Development, Extreme programming, or any other approach, or they can mix and match agile team models or create their own model. Whatever works.
The only limits on teams is that they align their activities with the squad’s mission, the company’s product strategy, and the squad’s short-term goals, which are reviewed monthly.
Not all squads work on product delivery. Squads may work in operations, on infrastructure, or in other functional areas of the company. Whatever the squad’s mission, it is expected to communicate and collaborate with other squads as needed. Spotify encourages a “we’re all in this together” culture. Although squads are generally self-sufficient, they’re not silos.
Although squads are autonomous, you don’t want every squad going off in its own direction. Spotify prevents the fall into chaos by requiring that teams be aligned with product strategy, company priorities, and other squads. As Spotify puts it, each squad is expected to “be a good citizen in the Spotify ecosystem.” The company’s mission is more important than that of any of the squads. Ideally, squads are autonomous but aligned.
Spotify sees autonomy and alignment as two different dimensions (see Figure 7-2). It charts the two with autonomy along the x-axis and alignment along the y-axis to create the following four quadrants:
At the squad level, alignment is primarily the responsibility of the system owner who keeps the squad aligned with its mission and creates and maintains a prioritized list of work items to help the team prioritize its work. She also communicates with other system owners in different squads to help coordinate the work and even manage any dependencies. (For more about the system owner role, see the later section “Having a system owner.”)
Instead of mandating that squads adopt certain processes and practices, Spotify gives squads freedom to use whatever processes and practices work best for them. Instead of setting standards, it relies on squads to cross-pollinate by learning from one another. When enough squads are following a certain process or engaging in a certain practice and having great success with it, word spreads through the community. Squads start supporting the tool and helping one another use it. Over time, it may become a de facto standard. Cross-pollination gives squads flexibility, but it also promotes consistency across squads.
One way Spotify supported squad autonomy as it was scaling Scrum is that it changed its architecture to decouple its systems. Its product consists of over 100 interacting systems, with each system dedicated to one specific need, such as playlist management or search. Each squad “owns” one or more of these systems, so each squad can work on its system(s) independently. Spotify’s software development model is open source, which promotes squad innovation and sharing among squads.
Prior to changing its architecture, Spotify was a single application. The company noticed that teams were spending a lot of time and effort synchronizing for each new release. Instead of creating numerous processes and rules to govern synchronization, Spotify changed the architecture to create a product that functions more like a website with different frames. Each squad can update its frame with little or no effect on the other frames.
If a squad needs something done to a system it doesn’t own, it asks the squad that owns it to do the work. If that squad is too busy to do it, however, the squad that needs it done is free to edit that system. The squad that owns the system must then review the changes. This approach reduces wait times, increases quality, and spreads knowledge. Minimal standards are in place to improve efficiency (reduce friction).
For the squad concept to work well, you must nurture a culture of trust and mutual respect. Keeping the focus on the product is a great start. When your people are focused on making a great product, ego, authority, and politics take a back seat behind product quality. People are more willing to share knowledge, ask for help, and give in and collaborate with one another. People are more likely to give credit to others than to seek it for themselves.
Much like Disciplined Agile Delivery (DAD) teams (see Chapter 6), squads like to be “sticky.” They stick together and they stick with one or more features throughout the life of those features. This stickiness makes squads different from typical Scrum or LeSS teams, where team members work where they’re needed most.
One of the downsides of using sticky teams is that you may have trouble divvying up the workload. Spotify’s approach to decoupling the systems that make up its product works for Spotify, but it may not work as well for you, especially if teams are required to work on different products or when they need specialized expertise. If you have one squad working on the most important features of a product, you can end up with a lot of dependencies and, as a result, bottlenecks and roadblocks.
You don’t want to create a squad that could be a potential bottleneck. A lot of software development organizations have a separate team for testing. If you created a squad that focused solely on testing, then you may create a backup if several other squads finish their development at the same time. This team would go from being overworked at certain times to being underworked at other times.
Spotify addresses this problem through validated learning, a common term used in Lean Startup — it tests, measures, and adjusts. In this testing scenario, the squad would quickly realize that it’s creating a bottleneck for the rest the organization. It would probably break up the squad and distribute testers to different squads. Then they would test this out and see if there’s an improvement and how work flows through the system.
Each squad meets regularly (every few weeks at Spotify) to discuss what’s working well and what it needs to improve. These informal meetings, referred to as “retrospectives,” are less about planning and more about improving the product and the process and learning from failure.
For large products, such as Spotify, squads are grouped into tribes (see Figure 7-3), with each tribe focusing on a specific area in product development. For example, Spotify has three tribes:
The tribes are built on a self-service model. Tribes don’t push what they have on other tribes; they simply make it available to the tribes to use the services when needed. The client app tribe enables and supports the feature tribe, and the infrastructure tribe enables and supports both the client app and feature tribes.
When forming tribes, consider the following (very loose) guidelines:
To keep squads in sync and maintain flow, Spotify uses release trains and feature toggles:
Within each tribe are chapters that cut across the squads (see Figure 7-4). Each chapter has a chapter lead and is composed of people with a shared competency, such as software development, web development, data management, or testing. The chapter lead acts as line manager, but without some of the traditional management responsibilities. The chapter lead is more of a service position, coaching and mentoring the chapter members. She doesn’t assign work or supervise employees, but she may set salaries, schedule vacations, or help squad members obtain additional training.
One of the key benefits to having chapters is that people can move from one squad to another without losing their chapter lead. In that way, a chapter is more like a traditional department, although Spotify would never think of calling it that.
To encourage knowledge sharing, promote professional development, and support the wider community (beyond squads and tribes), the Spotify approach uses guilds — informal groups that form around a shared interest. A guild is basically a community of practice (CoP) (see Figure 7-5), which is common in other enterprise agile frameworks, such as SAFe and LeSS.
Each guild has its own coordinator, who sets the time and location of meetings and may create a loose agenda based on the guild members’ needs. These meetings, often referred to as “unconferences,” are a time for different parts of the organization to get together and share ideas in an informal setting.
Guilds offer numerous benefits beyond knowledge sharing among a group of people with shared interests, including the following:
To keep your guilds thriving, make them a priority by doing the following:
The vitality of a guild depends a great deal on its coordinator. Generally, the person or people who form the guild are the most passionate about their shared interest, but someone else in the organization may be better suited for that role. You want someone with passion, knowledge, and charisma. Encourage your guilds to choose the best person for the role and look past ego and politics. The best coordinator is the best for everyone in the guild.
If you choose the Spotify model for your enterprise agile framework, you’re committing your organization to a culture of creativity that’s failure-friendly. As Spotify founder Daniel Ek said about his company, “We aim to make mistakes faster than anyone else.” Spotify sees failure as a learning opportunity. Squads release new features quickly and frequently, test those features, conduct retrospectives to discuss their success and failures, and then tweak the product and the process in the spirit of continuous improvement. Perhaps most important, when they discuss failure, they don’t try to figure out whose fault it was; they’re more concerned with finding out what happened and why and then using that insight to make changes.
This experiment-fail-learn-improve approach is reflected in how the company handles incident reports. Instead of closing an incident after it’s been resolved, the company closes it only after the squads have captured the learning via a post-mortem. Making mistakes is fine; repeating them isn’t.
In this section, I discuss other ways you can nurture a creative, failure-friendly culture by following Spotify’s lead.
Agile frameworks tend to communicate culture through manifestos and principles more than anything else. Although Spotify doesn’t exactly list its principles, the creators of its approach highlight what the organization values in terms of inequalities. As you read through this list of Spotify’s “values,” note how nearly every one of them supports squad autonomy and innovation over methodologies and practices:
Innovation > Predictability: Spotify sees innovation and predictability as opposite ends of a spectrum. It wants some predictability but prefers to be more innovative than predictable. The focus is on delivering value, not on meeting deadlines.
When higher predictability is needed (for example, to coordinate a release with a planned marketing activity), Spotify may slide closer to the predictability end of the spectrum and use standard agile planning techniques, such as epics and user stories.
To prevent experimentation and failure from negatively impacting customers (or to minimize that impact), Spotify introduced the concept of having a “limited blast radius.” That is, if failure occurs, it negatively impacts only one feature or feature set and very few customers. Spotify limits its blast radius in two ways:
Throughout the year, Spotify encourages squads to spend ten percent of their time to invent and build whatever they want with whomever they want. The squads set aside one of every ten days as a hack day — innovative play time. Twice a year, the entire company joins in a hack week at the end of which the company has a party and reveals the inventions developed over the course of the week.
The hack day is nothing new. Several companies, including Sun Microsystems, allocate play time to give programmers the opportunity to engage in exploratory programming and explore ways to improve the product. They could rework the code or even research new ideas. At Spotify, squads are encouraged to create anything they like — the idea is to get the creative juices flowing.
One way Spotify promotes creativity while improving efficiency is by eliminating waste — anything that doesn’t add value to the product, such as time reports, handoffs, separate test teams, task estimates, useless meetings, and corporate nonsense.
Like all other enterprise agile frameworks, Spotify’s approach stresses the need for continuous improvement. The company drives continuous improvement in the following ways:
The very foundation of the Spotify approach is community and culture. According to the creators of this approach, “Healthy culture heals broken process.” Spotify nurtures community and culture with the following:
Spotify’s approach to product development/planning is deeply rooted in the Lean Startup approach of “Think it, build it, ship it, tweak it.” Here’s how it works at Spotify:
While Spotify’s usual approach works for small products, such as a feature, it’s less successful for large, complex products. Spotify tries its best to steer clear of larger products by breaking the product down into smaller products, so it can use the “Think it, build it, ship it, tweak it” approach described in the previous section. However, if that’s not an option, it takes a more traditional approach to product development.
For big products, Spotify adds a small, tightly knit leadership group, which typically includes a tech lead, product lead, and (sometimes) a design lead. The leadership group isn’t exactly management. They don’t assign and supervise work. Their function is to communicate the vision and maintain alignment among the squads and tribes.
The squads track progress visually using a Kanban board. They conduct daily syncs to facilitate coordination and collaboration and ensure alignment. And they produce a weekly demo to check how the product is coming along. The idea is increase collaboration and reduce risks by keeping the feedback loops as short as possible.
For large enterprises and large products, Spotify recommends adding a role called a “system owner.” Even though “system owner” sounds like one person, it’s actually the role of two people — a developer and an operations expert, each of whom works in her own squad. Every so often (on “system owner day”), they get together to make system-level decisions.
The system owner role is based heavily on the DevOps (development and operations) role, which is an attempt to combine two roles that may seem incompatible at first glance:
Having a systems owner is an attempt to strike a healthy balance between predictability and innovation on large products. The system owner should always be pushing to achieve a healthy balance between frequent changes and the stability of the whole.
On its surface, Spotify’s approach to enterprise agility is attractive, but it may be more like a utopian dream for larger enterprises. Its success relies on entirely on the ability of people to bond and to respect and trust one another. While that often works on a small scale, it’s often less effective on a larger scale. Just as smaller countries can typically function well with less bureaucracy, and larger countries crumble when they don’t have enough central control, a small business may thrive with less management, while a large enterprise can fall into chaos without a more structured framework.
To decide whether the Spotify approach is best for your organization, ask yourself and your organization’s leadership the following questions:
Are we comfortable adopting a relatively unproven approach? If you answer “yes,” the Spotify approach may be worth trying. However, while it has proven effective in scaling one organization up to a medium-sized company, the jury is still out on whether it can work for a large enterprise. If you’re a large enterprise with thousands of developers, then you may test the limits of this approach.
Other enterprise frameworks, including SAFe, LeSS, and DAD, are all currently being used in very large companies. As of this writing, there haven’t been any public demonstrations of really large organizations using the Spotify approach.
The good news is that if you follow the Spotify approach, you’re likely to end up with an agile framework that’s tailored to your organization. You will adopt what works, toss what doesn’t, adopt elements of other agile frameworks, and create your own principles, processes, and practices. As a result, your organization would probably be much more agile than if you had chosen a more substantive framework, such as SAFe, LeSS, or DAD.
In some ways the Spotify approach is like flipping through a weightlifting magazine. Some organizations may find it inspirational, while others look at the shiny mounds of muscle and think to themselves that there’s no way they could (or would even want to) do that to themselves!