Assessing an engineering team’s performance


What should be the right metric for a startup CEO to assess an engineering team's performance

Charlotte recently joined Kash, a post-series A Fintech, as Head of Engineering to manage 20 engineers. Three months after joining, she received a request from her CEO asking her to assess the engineering team’s performance.

The CEO wants to know if engineers are delivering features fast enough for the needs of the business. To answer their request, Charlotte initially thought of using some of the Accelerate four key metrics, but they did not convince the CEO to measure output instead of efficiency.

What should Charlotte do?

What Charlotte is experiencing is a classic case of engineering performance vs business performance. As a Head of Engineering, her initial choice of the four key metrics (Deployment Frequency, Lead Time for Changes, Change Failure Rate, Time to Restore Service) was correct. At the engineering level, Deployment Frequency and Lead Time for Changes measure velocity, while Change Failure Rate and Time to Restore Service measure stability. Velocity and stability are the pillars of good engineering practice. But at the business level, they, unfortunately, don’t provide much information.

In a high-growth startup, the goal of an engineering team is to deliver as much value as possible to users as fast as possible. The value generated by the engineering team (new features, improved performance, etc.) should translate into either acquisition of new users (growth) or retention of existing users (profit). So an engineering team that is providing high velocity and stability could well be providing medium to low value if they’re shipping the “wrong” features or not improving the “right” performances.

What makes this situation even more complex is that even if a delivered feature seems “right”, it might take several iterations to reach a market fit. So, for the startup CEO, the actual performance they should be measuring is how long it takes to go from a greenlighted idea to positive user feedback.

Practically speaking, the “green light to positive user feedback” sentence conveys the following components:

  • How fast can a Product Manager spec out a greenlighted idea?
  • How fast can a developer (or squad) push a spec into production (dev + review + test + release)?
  • How fast does it take to get meaningful user feedback?
  • How many iterations does it take to reach positive user feedback?

I call this metric TPUF, as in Time to Positive User Feedback. The lower the TPUF, the more successful a startup is since positive user feedback directly impacts growth and profit. For the CEO, the advantage of looking at TPUF is that they get a complete picture of the company’s idea building system and could work with the related managers to improve this or that sub-system. If TPUF is not low enough, maybe it’s the product managers who don’t spend enough time analysing the market and are coming up with half-baked features. Or maybe it’s the engineering time that’s too high, enabling the Head of Engineering to investigate what is slowing down the system (code, code review, deploy, debug).

Another advantage of TPUF is that it forces managers to cooperate since their success is highly dependent on another team’s success. For the engineering team to succeed (code and push to production), they need to have a good relationship with the product team (prioritise and spec’ing out) and the QA team (test).

What the CEO doesn’t have, though, with TPUF are the individual performances. If they want to know how each team member contributes to the overall effort, this is not the way to go. But at the same time, a manager should measure individual performances independently of team performance. An individual contributor can have a fantastic performance during a quarter, though the team did not perform well for reasons outside of their control. In this case, Charlotte should protect her team members by relying on objective performance reviews to reassure the CEO.

One last point, when looking at TPUF or any other metrics, keep in mind that “when a measure becomes a target, it ceases to be a good measure” (Goodhart’s law). If the CEO or Charlotte become too obsessed with efficiency, team members will inevitably suffer burnout in the long term. Too much efficiency also goes against slack, the little bit of wiggle room that allows us to respond to changing circumstances, experiment, and do things that might not work.

To go further

📖 Accelerate - The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations - Nicole Forsgren, PhD, Jez Humble, and Gene Kim

📝 What Should We Measure? - Tim Ottinger

📝 Introduction to systems thinking - Will Larson

📝 Are you an Elite DevOps performer? Find out with the Four Keys Project - Dina Graves Portman

📝 Efficiency is the Enemy - Farnam Street