How management by metrics leads us astray

Let’s say my goal this year is to get to 10k monthly recurring revenue (MRR).

MRR first, everything else second. I would ruthlessly cut out all activities that don’t lead to a measurable MRR increase. For example, I would stop writing blog posts like the one you’re currently reading. I certainly wouldn’t create fun projects like What to Tweet or take the time for non-transactional Zoom calls.

I would focus on my revenue generating projects, crank up the advertising machinery and do everything I can think of that potentially increases my MRR. I’m fairly confident that I would reach 10k MRR by the end of the year. But at what price?

Let’s take a step back for a second.

You know who manages by metrics? Big companies like Google, Amazon and LinkedIn.

And what do they have in common?

Their core product got notably worse over time.

Google’s search results are dominated by ads and many users now use workarounds to find what they’re looking for (“Best headphone reddit”). LinkedIn looks like Minesweeper. Facebook was a fun place to meet friends. And if you search for an electronics product on Amazon, you immediately feel like you’re at a flea market in the middle of Shenzhen. This is the result of hundreds of decisions that were motivated by a short-term focus on specific metrics like revenue and click rates. And while these decisions most likely optimized the metrics, they made the user experience worse.

The problem is that we don’t have the technology to measure the right thing. Or maybe the “right thing” is inherently immeasurable.

Or, and that’s my personal favorite, we’re dealing with a Schrödinger’s cat like situation. The “right thing” is only the right thing as long as we’re not trying to optimize it. This is known as Goodhart’s law. “When a measure becomes a target, it ceases to be a good measure.“

Let’s consider another example.

Who else manages by metrics? Politicians!

Instead of some boring historic example (there are plenty, just read “Seeing like a State”), I’m going to spice things up by talking about something that’s happening right now. It’s not hard to anticipate what I want to talk about since there’s precisely one thing that’s currently happening: Covid-19.

Politicians try to manage the situation by focusing on metrics like the “R number”. At least here in Germany the motto is: let’s lower the R number at all costs. Turns out the cost of this strategy is incredibly high. It’s just not immediately measurable.

For example, everyone’s getting fatter since all gyms and sport clubs are closed. Most people are developing unhealthy eating habits because there’s simply nothing else to do. Social connections are rapidly degregating even though everyone knows how important they’re for the human wellbeing.

I’m not a doctor and don’t play one on the internet, but I can easily imagine that there are lots of further unintended consequences of the current lockdown policy. For example, since most people now spend most of their time indoors alone and wear masks when they leave their homes, their immune system isn’t stressed regularly like it usual is. Hence I wouldn’t be surprised if many people will get sick of other diseases once the lockdown ends as a result of weakened immune systems.

Just to clarify, I’m all for taking Covid seriously. In fact, I’ve been hoarding masks since 2009. I spent the summer of 2009 with a bunch of friends in Lloret de Mar (a party hotspot in Spain) just when news of the swine flu started to emerge. Long story short, everyone in our group got sick (20+ people) except for my girlfriend and me because we’ve been wearing masks. Now the most famous example how focusing on metrics can have unhealthy consequences is the cobra effect. I’ll just quote from Wikipedia:

“The term cobra effect originated in an anecdote that describes an occurrence during India under British rule. The British government was concerned about the number of venomous cobras in Delhi. The government therefore offered a bounty for every dead cobra. Initially, this was a successful strategy; large numbers of snakes were killed for the reward. Eventually, however, enterprising people began to breed cobras for the income. When the government became aware of this, the reward program was scrapped. When cobra breeders set their now-worthless snakes free, the wild cobra population further increased.”

There is one more example I always have to talk about when the topic comes up: academia.

Stagnation apologists love to argue that the lack of progress in science is simply a result of the fact that all low-hanging fruit are gone.

That’s complete nonsense. There’s still plenty of low-hanging fruit. It’s just that everyone is so busy chasing meaningless metrics that they can’t see them.

In academia the metric of choice usually has something to do with the number of citations. But just like Google’s backlink-driven algorithm (which was in fact inspired by academia’s citation metrics) was quickly gamed by savvy webmasters, academia’s algorithm is gamed by careerists.

And I’m not talking about some tiny minority here. If you want to survive in academia, you have to play the citation game. And if everyone is cheating, at the very least you have to do the same to stand a chance.

You have to partner up with others because five people can write five times as many papers. You agree to constantly cite each other. Likewise, you focus on incremental additions to established ideas because that’s the safest way to new publications regularly. You work on the stuff everyone else is working on because how else are you going to get citations?

And one thing you absolutely have to avoid like the plague is the risky and deep kind of research that leads to real progress in the field.

The problem isn’t that citations are a bad metric, and we need to come up with smarter ones. Instead, it’s just Goodhart’s law in action. Every attempt to manage academia makes it worse.

As soon a new metric is introduced it will misdirect the attention of scientists towards playing the game, instead of progressing science.

I talked about three areas, but the same pattern occurs almost everywhere. For example, just remember when everyone started minimizing fat in their diet or how grades are used to measure “learning”.

Crucially the problem is not the specific metric that’s being used. Whatever new metric gets introduced will soon again be made useless by Goodhart’s law. The metrics game will always be akin to Whac-A-Mole.

Now where does this leave us? If management by metrics doesn’t work, what else should we do?

First of all, I think that looking at metrics can be helpful. They just shouldn’t be the sole yardstick decisions are measured against.

And while the map never will be the territory, the picture you’re looking at gets more accurate the more metrics you consider. For example, if politicians would not just consider the R number but hundreds of metrics that take different aspects of a population’s wellbeing into account, the decisions would be better ones.

The obvious problem is that decisions become much harder if you consider more than one metric.

But more importantly there are so many factors we can’t measure that nevertheless should be taken into account.

The more I think about it the more I become convinced that it all boils down to talking to people.

And I’m not talking about surveys that distill thousands of standardized “conversations” into a few numbers or analytics tools that do similar things for behavior. I’m talking about real one-on-one interactions.

Just imagine if Jeff Bezos would talk to hundreds of real (non pre-vetted) Amazon customers each month and take what he hears in these conversations seriously. Or if politicians would have long conversations with regular people from all wakes of life. Or if academic committees would actually talk to applicants before filtering them out based on citation metrics.

In our hyper rational world it’s extremely hard to justify decisions based on the mere gut feeling you got from a few conversations. But this is exactly what people should be doing.

Engagement metrics will never tell you if users are genuinely happy when they use your product. But a few genuine conversations will.

A resume will never tell you if a person is genuinely interested in uncovering unknown truths about nature. But a 30-minute conversation will.

Yes, it’s so much easier to reach a consensus if you just hire the candidate with the highest citation metrics or implement the feature that will lead to the largest revenue increase in the next quarter. But while it’s the easiest decision it’s usually not the best one.

And I think the same applies not just to managing organizations but also to managing ourselves.

Just by listening to my own body I’ll always know better than any fitness tracker if I slept well, better than any blood test if my diet is healthy, and better than my bank account if I’m happy. That doesn’t mean that I’ll not use a fitness tracker, make regular blood tests, or check my bank account. But I certainly won’t allow them to dictate my decisions.

I’ll look at all the facts and take everything in, even the stuff that can’t be measured, and then I’ll just go with my gut.

Written on January 17, 2021
P.S. I'm now on Twitter too if you'd like to follow my adventures. Alternatively you can enter your email address below and I will send you occasionally a short email whenever I publish something new.