A Calculator for Predicting Cash Flow and Breiman's Two Cultures of Statistical Modeling

Or another episode of “You know we can just copy practice from Amazon, right?”. Oh and this time, reviewing paper!

Hello there!

Today’s observation is around a calculator (but from Amazon!) and a legendary paper that I regret not reading sooner. Let’s go!

Juan Mateos Garcia on X: "It was fun to re-read Leo Breiman's classic paper about statistical and machine learning modeling cultures. Published in 2001 as ML was starting its renaissance, its calls

Juan Mateos Garcia on X: “It was fun to re-read Leo Breiman’s classic paper about statistical and machine learning modeling cultures. Published in 2001 as ML was starting its renaissance, its calls

”Amazon Free Cash Flow Unit Economic Model”

What is so interesting about it?

Many people know that Amazon, one of the internet giants, put “Free Cash Flow” as their North Star in the early days. This came directly from Jeff Bezos’s belief that free cash flow, instead of earnings, is what matters most for the Seattle giant. Few people, however, know how committed they were to this strategy, going as far as to build a complex unit economic model that can take input from almost anything, and churn out cash flow numbers.

Taken from “The Amazon Way: Amazon’s 14 Leadership Principles” book:

“This philosophy and the need to practice it successfully drove the creation of other capabilities, such as Amazon’s robust, extremely accurate unit economic model. This tool allows merchants, finance analysts, and optimization modelers (known at Amazon as quant-heads) to understand how different buying decisions, process flows, fulfillment paths, and demand scenarios would affect a product’s contribution profit. This, in turn, gives Amazon the ability to understand how changes in these variables would impact FCF.” (emphasis added)

Calculator. They build a calculator. What’s so important about it?

I shared this observation with my friend yesterday. We joked that this story might be real (unlike most stories from business books) because, well, Jeff Bezos. The same man who ditched PowerPoint for narrative documents wouldn’t hesitate to give another unconventional order for his Finance and Data teams like this.

But working in the analytics field almost all my career, I can tell to you that this project, while quite simple from the outset, is hard to make.

Think about this for a bit. To build this “calculator”:

You need to know what the top exec or CEO wants to achieve for the company. Amazon is quite fortunate that Jeff has a pretty clear idea of what he wants to focus on (FCF per share), which solves the alignment issue. During my career, I’ve seen companies, in Indonesia or abroad, that don’t have this level of clarity.
Not only that, you also need to assess if everyone in the company knows what ‘mission accomplished!’ looks like. There’s a difference between setting a target and “enforcing” that target. Employees also need to have clarity on what they want to achieve, because fundamentally, they’re the ones who will the action to achieve it.
And the underrated part: now you have a target, everyone knows about it and will try to achieve it. How would you know if what you do made you closer to the target? Imagine you want to go to your favorite Thai food from your home. You know you’re at home now, and what the restaurant looks like. But how would you know you are in the right direction? (“Oh, I know. The restaurant is close to the local bookstore”). What decision criteria that you have when you pick your road? (“I’ll stuck in the traffic at this hour if I take the usual route. Might as well take another route”). What Google Maps does for these two questions is the same thing that Amazon’s unit economic model does for two earlier points. It’s a good way to distribute understanding, and make the feedback loop quite faster helping “If I take action X, FCF will go right here”.

Who knows talking about a calculator can be this interesting?

Subtitle: Statistical Modeling: The Two Cultures

What the Article/Paper is About?

I was late in reading this legendary paper by Leo Breiman. Published in 2001, it was a revolutionary paper for its time. Breiman articulates his critique of the “data-model first” approach that was commonly held by statisticians in his time, ignoring the recent development of “algorithmic modeling” (what we might now know as “machine learning”). Breiman points out: “Why do we limit ourselves to only adopt this culture, when newer (and better) cultures arrive with new tools?”

In Memory of Leo Breiman | Department of Statistics

Leo Breiman. Taken from statistics.berkeley.edu

Here’s a quote from the abstract that explains Breiman’s point:

“There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.” (emphasis added)

Wait, Breiman argues there are two cultures of statistical modeling?

I’ll be the first to admit that although I’ve finished reading this paper, I can only understand about 10% of it. It contains so much new information that I still need to wrap my head around it and its implications (a deep-dive post is coming soon).

What piqued my interest was the context in which Breiman wrote this paper. It was very different from what I see in the industry nowadays. The “We can use ML for anything” cult is growing, and I’m afraid we’re leaving the “data model” world behind without recognizing its usefulness. This is exactly the reverse of what Breiman observed in his time.

I believe there might be a lesson here.