AMZN: Forecast 'Spyglass' Tool

Summary:

In my role as a product manager at AMZN, most of my work was focused on forecasting systems and was behind the scenes. The 'spyglass' tool was one of the ad-hoc things I built. The purpose was to bridge the gap between the automated forecasting systems and models to the needs of the businesses relying on accurate forecasting.

Each category has thousands and thousands of products in their catalog. So it's impossible for a category management team to operate at the product level, unless the product was in the head of the catalog (top 0.5%). For those products, the ROI of manual attention and forecast intervention was positive.

So the prior state:

Business users (CM teams) maintained a list of top selling products.
Every week (or so), folks manually pull up the forecast for each product.
Compare the forecast to sales velocity and category knowledge, and inspect the forecasting curves.
Then make an accuracy assessment and if required, intervene in the forecast by overriding or adjusting the curves.

This was sub-optimal for several reasons. Chiefly, it was a poor use of time. Secondarily, it was highly manual and therefore, not precise and replicable. Third most, it introduced user manual bias into a highly instrumented data model. This bias was sometimes good, sometimes bad. But it persisted until manually removed - at least at the time, that changed in the coming years - creating more work or more error, sometimes both.

In response to this, I built the 'Spyglass' tool, shown below.

Step by Step:

I created 6 different SQL scripts, combining catalog, sales, forecasting, and traffic tables.
I replicated the SQL scripts for each category (30-40).
I connected a spreadsheet template to the Data Warehouse UI.
The data automatically flowed into the spreadsheet template once a week.
The raw data was transformed inside the spreadsheet to create 6 summary tables.
These summary tables were then 'printed' as a PDF document.
The output was sent via email to each category team (using email alias wildcards).

The first page shows the top 20 under bias and over bias risk products (ASINs). Importantly, it shows the $s impact (grayed out), expressed in revenue (OPS) and profit (CP). This made prioritization easy and brought the tangible benefit/loss to the forefront for each category manager. Additionally, the summary shows crucial context along side, such as product name, manufacturer, price data, and last year sales.

The logic behind the bias calculation was simple. Last year's actual unit demand vs. year's forecast. Getting to that calculation required a few complicated considerations:

Remove markdown sales.
Add sales lost due to out of stock (calculated with traffic and conversion data).
Calculate price elasticity and apply to this year's price compared to last year's.
Assign a growth rate using a blend of the product's growth rate and the subcategories growth rate.

The second page is the same foundation as the first page, showing further breakdowns to triangulate and troubleshoot. It applied the same calculations to product the top 20 over/under bias venders and subcategories. This illuminated any larger or systemic problems and directed teams to the problem area.