## Forecasting advertising channel revenue with Google Sheets

This is the fourth lesson of my Google Sheets for Marketers mini-course. A typical task every marketer has to face at one point or the other is to forecast things such as sales, revenue, conversions or something similar. Often this comes up when you have to create a robust marketing plan. Or when you have to present forecasts to superiors and stakeholders to secure budgets.

One challenge that usually comes up during the forecasting process is how you can predict the numbers as accurate as possible without using too complicated and time intensive methods.

This tutorial describes how to use simple linear regression (no worries, no math needed. There is a simple formula in Google Sheets for that) and formulas to forecast marketing channel revenue. As usual I’ll go through everything step-by-step and explain it all. But I recommend having open the mini-course workbook (please make a copy) to work along to really understand everything.

#### What you’ll learn:

The basics of the Google Sheets QUERY function, FORECAST formula.

## The task

In the past lessons of this course we analyzed the data from three advertising channel and learned that Twitter Ads are performing quite well. You think it would be a good idea to raise the budget of those ads to improve revenue. However you first have to convince your stakeholders and superiors of this to secure the budget. Part of this you have to forecast the potential additional revenue to prove the impact of more Twitter Ads budget. As such the task is the following:

If we rise our investments by 20% compared to last year, what will be the revenue from Twitter Ads on a monthly basis for the next year?

There are two implications from this question:

• It won’t be enough to give a total estimate for the next year, but we have to give a forecast for each individual month.
• We have to forecast at least 12 months into the future.

## Preparing the data

Again I highly recommend to work along in the workbook, you’ll also find a solution sheet to this lesson there. As a first step we want to get the data we are going to work with into our worksheet. In our case the Twitter Ads data.

Of course we could simply copy the data from the Raw Data sheet and delete everything besides the Twitter Ads data. However this approach would get messy very fast, when working with large Data Sets. Instead we are going to work with the Google Sheets Query function. This function is probably one of the most powerful functions overall. It is kind of a pseudo SQL, which let’s you query, manipulate and analyze very large data sets in Google Sheets.

We won’t go into depth here and you’ll only learn a simple case on how to use the formula. However believe me, knowing this function will probably make you stand out among 97% of all marketers and it will seem like magic for those who don’t know it.

Put the following formula into cell A1 in your new sheet:

=QUERY(‘Worksheet – Raw Data’!A1:J118, “SELECT A, B, H, I WHERE A = ‘Twitter Ads'”,1)

The function =QUERY() has three input parameters. The first one ‘Worksheet – Raw Data’!A1:J118 defines the dataset you want to work with and which you want to query. In our case all the raw data. “SELECT A, B, H, I WHERE A = ‘Twitter Ads'” is the actual query. What you are doing here is you are selecting columns A, B,H and I. The WHERE statement introduces a condition and lets you only select those rows, where the cell in column A equals Twitter Ads. The last parameter 1 simply indicates that there is a header row, which we want to display as well.

As such we build a simple very fast filter to only select the data we need. Feel free to play around a little bit with the Query function. You can add more conditions with an AND behind the WHAT the statement. E.g. SELECT A, B, H, I WHERE A = ‘Twitter Ads’ AND H >1000 would give you the same as above, however only those rows, which have a revenue higher than \$1000.

We are almost done with preparing the sheet. Just add Cost Forecast and Revenue Forecast as headers in cell E1 and F1. Write Twitter Ads in A41 and =B40+31 in B41. Copy both down until row 52.

## The actual forecasting

We will have to do two things here. Plug-in the future potential budgets and forecast the resulting revenues.

The first one is quite easy. As the task states we will have a 20% higher budget compared to the last 12 months. So we will simply multiply the costs of each of the last 12 months with 1.2 in order to get the investments for the future 12 months. As such write the following into E41 and drag it down for the other months:

=D29*1.2

Next we’ll do the revenue forecasting itself with linear regression. It is easier than you might expect as there is a formula in Sheets exactly for that. Write this into F41 and again drag it down for the other future months:

=FORECAST(E41,C\$2:C\$40,D\$2:D\$40)

What the FORECAST formula does is it predicts through linear regression a future value by using existing values.

In our case the predicted value is revenue (y-value) for a given cost (x-value). The existing revenue values (y-values) are in cells C2:C40 and existing cost values (y-values ) in cells D2:D40.

I added \$ dollar signs in above formula in order create an absolute references. So when you drag the formula down it won’t change the referenced existing values.

## Visualization

You’re done! Or at least almost. Technically you are done already – you predicted revenues for Twitter Ads for the next 12 months. However nobody wants to read a boring table and drawing insights from that table is very difficult. So the last step to take is to build a chart on top of your analysis.

Insert a chart from the menu: Insert > Chart and change the chart type to Linear chart (Need a refresher on charts? Check out lesson 3):

Add the following two series:

Series 1 – C1:C52

Series 2 – F1:F52

and as X-axis: B1:B52.

You’ll probably notice now that there is an ugly and confusing gap between the lines of the historic data and the forecast. In order to get rid of that add the following to cells F40:

=C40

This will level the start of the forecasts to the actual revenue.

That’s it! Obviously you can change the colors of the lines (which I did in above screenshot and the example sheet) to make it look prettier and more organized.

## Limitations

Certainly above is a very simple solution for forecasting revenue as there are a lot more advanced techniques such as the moving average forecasting method. We are also not counting in any other influencing factors such as seasonality, competition etc..

Nonetheless above is a valid demonstration on how a simple data science technique (linear regression in this case) can be applied to real life marketing challenges by everyone. You could use above methods for all kind of marketing forecasting, e.g.:

• How will the the next days outside temperature affect my sales?
• How will the number of blog posts affect my website traffic?
• Will conversions go up or down during winter months?

In the last lesson of this mini-course we will build a dynamic reporting dashboard, which pulls its data automatically from Google Analytics. So be sure to don’t miss out lesson number 5.

## A Sales Forecast Template for Google Sheets

With the new year ahead many marketers and analysts will be tasked with creating a marketing plan for the year. Imagine you are one of those and are asked to present the plan next week. Sooner or later during the preparation you will have to address one major pain point usually every analyst or data driven marketer has to go through: How to realistically forecast sales, revenue, conversions or something similar for the next year?

Obviously during the presentation you would have to defend those numbers and as such you have to have a robust model for predicting them. On the other hand it can’t be too complex as you have to present already in a couple of days.

So what if you had a simple technique which allows you to accurately predict metrics, while it can be implemented very easily.

That’s what this tutorial is about. It teaches you the moving average forecasting method for forecasting future sales, revenue, etc. in Google Sheets. If you are reading this last minute and you need forecasts right away you can just plug-in your numbers in below template, but I highly recommend working through the guide to understand everything.

## The Template

First of all, for those of you, who only need a sales template for predicting revenue or other metrics, please find it above. Make a copy, open the sheet Data Input and copy your monthly revenue or conversion numbers from the last two years into cells D3:D26 and E3:E26 respectively.

You’ll find the output, i.e. the actual projections in the sheet Forecast.

You can simply change the column headers (D2 + E2), if you want to name your forecasts differently.

However in order to understand the techniques behind it and to potentially tweak and change the forecast I highly recommend working along the following guide. I’ll show how the predictions were modeled, including the ratio to moving average forecasting method.

## The Ratio to Moving Average Forecasting Method

In order to work along please also access above Google Sheets Workbook. In addition to the Data Input and Forecast sheets you’ll find two additional Worksheets – the first one containing the example data, which you can use to work along and the second one the solution to the example data. Even though conversions as well as revenue is forecasted in the template we will only work on revenue prediction in the following. However the forecasting technique used is the same for both.

The example data has already been cleaned and prepared, so you can start right away.

The ratio to moving average forecasting method uses trend and seasonal indices to accurately forecast future sales, revenue, conversions or whatever other time series you decide you want to forecast. It is an very easy-to-use four step method. We will use it in our example to forecast sales revenue. As such we’ll have the following four steps:

1. Estimate the deseasonalized level of sales during each month (using centered moving averages).
2. Define a trend line to the the deseasonalized estimates.
3. Determine the seasonal index for each month and estimate the future sales by extrapolating the trend line.
4. Predict future sales by adding seasonality to the trend line estimate.

## Calculating Moving Averages and Centered Moving Averages

First you’ll have to create a full year moving average for each month by averaging the current month, the six prior months and the next five months respectively. By creating a full year average seasonality will be eliminated. To do so copy the following formula in cell F8 and drag it down to cell F20: =AVERAGE(D2:D13).

This means for example that the moving average for month no 13 (January-18) is \$18.7k. The moving average for month no 14 averages months no 7 to 18. Adding these up (7+8+9+10… +18) and averaging those month numbers will give you 12.5. As such the moving average for month no 13 is centered at month no 12.5. Similarly the moving average for month no 14 is centered at month no 13.5. Averaging those two moving averages will give you a centered moving average that estimates the actual centered moving average at the end of month no 13. As such to estimate the sales revenue during each month (de-seaonalized), copy the formula =AVERAGE(F8:F9)down from cell G8.

## Defining the Trend Line to Centered Moving Averages

We’ll now use the centered moving averages to define a trend line that can be used to estimate future sales revenue. We’ll need to find an intercept and slope to do this

Luckily there are two functions, which will do exactly this for us. In cell L3 put =SLOPE(G8:G20,A8:A20to find the slope of the trendline and in cell L4 write =INTERCEPT(G8:G20,A8:A20)to find the intercept of the trendline.

Now copy the formula =A26*L\$3+L\$4 from G26 to G37. This will give you the estimated revenue (without seasonality) for the future months.

## Calculating the Seasonal Indexes

Start by calculating for each past month Sales / Centered Moving Average. So simply put =D8/G8 into H8 and copy it down to H20.

E.g. for July you’ll get 0.9 (2017) and 1.02 (2018) respectively. This means in July 2017 sales have been at 90% of an average month and in July 2018 at 102% of an average month. Averaging those two numbers will give you the seasonal index for July, which is 96%. So July usually generates 96% of the sales a average months would generate. In order to calculate the seasonal index estimates for all month we can work with the AVERAGEIF formula. The AVERAGEIF formula will only average values, which fulfill certain criteria.

Put the numbers 1 to 12 in the cells K7 to K18 respectively. This numbers represent the individual months. The formula =AVERAGEIF(B:B,K7,H:H)in L7 will average all revenue numbers for January. Copy that formula down to L18 to do the same for the remaining months.

We have to ensure that the seasonal indices average exactly to 1 to normalize them. This is actually quite easy. Put =L7/AVERAGE(L\$7:L\$17) into M7 and copy it down to M18.

## Forecasting future months

In order to forecast the revenue for future months you have to multiply the trend line estimate for each month’s revenue with the the appropriate seasonal index. Copy the formula =VLOOKUP(B26,K\$6:M\$18,3)*G26 from I26 to I37 to predict revenue for the next 12 months.

That’s it you are done!

The model is adjustable for more recent trends as well, if you believe the recent trend of the series has changed significantly. As such you don’t have to take all Centered Moving Average for calculating the slope but could take only more recent months (e.g. the last half year in our case) in order to calculate the slope with months closer to the current date.

## Limitations

As usual the disclaimer that this is model won’t predict the future to 100% as it is based on historical data.The model will obviously be more accurate the more past data you have and furthermore the less volatile your time series development is.

Nevertheless above is an very easy to follow accurate method for forecasting sales and other metrics.

## Predicting the impact of promotions on sales with Google Sheets

Sales season is on. Black Friday and Christmas are traditionally some of the promotion strongest dates during the year. However for many marketing analysts and data driven marketers a lot of uncertainty develops on how to predict the outcome of promotional efforts. Obviously most promotions will probably lead to more orders, but to how much incremental revenue will it lead? Can it even have a negative effect by cannibalizing sales in the time following the promotions? Or is the overall effect positive nonetheless?

So imagine if you had a predictive model with which you can forecast the effects of your planned promotions. Giving you more certainty on how big your sales revenue will be.

This guide provides step-by-step instructions on how develop such a forecasting model for your business based on past promotions. All you need is Google Sheets and a free solver add-on. I recommend working along in the provided sheet to truly understand everything.

## Defining the task

For this example you are given the historical daily revenue for an online retailer for the last two and a half years up until 2018-11-28 (obviously the techniques in this model can be used for other products such as online courses, software and others as well). In the past years the retailer ran 20% and 30% promotions reducing the price of all sold products respectively. Promotions were run on random days as well as during Black Friday and before Christmas.

We now want to find out a) what the impact of those promotions is, b) if it makes more sense to run a 20% or 30% promotion for overall revenue and c) if a promotion should be run during upcoming Christmas time.

## Preparing the data

Formatting and cleaning data is usually the first step you have to take for every data analysis. As we are dealing with quite a simple data set here (revenue and dates) and the data has been prepared upfront there is not much to do in terms of cleaning and formatting.

However we still have to do some preparations for our upcoming analysis. As we want to investigate the impact of promotions we have to indicate the dates on which promotions happened. Furthermore we need to  indicate if the promotions had any visible impact on the days after the promotions, i.e. negatively due to the cannibalization of revenue.

Both has been done already in the provided sheet. In column E – H a “1” indicates that a event has occurred respectively. E.g. a “1” In cell E2 would mean a 30% promotion has happened on 2016-01-01, while a 1 in cells E3 – E5 means a negative after-promotion effect was observed on those days.

Days of the week and seasonality usually have an impact on sales as well (e.g. weekdays vs. weekends, summer month vs December), which is why we want to factor in those as well.

There are two simple formulas we can use here: MONTH() and WEEKDAY() which will give you month and the day of the week for a reference cell respectively. Those formulas have been used in  column B and C.

That’s it for preparation. Obviously even though everything above is already provided in our example data you would still have to do it for your own datasets.

## Building a basic model

What we are going to do is build is a model, which forecasts sales based on the impact of seasonality and promotions. We will have the following five factors influencing our sales prediction:

1. A constant as a basis
2. Influencing effects for the day of the week
3. Influencing effects for the month
4. Influencing effects for a promotion day
5. Influencing effects for the days after a promotion

This means for each day the forecast of revenue will be calculated by the following equation:

Predicted revenue =  Constant + [day of the week effect] +  [month effect] + [20% promotion effect] + [days after 20% promotion effect] +[30% promotion effect] + [days after 30% promotion effect]

Put those individual parameters in cell Q2:Q27 and a 100 in the respective cells in column R. Those are only trial values and will act as coefficients.

We will use the solver to estimate the coefficients, which will make the model fit best to our data, i.e. minimizing the (squared) error between our forecasts and the historical data. But more on that later on .

Next step is to write the actual formula, which corresponds to above equation. In cell M2 put the following formula and drag it down the whole column:

=\$R\$2 + VLOOKUP(B2,\$Q\$4:\$R\$15,2,FALSE) + VLOOKUP(C2,\$Q\$17:\$R\$23,2,FALSE) + E2*\$R\$24 + F2*\$R\$26 + G2*\$R\$25 + H2*\$R\$27

R2 represents the constant. VLOOKUP(B2,\$Q\$4:\$R\$15,2,FALSE) picks up the coefficient for each month while VLOOKUP(C2,\$Q\$17:\$R\$23,2,FALSE) picks up the coefficient for the day of the week. If a promotion occurred on a given day E2*\$R\$24 and G2*\$R\$25 will pick up the effects of a 20% and 30% promotion respectively, while F2*\$R\$26 and H2*\$R\$27 will do the same for after promotion day effects.

Now you’ll calculate squared error for each day. This can be done simply by putting =(D2M2)^2 into N2 and dragging it down the column. Sum all squared errors up in cell Y4.

One last thing you have to do before using the solver is to average all values of the day of the week and months values. You can do that by putting =AVERAGE(R17:R23) in into V4 and =AVERAGE(R4:R15) into V5. We are doing this as we will add constraints to the Solver model, which constrain the average day of the week and the average month to equal 0. We are doing this as those constraints make sure that a day of the week or month with a positive effect has a higher than average revenue and a day of the week or month with a negative effect has a lower than average revenue.

If you haven’t so far: No it’s time to download the solver. After doing so click on Add-ons -> Solver -> Start to set it up:

We want to minimize objective cell Y4 by changing values R2:R27, while constraining V4:V5 = 0.

So basically what we are doing is we are minimizing the squared error between the historical revenue data and our fictional forecast by changing the coefficients for each of our above defined five factors. When you are done setting up the Solver click on “solve”.

The outcome tells you a couple of things. E.g. a 30% promotion day raises (all else being equal) the revenue by \$6116. Weekends are revenue wise a lot better than other week days and December accounts for a major sales raise as well. Days after a promotion usually lead to a drop in revenue.

### Evaluating the model

You can evaluate the model by calculating the R2 value between the forecasts and the actual historical revenue. Plug =RSQ(D2:D1064,M2:M1064) into cell Y5 for this. The formula calculates the percentage of the variation in revenue, which is explained by the forecasting model. In our case we should get a 0.59, which means that the independent variables of the five factors explain 59% of the daily variation of the revenue. Obviously as such the model has some room for improvement.

In order to do so we’ll try to spot any outliers in the daily forecast accuracy. As a first step calculate the error between the forecast and the historical daily revenue. Do so by putting =D2M2 into cell O2 and copying it down the column. Calculate a close approximation to the standard error of the forecast with the standard deviation of the errors with the formula =STDEV(O2:O1064) in cell Y6.

We define outliers if the absolute value of the forecast error exceeds two times the standard error. So as a next step we want to identify the outliers fitting into this definition. For this we will use conditional formatting. Select range O2:O1064 and click on Format -> Conditional Formatting -> Add new rule. In the “Format if…” dropdown menu choose “Custom formula is” and put =abs(O2)>=2*\$Y\$6 in the box.

This will highlight all outliers within above definition.

And, surprise surprise we will find clusters of outliers around Black Friday and before Christmas, both we did not include in our original model. However, obviously people are shopping a lot more during these dates (e.g. 12/01 – 12/17) and less after (e.g. 12/18 – 12/31).

## Improving the model

Now that we know what caused the outliers and thus a lower forecasting accuracy we can improve our model by adding those factors to our original five factors.

As such we’ll add Black Friday, After Black Friday, Christmas sales, After Christmas into the headers of the columns I:L in order to be able to mark the occurrence of those events for each day similar to what we did before for the promotions. As such we’ll add the coefficients for those new factors in Q28:R31 as well.

Last adjustment we have to do now is to add the factors in our forecasting calculation. Put the following formula into Cell M2 and drag it to M1064 (we added the Black Friday, Christmas and the days after those events respectively as factors):

=\$R\$2 + VLOOKUP(B2,\$Q\$4:\$R\$15,2,FALSE) + VLOOKUP(C2,\$Q\$17:\$R\$23,2,FALSE) + E2*\$R\$24 F2*\$R\$26 G2*\$R\$25 H2*\$R\$27 I2*\$R\$28 J2*\$R\$29 K2*\$R\$30 L2*\$R\$31

Now run the solver again with the added variables:

Adding those variables improved R2 drastically to 0.88, which means the variables explain 88% of the daily variation of the revenue

## The actual forecasting

So far we have only worked on the prediction model. Next step is obviously to forecast sales revenue with this model until the end of the year and to examine what impact a 20% or 30% promotion has. This actually the easy part.

First copy the formula from cell M1064 to all cells M1065:M1097. Next mark the Christmas sales time and days after the Christmas sales in column K and L (this is quite easy here as the change in sales revenue is quite obvious. However it might be more difficult to spot in real life data.)

This will give you a prediction on what revenue to expect for December without any promotions: An overall revenue of \$175,734.

Let’s try now what will happen, if we have a 20% promotion on 2018-12-01. Plug in the indicators into G1067 and for the after promotion effects into H1068:H10670 respectively. The overall revenue for December would be \$175,082 now.

Delete the indicators for the 20% promotion and try the same for a 30% promotion by writing a “1” into E1067 and F1068:F10670. Overall revenue for December would be now \$177,291. Not much of a difference but still it is better to go with a 30% promotion.

That’s it, we are done! We are now able to answer all of our initial questions: We know what revenue impact the promotions have respectively and which promotion we should run during Christmas time

## Limitations

You might wonder, why we did not solve the problem with a regression analysis tool. The answer to this is actually quite simple:

For a regression analysis with a typical spreadsheet analysis add-on the dependent variable would have been revenue. In addition we would have needed 25 independent variables to account for the different factors (months, day of the week, promotions, Black Friday, Christmas, after event days).

Most analysis add-ons simply can’t handle that many independent variables (including the Excel Analysis ToolPak, which can handle up to 15 independent variables). That’s why we use the Solver, which allows a lot more changing cells) to estimate the coefficients of the independent variables in order to minimize the sum of squared errors.

In addition as with all models above created model is only a simplification of reality and can never give a 100% accurate forecast. Furthermore in this case the revenue numbers were simplified as well. As such revenue growth wasn’t included. In addition events and after event days were very easily identifiable and could be clearly distinguished from normal days. As mentioned this might be more difficult for a real life data sets.

Nevertheless the above guide provides a good basis to develop your own forecasting models and to fit a model to your revenue data and events. So next step for you is to fit it to your own data set and to predict if promotions for Christmas will make sense for you!