This is the second lesson from the Google Sheets for Marketers mini course. In this lesson I’ll walk you through the most important Google Sheets formulas for analyzing marketing data as well as some useful functions such as conditional formatting and filters. You’ll even build a small tool for comparing performance metrics for different advertising channels.
What you’ll learn in this lesson:
Conditional Formatting, Filters, Data Validation, AVERAGE, MEDIAN, MODE, MAX, MIN, COUNTIF[S], AVERAGEIF[S], SUMIF[S], TRANSPOSE, VLOOKUP, INDEX, MATCH, calculating ROI.
Preparing the data
In our last lesson we generated some first insights from the ad channel data set with the help of pivot tables. One of the insights was that Facebook ads were not performing as well, which is why we are going to focus on those for this lesson to analyze them further.
First make a copy of the worksheet (it’s always a good idea to keep a backup of the original data when manipulating it). Next select all data in the new sheet and in the menu click on Data → Create a filter. You can now click on the three bars next to Channel in cell A1. De-select Twitter Ads and Google Ads by clicking on them respectively. This will allow us to focus on the Facebook Ads data only without distraction from the data from the other advertising channels.
As you learned in the first lesson revenue and profitability of Facebook Ads are going down. Since costs generally stayed the same you are guessing it might have something to with the Average Order Value. In order to be sure we are going to apply a coloured heatmap to the values to see if there is a downward trend. Select all cells with Average Order Value data and click on Format → Conditional Formatting in the menu.
While you could colour the cells depending on if the are not empty or contain a certain value we want to colour them on a scale from min to max value. As such choose Colour Scale on the right site. Under Preview choose the scale from green to yellow to red, which will colour the lowest values green, mid range values yellow and high values red (you could choose different colours here, but we are going to leave them as they are for now).
As you can clearly see, Average Order Values dropped significantly at the end of 2017 and as such we got our first insight here (in a real life scenario I would e.g. talk to one of the Social Media PPC managers now to see what might have happened at the the end of the year., if I don’t know it myself).
Calculating the typical value of the data set
When working with marketing data you will often have to deal with large data sets. It’s often difficult to make sense of those data sets due to the sheer amount of metrics. As such it’s helpful to summarize the data by getting the typical values, min/max values and others to get a feel for the data.
There are three different ways you usually use to summarize the typical value for a data set.
The mean or average is simply the sum of the numbers in in the data set divided by the number of values in the data set. Write =AVERAGE(G41:G79) into N41 to get the average number of Conversions per month.
The median is the 50th percentile of the data set. This means one half of the data is below the median and the other half is above the median. Write =MEDIAN(G41:G79) into N42 to get the median for Conversions.
Last is the mode of the data, which is simply the most frequently occurring value in the data set. Write =MODE(G41:G79) into M43 to get the median for Conversions.
As such on average there were 53.2 conversions via Facebook Ads each month, around one-half the time there were fewer than 52 conversions and the most frequently occurring number of conversions per month were 49.
None of the three is the best per se. In most cases you would take either the mean or the median. If you have extreme values they tend to distort the mean and the median is a better choice as a summary of a typical data value. However the median might throw out important important in other situation. So as a general rule of thumb use the mean, if no extreme values are present and the median otherwise. In our case no extreme values are present and we can focus on the mean
Finding the smallest and largest values
This is actually an easy one. Simply type in =MAX(H41:H79) in N46 and =MIN(H41:H79) in N47 to get the largest and smallest values respectively.
First we’ll calculate ROI for each month. As Return on investment = Revenue / Investment you can put =((H41-I41-J41)/(I42+J42)) in K41 and drag it all the way down to K79. You now have the ROI for each month.
In order to get the total ROI for all month write =(SUM(H41:H79)-SUM(I41:J79))/SUM(I41:J79) into Cell N48. This formula summarizes the revenue first and then subtracts the sum of all costs.
In order to count the number of times we had a positive ROI you can use the COUNTIF formula. This formula will count values depending on a certain criteria. In our case the criteria will be that the ROI is larger than 0. So write =COUNTIF(K41:K79,”>0″) into N48.
Putting the metrics into context
Above we calculated several performance metrics for Facebook ads. However they are quite useless, if we don’t put them into context (a general rule for marketing analysis: Never just dump metrics out there, always put them into context and make them actionable). In our case the context would be to compare the Facebook ad metrics with Twitter and Google ad metrics. With the exercises we did above we have the tools to do exactly that.
First select the Google ads and Twitter ads from the filter in A1 as well so you can see the data for all advertising channels. Next calculate ROI for those two channel by dragging the ROI formula into the empty cells in column K.
As preparation write the following headlines into the corresponding cells:
Facebook Ads in cell M3
Google Ads in cell M4
Twitter Ads in cell M5
Average CPC: in cell N2
Max: in cell O2
Number of positive ROI months: in cell P2
Total ROI: in cell Q2
We’ll calculate the average CPC for each channel first. You can use the AVERAGEIF formula for this. The AVERAGEIF formula checks if a cell in the criterion range matches a certain criteria and will only average the values of the row with matching criteria. E.g. put =AVERAGEIF(A2:A118,M3,F2:F118) into N3. Sheets now checks if the cell in specified range A2 to A118 matches the value of M3 (Facebook Ads) and will only calculate the average of the values in range F2:F118 of the rows with matching criteria.
Put $ in front of the range row specifiers and drag the formula into N4 and N5 to do the same for the other channels (the $ will keep the range the same).
Well now do something similar to find each maximum Average Order Value. As such put =MAXIFS(C$2:C$118,A$2:A$118,M3) into O3 and drag it into O4 and O5. Keep in mind that the order inside the formula is different. Unfortunately that’s the case for most *IF formulas. So alway pay attention to the hints in the upper left corner, which give specific instructions here.
Next we’ll use the COUNTIFS formula to calculate the number of positive ROI months. COUNTIFS allows several criteria (as opposed to COUNTIF). Put =COUNTIFS(A$2:A$118,M3,K$2:K$118,”>0″) into P3. This formula will only count rows which match criteria M3 (Facebook Ads) in column A as well as has values >0 in column K.
Drag the formula down to get the counts for the other channels.
Last we’ll calculate total ROI. For this we’ll use the SUMIF, which works similar to the AVERAGEIF formula. As such it will only sum the values in a range if a certain criteria matches the criteria range in the same row. It is a pretty long formula to calculate the ROIs. However you are basically summing up Revenue and subtracting the sums of Advertising Costs and Other Costs first and then dividing that by the sums of Advertising Costs and Other Costs. Put
into Q3 and drag it down to Q4 and Q5.
Your new table comparing performance metrics from the different advertising channels is technically done. However it is kind of hard to read. It would would be better to have the different channels as rows and the metric titles as columns. The can be easily done with TRANSPOSE, which will interchange rows and columns. Put =TRANSPOSE(M2:Q5) into M8 (you could also use the paste function of the same name instead, however that would mess up the formulas).
Even though the format is better now it is still hard to compare the performance metrics on first sight. Some colour coding would be nice… Luckily you already learned how to do heat maps in the beginning. Select cells N9 to P9 and click in the menu on Format → Conditional Formatting. Choose the color scale on the right side with green to yellow to red. Do the same for cells N10 to P10, N11 to P11 as well N12 to P12. However for those three interchange green and red as we want green to indicate value where the respective channel is better than the other channels.
The final result (and insight) shows us that Twitter Ads are actually comparing quite well in all metrics compared to the others even though the Average Order Value is quite low. Facebook Ads on the other hands perform quite bad in all metrics compared to the other channels. This might indicate that you should shift some budget from Facebook Ads to Twitter Ads.
Preparing the data for charts
The last part of this lesson will prepare the data for building some charts (and a simple reporting which you could use to send out to other stakeholders or clients). As such we will work in the sheet Solution – Charts.
There is actually another new sheet called Worksheet – Budgets/Costs, which contains the budgets and actual costs of several 2019 advertising channel.
As we are analyzing Facebook Ads, Twitter Ads and Google Ads more closely you obviously don’t want to have all of the Budget/Costs data in your Solution – Charts sheet.
You could just copy the relevant data from the former sheet to the later one. However in very long list it can be very toilsome to find relevant data. There is a smarter way called VLOOKUP, which will find relevant data for you based on a key.
Prepare your sheet by writing the following in the cells:
Facebook Ads in cell A2
Google Ads in cell A3
Twitter Ads in cell A4
Budget 2018 in cell B1
Actual Cost in cell C1
=VLOOKUP($A2,’Worksheet – Budgets/Costs’!$A$2:$C$10,2,FALSE)
in Cell B2.
What this does is that VLOOKUP searches for Key A2 (Facebook Ads) in range A2 to C10 in Worksheet – Budgets/Costs and returns the cell of the 2nd column of the row where it finds the key. FALSE only says that the range is not ordered in any particular way. Past the formula in cell B4, B6, C2,C4 and C6 as well. Since we are looking for the actual costs in column C you have to replace the 2 in the formula of C2,C4 and C6 with a 3 to return a cell from the third column.
Building a performance metric comparison tool
Two other useful formulas to find data are INDEX and MATCH. Those two combined are a powerful tool to find data in large data set. INDEX gets a value at a specified location in a range of cells based on the numeric position. E.g. putting =INDEX(A1:C4,2,3) in any cell in the sheet Solution – Charts will get you the cell in the second row and third column of the range A1 to C4 (in this case that would be $24,310).
MATCH will find the numeric position of an item in a list. E.g. putting=MATCH(“Google Afs”,A2:A4,1) in any cell in the sheet Solution – Charts will get you the position of Google Ads in the list A2 to A4. The last 1 indicates that we are looking for an approximate match (which is why it ignores the typo) rather than an approximate match (in which case we would use 0).
We will use those two formulas two build a small dynamic performance metric comparison tool. First write
Total ROI: in cell A9
Twitter Ads in cell B8
Google Ads in cell C8
Copy this formula into B9 and paste it into C9 as well:
=INDEX(‘Solution – Functions’!$N3:$Q5,MATCH(B8,’Solution – Functions’!$M3:$M5,0),MATCH($A9,’Solution – Functions’!$N2:$Q2,0))
It is actually a simple index function, however row and column indicators are replaced by match functions. So MATCH(B8,’Solution – Functions’!$M3:$M5,0) looks for the value in B8 (=Twitter Ads) in range M3 to M5 of the Solution – Functions sheet and gives back its position (=3) while MATCH($A9,’Solution – Functions’!$N2:$Q2,0) looks for the value in A9 (=Total ROI:) in range N2 to Q2 of the Solution – Functions sheet and gives back that position (=4). The INDEX function takes the positions and uses them as row and column indicators for the specified range respectively.
The cool thing is now, that if you would e.g. change Twitter Ads in cell B8 to Facebook Ads it would update the value in C9 automatically!
However every proper tool has some dropdown menus. We can add those with data validation. Data validation tells Sheets that only certain values are allowed in a cell. Select cells B8 and C8 and right click on them. Choose Data validation… In the empty field next to List from range paste this: ‘Solution – Functions’!M3:M5. That is a list of the three advertising channels we are analyzing. Click on Save.
Do the same for cell A9 by right clicking on it, choosing Data validation… and pasting ‘Solution – Functions’!M9:M12 into the empty field. Save.
You can now use the dropdown menus to choose the comparison metric as well as the channels you want to compare. We prepared everything in this sheet for the next charts lesson. Based on the data we will create some charts, modify them to look better and I’ll show you how they can be updated dynamically to build some simple beautiful reports.
Pivot tables are one of the easiest and quickest tools to analyze marketing data and to draw some first actionable insights. As such they shouldn’t be missing in the basic skill set of every marketer. This is an introductory session to pivot tables.
What you’ll learn in this lesson:
Basics of pivot tables, different aggregation options (SUM, AVERAGE, %, etc.), pivot groupings, calculated fields.
Examining marketing channel performance with pivot tables
First of all make a copy of this workbook. It contains the raw data the below example is based on as well as the solution sheets. Obviously you can just read through this guide. However I highly recommend to make a copy and work along!
For this case we will use the reported data for three paid advertising channels (Google Ads, Facebook, Twitter) from a t-shirt ecommerce store as a basis for the analysis. Obviously this is only an example and you could use pivot tables for analyzing other data such as sales revenue from different regions, customer orders or cost by location.
The example data contains the last three years and includes: Average Order Value, Impressions, Clicks, CPC, Conversions, Revenue, Advertising Costs and Other Costs segmented by month and advertising channel.
Disclaimer: The values for the advertising channels are completely random and should not be seen as representative for one channel or the other.
In this lesson you’ll do the following things:
- Examine absolute revenue and revenue share by channel and month
- Describe the influence of seasonality and overall trend
- Analyze profitability and order values based advertising channel
Analyzing absolute revenue and revenue shares
In this first part you’ll learn how to get some first insights on how each advertising channel is performing during the respective months. We’ll start by looking into absolute revenue numbers and then analyze what months and channels drive most of the revenue.
How much revenue does each channel generate?
Click anywhere in one of the cells containing data in the Worksheet – Raw Data sheet. Afterwards click in the menu on Data and then Pivot table… to prompt the pivot table pop-up. Google Sheets should have correctly guessed the range which contains data. So simply click on create to create the pivot table in a new sheet.
In the new sheet you’ll see the pivot table as well as the table editor on the right site, which you can use to build the table. We’ll start by clicking on Add next to Rows and adding Channel there. Now click on Values to add Revenue. We already know now what the lifetime revenue of each channel is:
However it would also be interesting to know how much revenue on average each channel does each month. For this simply click on the dropdown SUM below Summarize by on the right side and choose AVERAGE instead.
Next we want to know what the total revenue is per months. So set AVERAGE back to SUM, click on Add next to Columns and add Month.
What’s the revenue share of each channel compared with advertising costs?
Even though knowing the absolute revenue of each channel is already helpful to get a general idea of the channel performance, looking at shares or percentages is often more insightful.
So remove Month by clicking on the X next to it. In the Revenue tab click on the Show as dropdown next to the Summarize by dropdown and chose % of column instead of Default. Next add another Value called Advertising Costs and do the same.
You just unlocked your first small marketing insight!
While Facebook ads account for roughly 30% of advertising costs they only account for 18% of revenue. The other two channel do a lot better here and as such there is definitely room for optimization or even a shift of budget. But we’ll look more into this in the later sessions.
The influence of seasonality and overall time trends
While we looked into channel performance above we’ll now examine the performance of individual years and months more closely. As preparation switch the pivot table back to Month as Columns and Revenue as Values.
Which months are on average the highest grossing?
Right click on any month in the month header and choose Create pivot date group… –> Month. This will group the months of each year (e.g. February ‘19 with February ‘16, February ‘17 and February ‘18).
Next in the Revenue tab switch Summarize by from SUM to AVERAGE. This gives us the average revenue for each kind of month. Obviously this would already be enough to answer above questions. However it’s a lot easier, if we sort the months by revenue descending. This can easily be done by choosing Descending in the drop down menu below Month and Order as well as AVERAGE of Revenue in the Sort by dropdown.
This gives us our second little marketing insight: Not surprisingly for a t-shirt retailer, summer months are the strongest revenue wise.
How is the revenue performance year-over-year?
Ungroup the months and create a pivot group by Year instead. Also switch the fields back to default (Columns: sorted by Month, Values: SUM of Revenue):
Since we want do look at year-over-year growth and 2019 is not done yet, we are going to filter it out. Simply click on Add next to Filter, choose Month and un-select all 2019 months (January, February, March, April).
Write =C3/B3-1 in cell C4 and =D3/C3-1 into cell D4 respectively to calculate the growth rates.
This results into our next insight: revenue growth rates are actually dropping!
Examining profitability and order values
Last thing we want to do is to look at the profits and average order values of each advertising channel.
What is the profit per channel for each year?
First unfilter the 2019 months since we also want to have a look at the most recent months. Also click the X next to Revenue to delete the field. Luckily Google Sheets pivot tables allow us to add calculated fields. And since profit =revenue-cost we can simply click on the Add button next to Values choose Calculated field and add the following formula (each item in the formula equals the column names of the raw data):
=’Revenue’-‘Advertising Costs’-‘Other Costs’
This will give us a profit field and another insight: In addition to the year-over-year drop in revenue, Facebook Ads are dropping in profitability.
How are the average order values per advertising channel distributed?
First of all delete the calculated profit field and add Average Order Value as row as well as Value and Channel as column. For the later one choose % of column as Show as.
Above we already grouped by date, however it is actually also possible to choose custom groupings. Just click on one of the Average Order Value values and click Create pivot group rule… . Set Interval size to $10. This leads to dividing Average Order Value into 10$ buckets. E.g. this means in our case 23% of all months Twitter Ads had an average order value of $40 – $50.
As such our last insight is, that Facebook as well as Twitter Ads have a significantly lower average order value than Google Ads.
That’s it, you are done! You learned all important pivot table functions and how to use them to gain some first insights from marketing raw data. In the next lesson we will further analyze the data with Google Sheets formulas and functions.
You are working as a digital marketer or digital marketing analyst in a company. The company has several thousand online customers, but beyond some top level metrics they don’t have any customer-focused insights. As such you might have the typical Google Analytics reports (e.g. which sources are customers coming from), know what products are purchased most often and what the average order value is. But what your stakeholders lack is a better understanding of the customers in order to drive marketing & content decisions and strategies for acquisition, growth and retention. That’s why they come to you asking to share some general “who are the customers?” insights with them.
Even though this probably one of THE typical asks for marketers and analysts it might lead to some inner stress levels rising due its vagueness. Above question can mean everything and anything from rather top level customer personas to in-depths customer and purchasing behavior reports.
So imagine you had an analysis that would not only group all customers into different clusters but also allow you to develop and present to your stakeholders targeted strategies for each cluster based on its characteristics. All leading to an optimized marketing & sales approach and in-depth customer insights for improved conversion rates. And the best of all is all you need is the transactions data of your company.
That’s where the RFM analysis comes into play. It’s a simple to understand and easy to apply data analysis model to segment your customers. The following is a step-by step tutorial on how to create such a model in Google Sheets. Furthermore it shows you specific strategy recommendations for each of the key customer clusters (and if you want to get started quickly you can plug-in your data into the provided workbook to use it as a template in order to segment your clients right away).
The recency frequency monetary (RFM) analysis
The recency frequency monetary analysis (RFM analysis) is a classic analysis model for behavior based consumer segmentation. It segments customers by scoring them on a 1-5 scale in regards to how recently, how often, and how much they have bought (different scales may be used, however the 1-5 is usually the one used in a commercial context). Those three factors can then be used to predict how likely it is that a customer will purchase (or for some business models engage, e.g. apps) again.
Furthermore those segments can be grouped into clusters allowing you to develop targeted individualized content and promotion strategies which are more likely to convert with the customers in each cluster. Because each cluster is assigned a monetary value high value high-value customers can be identified easily and marketing spend can be allocated accordingly.
This guide gives a RFM Analysis example and shows step-by-step how to conduct a recency frequency monetary analysis with your data in Google Sheets. Afterwards it explains the content and promotion strategies, which can be applied to each individual cluster.
As usual I recommend working along in the above provided Google Sheet in order to understand everything. The workbook contains two sheets: Sample Data includes dummy data representing a transaction list (this could be anything from software sales to ecommerce store orders) and RFM Model which includes the solution to the RFM analysis example this guide is working towards.
If you are in a hurry: It is possible to replace to the dummy with your own data to use the RFM Model as a ready made template.
Creating a RFM analysis example step by step
First step is to prepare the data and to calculate the following metrics for each customer:
- The most recent transaction
- The number of transactions per month for each customer
- The average amount purchased each month by each customer
In the following we will work in the sheet Tutorial of the provided workbook.
We’ll start by finding out the number of transactions each individual customer had. This is easily done by copying the formula =COUNTIF(‘Sample Data’!B:B,A2)from B2 down to B3403. It will count how often the value from the referenced cell occurs in column B from the sheet Sample Data.
Next step is to identify the most recent transaction for each customer. You can do so by copying the following formula from C2 down to C3403:
=MAXIFS(‘Sample Data’!C$2:C,‘Sample Data’!B$2:B,$A2)
The formula uses cell A2 as a reference to filter the corresponding rows in the sheet Sample Data with column B and returns the highest (=most recent) date from column C respectively.
Similar to this we use the formula =MINIFS(‘Sample Data’!C$2:C,‘Sample Data’!B$2:B,$A2) in in cells D2 to C3403 to get the date of the first transactions of each customer.
For the final RFM model you’ll need the amount of time the customer has been with the business. In our example we’ll use months for this. As such put =DATEDIF(D2,now(),“M”)into E2 and drag it down E3403.
Next we want to know how much each customer spends on average each month. Plug in =SUMIF(‘Sample Data’!B:B,A3,‘Sample Data’!D:D)/E3 from F2 to F3403.
The last step is to calculate the average number of transactions per month for each customer. So simply write =B2/E2 into G2 and copy it down.
For all above instead of “M” you could also use “Y” or “D” to set the time unit to years or days respectively. It doesn’t really matter what you choose as we we’ll be coding each data point into a 1-5 scale for the RFM analysis later on anyway. However to make your data more vividly choose a unit, which make sense to your business model, e.g. if you are selling cars it would make sense to choose years, while it might make more sense to choose days when you are selling coffee.
Calculating R,F and M
Two steps are necessary to calculate the R, F and M scores:
- Determine how each customer ranks for recency, frequency and monetary
- Assign a score to each recency, frequency and monetary rank
Luckily Google Sheets has a handy formula for returning the rank of a specified value in a dataset.
As such plug in and copy down the following formulas.
For the recency rank: =RANK(C2,C$2:C$4296,1) in H2 to H3403
For the frequency rank: =RANK(G2,G$2:G$4296,1) in I2 to I3403
For the monetary rank: =RANK(F2,F$2:F$4296,1) in J2 to J3403
The last argument (1) in the formulas ensures that the highest values in the respective dataset gets a higher rank and vice versa. E.g. a customer with an average order value of 10$ would get a higher rank than a customer with an order value of 5$.
Next we’ll create a RFM rank matrix to convert a customer’s ranks on recency, frequency and monetary into the wanted 1-5 rating. For this you will use the formula PERCENTILE to get the minimum rank a customer has to have on each for the three factors to get a certain rating. Since we have ratingd from 1-5 we’ll need five percentiles which equal to 20% percent steps.
Put =PERCENTILE(H$2:H$4296,0.8) into Q2 to get the lowest possible rank for the highest R score, similar put =PERCENTILE(H$2:H$4296,0.6) into Q3 to get the lowest possible rank for a R score of 4 and so on. Do the same for the F score in column R and and for the M score in column S.
No we’ll do the actual conversion for each customer. Put the formula =if(H2>=Q$2,$T$2,if(H2>=Q$3,$T$3,if(H2>=Q$4,$T$4,if(H2>=Q$5,$T$5,$T$6)))) in K2 and copy it down to K3403 to determine the R score. The conditional statements will determine if the respektive rank is above one of the thresholds of the RFM rank matrix and then assign an according score. Do the same for the F score by copying =if(I2>=R$2,$T$2,if(I2>=R$3,$T$3,if(I2>=R$4,$T$4,if(I2>=R$5,$T$5,$T$6)))) from L2 to L3404 and for the M score by copying =if(J2>=S$2,$T$2,if(J2>=S$3,$T$3,if(J2>=S$4,$T$4,if(J2>=S$5,$T$5,$T$6)))) from M2 to M3404.
As a last step we want count how often each of the RFM combination (there are 5x5x5 = 125 combinations) occurred. Again we will use the COUNTIF formula for this. Write =COUNTIF(N:N,Q10)into R10 and copy it down to R134.
You are done! ….at least technically. You gave each of your customers a RFM score based on the recency, frequency and monetary value of past purchases. However in order to make your analysis actionable you should group your 125 segments into more meaningful clusters and define strategies for each of them.
Clustering the RFM segments
Even though many others suggest having ten or more clusters I would recommend focusing on having only six key clusters (at least in the beginning). As such the implications for each cluster are still manageable and content and promotion strategies for each can be created.
The following describes content strategies for your email list and consequently customer segmentation for each of the defined clusters.
High Value Customers
Description: These customers are your most valuable customers. They buy frequently, are spending a high amount on each transaction and are still very active (=bought something recently).
Strategy: Obviously these customers have proven that they are willing to pay and to buy often from your. So don’t use price incentives (e.g. discounts) to generate incremental sales. They love engaging with you so rather reward them by testing new product (or feature) launches with them first. Being your most loyal customers the probability that they give some valuable feedback on the new products and recommend them to others will be highest with them. If possible try implementing loyalty programs (as well as advocacy and review programs) to reward and keep their loyalty. In addition these are the customers you should target your most expensive products at.
Segments: 34X, 35X, 44X, 45X, 54X, 551, 552, 553, 554
Description: Your core group of loyal customers. While they might be spending less frequently or lower amounts than your High Value Customers, they are still very valuable as they are regular and recent purchasers of your products.
Strategy: Probably not as effective as your High Value customers you can still look into upsell opportunities. Aa such if you are selling several products (e.g. as an ecommerce business) you can add value for those clients by recommending products based on previous purchases. Advocacy and review programs can help you spread word of mouth for your business through this cluster.
Segments: 51X, 52X
Description: Your newest customers. They recently had their first transaction with you. As such they obviously will have low frequency score. Even though they can have high monetary scores already, if they are high spenders.
Strategy: Most first time buyers will never graduate to promising and finally loyal customers. It is important to have an optimized onboarding with clear strategies in place (such as a triggered welcome email sequence) to encourage repeat purchases.
Segments: 33X, 43X, 53X
Description: Customers, who finished the onboarding process but aren’t in the Loyal Customer cluster yet. They buy fairly often, but haven’t reached the frequency levels of the Loyal Customers or High Value Customers yet.
Strategy: You have already accomplished an initial relationship with the customers. Now you should focus on increasing monetization and frequency depending on what they are currently lacking. You can test personalized product recommendations based on past purchases and special offers based on spending thresholds. Increase brand awareness to stay top of mind and to increase frequency.
Need Attention Customers
Segments: 24X, 25X
Description: Customer, who once purchased from you with a medium to high frequency but stopped for some reason a while ago.
Strategy: Goal for this customer cluster is to reactivate them before they get lost at all. Part of this can be to try to find out, why they left through analyzing their behavior or surveying them. Try limited-time offers as well as individualized recommendations based on past transactions. Price incentives can also be tested.
Description: These customers have not purchased from you in a long while. Some of them might have been high frequency and big spenders, but stopped buying at some point.
Strategy: As with the “Need Attention” segment you should try to reconnect with these customers. By having even more aggressive offers and price incentives you can try to reactivate them. However it can make sense here to segment this cluster even further by F and M in order to identify low value customers
Obviously there are still some white spots left for some of the customer segments. Above are only key clusters you should focus on in the beginning. You can cluster all the left out segments into a General Population group sending those more generalized and less segment specific content. Once you have mastered above key groups you can start clustering the rest of your segments.
In addition the thresholds for above clusters aren’t set in stone. As soon as you fully understand the implications of the model and the clusters for you business you can start moving the segments around to form your own clusters, fitting best to your business.
You can add the following formula to cell O2 and copy to O3403 to group the segments with above clusters in the Google Sheets workbook:
IF(AND(K2=2,L2>3),“Need Attention”, “General”))))))
Did You Enjoy This?
Then feel free to sign up for my newsletter. Get my newest articles on career & skill development for marketers and guides on technical marketing every week.
With the new year ahead many marketers and analysts will be tasked with creating a marketing plan for the year. Imagine you are one of those and are asked to present the plan next week. Sooner or later during the preparation you will have to address one major pain point usually every analyst or data driven marketer has to go through: How to realistically forecast sales, revenue, conversions or something similar for the next year?
Obviously during the presentation you would have to defend those numbers and as such you have to have a robust model for predicting them. On the other hand it can’t be too complex as you have to present already in a couple of days.
So what if you had a simple technique which allows you to accurately predict metrics, while it can be implemented very easily.
That’s what this tutorial is about. It teaches you the moving average forecasting method for forecasting future sales, revenue, etc. in Google Sheets. If you are reading this last minute and you need forecasts right away you can just plug-in your numbers in below template, but I highly recommend working through the guide to understand everything.
First of all, for those of you, who only need a sales template for predicting revenue or other metrics, please find it above. Make a copy, open the sheet Data Input and copy your monthly revenue or conversion numbers from the last two years into cells D3:D26 and E3:E26 respectively.
You’ll find the output, i.e. the actual projections in the sheet Forecast.
You can simply change the column headers (D2 + E2), if you want to name your forecasts differently.
However in order to understand the techniques behind it and to potentially tweak and change the forecast I highly recommend working along the following guide. I’ll show how the predictions were modeled, including the ratio to moving average forecasting method.
The Ratio to Moving Average Forecasting Method
In order to work along please also access above Google Sheets Workbook. In addition to the Data Input and Forecast sheets you’ll find two additional Worksheets – the first one containing the example data, which you can use to work along and the second one the solution to the example data. Even though conversions as well as revenue is forecasted in the template we will only work on revenue prediction in the following. However the forecasting technique used is the same for both.
The example data has already been cleaned and prepared, so you can start right away.
The ratio to moving average forecasting method uses trend and seasonal indices to accurately forecast future sales, revenue, conversions or whatever other time series you decide you want to forecast. It is an very easy-to-use four step method. We will use it in our example to forecast sales revenue. As such we’ll have the following four steps:
- Estimate the deseasonalized level of sales during each month (using centered moving averages).
- Define a trend line to the the deseasonalized estimates.
- Determine the seasonal index for each month and estimate the future sales by extrapolating the trend line.
- Predict future sales by adding seasonality to the trend line estimate.
Calculating Moving Averages and Centered Moving Averages
First you’ll have to create a full year moving average for each month by averaging the current month, the six prior months and the next five months respectively. By creating a full year average seasonality will be eliminated. To do so copy the following formula in cell F8 and drag it down to cell F20: =AVERAGE(D2:D13).
This means for example that the moving average for month no 13 (January-18) is $18.7k. The moving average for month no 14 averages months no 7 to 18. Adding these up (7+8+9+10… +18) and averaging those month numbers will give you 12.5. As such the moving average for month no 13 is centered at month no 12.5. Similarly the moving average for month no 14 is centered at month no 13.5. Averaging those two moving averages will give you a centered moving average that estimates the actual centered moving average at the end of month no 13. As such to estimate the sales revenue during each month (de-seaonalized), copy the formula =AVERAGE(F8:F9)down from cell G8.
Defining the Trend Line to Centered Moving Averages
We’ll now use the centered moving averages to define a trend line that can be used to estimate future sales revenue. We’ll need to find an intercept and slope to do this
Luckily there are two functions, which will do exactly this for us. In cell L3 put =SLOPE(G8:G20,A8:A20to find the slope of the trendline and in cell L4 write =INTERCEPT(G8:G20,A8:A20)to find the intercept of the trendline.
Now copy the formula =A26*L$3+L$4 from G26 to G37. This will give you the estimated revenue (without seasonality) for the future months.
Calculating the Seasonal Indexes
Start by calculating for each past month Sales / Centered Moving Average. So simply put =D8/G8 into H8 and copy it down to H20.
E.g. for July you’ll get 0.9 (2017) and 1.02 (2018) respectively. This means in July 2017 sales have been at 90% of an average month and in July 2018 at 102% of an average month. Averaging those two numbers will give you the seasonal index for July, which is 96%. So July usually generates 96% of the sales a average months would generate. In order to calculate the seasonal index estimates for all month we can work with the AVERAGEIF formula. The AVERAGEIF formula will only average values, which fulfill certain criteria.
Put the numbers 1 to 12 in the cells K7 to K18 respectively. This numbers represent the individual months. The formula =AVERAGEIF(B:B,K7,H:H)in L7 will average all revenue numbers for January. Copy that formula down to L18 to do the same for the remaining months.
We have to ensure that the seasonal indices average exactly to 1 to normalize them. This is actually quite easy. Put =L7/AVERAGE(L$7:L$17) into M7 and copy it down to M18.
Forecasting future months
In order to forecast the revenue for future months you have to multiply the trend line estimate for each month’s revenue with the the appropriate seasonal index. Copy the formula =VLOOKUP(B26,K$6:M$18,3)*G26 from I26 to I37 to predict revenue for the next 12 months.
That’s it you are done!
The model is adjustable for more recent trends as well, if you believe the recent trend of the series has changed significantly. As such you don’t have to take all Centered Moving Average for calculating the slope but could take only more recent months (e.g. the last half year in our case) in order to calculate the slope with months closer to the current date.
As usual the disclaimer that this is model won’t predict the future to 100% as it is based on historical data.The model will obviously be more accurate the more past data you have and furthermore the less volatile your time series development is.
Nevertheless above is an very easy to follow accurate method for forecasting sales and other metrics.
Did You Enjoy This?
Then feel free to sign up for my newsletter. Get my newest articles on career & skill development for marketers and guides on technical marketing every week.