# Google Sheets formulas for analyzing marketing data

This is the second lesson from the Google Sheets for Marketers mini course. In this lesson I’ll walk you through the most important Google Sheets formulas for analyzing marketing data as well as some useful functions such as conditional formatting and filters. You’ll even build a small tool for comparing performance metrics for different advertising channels.

#### What you’ll learn in this lesson:

Conditional Formatting, Filters, Data Validation, AVERAGE, MEDIAN, MODE, MAX, MIN, COUNTIF[S], AVERAGEIF[S], SUMIF[S], TRANSPOSE, VLOOKUP, INDEX, MATCH, calculating ROI.

## Preparing the data

This is the Google sheet with the practice data.

In our last lesson we generated some first insights from the ad channel data set with the help of pivot tables. One of the insights was that Facebook ads were not performing as well, which is why we are going to focus on those for this lesson to analyze them further.

First make a copy of the worksheet (it’s always a good idea to keep a backup of the original data when manipulating it). Next select all data in the new sheet and in the menu click on DataCreate a filter. You can now click on the three bars next to Channel in cell A1. De-select Twitter Ads and Google Ads by clicking on them respectively. This will allow us to focus on the Facebook Ads data only without distraction from the data from the other advertising channels.

As you learned in the first lesson revenue and profitability of Facebook Ads are going down. Since costs generally stayed the same you are guessing it might have something to with the Average Order Value. In order to be sure we are going to apply a coloured heatmap to the values to see if there is a downward trend. Select all cells with Average Order Value data and click on Format Conditional Formatting in the menu.

While you could colour the cells depending on if the are not empty or contain a certain value we want to colour them on a scale from min to max value. As such choose Colour Scale on the right site. Under Preview choose the scale from green to yellow to red, which will colour the lowest values green, mid range values yellow and high values red (you could choose different colours here, but we are going to leave them as they are for now).

As you can clearly see, Average Order Values dropped significantly at the end of 2017 and as such we got our first insight here (in a real life scenario I would e.g. talk to one of the Social Media PPC managers now to see what might have happened at the the end of the year., if I don’t know it myself).

## Calculating the typical value of the data set

When working with marketing data you will often have to deal with large data sets. It’s often difficult to make sense of those data sets due to the sheer amount of metrics. As such it’s helpful to summarize the data by getting the typical values, min/max values and others to get a feel for the data.

There are three different ways you usually use to summarize the typical value for a data set.

The mean or average is simply the sum of the numbers in in the data set divided by the number of values in the data set. Write =AVERAGE(G41:G79) into N41 to get the average number of Conversions per month.

The median is the 50th percentile of the data set. This means one half of the data is below the median and the other half is above the median. Write =MEDIAN(G41:G79) into N42 to get the median for Conversions.

Last is the mode of the data, which is simply the most frequently occurring value in the data set. Write =MODE(G41:G79) into M43 to get the median for Conversions.

As such on average there were 53.2 conversions via Facebook Ads each month, around one-half the time there were fewer than 52 conversions and the most frequently occurring number of conversions per month were 49.

None of the three is the best per se. In most cases you would take either the mean or the median. If you have extreme values they tend to distort the mean and the median is a better choice as a summary of a typical data value. However the median might throw out important important in other situation. So as a general rule of thumb use the mean, if no extreme values are present and the median otherwise. In our case no extreme values are present and we can focus on the mean

### Finding the smallest and largest values

This is actually an easy one. Simply type in =MAX(H41:H79) in N46 and =MIN(H41:H79) in N47 to get the largest and smallest values respectively.

## Calculating ROI

First we’ll calculate ROI for each month. As Return on investment = Revenue / Investment you can put =((H41-I41-J41)/(I42+J42)) in K41 and drag it all the way down to K79. You now have the ROI for each month.

In order to get the total ROI for all month write =(SUM(H41:H79)-SUM(I41:J79))/SUM(I41:J79) into Cell N48. This formula summarizes the revenue first and then subtracts the sum of all costs.

In order to count the number of times we had a positive ROI you can use the COUNTIF formula. This formula will count values depending on a certain criteria. In our case the criteria will be that the ROI is larger than 0. So write =COUNTIF(K41:K79,”>0″) into N48.

## Putting the metrics into context

Above we calculated several performance metrics for Facebook ads. However they are quite useless, if we don’t put them into context (a general rule for marketing analysis: Never just dump metrics out there, always put them into context and make them actionable). In our case the context would be to compare the Facebook ad metrics with Twitter and Google ad metrics. With the exercises we did above we have the tools to do exactly that.

First select the Google ads and Twitter ads from the filter in A1 as well so you can see the data for all advertising channels. Next calculate ROI for those two channel by dragging the ROI formula into the empty cells in column K.

As preparation write the following headlines into the corresponding cells:

Average CPC: in cell N2

Max: in cell O2

Number of positive ROI months: in cell P2

Total ROI: in cell Q2

We’ll calculate the average CPC for each channel first. You can use the AVERAGEIF formula for this. The AVERAGEIF formula checks if a cell in the criterion range matches a certain criteria and will only average the values of the row with matching criteria. E.g. put =AVERAGEIF(A2:A118,M3,F2:F118) into N3.  Sheets now checks if the cell in specified range A2 to A118 matches the value of M3 (Facebook Ads) and will only calculate the average of the values in range F2:F118 of the rows with matching criteria.

Put \$ in front of the range row specifiers and drag the formula into N4 and N5 to do the same for the other channels (the \$ will keep the range the same).

Well now do something similar to find each maximum Average Order Value. As such put =MAXIFS(C\$2:C\$118,A\$2:A\$118,M3) into O3 and drag it into O4 and O5. Keep in mind that the order inside the formula is different. Unfortunately that’s the case for most *IF formulas. So alway pay attention to the hints in the upper left corner, which give specific instructions here.

Next we’ll use the COUNTIFS formula to calculate the number of positive ROI months. COUNTIFS allows several criteria (as opposed to COUNTIF). Put =COUNTIFS(A\$2:A\$118,M3,K\$2:K\$118,”>0″) into P3. This formula will only count rows which match criteria M3 (Facebook Ads) in column A as well as has values >0 in column K.

Drag the formula down to get the counts for the other channels.

Last we’ll calculate total ROI. For this we’ll use the SUMIF, which works similar to the AVERAGEIF formula. As such it will only sum the values in a range if a certain criteria matches the criteria range in the same row. It is a pretty long formula to calculate the ROIs. However you are basically summing up Revenue and subtracting the sums of Advertising Costs and Other Costs first and then dividing that by the sums of Advertising Costs and Other Costs. Put

=(SUMIF(A\$2:A\$118,M3,H\$2:H\$118)-SUMIF(A\$2:A\$118,M3,I2:I\$118)-SUMIF(A\$2:A\$118,M3,J\$2:J\$118))/(SUMIF(A\$2:A\$118,M3,I\$2:I\$118)+SUMIF(A\$2:A\$118,M3,J\$2:J\$118))

into Q3 and drag it down to Q4 and Q5.

Your new table comparing performance metrics from the different advertising channels is technically done. However it is kind of hard to read. It would would be better to have the different channels as rows and the metric titles as columns. The can be easily done with TRANSPOSE, which will interchange rows and columns. Put =TRANSPOSE(M2:Q5) into M8 (you could also use the paste function of the same name instead, however that would mess up the formulas).

Even though the format is better now it is still hard to compare the performance metrics on first sight. Some colour coding would be nice… Luckily you already learned how to do heat maps in the beginning. Select cells N9 to P9 and click in the menu on Format Conditional Formatting. Choose the color scale on the right side with green to yellow to red. Do the same for cells N10 to P10, N11 to P11 as well N12 to P12. However for those three interchange green and red as we want green to indicate value where the respective channel is better than the other channels.

The final result (and insight) shows us that Twitter Ads are actually comparing quite well in all metrics compared to the others even though the Average Order Value is quite low. Facebook Ads on the other hands perform quite bad in all metrics compared to the other channels. This might indicate that you should shift some budget from Facebook Ads to Twitter Ads.

## Preparing the data for charts

The last part of this lesson will prepare the data for building some charts (and a simple reporting which you could use to send out to other stakeholders or clients). As such we will work in the sheet Solution – Charts.

There is actually another new sheet called Worksheet – Budgets/Costs, which contains the budgets and actual costs of several 2019 advertising channel.

You could just copy the relevant data from the former sheet to the later one. However in very long list it can be very toilsome to find relevant data. There is a smarter way called VLOOKUP, which will find relevant data for you based on a key.

Prepare your sheet by writing the following in the cells:

Budget 2018 in cell B1

Actual Cost in cell C1

Next write

=VLOOKUP(\$A2,’Worksheet – Budgets/Costs’!\$A\$2:\$C\$10,2,FALSE)

in Cell B2.

What this does is that VLOOKUP searches for Key A2 (Facebook Ads) in range A2 to C10 in Worksheet – Budgets/Costs and returns the cell of the 2nd column of the row where it finds the key. FALSE only says that the range is not ordered in any particular way. Past the formula in cell B4, B6, C2,C4 and C6 as well. Since we are looking for the actual costs in column C you have to replace the 2 in the formula of C2,C4 and C6 with a 3 to return a cell from the third column.

### Building a performance metric comparison tool

Two other useful formulas to find data are INDEX and MATCH. Those two combined are a powerful tool to find data in large data set. INDEX gets a value at a specified location in a range of cells based on the  numeric position. E.g. putting =INDEX(A1:C4,2,3) in any cell in the sheet Solution – Charts will get you the cell in the second row and third column of the range A1 to C4 (in this case that would be \$24,310).

MATCH will find the numeric position of an item in a list.  E.g. putting=MATCH(“Google Afs”,A2:A4,1) in any cell in the sheet Solution – Charts will get you the position of Google Ads in the list A2 to A4. The last 1 indicates that we are looking for an approximate match (which is why it ignores the typo) rather than an approximate match (in which case we would use  0).

We will use those two formulas two build a small dynamic performance metric comparison tool. First write

Total ROI: in cell A9

Copy this formula into B9 and paste it into C9 as well:

=INDEX(‘Solution – Functions’!\$N3:\$Q5,MATCH(B8,’Solution – Functions’!\$M3:\$M5,0),MATCH(\$A9,’Solution – Functions’!\$N2:\$Q2,0))

It is actually a simple index function, however row and column indicators are replaced by match functions. So MATCH(B8,’Solution – Functions’!\$M3:\$M5,0) looks for the value in B8 (=Twitter Ads) in range M3 to M5 of the Solution – Functions sheet and gives back its position (=3) while MATCH(\$A9,’Solution – Functions’!\$N2:\$Q2,0) looks for the value in A9 (=Total ROI:) in range N2 to Q2 of the Solution – Functions sheet and gives back that position (=4). The INDEX function takes the positions and uses them as row and column indicators for the specified range respectively.

The cool thing is now, that if you would e.g. change Twitter Ads in cell B8 to Facebook Ads it would update the value in C9 automatically!

However every proper tool has some dropdown menus. We can add those with data validation. Data validation tells Sheets that only certain values are allowed in a cell. Select cells B8 and C8 and right click on them. Choose Data validation… In the empty field next to List from range paste this: ‘Solution – Functions’!M3:M5. That is a list of the three advertising channels we are analyzing. Click on Save.

Do the same for cell A9 by right clicking on it, choosing Data validation… and pasting ‘Solution – Functions’!M9:M12 into the empty field. Save.

You can now use the dropdown menus to choose the comparison metric as well as the channels you want to compare. We prepared everything in this sheet for the next charts lesson. Based on the data we will create some charts, modify them to look better and I’ll show you how they can be updated dynamically to build some simple beautiful reports.