Navigating the Google Ad Data Transfer for BigQuery
So, you got the BigQuery Google Ads data transfer enabled, and you’re ready to do some advanced analysis. You pop into your BigQuery project and find more than 100 tables and views. Then you realize none of them allow you to view a report as simple as how much you spent on a given keyword. Suddenly, you realize this is more than you bargained for. Sound familiar?
It’s ok, we’re here to help!
With a little bit of background and a few examples, hopefully, you’ll see why all these tables give us a lot of versatility and freedom in how we report out of Google Ads.
Before we get started, this guide is not a description of every table and view created for Google Ads. Nor is it a Google Ads guide. Instead, it’s designed to help you understand the structure of the BigQuery data transfer for Google Ads which will enable you to complete some basic reporting tasks. Let’s get started with a discussion of structure.
What is the Google Ads Data Transfer Service for BigQuery?
Working in the same marketing tech stack has its benefits. There are several ways to regularly schedule data imports to BigQuery, but most require some additional set up or code. However, for many of its own products (Ads, YouTube, Display and Video 360, etc.), Google has created a simple solution to set up regular imports with just a couple clicks. For the convenience of fresh reliable Ads data in BigQuery, Google charges a monthly fee for each External Customer ID imported with the transfer. It’s just five easy steps to enable the transfer.
Data Transfer Structure
Before we dive into queries and joining tables, it’s important to understand the overall table and view structure.
Views vs. Tables
One of the first things you’ll notice is repeated names in the format of a view and then a similarly-named table. The table always has the same name but with
p_ at the beginning. All Google Ads data is supplied in both a view and again as a partitioned table (
p_ version). The options ultimately give us more flexibility in how we query data over time.
From an efficiency and cost perspective, it makes sense to use the partitioned tables, because we can select and process only the dates that are relevant for the reporting task. The views require us to process all dates since the transfer was enabled, which adds up pretty quickly.
Table and View Naming Convention
Look a little closer and you will see that there is a formula for many of the tables. For example, the terms “Stats,” “BasicStats,” and “ConversionStats” show up repeatedly. The basic structure for all tables is:
Decoded, that translates to:
p_: Present if it’s a partitioned table. Otherwise, it’s a view.
AttributeCategory: One of many values signifying the primary attribute the report represents. These are things like campaigns, ad groups, ads, and keywords.
[StatDetail]: You’ll see this additional modifier on some tables. Each attribute category can be queried with different stats. Some stats tables have basic info that’s commonly used. Others are more detailed. We’ll go over the options below.
_123456789: All tables and views end with the account ID or MCC ID.
The first challenge of dealing with Ads data lies in this structure. For each attribute, there are stats tables which contain metrics and separate tables that contain only information about your campaign attributions. For example, our keywords live in a different table than the metrics associated with the keyword. Most groups of tables follow the same basic format:
p_AttributeCategory_12345789: Table that contains rich information about campaigns like ad group names, keywords, ad text, and audience descriptions.
p_AttributeCategoryBasicStats_12345789: Basic stats give you high-level metrics segmented by some ID (e.g. ad group ID, ad ID, etc.) associated with the data.
p_AttributeCategoryStats_12345789: All the same metrics as BasicStats but with the addition of the computed metrics (like CTR, CPC) and a lot of additional segmentation. The table includes things like time-based segments and custom attribution model information.
p_AttributeCategoryConversionStats_12345789: Have multiple conversions? This is where we get conversion metrics segmented by each goal defined in the ads account.
p_AttributeCategoryCrossDeviceConversionStats_12345789: Two key additions are added to this report. Google’s “All Conversions” metric as well as the total number of conversions believed to be cross-device.
p_AttributeCategoryCrossDeviceStats_12345789: This is possibly the most interestingly named of all tables in the transfer. Here, you get the impression assisted data you might expect to see, but all video metrics and viewability metrics also appear.
p_AttributeCategoryNonClickStats_12345789: What we mean by non-click stats is non-click engagement – primarily video views, but also lightbox expansions and showcase shopping ad interactions.
To develop any human-readable report, we see that we will need to join a least one Stats table to one base Attributes table.
Each of the attributes tables includes the human-readable information about the attribute (the ad group names, ad text, keywords) as well as advanced attributes like bidding types and bidding rules and restrictions set at that level. The most important attribute table options include:
- Ad Groups
- Age Ranges
- Big Goal
- Criteria (all other targeting options keyword, audience, interest, etc.)
- Geography (Geo)
- Parental Status
While not all attributes have outlined stats tables, they all have some. So, if the initial question is, “Why are there so many tables and views in the Google Ads data transfer?” then the answer is:
( (attribute table * several stats tables) * number of attribute tables ) * 2 to include views = a lot of tables and views
Now that we’re starting to grasp the structure, let’s go over some common examples of joining this data.
Basic Keyword Report
In this example, we are simply joining some keyword information from
p_Keyword_000000000 with some basic statistics from, you guessed it,
But once again, it’s not quite that simple. You’ll notice we do a bit more than just grab some columns for a quick join. More specifically, this query gets attribution with various segments – slot, device, date, position, etc. So, we will expect to see CriteriaId (our join key) repeated here. We’re just adding in the text value for every occurrence of the ID.
WITH stats as (SELECT CampaignId, AdGroupId, CriterionID, AdNetworkType1, AveragePosition, Clicks, ConversionValue Conversions, Cost, Date, Device, Impressions, Slot FROM `your-project.googleads_ads.p_Keyword_[ACCOUNT_ID]` WHERE _PARTITIONTIME BETWEEN "2019-03-01" AND "2019-03-01"), keywords as ( SELECT CriterionId, MAX(Criteria) as Criteria FROM `your-project.googleads_ads.p_Keyword_[ACCOUNT_ID]` WHERE _PARTITIONTIME BETWEEN "2019-03-01" AND "2019-03-01" GROUP BY CriterionId) select * from stats left join (select * from keywords) USING(CriterionId)
For an accurate keyword report, we have to filter out a few things with this WHERE clause:
isNegative is FALSE
AND status = 'ENABLED'
This table also includes our negative and disabled keywords, which we don’t typically want for a basic keyword report. Let’s get rid of those.
We also have to make sure we group to a single CriteriaId before joining with the stats table. Intuitively, we think about this data in terms of keywords, but Ads thinks about it as a CriteriaId. All of our cost and bidding information rolls back up to this value.
Furthermore, consider an account with the same keyword in multiple campaigns? Maybe you have redundant geographic campaigns where you’re bidding differently or complex audiences bidding on the same term. In most cases, we want to roll up all impression, click, and cost data for each instance of the term. To do this, we group by and join on all IDs from the top (ExternalCustomerId – our Ads Customer ID) down to the most specific (in this case the CriterionId associated with the keyword).
Search Ad Text Report
The ad report includes all of the same nuances as the keyword report, but we introduce some additional complexities. Most importantly, the wide range of various ad types available in Google Ads. Even beyond the differences in search and display, we have several options with unique fields for search ad types (Extended Text Ads, Responsive Search ads, Call Only Ads, Expanded dynamic search ad, and legacy Text Ads). Each has its own columns associated with ad text.
In this example let’s create an ad text report for all of our most common search ad types:
WITH stats as (SELECT ExternalCustomerId, CampaignId, AdGroupId, CreativeId, AdNetworkType1, AdNetworkType2, AveragePosition, Clicks, ConversionValue Conversions, Cost, Date, Device, Impressions, Slot FROM `your-project.googleads_ads.p_Keyword_[ACCOUNT_ID]` WHERE _PARTITIONTIME BETWEEN "2019-03-01" AND "2019-03-01"), ads AS (SELECT AdType, ExternalCustomerId, CampaignId, AdGroupId, CreativeId, CASE WHEN AdType = 'EXPANDED_TEXT_AD' THEN CONCAT(headlinepart1, ' ',headlinepart2) WHEN AdType = 'TEXT_AD' THEN headline ELSE null END as textAdHeadline, CASE WHEN AdType = 'EXPANDED_TEXT_AD' THEN description WHEN AdType = 'EXPANDED_DYNAMIC_SEARCH_AD' THEN description WHEN AdType = 'TEXT_AD' THEN CONCAT(headlinepart1, ' ',headlinepart2) ELSE null END as textAdDescription, displayUrl, REGEXP_REPLACE(CreativeFinalURLs, r'\[?"\]?', '') as destinationUrl FROM (SELECT AdType, ExternalCustomerId, CampaignId, AdGroupId, CreativeId, headline, headlinePart1, headlinePart2, description, description1, description2, displayUrl, CreativeFinalURLs FROM `your-project.googleads_ads.p_Keyword_[ACCOUNT_ID]` WHERE _PARTITIONTIME BETWEEN "2019-03-01" AND "2019-03-01" AND status = "ENABLED" AND REGEXP_CONTAINS(AdType, r'^(EXPANDED_)?(TEXT|DYNAMIC_SEARCH)_AD$') GROUP BY AdType, ExternalCustomerId, CampaignId, AdGroupId, CreativeId, headline, headlinePart1, headlinePart2, description, description1, description2, CreativeFinalURLs, displayUrl)) select * from stats left join (select * from ads) USING(ExternalCustomerId, CampaignId, AdGroupId, CreativeId)
Our ads data table has an updated WHERE clause here. We still want enabled ads, but now we need to grab our common search ad types:
AND REGEXP_CONTAINS(AdType, r'^(EXPANDED_)?(TEXT|DYNAMIC_SEARCH)_AD$')
This regex will give us the ad types
TEXT_AD, EXPANDED_TEXT_AD, DYNAMIC_SEARCH_AD, and
EXPANDED_DYNAMIC_SEARCH_AD. Other ad formats are different enough that they won’t lend themselves to this kind of text analysis.
We also included a couple of CASE WHEN Statements to pull ad text into broad headline and description fields regardless of ad type. There are several ways to do this depending on your reporting, but the key is remembering that different ad types store text differently:
WHEN AdType = 'EXPANDED_TEXT_AD' THEN CONCAT(headlinepart1, ' ',headlinepart2)
WHEN AdType = 'TEXT_AD' THEN headline
END as textAdHeadline,
WHEN AdType = 'EXPANDED_TEXT_AD' THEN description
WHEN AdType = 'EXPANDED_DYNAMIC_SEARCH_AD' THEN description
WHEN AdType = 'TEXT_AD' THEN CONCAT(headlinepart1, ' ',headlinepart2)
END as textAdDescription
The last big change here is the treatment of the destination URL. Because there can be multiple destination URLs per ad, BigQuery puts them in an array (in brackets). If you are like most users, who have one URL per ad, you can use this bit of regex to replace the brackets:
REGEXP_REPLACE(CreativeFinalURLs, r'\[?"\]?', '') as destinationURL
The Power of Google Ads Data Transfer for BigQuery
These two queries only scratch the surface of the power of the Google Ads data transfer for BigQuery. The foundations outlined here are the recipe for high-level reporting, but hopefully, as you use these tables, you find more opportunities for joining and expanding reporting.