Google Analytics Checkup With R And Management API

March 8, 2018
Google Analytics Checkup With R And Management API

Using the combined power of R and the Google Analytics Management API, we can efficiently and programmatically check that our accounts are functioning properly. In the past, we’ve talked about using R (a statistical programming language) to help you download and analyze your Google Analytics data. This tool can also help you monitor the quality of your data and audit some of your Google Analytics settings.

A Little Bit of Setup

If you haven’t used the Google Analytics API with R before, there are a few extra steps including downloading R and RStudio as well as enabling the Google Analytics API.

To pull configuration data, we are going to use Mark Edmondson’s googleAnalyticsR package. You will need to install this package through the packages tab in RStudio or by running this code.

In order to use the googleAnalyticsR package, you will need to create your own Google Developer Project API Key. You can find a detailed video with instructions on how to do this on the documentation page for googleAnalyticsR.

Export Your Configurations!

googleAnalyticsR provides a number of functions that allow you to download configuration information.

Listed below are a few common “GA checkup” to-do items and which functions you can use to efficiently complete those tasks. For a complete list of functions, check out the googleAnalyticsR help.

Account To-dos

  • Document all of your accounts, properties, and views. With a little help from the google_analytics_4 function, you can see which properties are still receiving data and which could be renamed as Historical properties or views.
  • Create a list of account, property, and view ids for use in other reporting automation tasks.
  • Check that your time zones are set consistently across all views.
  • Check that bots are filtered out of all appropriate views.
  • Check that site search is consistently and correctly set up on all views.

Use the ga_account_list, ga_webproperty_list, and ga_view_list functions to download a list of all account, property, and view names and ids. You will also see some meta-data such as creation and modified dates as well as various other configuration information such as time zones, excluding bots, site search settings, default page, exclude query parameters, and industry vertical.

Filter To-dos

  • Identify all of the filters you have in place. Ensure that all best practice filters are correctly set up and are being applied appropriately across all views in your account.
  • Identify any duplicate or unused filters that could be cleaned up or deleted.
  • Double check that your internal traffic filters up to date.

Use the ga_filter_list and ga_filter_view_list functions to view a list of the filters you have created and applied in your Google Analytics account. The ga_filter_list function will return a list of all of the filters in your account along with configuration information, such as what type of filter it is, what field you are filtering on, and what filter criteria is being used. You can also see some meta-data such as when the filter was created and last updated. The ga_filter_view_list function shows you which views these filters are applied to.

Goals To-dos

  • Check that your goals set up consistently across all views.
  • Check that your destination and event-based goals are up to date and still functioning correctly.
  • Document your goal and what they mean for business partners and colleagues.
  • Check that all of your main business objectives are being measured by goals.

Use the ga_goal_list function to quickly view all of the goals set up in your view. You can basic meta data like name, goal number, goal type, and creation and last modified date. In addition, you can view info about how your goals are configured – things like what urls and events you are matching off of.

Custom Dimension and Custom Metrics To-dos

  • Check that the scope is appropriate for each custom dimension and metric.
  • Document your custom dimensions and metrics for business partners and colleagues.
  • With a little partnership the Core Reporting API (see the google_analytics_4 function), check that the right data is flowing into all of your custom dimensions and metrics.

Use the ga_custom_vars_list function to download a list of all of your custom dimensions and metrics. You can see some additional information as well, like the index, scope, creation date, and last modified date.

User Access To-do

  • Monitor who has access to your Google Analytics data and remove access when people leave the company or no longer use Google Analytics. Be especially careful with who has Edit access!

Use the ga_users_list to see who has access to your Google Analytics accounts.

Tidy Your Data!

When you pull this data through the Google Analytics API, you might notice that the data is not always the nice, clean table of tidy data that we like to work within R. Unfortunately, this data sometimes need a little extra work before we can easily evaluate our to-dos or export to csv (with a write.csv command).

Here are a few tips for working with this data:

Why Do I See a List and Not a Data Frame?

These functions often return a list with some extra (often uninteresting) information from the API. Usually, we just want to focus on the items part of that list, so adding a

$items

to the end of your function call can often help you surface the data you are interested in. Here’s an example with ga_custom_vars_list.

ga_goal_list(account_id, property_id, view_id)$items

Why Can’t I Write This Data Frame to a CSV File?

Unfortunately, the some of these datasets can get a little complex. For example, information about goal funnels is buried deep in the urlDestinationDetails column. Sometimes, the way these complicated columns are structured makes it impossible to force into a nice, clean csv export. It can also make it hard quickly interpret your configuration data. To clean this up, I recommend taking a look at the purrr package and especially the map family of functions.

Why Is the Tidyverse Angry at Me?

When using tidyverse functions, you may run into some errors like:

Error: Columns `parentLink`, `urlDestinationDetails`, `eventDetails` must be 1d atomic vectors or lists

To understand this error, take a look at the structure of your data frame. (Try the str function.) You will probably see a few columns that have the type of ‘data.frame’. This error is the tidyverse telling us to not be so lazy and clean up these columns. Below is a function I’ve been using to clean up these pesky data frame columns.

# columns that are formatted as a data frame cause trouble for tibbles, so we clean that up here
clean_df_cols <- function(data) {
  
  # find the problematic columns
  data_frame_cols <- data %>% 
    select_if(is.data.frame)
  
  # will need these to add a prefix and keep the naming conventions
  name_additions <- rep(names(data_frame_cols), map_int(data_frame_cols, ncol))
  
  # break out the data frame columns into their own columns
  data_frame_cols <- data_frame_cols %>% 
    reduce(bind_cols) %>% 
    select_if(~ !is.data.frame(.))
  
  # add in the column-naming prefix
  names(data_frame_cols) <- paste(name_additions, names(data_frame_cols), sep = "_")
  
  # remove the data frame columns and add back in the broken out columns
  clean_data <- data %>% 
    select_if(~ !is.data.frame(.)) %>% 
    bind_cols(data_frame_cols)
  
  return(clean_data)
  
  
}

Why Don’t My Datasets Look the Same Across Different Views or Properties?

This is due to you having a different set-up in your different views. For example, if you don’t have any event based goals in a certain view, you will not see the eventDetails column.

This Just Isn’t Working, and I Need Some Additional Help.

There is a link to an R Markdown file in the next section with some code that might help you clean up and use the data from these functions. Or, reach out to the Analytics and Insights team for more help.

Get Fancy!

Having programmatic access to this data opens up a lot of possibilities for creative ways to monitor your Google Analytics data. As an example of what you can do, I’ve created a RMarkdown file that helps you monitor the status of your goals and custom dimensions and can print status information out in a human-readable Word doc.

To create the document, I would recommend first creating an R project then downloading the RMarkdown file and moving it into your project folder.

Then, check that you have all of the appropriate packages installed (see the packages “chunk” or section of code.) Update the authentication chunk with the client id and secret for your API project.

Then run both the packages and authentication sections of code using the green play button. You should be prompted authenticate in a browser window and then you should see a “.httr-oauth” file appear in your project folder. This file stores your authentication settings.

Update the params section at the top of the page with your account, property, and view id.

Before sending the data to Word, you may want to run each of the chunks of r code to see what they are doing. This will also create a few datasets (views, goals, and cd) in your environment that you may decide to export directly to csv, Excel or sheets for further analysis.

Then, finally, run this report and export the results to Word using a process called Knitting. You can do this by clicking on the Knit button above your code. If you click the Knit drop-down, you can also choose to Knit with Parameters, which will prompt an interactive window that allows you to easily change the account, property, and view ids as well as the date ranges being used to collect data about your goals and custom dimensions.

Once you click the Knit button, you should see an RMarkdown tab open beside the console where the progress of your report will be logged. Once the report is complete, a Word file will open with your Check-Up results!

A Note to Enterprise Organizations

Maintaining a large number of Google Analytics accounts can be a time-consuming and difficult job. Fortunately, the googleAnalyticsR package can really help us out here. You can use the functions described above to print out a list of your GA configurations across all of your accounts into a csv file or other reporting tool. Here are a few tips for doing this:

  • Use the ga_account_list function to pull a list of your account structure and the appropriate ids. You will need those ids to run the other functions. You may want to use dplyr’s filter function to narrow that list down to just the accounts that you are interested in.
  • Use purrr’s map function to apply the same function to a number of different accounts in one fell swoop. You may want to follow this up with a handy
    reduce(bind_rows)

    to consolidate down to a single data frame. Note, you will probably want to apply the map function to a “helper” function that cleans up your data, rather than applying it to the googleAnalyticsR function directly.


I hope that with the use of this checklist you are able to use the power of R to responsibly and programmatically work with your Google Analytics data!