Supercharge your Adobe Analytics Classifications with Google Sheets and Automation

Classifications are one of the best features of Adobe Analytics. They allow to enrich and translate tracked values by uploading classification files. One of the most common use cases is handling marketing campaign tracking codes, which can be translated from technical ids to understandable details about the campaign. This can be automated to a great extend, which is what this article will be about.

We are going to look at the architecture of our solution and plan our implementation. Right after that, we will start building our spreadsheet in Google Sheets and create an automatic upload to Adobe Analytics using the Python programming language. If you are just interested in the final script, you can find it on Github.

What we love and hate about Classifications

When you are using Adobe Analytics, chances are pretty high that you are already using some form of classifications. Rightfully so, because they are pretty awesome! For example, whenever I have to track how users interact with the different menus on a website, I like to put both the name of the menu and the clicked item in the same variable like this:

menu=Main Menu;link=About us

Then I would setup a classification rule with a regular expression to take that value and split it into certain classification fields:

With this I can save on dimension usage, since I can put two (or even more!) values into just one dimensions and split them afterwards. Also, if things should go wrong on the implementation, I could correct the classifications manually. That’s pretty awesome!

But this is not even the most common use case. That title is reserved to classifying marketing campaigns where the value tracked might be something like “sea1”, but the report should tell us that this is a SEA code from the Summer Sale campaign. To achieve this we would setup a classification structure with fields like “Channel” or “Campaign” to translate those tracking codes to understandable values.

And then things tend to get complicated. This is because of the way Adobe Analytics requires us to handle the classification information and how big companies usually handle marketing campaigns. To upload classification data we need to create an upload file in a very specific format regarding file headers, separators, column order, etc., like below:

This is already quite a challenge if you don’t handle technical files every day. But then those files need to be uploaded to Analytics, which can be done either in the frontend (if you file size is rather small) or by FTP, which is another challenge regarding technical skills and rights management. And since we are dealing with normal files, we will commonly run into issues where one agency messes up a file or not use the most recent version. Can’t we have this in a more simple way?

Google Sheets to the rescue

A way to tackle one of the problems outlined above is by using Google Sheets. It allows us to have only one version of the classification file and collaborate with others. Thanks to global undo and redo, we can know who messed up the file and why, and revert it to a previous state. We can also allow only certain users to change certain fields, preventing accidental changes. And the best part is: Google Sheets has a comprehensive API, just like Adobe Analytics for classifications. Let’s use both to 1UP our classification game!

But before we start, we should think about what we need to do to get data from Sheets to Analytics:

  1. Call the Google Sheets API to get the classification data we want to upload
  2. Create an import job for Adobe Analytics
  3. Add the data we received from Sheets to our import job
  4. Commit the import job

But while we are at it, why not build some cool things into our classification file? For example, we could include a switch to avoid uploading incomplete or not approved data. Also, why not have some indicator when the file has been uploaded the last time or how many rows were included? The file we are going to use for our little example looks like this:

You will notice that we have our approval toggle as a dropdown in cell C2. In production, we might limit who can set this field from “No” to “Yes” if data has been approved and is actually ready to be uploaded. From C2 to C4 we will put some meta information about our uploaded file. The other notable area is from A7 to C10, where I included some sample campaigns.

Calling the Google Sheets API with Python

We are going to use the same script from Importing Organic Google Search data to Adobe Analytics with a single script as a basis here, since it already includes everything we need to call both Google’s and Adobe’s APIs. I’m assuming you have already setup your Adobe IO and Google API projects as described in that article.

Let’s start coding! First, we need to import the modules we need:

import datetime
import requests
import sys
import jwt
import re
import httplib2

from apiclient.discovery import build
from apiclient.errors import HttpError
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import argparser, run_flow

Now we put our configuration in one big object like in the previous post:

config = {
"apiKey":"3ec174hj4k58be87e8fdivk49hkdfl396",
"technicalAccountId":"6CH54KFL43DM65FD2@techacct.adobe.com",
"orgId":"7JDB342JJJMOP78@AdobeOrg",
"secret":"dfjk3ik31-4bd2-4kf8-9df5-ff5j3834kb",
"metascopes":"ent_analytics_bulk_ingest_sdk",
"imsHost":"ims-na1.adobelogin.com",
"imsExchange":"https://ims-na1.adobelogin.com/ims/exchange/jwt",
"discoveryUrl":"https://analytics.adobe.io/discovery/me",
"key":b'-----BEGIN PRIVATE KEY-----\nMIIEv3yjqGA==\n-----END PRIVATE KEY-----',
"sheet_id":"1SfJ4k5k456lJKL4m7fkh3-KTs8",
"data_range":"Campaigns!A7:C",
"approval_range":"Campaigns!C1",
"meta_range":"Campaigns!C1:C4",
"description": "Automated Google Sheets import",
"report_suite_id": ["REPORT SUITE ID"],
"variable_id":"trackingcode",
"notification_email":"NOTIFICATION ADDRESS",
"upload_time": datetime.datetime.now().strftime("%Y-%m-%d  %H:%M:%S")
}

Everything up to line 11 should be familiar from Importing Organic Google Search data to Adobe Analytics with a single script. Line 11 now holds the spreadsheet ID from Google Sheets (look in the Sheet URL to find it). Line 12 to 14 tell us where the classification data lives, where the approval toggle is, and where the meta information should be written to. The last four lines then define the information needed for the Classifications API, including the file description, which Report Suite and variable (here: “trackingcode”, which is the campaign variable eVar0) the data should be imported in, what email address should receive notifications and when the file has been generated.

Now we need to authenticate to and call the Google Sheets API, receive the data we need and put it in the right structure for Adobe Analytics:

def get_authenticated_google_service():
    flow = flow_from_clientsecrets("client_secrets.json", scope="https://www.googleapis.com/auth/spreadsheets",
    message="MISSING_CLIENT_SECRETS_MESSAGE")
    storage = Storage("oauth2.json")
    credentials = storage.get()
    if credentials is None or credentials.invalid:
        credentials = run_flow(flow, storage)
    return build("sheets", "v4", http=credentials.authorize(httplib2.Http()))

google_sheets = get_authenticated_google_service().spreadsheets()
data_rows = google_sheets.values().get(spreadsheetId=config["sheet_id"], range=config["data_range"]).execute()
upload_flag = google_sheets.values().get(spreadsheetId=config["sheet_id"], range=config["approval_range"]).execute()

if upload_flag["values"][0][0]=="No":
    print("File not ready for upload. Exiting...")
    google_sheets.values().update(spreadsheetId=config["sheet_id"], range="Campaigns!C2",valueInputOption	
="USER_ENTERED",body={"values":[[config["upload_time"]]]}).execute()
    sys.exit()

classification_rows = []
for row in data_rows["values"]:
    classification_rows.append({"row":row})

if len(classification_rows) < 1:
    print("No data to upload. Exiting...")
    sys.exit()

Lines 1 to 10 take care about the authentication with the Google APIs. Line 11 and 12 receive the actual data from that API for the approval toggle and the actual lines to upload. If we are not allowed to upload the data because of the approval field, we are just updating the “last checked” field C2 and exit the script (Lines 14-18). Lines 20-22 then iterate through the rows we received and put them into a list in the format we need for Analytics. We can do it like this because my Sheet has the same structure we need for the upload. If your file should be different, you could define the fields in line 22 for yourself. If there are no rows, we exit in lines 24-26.

Like we did before, now we need to authenticate to the Adobe API:

def get_jwt_token(config):
    return jwt.encode({
        "exp": datetime.datetime.utcnow() + datetime.timedelta(seconds=30),
        "iss": config["orgId"],
        "sub": config["technicalAccountId"],
        "https://{}/s/{}".format(config["imsHost"], config["metascopes"]): True,
        "aud": "https://{}/c/{}".format(config["imsHost"], config["apiKey"])
    }, config["key"], algorithm='RS256')

def get_access_token(config, jwt_token):
    post_body = {
        "client_id": config["apiKey"],
        "client_secret": config["secret"],
        "jwt_token": jwt_token
    }

    response = requests.post(config["imsExchange"], data=post_body)
    return response.json()["access_token"]

def get_first_global_company_id(config, access_token):
    response = requests.get(
        config["discoveryUrl"],
        headers={
            "Authorization": "Bearer {}".format(access_token),
            "x-api-key": config["apiKey"]
        }
    )
    return response.json().get("imsOrgs")[0].get("companies")[0].get("globalCompanyId")

jwt_token = get_jwt_token(config)
access_token = get_access_token(config, jwt_token)
global_company_id = get_first_global_company_id(config, access_token)

Next up is the interesting part. We are creating our import job with the Adobe Analytics Classification API like this:

jobid = requests.post(
        "https://api.omniture.com/admin/1.4/rest/?method=Classifications.CreateImport",
        headers={
            "Authorization": "Bearer {}".format(access_token),
            "x-api-key": config["apiKey"],
            "x-proxy-global-company-id": global_company_id
        }, 
        json={
            'rsid_list':config["report_suite_id"],
            "element":config["variable_id"],
            "check_divisions":0,
            "description":config["description"],
            "email_address":config["notification_email"],
            "export_results":0,
            "header":["Key","Channel","Campaign"],
            "overwrite_conflicts":1
        }
    ).json()["job_id"]

Line 15 and 16 are the only ones we could adapt here. Line 15 defines which columns we want to import and in which order. Make sure the order and number of fields matches what has been collected from the Google Sheets API. Line 16 defines if we want to overwrite existing data, which we most certainly want to do.

With the Job ID we receive as result, we now populate that job with data. If you have to import more than 25,000 lines, you need to split it into multiple jobs, which we will skip right now:

result = requests.post(
        "https://api.omniture.com/admin/1.4/rest/?method=Classifications.PopulateImport",
        headers={
            "Authorization": "Bearer {}".format(access_token),
            "x-api-key": config["apiKey"],
            "x-proxy-global-company-id": global_company_id
        }, 
        json={
            'job_id':jobid,
            'page':1,
            'rows':classification_rows
        }
    ).json()

There is not much going on here. Line 11 is the only interesting one, because here we add the rows from the previous iteration over the Google Sheets result.

Now all we need to do is submitting the import job by committing it like this:

result = requests.post(
        "https://api.omniture.com/admin/1.4/rest/?method=Classifications.CommitImport",
        headers={
            "Authorization": "Bearer {}".format(access_token),
            "x-api-key": config["apiKey"],
            "x-proxy-global-company-id": global_company_id
        }, 
        json={
            'job_id':jobid
        }
    ).json()

And that’s it! Adobe will send you an email with the result of the import. To finish things up, we update our Sheet with the import time and number of rows uploaded. Also, we reset the approval toggle to “No”:

google_sheets.values().update(spreadsheetId=config["sheet_id"], range=config["meta_range"],valueInputOption	
="USER_ENTERED",body={"values":[
    ["No"],
    [config["upload_time"]],
    [config["upload_time"]],
    [len(classification_rows)]
    ]}).execute()

The last block changes the meta information like this:

That’s all we need to do! As always, you can find the whole script on Github.

Making your life easy with automation

This little script allows us to streamline our campaign classification process in a meaningful way. Now there is only one file for our internal team and agency to use. Once the agency is finished setting up the campaign, the marketing manager can check the sheet and approve it for uploading. We can run our script through a Cron job every day (or hour) and allow our marketeers to be quick and self-serviced.

But even outside of marketing, setting up those highly automated processes can change how you work with classifications. When there is no more tedious upload and file format to manage we can drastically lower the threshold for working with classifications more often to correct or translate data.

Let me know what you think of this approach and what cool other use cases come to your mind! ?