Creating Marketing Channel Stacking and Pathing Reports from Adobe Analytics with Power BI

Ever since my very first attendance, Adobe Summit is my number-one source for inspiration for new things to try out in Adobe Analytics. When the world’s leading practitioners and product experts from Adobe come together to share their knowledge, there is a lot to learn for everyone.

This year, Eric Matisoff invited me to share a visualization I created in Analysis Workspace as part of the Analytics Rockstars session. However, the true Rockstar content in that session was the Tips & Tricks shared by Jenn Kunz using Excel with the Flow viz in Workspace. A followup conversation on Measure Slack then unveiled some improvements using Data Warehouse and reminded me of an approach of my own that I want to share today.

Some years back I used Adobe Analytics’ Data Feeds with Elasticsearch and Grafana to analyze marketing performance beyond what Adobe Analytics has to offer. While that was a very powerful approach at the time, it required a lot of expertise with Data Feeds and Big Data systems that not every company can maintain. Other approaches try to maintain path information in the frontend as Jan described here (in German) but, in my personal experience, are not as reliable as using the already captured data.

Today we are going to use a fun combination of Adobe Analytics’ Data Warehouse API with Python in Power BI to create what marketers know and love as Channel Stacking Reports. Those reports help marketers understand which combinations of marketing channels were involved in a user journey and in which order. Channel stacking reports commonly look like this:

Marketing Channel Stacking Report from Adobe Analytics data

As you might notice, those reports look quite familiar to any other type of pathing report, and you are right! Because of this similarity, the approach from this post works for any type of pathing use case, like on-site pathes, conversion funnels, engagement journeys, etc. Let’s get started!

Step 1: Choose pathing dimension and scope

Before we start with the actual setup, we first need to choose which dimension we want to use for our path. Some common examples are:

  • Use a Page dimension to analyze how people traverse your page during a Visit
  • Use a Marketing Channel dimension (as shown above) to see which channels contribute to converting user journeys
  • Use a Product Feature dimension to visualize which features of the product are used in succession in a Visit
  • Use an Article Topic dimension to understand how a certain topic contributes to other topic’s success in a Visit

Depending on which dimension you choose there are many different insights to discover. Don’t worry if there is more than one dimension you would like to try out, you can repeat the process described below as often as you like!

Next, we need to think of the scope that we want to analyze. If we go with the Page dimension, it probably makes most sense to only analyze the Page paths during a Visit, but not for everything a Visitor has ever done. That view on the full user journey might be more interesting for Marketing Channel use cases, where multiple Visits may be involved in the longer journey. Once we have decided, we can go on and…

Step 2: Create a Segment to reduce the amount of data

Technically speaking, there is no limit on how much data we can extract through Adobe Analytics’ Data Warehouse feature. However, we should keep in mind that we need to process all the data that we are going to receive, so it is probably a good idea to only extract what we really need.

Think of the Page path example mentioned above. For that example, we only need to consider individual Hits that have a Page to analyze, right? To filter out all other Hits we just need a simple Segment like this:

Filtering for Hits with a set page in Adobe Analytics

This simple trick will make our processing much easier because we don’t need to filter through Gigabytes of irrelevant data. On the top right of the screenshot above, you can see that Data Warehouse is listed under Product compatibility. This is an important information to watch out for because Data Warehouse does not support all metrics and dimensions or very complex Segments.

Another type of Segment that can be very helpful for marketing use cases looks like this:

Filtering for Hits with new Marketing Channels in Adobe Analytics

We are doing something similar to the Page filter from before, but this time we are filtering for Marketing Channel Instances. Since Marketing Channels are usually configured to be valid for a long time, we may end up with duplicates in our path if we don’t filter for the Hits where a new channel was spotted.

Besides those very simple Segments we could add some more logic. For example, we can only filter for journeys where the Visit Number equals 1, making sure the full journey is covered from the start. Similar to that, we could segment for journeys or Visits with at least one purchase to identify successful paths. Once the Segment is built, we can…

Step 3: Create the Python Script in Power BI

If you’ve read any of my previous posts involving Python, Adobe Analytics, and Power BI, you probably know how we are going to pull data from Adobe Analytics: Using Julien’s awesome aanalytics2 Python package! You can check out this post for a detailed setup guide or this post for another example.

Once we have all packages in place and the Adobe IO account created, we can start with the actual Python script using the Python data source in Power BI. First, we need to authenticate and create a connection to the API like this:

import aanalytics2 as api2
import pandas as pd
import time

api2.importConfigFile('\Path\to\config.json')
login = api2.Login()
companyids = login.getCompanyId()
companyid = companyids[0]['companyName']
aaapi = api2.LegacyAnalytics(company_name=companyid)

Once we are authenticated, we then need to define the data request to send to the API. For my Marketing Channel Stack example, a request might look like this:

reportDescription = {
    "reportDescription": {
        "reportSuiteID":"reportsuiteid",
        "dateFrom":"2022-01-01",
        "dateTo":"2022-04-01",
        "source":"warehouse",
        "segments":[
            {"id":"s1427_62534df5gd3afs4gdb0ds36a422a9"}
        ],
        "metrics":[
            {"id":"visits"}
        ],
        "elements":[
            {"id":"visitorid"},
            {"id":"visitnumber"},
            {"id":"clickstopage"},
            {"id" : "lasttouchchannel"}
        ]
    }
}

There is a bunch going on here, so let’s go through it together:

  • In line 3, we have to put the ID of the Report Suite from which we want to get our data
  • Lines 4 and 5 define the date range of our extract
  • Line 6 specifies that we want to have data from Data Warehouse
  • Lines 7 to 9 let us define the ID of a Segment (or multiple) to use to reduce the amount of data we need to process. This ID can be found in the URL if you open a Segment in the editor
  • Line 11 defines the metric we want to use. Visits is a good catch-all to cover practically all use cases
  • Lines 13 and following now define the actual data we want to pull. The first three elements allow us to pull the user id (usally the Experience Cloud ID), the visit number, and the hit number, all of wich we need to bring all touchpoints in the right order. The fourth element, lasttouchchannel in my example, contains the actual dimension we want to use for pathing. You may want to change this to something like page for other use cases. You can find the names of the dimensions and their corresponding elements here.

To reduce the amount of data you are querying for testing you can also add a time to the dateFrom and dateTo parameters, so they look something like “dateFrom”:”2022-04-01 12:00″, “dateTo”:”2022-04-01 12:10″ to only request 10 minutes of data.

Next, we need to send our query to the Adobe Analytics API. This needs to be done in two steps: First, we are requesting the report like this:

report = aaapi.postData(method="Report.Queue",data=reportDescription)
reportID = report["reportID"]

This will give us a report ID back. With that report ID, we then need to periodically check if our data is ready to be retrieved. This example below checks every 30 seconds for the report to be ready:

error = "report_not_ready"
while(error=="report_not_ready"):
    response = aaapi.postData(method="Report.Get",data={"reportID":reportID})
    if "error" in response:
        error = response["error"]
        time.sleep(30)
    else:
        error = "none"

This loop will continue until the data is available. Once that is the case, we can create an empty Pandas data frame (which Power BI expects) and loop through the nested data from the API:

df = pd.DataFrame()
for visitor in response["report"]["data"]:
    visitorID= visitor["name"]
    for visit in visitor["breakdown"]:
        visitNr = visit["name"]
        for hit in visit["breakdown"]:
            hitID = hit["name"]
            page = hit["breakdown"][0]["name"]
            df=df.append({"Visitor":visitorID,"Visit":int(visitNr),"Hit":int(hitID),"Page":page}, ignore_index = True)

That’s already all we need! Power BI can now further process the Pandas data frame in the next script. The full script looks like this:

import aanalytics2 as api2
import pandas as pd
import time

api2.importConfigFile('\Path\to\config.json')
login = api2.Login()
companyids = login.getCompanyId()
companyid = companyids[0]['companyName']
aaapi = api2.LegacyAnalytics(company_name=companyid)

reportDescription = {
    "reportDescription": {
        "reportSuiteID":"reportsuiteid",
        "dateFrom":"2022-01-01",
        "dateTo":"2022-04-01",
        "source":"warehouse",
        "segments":[
            {"id":"s1427_62534df5gd3afs4gdb0ds36a422a9"}
        ],
        "metrics":[
            {"id":"visits"}
        ],
        "elements":[
            {"id":"visitorid"},
            {"id":"visitnumber"},
            {"id":"clickstopage"},
            {"id" : "lasttouchchannel"}
        ]
    }
}

report = aaapi.postData(method="Report.Queue",data=reportDescription)
reportID = report["reportID"]

error = "report_not_ready"
while(error=="report_not_ready"):
    response = aaapi.postData(method="Report.Get",data={"reportID":reportID})
    if "error" in response:
        error = response["error"]
        time.sleep(30)
    else:
        error = "none"

df = pd.DataFrame()
for visitor in response["report"]["data"]:
    visitorID= visitor["name"]
    for visit in visitor["breakdown"]:
        visitNr = visit["name"]
        for hit in visit["breakdown"]:
            hitID = hit["name"]
            channel= hit["breakdown"][0]["name"]
            df=df.append({"Visitor":visitorID,"Visit":int(visitNr),"Hit":int(hitID),"Channel":channel}, ignore_index = True)

Now that our data is ready, we can start…

Step 4: Processing the data in Power BI

As a first step in the actual Power BI Query Editor, we should check that our columns are correctly typed as shown below, with Hit and Visit as a number. The data contains decimal places with a dot as separator, so you might need to change your file’s region settings to US:

Checking data types in Power BI

Next, we need to sort the data to ensure the correct order of events in our pathing. You want to order by Visitor first, then Visit, and Hit last as shown here. If you want some backwards paths instead you could also define that logic here:

Sorting by three columns in Power BI

Depending on the scope you have chosen for your use case we now need to group by either the Visitor column or the Visitor and Visit column. If you want a Visit scope, you would do it like this in the Advanced Editor:

#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Visitor", "Visit"}, {{"Count", each Table.RowCount(_), Int64.Type},{"Path", each Text.Combine( [Channel] , " > "), type text}})

On a Visitor level, you would do it like this:

#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Visitor"}, {{"Count", each Table.RowCount(_), Int64.Type},{"Path", each Text.Combine( [Channel] , " > "), type text}})

The grouping will combine the individual pathing items, like the Channel or Page, and combine it by a ” > ” separator. Your table should now look something like this, where you can see the full path on the right:

Processed data in Power BI

You can see a second new column in here, called “Count.” In that column we are counting how many pathing items (Channels in our case) are part of this specific journey. Now all that is left is to add some…

Part 5: Visualizations in Power BI

A very basic way to visualize this data is by adding a Table visualization that shows the individual Paths along with the number of Touchpoints and how often a Journey occurred:

As you can see, I also added two big Card visualizations on the right that show the total number of Journeys and the average number of Touchpoints per Journey. Quite nice!

Closing thoughts

This was a fun post! Pathing reports are something that is sadly still missing from both Adobe Analytics and Customer Journey Analytics (I have high hopes for CJA on this) but Power BI can help us quite a bit! In case you didn’t know: Power BI is free for the Desktop version, so there is nothing stopping you from downloading it right away and start analyzing!

As an added bonus, Power BI is using Power Query under the hood, so the same process should work just as well for Excel. Of course you could also manually download Data Warehouse files and use that for Power BI, but I like the easily repeatable API route more. I hope I was also able to show how you could change the pathing dimension and scope to anything that fits your use case.

In theory, we could also bring those reports back to Adobe Analytics, allowing us to analyze them in Analysis Workspace. Let me know if this would be of interest to you! As always, also let me know of the cool other use cases you can think of and have a great rest of your day!