Adobe Customer Journey Analytics’ Summarize Function is Awesome
Just last week, I was celebrating Customer Journey Analytics’ new “Next or Previous” and Math features in Derived Fields. While I originally intended to update that post once the new Summarize function is released, I found this feature so useful and versatile that I decided to write an entire post about it. If you want a longer introduction, feel free to check out last week’s post. If not, just enjoy this one!
Summarize Derived Fields function
The Summarize function does pretty much what the name says: It reduces a larger set of values, like a user’s entire website journey, into a (for now) single value. For example, a user may view a bunch of pages during their session but you might only be interested in the very first page. The Summarize function can return that for you with ease!
Now, you may think that this example sounds somewhat familiar. And you would be right! The Allocation settings within Data Views can do some of those things too, but as they run after Derived Fields, their result isn’t available for further processing in Derived Fields. So, if you want a simple Entry Page report, just use the Allocation settings. However, if you want to concatenate the Entry Page with the Exit Page for a report, Summarize is your friend. Let’s continue with this use case as our first example:
String summarization
Continuing this train of thought, let’s look at how we can summarize string values. To start our exploration, let’s create a fresh Derived Field:

As with any Derived Field function, we start by finding the function in the left rail (1) and drag it onto the empty canvas (2). Once we’ve selected a string-type field (3), we can explore the available summarization methods (4). Let’s see what those do:
- The “Count” and “Count Distinct” functions return how many (distinct) string values were seen for a Session or Person. If someone goes from the home page to a product page and back to the home page, the “Count” method returns 3 while the “Count Distinct” function would return 2. For some reason, the counts are returned as strings, so they can’t be used for any calculations within Derived Fields for the current lack of a type conversion function.
- “Most Common” and “Least Common” return the respective value for a given Session or Person. In the previous example, “Most Common” returns “home page”, while “Least Common” returns “product page”. Those are super helpful to identify potential friction points in user journeys, where someone has to return to the same page multiple times within a Session.
- “First” and “Last” return exactly what they say, the very first and last item for a Session or Person. Currently, there is no way to control how empty fields/Unspecified values should be handled, which would be a neat addition.
- The documentation mentions a mysterious “Distinct” function that should return a set of all values, which seems to either have been dropped from development or postponed to a future release.
The counting and “commonality” methods are pretty cool to be used on their own, but the first and last methods got me a bit more creative. For example, why not give our UX team an easy-to-work-with summary of someone’s website journey in a Session? To do that, we could use a field like this:

In this example, I’m first pulling the very first page of the Session (1), then the last page of the Session (2). I’m then merging them together (3), but with some very neat flavor text to make the values easier to digest (4). In the end, I get some neatly formatted values that say things like “Started on Homepage, ended up on Product Page” (5), which is super easy to understand!
However, some things need a bit more work from Adobe. For example, I tried to concatenate the current page name to the very first page name, which currently gives me this weird backend error with a not-very-helpful error message:

While I’d love to do some fun calculations based on the number of pages, as mentioned above, the function always returns a string, making it impossible to do any calculations within Derived Fields. Bummer! While we are wishing for numerical outputs, let’s see what we can do with…
Numeric summarization
For numerical values, we can do some pretty straight-forward things:
- Max and Min: Return the highest and lowest value for a Session or Person
- Median, Mean, and Sum: Similarly, the average, mean, and sum of all values for a Session or Person
- Count: The number of values for a Session or Person
- Again, the documentation teases a Distinct function, which is not in the product today
Those functions are pretty simple. For example, if we wanted to know which price of a product was the lowest in a Session, all we need is a Derived Field like this:

That’s quite cool! Finding a value like this would be impossible in Adobe Analytics and becomes very easy now in CJA. In Analysis Workspace, we could easily format this field as a string and use it as a breakdown on the journey summarization from earlier:

With this report, we now see that people who started on the home page and ended up on the order confirmation page were seeing higher-value products than those who just made it to the product page. When optimizing our page, we could try and show everyone products of a higher value. Neat!
However, there are some disappointing shortcomings of this function. For example, I tried to compare the current price of a product to the max price, only to see another weird error:

Also, for some weird reason, there is no way to get the first or last numerical value of a Session. But why not? I’d love to do calculation like “first price – last price” to see if people went to see higher-value products towards the beginning or end of their Sessions. This addition would be very much appreciated!
Of course, converting the numerical value to a string would allow us to create some neat, self-speaking values for our users. Hopefully, we get something like this in the future. Talking about the future, let’s look at the last category of functions for…
Date summarization
If we have a field in datetime format in our data, we can summarize it like this:
- Count and Count Distinct: Returns the number of (unique) values in the selected scope
- Most and least Common: Again return the values received most and least often in a scope
- First and last: Exactly the same as with String summarization
- Earliest and latest: Oooh, a new one! Similar to numerical summarization, this gives us the highest and lowest date values
- The obligatory, mysterious Distinct function, which is not there
Those are pretty straight forward. Especially the earliest and latest functions are pretty cool. If you are a travel or entertainment company, you could easily get the earliest flight date or concert date a user looked at with a function like this:

Neat! Again, if we had type conversion, we could do some nifty calculations with the returned dates, like analyzing how far in the future a travel date is as a string value (like “flights between 10 and 14 days from today”) or adding it to the name of the travel destination, like “Flight to Hawaii on December 24”. But before we complain too much, let’s get to the…
Conclusion and wish list
I’ve been looking forward to this feature for a very long time. Ever since I first started using CJA some years ago, having something like Derived Fields and the Summarize function were very high on my wish list. And I like Summarize a lot!
However, I can’t stop myself from wishing for more and seeing some missed potential. Particularly, I want to ask for these enhancements:
- For numerical summarization, I really want to have the first and last value of a Session. I went over some potential use cases above, and I really think it would be both easy to build and helpful to many more users.
- Similarly, the count of values should return a number, not a string.
- First and Last summarizations should have an option to treat empty fields/Unspecified in a defined way.
- Derived Fields desperately need a type conversion feature. Since fields in AEP are strongly typed, being able to concatenate a string with a number or getting just the day of the month as a number from a datetime field would be amazing.
- I don’t really know what’s up with those errors I’ve been seeing. The documentation mentions “only applicable for the session & event tables” but never explains what that means. If this means that we can’t use event-scoped fields with session-scoped fields, I’d ask Adobe to please make that work and think about some more helpful error messages in the future (or prevent me from doing things wrong in the first place)!
- While playing around with the feature, I had the idea of using some returned values as comparison in the “Case When” function. However, I was disappointed to find that we can not use other fields as comparison values. Specifically, I tried to find the page name of the page with the highest product price on it, but couldn’t use the highest price as comparison. Bummer!
- I wish we already had the mysterious Distinct method available already. On a similar note, being able to concatenate all page names of a Session would be really powerful. Maybe we get this type of chronological concatenation in the future, so we don’t have to use Query Service for it.
- Doing summarizations with arrays would be fun, too!
Needless to say, I still like the feature quite a bit. Once I can cross those items off my wish list, I will be even happier!
How about you? Do you agree? What else do you wish for? Feel free to let me know!

German Analyst and Data Scientist working in and writing about (Web) Analytics and Online Marketing Tech.
2021 – current Adobe Analytics Champion
EMEA Adobe Analytics User Group Lead
Adobe Analytics Community Advisor