Building your own Web Analytics from Log Files – Part 1: Motivation

This is the first part of the six-part-series “Building your own Web Analytics from Log Files”.

What is Web Analytics

As the owner or administrator of a website, you will go through different phases of maturity. When you are just starting with a hobby or web project, you will most likely care about the technical setup and gaining traction. Once everything is up and running, you will start asking yourself questions like

  • How many People are using my website?
  • How many of those are new Visitors?
  • Which page on my website attracts the most (new) Visitors?

Those questions are Web Analytics questions. It is what Web Analysts spent their time on to deliver value to the business behind it. To achieve that, we most commonly use tools like Piwik (Matomo), Google Analytics, or Adobe Analytics. Those tools rely on some Javascript code that needs to be integrated on a website to collect data about the Visitors. This is the mature phase, where time and money is spent on Analysts and their tools.

Delivering value fast, precise

While it is definitely helpful to have some experts with you, I don’t think not having those should stop you from being able to answer questions like above. This can be done with some hints on how to amp up the log file processing that is already happening.

This series is intended for those between the first phase, where log files are used for initial technical diagnosis, and the mature phase, where dedicated tools are built and used by specialized experts. It aims to keep the cost and technical footprint as low as possible by using tools commonly found in that phases.

The expectation is that this approach will enable more people to dive into user behavior analysis and therefore help them put the customer first and iterate with more precision based on meaningful data instead of log files alone.

While progressing through this series we will also look at the concepts behind big Web Analytics tools, since we need to build something similar. To round things off, we then take a peek at common questions relevant in mature Analytics practice.

In the next part of the series, we will look at the architectural considerations when starting with log file based analytics. After that, we actually teach our web server how to identify users and sessions to finally collect data and build some dashboards with it.