Tag: Nginx

Building an Enterprise Grade OpenSource Web Analytics System – Part 3: Data Collection

This is the third part of a seven-part-series explaining how to build an Enterprise Grade OpenSource Web Analytics System. In this post we are setting up the tracking backend with Nginx and Filebeat. In the last post we took care of the client side implementation of Snowplow Analytics. If you are new to this series it might help to start with the first post. Now that we have a lot of data that is being sent from our clients, we need to build a backend to take care of all the events we want. Since we are sending our requests unencoded via GET, we can just configure our web server to write all requests to a logfile and send them off to the processing layer. Configuring Nginx with Filebeat In our last project we used a configuration just like the one we need. As web server, we used and will […]

Building an Enterprise Grade OpenSource Web Analytics System – Part 1: Architecture

Some time ago I wrote a litte series on how to amp up your log analytics activities. Ever since then I wanted to start another project building a fully fledged Analytics system with client side tracking and unlimited scalability out of OpenSource components. This is what this series is about, since I had some time to kill during Easter in isolation 😊 This time, we will be using a tracker on the browser or mobile app of our users instead of logfiles alone, which is called client side tracking. That will give us a lot more information about our visitors and allow for some cool new use cases. It also is similar to how tools like Adobe Analytics or Google Analytics work. The data we collect has then to be processed and stored for analysis and future use. As a client side tracker, we will be using the Snowplow tracker. […]

Building your own Web Analytics from Log Files – Part 4: Data Collection and Processing

This is the fourth part of the six-part-series “Building your own Web Analytics from Log Files”. Legal Disclaimer: This post describes how to identify and track the users on your website using cookies, IP adresses and browser fingerprinting. The information and process described here may be subject to data privacy regulations under your legislation. It is your responsibility to comply with all regulations. Please educate yourself if things like GDPR apply to your use case (which is very likely), and act responsibly. In the last part we have built a configuration for OpenResty to generate user and session IDs and store them in browser cookies. Now we need a way to actually log and collect those IDs together with the requests our web server handles. OpenResty Configuration To be able to log our custom variables we need to announce them to Nginx. This is done right in the server-part of […]

Building your own Web Analytics from Log Files – Part 3: Setting up Nginx with OpenResty

This is the third part of the six-part-series “Building your own Web Analytics from Log Files”. Legal Disclaimer: This post describes how to identify and track the users on your website using cookies and browser fingerprinting. The information and process described here may be subject to data privacy regulations under your legislation. It is your responsibility to comply with all regulations. Please educate yourself if things like GDPR apply to your use case (which is very likely), and act responsibly. Identifying Users and Sessions One of our goals for this project is to be able to tell how many people are using our site. This means we need a way to differentiate between the users on our site. One approach would be to look at the IP addresses of our users. This is not very precise since all devices with the same internet connection share an IP address. Especially for […]

Building your own Web Analytics from Log Files – Part 2: Architecture

This is the second part of the six-part-series “Building your own Web Analytics from Log Files”. Architecture Overview To start of this series, let’s remember what we want to achieve: We want to enable a deeper understanding of our website users by enriching and processing the log files we already collect. This article looks at the components we need for this and how to make our life as easy as possible. To achieve our goal, we need to teach our web server to identify our users, store information about the activity in the log files, ship those files to storage and make it actionable with a way of visualizing it. Because I believe in Open Source Software, we will look at our options among that category. Another requirement is to introduce as less components as possible and keep scalability in mind. Choosing our Web Server The first part of our […]