Tips & Tricks

October 17, 2013

3 Minute Read

Still using 3rd party web analytics providers? Build your own using Splunk!

By Splunk

Why Build Your Own (BYO) Client-Side Analytics?

There are many 3rd party web analytics providers such as Google Analytics and Omniture SiteCatalyst. However, with the flexibility of Splunk as general purpose analytics tool, many site owners opt to build their own client-side analytics powered by Splunk. Last month we talked about how jQuery Foundation had their conference website leverage Splunk to collect & analyze all client-side events.

Compared to off-the-shelf web analytics tools, building your own client-side analytics gives you significant advantages:

Avoid giving away your users’ data to 3rd party providers
Own the complete raw client-side data (as opposed to an aggregation or a sampling), and access it securely – and for free
Unlimited tracking and customization: no collection limits or custom dimensions/variables limit as imposed by leading web analytics providers
Correlate client-side data with your already existing server-side logs or offline metadata

To learn more about the difference between server-side and client-side data, check out the first part of this previous blog post.

Let’s show you how you can easily instrument your own sites:

1) Tracking

Going through the 3 blue stages from right to left in the above diagram, the first step, tracking, is achieved by pasting a JavaScript snippet to your page to load a small analytics library. To help you with that, we’re providing you with an easy-to-use analytics library sp.js that gives you:

Page-level tracking such as unique visitors and pageviews data out of the box
Event-level tracking such as user interactions with an easy-to-use API

Simply add this script tag before the closing </head> tag on your page. This will asynchronously fetch the JavaScript library sp.js from a global CDN without impacting the page load time:

In the last line of above script, make sure to replace https://www.example.com with the address of your data collector discussed in the following section.

2) Collection

To use sp.js, you must specify an endpoint where tracking calls get made to. Behind that endpoint, a single collection server (or distributed collection tier) can respond to these calls, and collect the tracked events into a log file, say events.log.

Again, to help you with this BYO project, we’re providing on github a sample code for a Node.js based backend collector server with instructions on how to run it.

Once deployed, copy the collector server address and use it in the last line of the script tag as mentioned above.

3) Analytics & Visualization

Finally, the file events.log can get be ingested into Splunk either by using a Splunk forwarder to send data to your existing Splunk deployment, or running a local Splunk instance that continuously monitors that file.

Once data in Splunk, the sky is the limit: set up Splunk monitoring & alerts, analyze with Splunk dashboards, or build your custom visualizations for traffic segmentation, A/B testing, funnel analysis, etc.

Client-side tracking in action:

Consider the following website showing a program schedule that consists of sessions. In this particular case, a call was made by sp.js to track a user’s mouse click that’s expanding a session description. Note that, as with many client-side interactions, this mouse click cannot be tracked from web server logs as it doesn’t trigger a web server request.

Notice the tracked data consists of:

Event e – custom user event such as ‘Click Program Description’
Properties kv – set of key-value pairs representing properties associated with the event such as speaker name clicked, title of the talk and whether expand is true as opposed to false for collapse. Properties also contain an automatically generated id field for a universally unique identifier to uniquely identify the visitor.
Timestamp t – automatically generated field specifying exact client-side timestamp

Finally, the following snapshot shows how this tracked event is monitored in real-time in Splunk as it gets collected and logged:

Code References – Available for you to use!

Publicly available sp.js JavaScript Library for Tracking:

https://github.com/splunk/splunk-demo-collector-for-analyticsjs#setup

Sample Node.js based Backend Server for Collection:

https://github.com/splunk/splunk-demo-collector-for-analyticsjs

----------------------------------------------------
Thanks!
Roy Arsan

Splunk

The world’s leading organizations trust Splunk to help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learn what Splunk does and why customers choose Splunk.

Tips & Tricks 4 Min Read

Getting Microsoft Azure Data into Splunk

An overview of how Microsoft makes Microsoft Azure data available, how to access the data, and out-of-the-box Splunk Add-Ons that can consume this data.

Tips & Tricks 1 Min Read

Dashboard Digest Series – Episode 3

Episode 3 of Splunk's Dashboard Digest Series takes logging data to another level. Download Splunk 6.4 and use these tips and tricks.

Tips & Tricks 1 Min Read

Time based load balancing – Part 2

Follow up to forceTimebasedAutoLB setting, prevent data munging when event is incorrectly added to another, successful test of 500K events using oneshot command.

About Splunk

The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.

Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.

Learn more about Splunk