TIPS & TRICKS TIPS & TRICKS

Quick N’ Dirty: Funnels

I recently had a customer ask me how to calculate funnels in Splunk. His source data consisted of custom application logs, but this method will work with any logs that have a field representing a unique visitorID.

In this context, a “funnel” is a calculation that shows what percentage of visitors progressed through each step in a process, usually a purchase process. So, for example, a classic funnel would show how many people visited a site, clicked on a product page, added the item to their shopping cart, and then purchased the item.

In Splunk, of course, this is simple, as long as you are familiar with the appendcols function. The appendcols function allows you to “glue” two independent searches together into the same table of results. In the case of a funnel, you would use the first search:

sourcetype="funnel" event_type="page_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as total_visitors

funnel_1

And then you would “glue” the second search onto it using the appendcols command.

sourcetype="funnel" event_type="page_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as total_visitors | 
appendcols [search sourcetype="funnel" event_type="product_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as visitors_who_viewed_products] |

funnel_2

Note that I’ve set time parameters for each search in the query itself. It’s actually not necessary for the first search — the time parameters in the interface would handle that if I hadn’t explicitly set them in the query. However, the second search triggered by the colspan function does require time parameters to be set.

Now, You could continue the “gluing” until you’ve included every step in the process:

sourcetype="funnel" event_type="page_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as total_visitors | 
appendcols [search sourcetype="funnel" event_type="product_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_viewed_products] | 
appendcols [search sourcetype="funnel" event_type="add_to_cart" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as visitors_who_added_items_to_carts] | 
appendcols [search sourcetype="funnel" event_type="check_out" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_checked_out]

funnel_3

If you want the results to be percentages, add an “eval” for each value:

sourcetype="funnel" event_type="page_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as total_visitors | 
appendcols [search sourcetype="funnel" event_type="product_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_viewed_products] | 
appendcols [search sourcetype="funnel" event_type="add_to_cart" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_added_items_to_carts] | 
appendcols [search sourcetype="funnel" event_type="check_out" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as visitors_who_checked_out] | 
eval visitors_who_viewed_products = round((visitors_who_viewed_products/total_visitors) * 100) . "%" |
eval visitors_who_added_items_to_carts = round((visitors_who_added_items_to_carts/total_visitors) * 100) . "%" |
eval visitors_who_checked_out = round((visitors_who_checked_out/total_visitors) * 100) . "%"

funnel_4

And if you want to clean it up in a few other ways, play with the “fields” command (to set the order), “transpose” command (to turn it vertical), and “rename” command (to rename the newly created “row1” field to something pretty).

sourcetype="funnel" event_type="page_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as total_visitors | 
appendcols [search sourcetype="funnel" event_type="product_view" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_viewed_products] | 
appendcols [search sourcetype="funnel" event_type="add_to_cart" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00"| stats dc(VisitorID) as visitors_who_added_items_to_carts] | 
appendcols [search sourcetype="funnel" event_type="check_out" earliest="01/09/2014:23:00:00" latest="01/09/2014:24:00:00" | stats dc(VisitorID) as visitors_who_checked_out] | 
eval visitors_who_viewed_products = round((visitors_who_viewed_products/total_visitors) * 100) . "%" |
eval visitors_who_added_items_to_carts = round((visitors_who_added_items_to_carts/total_visitors) * 100) . "%" |
eval visitors_who_checked_out = round((visitors_who_checked_out/total_visitors) * 100) . "%" |
fields total_visitors visitors_who_viewed_products visitors_who_added_items_to_carts visitors_who_checked_out |
rename total_visitors AS "All Visitors" visitors_who_viewed_products AS "Viewed Products" visitors_who_added_items_to_carts AS "Added Items" visitors_who_checked_out AS "Checked Out" |
transpose |
rename "row 1" AS "Purchase Funnel"

funnel_5

And Now For Something Completely Different

My brilliant colleague Gilberto Castillo noted that there is a much more efficient way to do this, as long as your data allows you to define each step in your funnel based on a single field like “event_type” (If you wanna get really fancy, you could create a single field by using the eval function to concatenate the multiple fields into one field.)

His approach bypasses the use of the appendcols function entirely and just counts unique visitors by event_type.

sourcetype="funnel" event_type="*"
| rex "(?<VisitorID>[0-9a-zA-Z]+)\s+event"
| eventstats dc(VisitorID) AS all_visitors
| search (event_type="product_view" OR event_type="add_to_cart" OR event_type="check_out")
| stats values(all_visitors) AS all_visitors dc(VisitorID) AS dc_event_type by event_type
| eval percentage=round(dc_event_type/all_visitors*100,2)
| fields - all_visitors dc_event_type

funnel_6
Try it out with your own data, or use the sample data I used to create this dashboard. You’ll find it below.

Thanks!
-S.

SAMPLE DATA: funnel.log (index as sourcetype “funnel” and extract the VisitorID field with the regex “(?<VisitorID>[0-9a-zA-Z]+)\s+event”)

 

Sondra Russell
Posted by Sondra Russell

Join the Discussion