MACHINE LEARNING

Your Data-Driven Christmas Playlist

I hope yule all forgive me for another festive blog post. As I write this, I’m sat in an airport on my way to San Francisco and the Christmas songs are in full flow. Inspired by the tune selection, I thought I’d use the 10 hour flight to see if I could work out the ultimate Christmas playlist - with a little help from Splunk.

I started with the Million Songs Dataset. This is a pretty amazing record of (unsurprisingly) a million songs and a collection of data on them. The full million songs is 280GB and the airport wifi wasn’t really up to that. Luckily they had a ten thousand sample of the data, so I downloaded that and ingested it into Splunk instead.

My first stage was figuring out just what kind of fields are there in the dataset. Splunk extracted these for me. I then searched all 10,000 songs for “Christmas” and got back the following:

You can see the extracted fields (song name, duration, year, “hotness” of song etc) on the left, and the raw data on the right. “Christmas” is highlighted in yellow.

I wanted to start with the “hottest” of all the Christmas songs, so created a quick table and filtered by the song hotness factor which is the column on the far right, below. You can see this brings up “This Christmas” from the album “A Classic Soul Christmas” by Donny Hathaway.

Being totally honest, I’d not heard of Donny Hathaway, so I thought I’d sort the data by “Artist Hotness” to see if there was someone I recognized. This brought up the following order:

This put Snow Patrol at the top of the table in Splunk. That’s OK – I’ve heard of them, but I didn’t realise they’d made a Christmas song and that it was on the Christmas Dance Sensation album.

Next, I wanted to see when the hottest Christmas songs were recorded so plotted this in Splunk. The larger the circle, the hotter the song. This is plotted against the year on the X axis, and the duration of the song on the Y axis. The colour of the circle determines how familiar people are with the song. The most familiar being the light orange circle on the right - “Away In A Manger”.

The hottest song is Chris Rea and “Driving Home For Christmas” from 1986 lasting 241 festive seconds.

Finally, I wanted to see the most upbeat and the most relaxing of all the Christmas songs, so looked at the data by song tempo. The charts below show the most upbeat, followed by the most relaxing:

So there you have it – the start of your Christmas playlist:

  • This Christmas by Donny Hathaway
  • We Wish You A Merry Christmas by Snow Patrol
  • Little Drummer Boy
  • Away In A Manger
  • An Old Fashioned Christmas
  • Adeste Fideles
  • Carol Of The Bells
  • Baby’s First Christmas
  • Up On The House Top
  • The Burden Of Hope

Have a lovely holiday season, some excellent tunes and as always, thank you for reading.

Matt

Matt Davies
Posted by

Matt Davies

Matt is Splunk's Head of Marketing for EMEA (and part time Chief Colouring-In Officer). He's responsible for developing and executing marketing strategy for all of Splunk's core platforms in EMEA, working closely with Splunk customers to help them understand the value that new insights from machine data can deliver to their business. Matt is also one of Splunk's technical evangelists and communicates Splunk's go to market strategy in the region. Previously Matt has worked at Cordys, Oracle/BEA, Elata, Broadquay Consulting, iPlanet/Sun, Netscape and IBM. With nearly 20 years in the software industry, Matt has extensive knowledge of enterprise IT systems.

Join the Discussion