Splunk Your Star Wars Data – Episode I. May the Fourth be with you.


So we’re getting more Star Wars films – I have to admit both my 7 year old son and I are quite excited (I guess me maybe slightly more). Splunk’s one of those pieces of software where you love to show people the kinds of things it can do.  On the back of the Star Wars announcement, a bedtime conversation with my son about how many planets Luke Skywalker went to after Return of the Jedi led to my discovery of some Star Wars data sets – I set about seeing if I could answer my son’s questions with Splunk.

Thanks to the power of the Force – I was able to take the downloadable Star Wars encyclopedia database and turn it into a collection of CSVs.  I uploaded those to Splunk and started asking some questions.

First up – I added four CSV files to separate indexes:

  • The entries (all the major entries in the Star Wars DB): about 53000 rows
  • The planets: about 4000 rows
  • The arcana (rumours/myths and behind the scenes stuff about star wars): 500 rows
  • The categories (helps join the data together – categories such as ship, planet, person etc.)


The Planets 

We decided to looks at the different planets first.

First up I started searching the entries for Luke, Skywalker and Planet  – 407 events. Turns out that the dataset includes all the books, comics and “expanded Universe” set after Return of the Jedi.

index=”star_wars_entries” luke skywalker planet

To narrow it down – I added the Planet ID and a limit of top 20 (to get to the most visited planets)

index=”star_wars_entries” luke skywalker planet| top limit=20 PlanetID


Luke Planet SearchTurns out Planet 19172 is the most mentioned (ignoring the zero) when searching for Luke Skywalker and the planets he visited. Anyone like to guess what planet 19172 is?



I’m sure you guessed it right – Tatooine. That got us looking at the different planets. We created a Pivot table in Splunk with all the planets (their IDs and Names) but also the Planet Name, Sentient Species, Other Species that lived on the planet and its environment.


Planet Pivot

We then added a filter to help us find planet 19172 (Tatooine)

Planet Filter


We had wanted to find out who lived on the planet, the environment etc, so we published this to a new dashboard. Next up we decided to see how many planets had what kind of environment – the search was pretty simple:

index=”Star_Wars_Planets”| top limit=20 Environment

Then it was just a case of choosing a pie chart:


Planet Pie

Endor came up in one of the searches we did – this lead to the question (after much discussion about Ewoks not being baby Wookiees) about where Ewoks and Wookiees live. So we search for all planets where the species living there was Wookie or Ewok:

index=”star_wars_planets” Wookiee OR Ewok | chart count by PlanetName

We had a quick look to see what the biggest planets were and you can see that below:

Planet Size

The last thing we did was to try and find out how many planets there were in each “region” (how many planets are in the Outer Rim etc.)

We put all that together on one dashboard that you can see below:



Luke Skywalker

As the original members of the Star Wars cast are back and Luke was my favourite (I was never cool enough to be Han) we decided to see what else we could find out about Luke. My son wanted to know the code numbers of every single battle droid but after some negotiation we decided to skip that.

The first question we asked the data was who else Luke visited planets with. This gave us a list of people ranging from R2-D2 through to Wedge Antilles and someone called Jaina Solo (who it turns out to be the daughter of Han and Leia in the books).

We then got into a bit more detail to find out some more about Luke, his X-Wing and Tatooine (as the announcement of the new film said they are going to filming in the desert again). There was a lot of information, as you can see below so we changed the layout of the table to make it easier to put on a dashboard.

Lots of Luke


We then had a quick look to see what other spaceships (apart from the Millenium Falcon) Luke and Han had been on together:

index=”star_wars_entries” CategoryID=34 Skywalker Solo| top limit=20 EntryName Description

Finally we wrote a new search to include the Category data and selected just the weapons of Star Wars and then selected any occurrence of *sabre* and some details about the weapon. There are more types of lightsabre than I thought:


We put all of that data onto a “360 degree view of Luke Skywalker” dashboard (I know, I’m sorry for such an awful title). You can see the dashboard below:


Luke Dash

We had some fun looking through the information. I’ll see what else I can put together for Episodes 2 & 3 (or should that be Episodes 8 & 9?).


Happy Star Wars Day – May the Fourth Be With You


Posted by