TIPS & TRICKS

Order Up! | Custom Sort Orders

By default, Splunk search results are sorted in lexicographical order...lexico-WHAT?!?

Lexicographical order explained

In Splunk software, lexicographical order almost always sorts based on UTF-8 encoding, which is a superset of ASCII.

What does this really mean?

  • Numbers are sorted before letters.
  • Numbers are sorted based on the first digit. For example, the numbers 10, 9, 70, 100 are sorted lexicographically as 10, 100, 70, 9. 
  • Uppercase letters are sorted before lowercase letters. 
  • Symbols are not standard. Some symbols are sorted before numeric values. Other symbols are sorted before or after letters.

You can specify ascending or descending order in Splunk Web, which is the Splunk UI, and also in some search commands.

You can find out if a command uses lexicographical order by looking in the Usage section of the documentation for those commands. The most common commands that use lexicographical order are searchsort and timechart.

But what if you want to use a custom sort order?

If you have a small set of unique field values, then it's easy to create a custom sort order.  "Small" is subjective here—it really depends on how much typing you want to do. By small, I mean 15 or fewer values. One example is the 12 months of the year (which I use in an example in this blog). 

If you have a large set of unique field values, you can create a custom sort order if you can identify categories that the values can fit into.  For example, you can sort a set of numeric values that fit into categories like High, Medium and Low.

The basic steps to create a custom sort order are:

  1. Use the eval command to create a new field, which we'll call sort_field.
  2. Use the case function to assign a number to each unique value and place those values in the sort_field.
  3. Use the sort command to sort the results based on the numbers in the sort_field.
     

Here's an example:

You want to sort your search results by the values in the status field. The values in the status field are critical, high, medium, and low.

If you sorted these values using the lexicographical order, the values are sorted alphabetically. The ascending sort order would be critical, high, low and medium. Which is not that order that you want. Low needs to be last.

You create a custom sort order by assigning numeric values to each status level. You assign 1 to the level you want to appear first in your sorted list. Then, assign 2 to the level you want to appear next, and so forth.

Here is the eval command that you'd add to your search:

... | eval sort_field=case(status="critical",1,     
      status="high",2, status="medium",3, status="low",4)
    | sort sort_field 


By the way, the name of the sort field can be any name; I'm just using sort_field in these examples.

Another example

Here's another example. You want to sort the date_month field using a fiscal year order starting from July and ending with June (of the next year). You create a sort_field, identify the months, and assign a ranking value to each month.


This one requires a bit more typing. For example:

... | eval sort_field=case(date_month="july",1,
      date_month="august",2, date_month="september",3, 
      date_month="october",4, date_month="november",5, 
      date_month="december",6, date_month="january",7,
      date_month="february",8, date_month="march",9, 
      date_month="april",10, date_month="may",11, date_month="june",12)
    | sort sort_field 

If you don't want the sort_field field to appear in your search results, add the fields command at the end of your search. Use the minus sign ( - ) before the field name to exclude the sort_field from the results. For example:

... | fields - sort_field


Dealing with mixed case values

If the values in your field are sometimes in uppercase and sometimes in lowercase, you can change how the values appear in the search results by using the eval command with the upper or lower function before the case function in your search.

The following example uses the lower function on the date_month field and places the results into a new field called month. Then the eval command and case function are used to create the custom sort order.

... | eval month=lower(date_month)  
    | eval sort_field=case(month="july",1, month="august",2, month="september",3, 
       month="october",4, month="november",5, month="december",6, month="january",7, 
       month="february",8, month="march",9, month="april",10, month="may",11, month="june",12)
    | sort sort_field 
... | fields - sort_field

NOTE: If your values are dates, ensure that the dates are in UNIX time before performing the sort. See the Resources section at the end of this blog for links to information about sorting with dates.

Creating categories

If there are a large number of unique values in the field that you want to sort, you can use the case function to create categories based on ranges of values.  Consider the first example that sorted the search results by the values critical, high, medium, and low. That example used values from a field called status. But what if you don't have a field that categorizes the values?  This is Splunk software—you can create one!

Suppose that your data is in a field called data_field that contains values between 0 and 100. You want to create the value ranges and associated status levels shown in this table:

 Values    Status Level 
 0-34 Low
 35-69 Medium
 70-89 High
 90-100 Critical


To create the categories for each range of values, add this to your search:

... | eval status=case(data_field>=90, "Critical", data_field>=70 AND data_field<=89, "High",
       data_field>=35 AND data_field<=69, "Medium",data_field>=34 AND data_field<=0, "Low")

Then you can create the custom sort order based on the status field that you just created:

... | eval sort_field=case(status="Critical",1, status="High",2, status="Medium",3, status="Low",4)
    | sort sort_field 


Thanks for reading, and...

SPL it like you mean it! - Laura


Resources

Sorting dates:

  • If you are dealing with dates where some entries include leading zeros (04/03/19) and some do not (4/3/19), check out this Splunk Answers post.
  • If your dates appear in dd/mm/yyyy order and the sort is not working as you expect, see this Splunk Answers post.
     

Other examples:

Laura Stewart
Posted by

Laura Stewart

Laura unravels the SPL maze, bringing clarity to the murky. She has been a software instructor, wrote books on Excel, PowerPoint, and Project, and spent some very interesting time working at the Defense Intelligence Agency in DC. Laura is a Principal Technical Writer at Splunk, with a focus on the Splunk SPL documentation.

TAGS

Order Up! | Custom Sort Orders

Show All Tags
Show Less Tags

Join the Discussion