Has anyone found an interesting way to do this within splunk?
The venerable old-skool Splunk forums are now closed. Feel free to search for old content here, but new posts are no longer supported.
Instead, please visit the thriving community at answers.splunk.com to ask and answer questions about your Splunk deployment and how to get the most out of it.
Forums: SplunkGeneral: translating IP to country code from logs
Previous Topic: Recommended Splunk Architecture | Next Topic: Missing Hosts?
There's one included by default in Splunk just pipe your search to "iplocation" (e.g. sourcetype=foo | head 5 | iplocation ) additional fields for CityN and CountryN will be available from the "Fields" drop down menu.
The N represents the position of the IP on your event if you have more than one per event.
Hope that helps!
Thanks, it wasn't obvious to me from the documentation either how iplocation worked.
I've been trying this:
source="/my/webserver/access_log" | top limit=30 clientip | iplocation
That doesn't seem to work.
When I do it the other way:
source="/my/webserver/access_log" | iplocation | top limit=30 clientip
It runs for about 10 minutes and blows up with one of several errors.
Is there another way to display the most common cities? (My understanding is the "head" command you use in your example only returns the first few in the logs, not the most common ones.)
--
ps:
I took a look at ip2location, thinking maybe I could hit an internal database instead of making all those (thousands) of http queries. Unfortunately it depends on something from http://www.maxmind.com which is blocked by our proxy servers as a "Malicious Site".
You can try:
source="/my/webserver/access_log" | status count by clientip | sort -count | head 30 | iplocation
that should produce the same top 30 and only lookup those IPs.
On the http://www.maxmind.com/ issue that's strange, I would recommend that you contact your proxy admins and have them check it, there's nothing wrong with that site, they actually offer fraud protection services among the GeoIP databases.
Hope that helps!
Thanks! That did the trick. I just had to change the word "status" to "stats".
Cool graphs. Folks here will love them.
Yes, that was a typo, my apologies!
Sadly it still isn't working right. (argh) I was premature in my elation over this feature. (And now my management is salivating over the potential.)
This works to sort the top 20 IP addresses:
source="/my/webserver/access_log" | stats count by clientip | sort -count | head 20
However, piping it into iplocation, doesn't return City or Country data. iplocation is a kind of report, I guess, so I need to have the City data already in the Fields list before I get to the report service and since it isn't there, I can't report on it (with "top" or whatever).
source="/my/webserver/access_log" | stats count by clientip | sort -count | head 20 | iplocation
In fact the graphs I was looking at yesterday were being mangled in the charting application and the "stats count by clientip" was being dropped (I don't know why). It was ignoring the sort (since "count" wasn't a defined field), and just giving me the first 20 records in the source as if I was doing this:
source="/my/webserver/access_log" | head 20 | iplocation | top limit=20 City
Actually I think it is the "stats" which breaks iplocation.
You can try this:
source=yoursource | stats count by clientip | sort -count | head 20 | eval _raw=clientip | iplocation
or simpler:
source=yoursource | top limit=20 clientip | eval _raw=clientip | iplocation
This works because "iplocation" looks through your _raw data field (not the clientip) field for addresses. When you use "stat" or "top" (or other reporting commands like "chart" or "timechart"), the _raw field gets removed. The above just creates a new field with the name _raw so that iplocation has somewhere to look.
Thanks! The data is more believable now.