It has been an interesting exercise. We were able to get access to Cisco’s product labs where I could (remotely) access some of their high-end hardware, and I was able to test the SNMP collector against the Nexus series 3000, 5000, and 7000 switches.
Lots of devices use SNMP, and the MIBs which I’ll discuss below are relatively universal among switch/router devices, so consider this a practical example of working with SNMP data in Splunk, and some lessons we learned along the way.
In Part 1 (which is this post), I’ll walk through what I had to do to get the data about the switches into Splunk using the SNMP Modular Input. In Part 2 I’ll explain just a few of the things you can do with that data once it’s in Splunk!
Just to be clear: the Cisco Nexus switches have syslog forwarding capabilities and even support Netflow (IPFIX), so there are plenty of ways to get information about what’s happening on them into Splunk — the one piece we hadn’t experimented with was getting configuration information. In fact, we considered gathering data via Netconf instead of SNMP, but determined that since our goals were read-only, and SNMP is everywhere (not to mention that the modular input was already written), we would go that route.
Step 1: enable SNMP on the Cisco switches.
This isn’t so much about turning on SNMP, as it is about making sure that you know how to query it. You have two choices: you can set a “community” string (which lets anyone query the device if they know the string), or you can set up SNMP v3 usernames and passwords. The configuration is fully documented (with examples) in the configuration guide (this is the Nexus 7000 one), including how to use v3 users with passwords and groups for authorization.
When I started out, we didn’t have SNMP v3 support in the Modular Input, so I went the “community” authorization route with SNMP v2C. Of course, while I was writing this, v3 support arrived in the latest release of the Splunk SNMP Modular Input, so all you have to do is get the latest version, and then download the appropriate pyCrypto package for your Splunk server to enable it.
To find an existing SNMP community name, log into your Nexus switch remotely (via SSH); you can review all the snmp configuration at once using the command “show snmp” or get just the community strings by running “show snmp community” … any snmp community string will do, since we’ll only need read access for now. If there are none configured, you’ll need to use SNMP v3 or create a community. Since I haven’t had a chance to try this the v3 way, here’s how to create a read-only SNMP community (let’s call it “nexus_stats”) in your ssh session:
configsnmp-server community nexus_stats ro
While you’re in there, you might want to make sure that the server’s contact and location information are correct:
snmp-server contact The IT guy!snmp-server location San Jose, CA
After exiting config, you can verify that it’s all correct by running the “show snmp” command again.
Step 2: get Cisco MIB definitions for pySNMP.
The SNMP modular input uses pySNMP, which requires that the MIB definition files be converted to python. It ships with the core SNMP MIBs pre-defined, of course, but in order to get custom CISCO information, you’ll need their MIB definitions. You can download the Cisco MIBs from their SNMP Object Navigator and compile them yourself using commands like this:
build-pysnmp-mib -o ./py/CISCO-CALLHOME-MIB.py CISCO-CALLHOME-MIB.my
If you like, you can download my converted copies.
NOTE: The pySNMP project is working on allowing the .my MIB definitions to be dropped in directly without first converting them, so this requirement should go away eventually. Also, please note that I did this conversion in November 2013 — the older these get, the more likely it is that you should update them from Cisco’s source. Regardless of where you get them, you can either compile them into an egg for your particular platform, or just drop the loose .py files into the snmp_ta/bin/mibs folder.
Step 3: Configure Input Stanzas.
You can configure SNMP inputs directly in the Modular Input’s management pages, or you can write config stanzas. In my case, because I hadn’t worked with this Modular Input before, I configured one via the UI, and then copied it and edited it to configure all the other devices I needed to monitor.
Here is the configuration I used for each device: two stanzas, one for the config data which was collected every hour, and one for the performance statistics which was collected more frequently (in the examples below, every 5 minutes). Note that you have to list the MIBs that will be loaded for parsing the data, and the community string, and give each stanza a name that will help you identify it when you see it in the logs.
As I worked on what information I needed to query, this list grew gradually. I needed names and software versions, so I started with “system” (from SNMPv2-MIB). When I needed interface performance statistics, I had to search around the internet and Cisco’s Object Navigator before hitting on the interfaces.ifTable and ifMIB.ifMIBObjects.ifXTable etc.
One thing that ended up being very helful was a full snmpwalk of the devices (dumping it to file), and then looking through the log to see where the interesting infromation was. All told, as a developer, determining which OIDs to query for the information you need is something I never quite felt I had a handle on, and it’s clearly a steep learning curve (as you’ll see below, I still have several more mib_names listed than I’m actually querying in the object_names).
[snmp://nexus 7k1 – info]destination = 192.168.0.71communitystring = nexus_statsmib_names = SNMPv2-MIBobject_names = iso.org.dod.internet.mgmt.mib-2.systemdo_bulk_get = 1split_bulk_output = 1ipv6 = 0listen_traps = 0snmp_version = 2Csnmpinterval = 3600interval = 3600index = nexussourcetype = nexus_snmp_info
[snmp://nexus 7k1 – ifStats]destination = 192.168.0.71communitystring = nexus_statsmib_names = IF-MIB, EtherLike-MIBobject_names = 220.127.116.11.18.104.22.168.1.1, 22.214.171.124.126.96.36.199.1, 188.8.131.52.184.108.40.206.2.1do_bulk_get = 1split_bulk_output = 1ipv6 = 0listen_traps = 0snmp_version = 2Csnmpinterval = 600interval = 600index = nexussourcetype = nexus_snmp
There are lots of choices you can make when collecting SNMP data. In the examples above I am doing bulk get queries of tables using SNMP v2C, and splitting the output. This results in a raw event in splunk for each field of data, which (as you’ll see) meant I had to pull the events back together in my queries.
I’ve configured the index and sourcetypes, and I’m going to use that source_type in all my queries. In the listing above I’ve shortened the list of MIB names and shown the object names as OID values (for the sake of formatting on the blog — if you check out the github project you’ll find them spelled out as text, but either way works.
I didn’t get very far in examining the Cisco MIB data, apart from the ifTable, and ifXTable to support reporting multi-gigabit speeds. There’s a lot more information available, including temperatures (which are a good way of detecting potential problems before they become catastrophes) but this is just an example, and the data I’m querying here should work with any switch or router that works with SNMP.
In Part 2 I explain a few of the things you can do with that data once it’s in Splunk…