
Salutation drivers of the Information Super Highway,
I’ve got another post here in the occasional “Help Me Help You” series, this time I’m going to digging into case writing.
I was talking with the some of the engineers the other day around the bar about an issue that one of our field guys opened. One of the engineers mentioned a piece of information that totally changed the way the rest of us were going to handle the issue. This got us to talking about how some people write great cases and others don’t. The ones who write good cases usually get their issues resolved first (often times closing the issue with the first response from a member of my team), the ones who write “bad” cases generally have a back and forth exchange.
That got me thinking that maybe I should take a sec to talk about what makes a good case. I’m going to try mapping out a basic template for submitting an issue. This is by no means limited to Splunk and is most definitely not a de facto standard. Rather it is a compilation of things that always make my life easier when my customers can provide them.
- Backstory: Like I mentioned in my previous post I don’t work in the cube next to you, I don’t see the same things you see, know the same things that you know.
Often times I get cases with a description like “I came into work this morning and discovered that this thingy that was working yesterday isn’t working today. What gives?” In digging into the issue the customer remembers that last night was the weekly maintenance window and one of the other guys was making some changes on the box and it is this change that caused things to go wonky.
I guess what I am getting at here is that it helps to know what led up to the issue. Flushing out the supporting data points can be a big help in piecing the problem together. Even if you think it is unrelated include it, it can’t hurt. The worst thing that can happen is you spent a few more bits and thankful bits don’t cost what they used to. I’ve also found that when I take the time to think about _all_ of the things that led up to the event in question the light bulb over my head starts to flicker and maybe I can figure it out before enlisting someone else. - Impact: Do you have to commit seppuku if this issue is not resolved in the next hour? If you do you may want to include that in the initial report, it will really help with prioritizing the issue. Are others unable to do their job because of this, we want to know. If you’re asking a question for your own edification share that as well — helps us to prioritize other issues and formulate the best answer for you. Big fires often require an immediate fix and you don’t really care about the inner workings of the fix just that it works. If you are trying to learn something you want the opposite.
- Priority: We all deal with fires (some bigger than others) let the guy on the other end know how you need the issue treated. Support folk inherently want to help (why else do we do this job? It isn’t for the unlimited supplies of handi-snacks) and if you say I need this now we will make every effort to deliver.
- Data Samples: One of my new favorite shows is The First 48 which follows real homicide cops as they investigate murders. Each episode always starts off with the cops going to crime scene collecting every potential piece of evidence. They don’t know what is relevant and what is not, so they assume it all is. The same is true when troubleshooting an issue with software. The more data points I have to work with the better position I am in to figure out what is going on.
If splunk isn’t parsing a field in a given file include a copy of said file along with your configs. If the UI is acting weird take a screen shot. If performance is an issue include the results of your tests to determine that things are slow along with the tool(s) used to produce the results. - Repro steps: If you can trigger this issue on demand, please share. Knowing the exact path traveled will often make root cause analysis that much easier. Screen shots of each step are very helpful (a picture is worth more than a 1,00 words) in describing an issue.
- Your investigation: I find it is really helpful to know what you have done to try to figure out a problem. It saves time because I wont ask you to perform steps that you said you’ve done and you wont get frustrated at me for asking you to do work again. It also gives me insight into your investigative process — if you are thorough I am more inclined to trust your results at first glance. If you are vague or unclear I have to assume that the information you are providing is incomplete. This is not to say that what you are giving is bad/wrong/stupid, rather it is not the full story.
Ok I’m sure there is more that I can say here but this post is getting kind of long, my fingers are tired of typing, and I need to answer some cases.
----------------------------------------------------
Thanks!
Matt Green