Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most importantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!
As a member of the Splunk Support Incident Management Team, you will be responsible for owning the response to high profile customer impacting incidents. In this role, you'll be part of a team of global incident commanders responsible for managing high severity incidents from initial triage through after action review. This is a senior role at Splunk requiring an individual who can take charge in high stress situations and give direction to both customer personnel and to Splunk engineers to drive expeditious resolution of incidents. We are looking for a natural leader with proven knowledge of incident management frameworks, a demonstrable understanding of distributed systems environments and the ability to communicate clearly and effectively to technical and business audiences.
- Take command of incidents by setting up or taking over a cross-functional technical bridge call, comprised of internal and external partners
- Work with SME’s to interpret key metrics from monitoring tools and facilitate a discussion aimed at building an incident action plan (and a backup plan if appropriate)
- Ensure that the partners have a deep understanding of the issue, the action plan and the path to resolution
- Ensure that each participant understands the incident management process and their role in that process
- Set clear incident resolution objectives (exit criteria) and timings.
- Provide direction and time management and keep the resolution effort on track and moving forward
- Drive the technical root cause analysis process by crafting the correct technical teams and driving the technical remediation plan
- Operate as part of a 24X7 global team of Incident Commanders and ensure perfect handover of critical issues to other regions
- Actively participate and drive incremental improvements to our Incident Runbooks through process creation, tool building and participating in post-incident reviews
- Ensure internal readiness at all times by leading training sessions, simulations and drills
- 10+ years in incident management or technical support for an enterprise software company
- Strong leadership skills
- Proven knowledge of incident management frameworks (eg. ITIL)
- Demonstrable understanding of distributed systems concepts
- Ability to work multi-functionally and to influence and execute across groups
- Strong financial and business sense, critical thinking, decision-making abilities
- Good interpersonal skills, both verbal and written.
- Executive presentation skills.
- Work well in dynamic changing environment and is comfortable with ambiguity.
- Negotiation, mediation and conflict management skills.
We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.