Join us as we pursue our disruptive new vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most importantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!
Our Splunk Family in Australia/ New Zealand are a results driven and collaborative bunch spanning functions across sales, customer success, support, marketing and G&A functions. We love to work as a team, celebrate success and learn from our losses. We have an excellent team culture with weekly team breakfasts, cocktail hour Fridays, end of quarter celebrations, volunteering activities and a culture based on respect, transparency and always doing the right thing!
Splunk Cloud is looking for self-starting individuals to join the Splunk>Cloud Network Operations Center (CNOC).
Splunk CNOC manages incidents that affect the availability and performance of Splunk>Cloud service for our customers Globally. The Splunk CNOC is an always-on / always-active team making sure that each of our customers has an outstanding experience.
- Design and Deploy the Splunk Incident Management System (SIMS) to internal partners and teams to improve service recovery time
- Help define microservice level definitions and budgets, and facilitate a discussion aimed at building an incident action plan (and a backup if appropriate)
- Assemble the response team, which includes the incident owner, problem owner, and other professionals in the specified area of expertise and ensure they have an understanding of the issue, the action plan and the path to resolution.
- Define exit criteria from response procedures to ensure customer satisfaction throughout the process
- Guide peers during incidents fully to ensure accurate information is captured
- Assemble and lead conference calls for diagnosis and remediation of customer impacting outages
- Define what clear and concise Problem Statements, Status Reports, and Final Summaries look like, that are able to be easily understood by Engineers and Executives
- Provide Incident Commander responsibilities, run post incident reviews, and assigns and follows through with action plans
- Actively build relationships with 3rd-party vendors Keep vendors to more easily be able to keep them on track in incidents where it is warranted
- Write Customer-facing communications in partnership with Customer Success
- Develop positive, strong, and collaborative relationships with multiple cross-functional partners across
Splunk to improve the team's efficiency and ability to deliver on sophisticated tasks that have broad impact
- Lead process improvements and improved operational efficiencies
- Work closely with cross-functional teams to understand the Microservices that make up Splunk>Cloud
- Able to remain calm and positive under pressure and demonstrates ability to provide clear actionable feedback to both peers, junior Incident Commanders and senior management
- Works with Customer Success to gather use cases to help define Incident Response policies for all teams
- Operate as part of a 24x7 global team of incident commanders and ensure perfect handover of critical issues to other regions
- Ensure internal readiness at all times by leading training sessions, simulations and drills for all levels of team members.
- Work closely with internal teams to define their Incident Response in alignment with SIMS
- Ensure Internal teams stay aligned with SIMS and follow the guidelines put in place to make sure resources are available 24/7
- Has developed and/or operated Incident management in highly controlled and secured environments (Requires special clearance)
- You have 20+ years of major incident response and management experience
- Proven knowledge of incident management frameworks (eg. ITIL)
- Demonstrable understanding of distributed systems concepts
- You understand multi-functional teams and are able to speak and execute across organizations to drive influence
- Strong financial and business sense, critical thinking, decision-making and communication abilities
- You enjoy problem solving and analyzing global-scale distributed systems
- You have outstanding interpersonal and communication skills
- You enjoy teaching others and guiding in something you’re passionate about
- You work well with, and support building strong, supportive, global teams
- Don’t shy away from conflict, can influence at all levels and can work in stressful situations
- Can teach executive presentation skills
- Work well in dynamic changing environments and is comfortable with ambiguity
- Participate in on-call rotations for some business use cases