Splunk Cloud is looking for a NOC Technician to join our team to support and monitor our ever expanding Cloud platform.
It is the responsibility of the NOC to monitor, troubleshoot, and resolve issues that affect the availability and performance of Splunk for our cloud customers.The NOC is the frontline of defense in making sure our customers have an unrivaled experience.
We are the authority on the Splunk customer experience and we need you to help us drive awareness and resolution for all customer impacting incidents. We're looking for someone to bring a fresh approach to problems of all sizes and shapes. Calling out problems is great. Crafting solutions to fix those problems is even better. We need you to help build a world-class NOC.
The NOC is responsible for reporting on issues and performance, performing maintenance, and working closely with the SRE and Cloud Monitoring teams to improve monitoring, troubleshooting, and response times.
Splunk Cloud SRE is implementing a lot of complex strategy and automation to support the backbone of our product and infrastructure, especially as we continue to scale.
The NOC is responsible for making this implementation seamless to our customers. We are building a world class NOC and need the best talent possible to get us there.
As a NOC Technician we expect you to:
- Represent the NOC in meetings/process changes and make recommendations on new procedures/ processes
- Prioritize work for the NOC technicians and manage inter-shift relations
- Act as a Liaison between SRE, monitoring teams, support and leadership for new processes, tools and knowledge transfers
- Work with NOC Lead technicians with complex tasks in creative and effective ways
- Identify, analyze, and initiate the escalation process based on the escalation criteria specified by the Splunk
- Assemble the escalation management team which includes the incident owner, problem owner, and other professionals in the specified area of expertise.
- Establish accurate expectations from the escalating procedures to ensurecustomer satisfaction throughout the escalation process.
Who You Are
- You have 3+ years of experience in Systems Administration or Technical Operations
- You have hands-on experience maintaining and troubleshooting Linux/UNIX servers in a production environment.
- You have experience with puppet, Ansible, AWS
- You are team-oriented with exceptional interpersonal and communication skills.
- You are analytical, organized, know how to make a plan and execute it.
- You are calm and collected in stressful situations, such as a major service outage.
- You enjoy and excel at learning new skills and technologies.
- Bonus points if you have experience with incident response, have worked with Cloud tech and have an active GitHub profile.
- You have a take-charge personality, and the ability to drive a plan to completion.
- Must be comfortable working in a fast-paced environment with a highly technical team.
- Ability to adhere to standards in a dynamic environment.
- Ability to thrive within an environment that relies heavily on the principles of teamwork.
- Demonstrated attention to detail, follow through, and ability to prioritize quickly are necessary.
- Strong written and verbal communication skills.
Here's the fine print
TBD Position operates in a shift model, 4 or 5 days a week, 10 hours a day and is subject to change. TBD