Join us as we pursue our exciting vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and celebrating each other’s success.
Splunk's Data Stream Processor team (DSP) is seeking an exceptional Senior Kubernetes Engineer to strengthen our on-premise product built on top of Kubernetes through quick deployment, debugging, and scaling of Kubernetes. Ours is a team working across 4 US time zones with a combination of in-office and remote employees. We offer full-time remote accommodations as well as relocation if you want to move to an office location.
Projects you may work on
- Adding new on-premise capabilities such as built-in load balancing, elastic block storage, or mutual TLS.
- Implementing different flavors of on-premise architecture to support better failover, multi-zone and multi-region support, or extremely high-throughput customers.
- Improving the DSP administrator experience through new tooling, better Kubernetes primitives, and more robust observability.
- Improving the build process for DSP for faster builds and simplified developer workflows.
About you
- You are excited about building, observing and operating distributed systems at scale in production.
- You are always willing to debug issues with our customers both for features you built, and issues that are new to everyone.
- You maintain ownership of everything you build. You are responsible for code you’ve built from development through deployment into production and beyond.
- You have a history of collaboration, openness, honesty, timely decision making, and communicating clearly in both verbal and written forms
Must-have experience
- Administering Kubernetes. Ability to create, maintain, scale, and debug production Kubernetes clusters as a Kubernetes administrator.
- Working on at least one Kubernetes cloud offering (EKS/GKS/AKS) or on-prem Kubernetes (native Kubernetes, Gravity, MetalK8s).
- Programming experience in Java, Go, or bash. Our services and tools are written in these languages.
- Ability to use observability tools to look at logs and metrics to diagnose issues within that system. We use Splunk Connector for Kubernetes, Splunk Enterprise, and Prometheus.
- Ability to not only work independently, but also work closely and pair with your own teammates, and also form strong relationships with other teams quickly.
- Ability to both mentor other team members and be mentored by them. Everyone has something new to learn from their teammates.
- Ability to take large, ambiguous projects, drive to clarity on the project, break the work down, present your designs to multiple teams, and gain agreement from leadership.
- 5+ years of industry experience along with a proven track record of ownership and delivery.
Supplemental skills
- Experience hardening a production-level Kubernetes environment (PDBs, Network Policies, memory/CPU limits, node taints and tolerations).
- Experience with on-call along with a robust, methodical approach to diagnosing and mitigating production issues.
- Experience with Kubernetes cluster networking and Linux host networking.
- Experience with configuration management tools. We use Terraform and Ansible.
- Experience in scaling infrastructure to support high-throughput data-intensive applications.
We value diversity at our company. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying.
For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.