I recently sat down with Mike Hamrah, vice president and chief architect at Namely, to talk about what it takes to build the foundation for an industry-leading HR platform. Mike spoke about the critical importance of observability and how SignalFx helps his team focus on what matters most—delivering the best possible experience for Namely’s clients.
Can you tell us about Namely and your role at the company?
Namely is an all-in-one HR solution with data-driven analytics that provide incredible insights into how well your company is doing and how well you’re managing your people. As chief architect, my primary role is to help our company and our engineers build and grow the Namely platform.
Why is observability so critical for Namely?
Reliability is a requirement for HR solutions, and a key focus of my team. We process billions of dollars of payroll, empower employees to select the best benefits, and help companies cultivate top talent among the many other needs of a modern HR solution. These systems can’t fail because people’s lives may be impacted, so creating a solid foundation is absolutely essential to the Namely platform and product.
To support our growth and enable us to continue delivering great experiences for our clients, we run a complex microservices environment. This requires us to have best-in-class observability for insights into the health across all of our systems to ensure our platform is performant and running reliably.
What were you looking for in an observability solution before selecting SignalFx?
When we were evaluating the tools we were using to monitor our systems versus what else was out there, we wanted a solution that would both give us full visibility across our environment and easily integrate into our current suite of tools. We wanted to select a tool that would provide in-depth insights into our systems while empowering our engineering team to focus even more on creating a first-class product for our clients.
We selected SignalFx as our observability tool of choice, and it’s worked out very well for us.
What drew you to SignalFx?
There were a few key features that stood out to us about the SignalFx platform.
- One of the things that really differentiated SignalFx was support for open standards. With its out-of-the-box integrations, we could easily send the Prometheus, StatsD, OpenTracing and other metrics we were already emitting via open standards and open source tools to the SignalFx platform.
- From there, we’ve found tremendous value from SignalFx’s built-in dashboards. We don’t have to spend days or weeks creating bespoke dashboards—they’re built into the platform which means we can start leveraging them as soon as we start sending metrics to SignalFx. The dashboards have allowed us to quickly get great insight into the technologies we use—like Kubernetes and Istio—with minimal effort and minimal overhead.
- And support for high cardinality metrics from SignalFx has allowed us to get extraordinary insights into our metric data to create very powerful visualizations that allow us to ensure that the different layers of our tech stack and services are performing optimally.
- We also are a big fan of tracing. We use OpenTracing with Istio, as well as our own Go and .NET libraries. Combined with SignalFx Microservices APM™, we get an immense amount of insight into the health of our services. The ability to aggregate trace information, quickly identify outliers, gain visibility into distribution of our calls, as well as understand the uniqueness of tags is tremendous insight that I haven’t seen anywhere else in the industry.
Thanks to the real-time nature of SignalFx, we get up-to-the-second insight on critical metrics.
You mentioned Kubernetes—can you tell us more about how you’re leveraging this platform?
We were an early adopter of Kubernetes and have continued to grow and invest in our deployment, especially with the emergence of hosted Kubernetes solutions. One of the advancements we made over the past year was moving to the Amazon Elastic Kubernetes Service. It has allowed us to simplify a lot of our operational complexity, but it doesn’t solve all of our Kubernetes challenges. That’s where monitoring our own servers and the health of our Kubernetes infrastructure becomes essential—particularly DNS response times, the number of pods and jobs and cron jobs that we’re running. This isn’t a problem that a managed Kubernetes solution solves and where we turned to SignalFx.
With SignalFx, we have a great overview of how well our Kubernetes clusters are running. And since we run in multiple AWS accounts, SignalFx is our single source of truth across all of our clusters. We can easily switch between different environments, drill down to various Kubernetes clusters, and see how well specific pods are running on top of the underlying nodes. SignalFx allows us to stay on top of how our systems are behaving, and reallocate resources as needed so that our clients continue to have an uninterrupted, easy-to-use experience.
What we’re looking at now is integrating SignalFx’s automation into our Kubernetes management, so we can have our systems autonomously do things like pod autoscaling. This will allow us to better leverage not only our Kubernetes platform, but also increase the efficiency of our resource use and the overall reliability of our infrastructure.
What about services meshes like Istio? What role does observability play?
While we’ve been on Kubernetes for about three years, we more recently integrated Istio into our environment. The service mesh has provided a tremendous amount of uniformity and insight into our microservices strategy, especially in the realm of tracing and standard metrics for all of our services. But as we’ve continued to adopt microservices and want to further leverage our Kubernetes platform and Istio, one of the key pieces of the puzzle is observability. We need to know how our services are performing, and more importantly, how well everything is working together.
We knew we could get visibility into how well our services were working with each other through tools like Jaeger and OpenTracing, but one thing that we didn’t have was the ability to aggregate all of that information and gain insight into things like anomalous behavior and outliers. We struggled to pinpoint where the slight disruptions were occurring in order to mitigate those and provide a more performant experience for our clients. SignalFx gives us this visibility.
What are some of the results you’ve seen from your team’s investments in observability?
Improving observability has been one of the highest priority items for Namely over the past year, and we’ve been able to do this thanks to best-in-class observability enabled by SignalFx. It’s allowed us to accelerate our product development because we can trust the changes we’re continuously making to enhance and improve our systems. This real-time visibility into our full platform has allowed us to increase performance of certain workflows, improving the experience of our clients. We’ve also been able to develop more advanced features because we have insight into how well we are composing our services into larger features.
What’s the biggest challenge SignalFx has helped Namely solve?
One of the biggest challenges that any growing company has is where you’re spending your engineering resources. We want to focus on the Namely product and platform to ensure we’re building features that make our clients happy and fulfill our mission of building better workplaces.
SignalFx has helped us do just that by simply solving the problems that got in the way of helping our clients build better workplaces. SignalFx allows us to focus on our clients and create a solid platform to build new features and functionality that make them happy and successful.