Splunk at Genius.com

Ensuring Continuous Availability for Cloud-based Marketing Automation Provider

The Business

Genius.com delivered its first product, Sales Genius, in May 2006 and introduced the first marketing automation software-as-a-service (SaaS) offering. This visionary company, which has introduced eight products since its founding in 2004, focuses fanatically on enabling customer success. For Genius.com this is achieved through delivering peak performance and continuous uptime for their cloud-based offering.

Challenges

With the multitude of devices, servers, and applications that it takes to provide a cloud-delivered solution, the Genius.com IT organization was inhibited without a centralized view of its infrastructure. Dealing with multiple consoles for the many different technologies added complexity to troubleshooting and lengthened resolution times. Not only did the operations team lose productivity sifting through logs and building custom scripts, they often missed items such as Apache 500 errors. Overlooking these "would-be" easy fixes had a impact on the customer experience. And as Genius.com grew and expanded its infrastructure, the probability of additional customer problems also increased.

Developers who were focused on fixing issues relied on the operations team to send them the debugging information needed to resolve problems. A request for visibility around a specific event, like a 404 error, required operations to create a custom script. It often took 24 hours to deliver these to development. Manual processes like this hindered IT productivity and impacted service levels. Beyond this, when someone needed access to sensitive data, this raised a security and compliance dilemma. Should they risk a penalty or fine by providing the required access or create a roadblock for a developer who was just trying to do his job?

Enter Splunk

Genius.com gave their entire customer support team access to Splunk--from the tier one staffers who take initial calls up the escalation path to the senior developers. With Splunk they gained a secure, realtime view across their infrastructure without compromising security or compliance. Developers now have a direct view into operational data, which is critical to quickly fix problems on the production cloud. Further, Genius.com is staying ahead of critical issues by using Splunk for monitoring and alerting.

The 24 x 7 Network Operations Center (NOC) uses Splunk and Nagios consoles side-by-side to monitor the application environment. If, for example, the customizedNagios monitoring system sends out an alert that a daemon has died, the NOC team responds by restarting the daemon and then uses Splunk to search for the error that caused the failure. They send the logs surrounding this event to their in-house bug tracking system, quickly and thoroughly reporting this issue as a defect. Using Splunk alerting, they can now proactively monitor to resolve potential and future product issues

Splunk has also helped the Genius team establish a baseline understanding of their environment. The baseline, coupled with Splunk alerting, prevented a potential security and performance problem with their DNS servers. When the NOC identified 4000 events coming in from India, they knew this was abnormal.

Once they had the IP address and a timeframe, they analyzed and understood the pattern, then coordinated remediation to block the IP address. Although the team members addressing this problem worked remotely, the common view provided by Splunk simplified collaboration and accelerated resolution.

Breakthroughs

With Genius.com's accelerated growth and cloud business model, the company doesn't have time for inefficiency. Splunk has made a huge impact in this area, reducing 24-hour turn-around times to minutes and better equipping both development and operations to respond, react and be more proactive in identifying and resolving problems. As a result, Splunk helps maintain the high availability of Genius' cloud-based service supporting service-levels and customer satisfaction and retention initiatives.

Not only has Splunk helped the Genius.com network operations operations team manage its infrastructure, it helps the company meet important SaaS criteria. As the centralized logging component of their cloud infrastructure, Splunk helped Genius meet a critical SAS 70 requirement—a common selection criteria when evaluating SaaS offerings. The addition of Splunk has made that an easy check mark.