We just completed four SplunkLive events in four striking locations across Asia Pacific – the bayside cities of Sydney and Melbourne and the island cities of Singapore and Hong Kong. Interest levels were high at each event with packed audiences throughout. Companies attending included communications service providers, financial institutions, governmental institutions and educational institutions, all seeking to find out more about the ways they can use Splunk. A key part of SplunkLive is hearing from customers and at the Sydney and Melbourne events we were lucky enough to have Corporate Express present their story.
Shaun Butler and Luke Harris – Corporate Express
Shaun is Senior Technology Specialist and Luke is Senior Systems Engineer at Corporate Express, a leading supplier of business essentials in Australia and New Zealand. Shaun explained that “e-commerce sales accounts for greater than 80% of revenue, so the company is heavily reliant on IT to facilitate the online ordering process and power our nationwide supply chain.” Shaun is the senior authority at Corporate Express responsible for the design, implementation, monitoring and support of their tier 1 infrastructure and application services, including Linux, AIX, Oracle and enterprise storage. He works with the infrastructure services team, responsible for core infrastructure services comprising network, data center storage, virtualization monitoring and infrastructure apps. Luke is the senior systems engineer and was the original user of Splunk at Corporate Express three years back.
Achieving new levels of visibility
One of the remarkable productivity ‘shifts’ Corporate Express realized with Splunk was using dashboards. The idea that once you had Splunked your data, building dashboards – combining multiple charts, graphs, tables and views into a single pane – could literally be measured in minutes.
Shaun described this in relation to Corporate Express. They have been undertaking the ITIL journey and a big part of ITIL is capacity management. Shaun commented that, “Splunk’s ability to capture structured and unstructured data from disparate formats and present it visually helped us tremendously.” He went on to say that, “operational dashboards help IT respond to the business, who want more real-time data on IT management and trends.” Shaun went through a number of their more critical Splunk dashboards.
Splunk and SAP
Corporate Express is in the midst of a business transformation program called “next-gen”, rolling out SAP across New Zealand and Australia. To smooth the transformation, they first used Splunk to measure actual current concurrency/usage of the legacy ERP system, including active vs. inactive users, peak usage and WAN distribution. This visibility enables them plan for performance and capacity requirements of the new system by measuring dialog response time (performance) and dialog steps executed (capacity). Shaun showed an example of an SAP dialog response dashboard, measuring latency performance between SAP clients and servers over time and also a breakdown of server activity.
Shaun commented that, “the senior SAP administrator has worked with SAP for 12 years and had not seen anything like this before.”
Splunk and the Network
By tagging netflow metadata, Corporate Express can obtain richer analytics of their network performance and provide powerful insights into their WAN environment. Splunk gives them the ability to not only look at WAN utilization, but correlate that to the sources of utilization.
The following dashboard compares internal and external Internet traffic and shows how utilization is distributed across all of their SAP applications.
Shaun commented how these dashboards were, “the tip of the iceberg. You can slice and dice information any way you want, tag data any way you want, and see it all in real time.”
Corporate Express is heavily committed to virtualization and has built an operational dashboard showing virtualization ratios. This enables management to see an immediate snapshot of the environment as well as trend the ratio of physical to virtual servers over time. They combine various data sources to visualize their datacenter OS split, not just by OS platform, but also by OS version. Shaun showed a SAN storage capacity dashboard used to ensure they avoid issues impacting production availability. He explained that, “we didn’t have to instrument anything – it took a few clicks to visualize the total number of emails and timechart email volume and message size.”
The more data in Splunk, the more valuable it becomes
Luke talked about their current implementation of Splunk. They Splunk all of their logs across their entire stack, from business systems down to the OS and networks, and everything in between!
“We Splunk all system logs (Linux, AIX, Windows), WMware environment logs (ESX servers and virtual guests), all Cisco device logs in Australia and New Zealand (Firewalls, IPS, ACE, routers, switches), EMC symmetrix storage array disk usage logs, IBM systems director logs, Nagios logs, logs from critical infrastructure applications (DNS, DHCP, postfix ), critical business applications (SAP, webMethods, netXpress) and finally netflow. We use a combination of syslog-ng, rsync, SNMP and CIFS share to ingest data into Splunk.” In terms of new data sources, Luke said that, “we are also planning to use Splunk forwarders to get Active Directory data into Splunk.”
They do all of this on 1 production server – a 2 x dual-core CPU IBM blade server, running 64-bit Red Hat Enterprise Linux and with 8GB of RAM – handling data volumes of 3 Mbps.
When asked why Corporate Express chose Splunk. Shaun said, “the first driver for Splunk was log aggregation – providing that single pane of glass into our infrastructure”, he went on to say that, “before Splunk, it was really about manual processes and managing logs within silos – Windows logs you need to go to the Windows admin, or Unix logs the Unix admin. It was very manual and created too many bottlenecks!” Shaun explained how this resulted in a lot of copying of logs from production environments to enable developers to troubleshoot issues. Also comparing logs on distributed applications was all manual. “What Splunk has done has been to ‘virtualize’ the logs. Just go to the Splunk interface and type in your keyword or question. This has single-handedly removed the key bottleneck.”
Corporate Express have a range of initiatives where they plan to use Splunk in the future, including green IT, visualizing Oracle storage footprint and finally web order system business intelligence – intelligent analysis of end-user activity and information to optimize the site for customer usage behaviors.
The presentation generated a number of questions about the setup and administration Splunk. One question was on their experience upgrading Splunk (which they had done the week prior). Luke explained the process, “it was all very simple, I basically stopped Splunk, installed the upgrade, accepting all the defaults and started Splunk. All this took me about 3 minutes.”
At the Singapore and Hong Kong events, we had the head of Technology for the Taiwanese Chapter of the Honeynet project present how they use Splunk. The Honeynet Project is an international, non-profit research organization dedicated to improving the security of the Internet. They focus on providing awareness of the threats/vulnerabilities that exist in the Internet today, information on better ways to secure and defend resources, and tools to help guard against security threats. One of the highlights of his presentation was showing live Splunk dashboards, providing a real-time view of malware incidents occurring on their network. Because of the sensitive nature of the use case, I can’t elaborate further here, except to say that the presentation showcased how Splunk can be used for security.
We did also have one of our partners in APAC present their experience deploying Splunk in over 100 sites across the region.
Johnny Lin – Systex Corporation
Johnny is the Director of the Splunk Lab at Systex – his team has deployed Splunk countless times across the APAC region and needless to say they’ve seen a range of use cases for Splunk. Johnny focused on how Splunk helps visualize IT data via dashboards and for specific customer use cases.
Taiwan Stock Exchange
Taiwan Stock Exchange had a plethora of network management tools, including new and legacy systems. Using Splunk, they were able to collect all their data in one place safely and securely and then build 3 levels of Splunk dashboards for the different roles – level 1 for executives and the NOC, level 2 for network monitoring engineers and level 3 for specific deep dive use cases.
#1 Bank in Thailand
The #1 bank in Thailand use Splunk to meet their stringent audit and related requirement needs. They built dashboards to instantly visualize login success/failure over a period of time for Cisco Access Control Server, IBM Z-series mainframes and Tandem computers. Johnny actually brought up the raw log files the Tandem tranding platforms generate which are not very meaningful and showed how viewing the same data in Splunk, they can immediately derive more meaning.
IAH games IT infrastructure consists of different systems, such as membership and billing systems, the game systems themselves (each game has its own respective architecture) and infrastructure (network, server monitoring, security). They actually use Splunk to consolidate the machine-generated data from all their sources to derive new business intelligence using Splunk. The game data resides in the game silo, the membership data in the membership and billing system and the payment data in the billing system and game silo. Using Splunk they can generate reports by correlating across systems, such as the distribution of active users by age group, and in this instance, user data resides in the billing system and age data resides in the membership system. But best of all, they can plot multiple search results on the same chart in just a few clicks.
Mobile Operator in Taiwan
The final example was a mobile operator in Taiwan using Splunk for business intelligence. Before Splunk they used to rely heavily on a relational database architecture for reporting, which led to a large data warehouse to maintain, expensive ongoing DB license and storage costs and no real-time visibility into value added service activities. Access to important BI reports were only available the next day (24 hours or more and reports spanning a longer time range tool hours to run.
Once their data was in Splunk they were able to get clear and immediate visibility of WAP Portal access and hit rates by various metrics, watch for WAP Portal hit rate over time, WAP Portal Ads hit rate analysis, Portal hit rate analysis by Web page category, SMS message volume over time, WAP gateway user activity analysis to name but a few. Here are an example of two of the dashboards:
He showed a final slide on how by having a single system replacing many silo-based systems makes more sense for today’s IT environment.
Splunk is generating a lot of interest in the countries visited during this SplunkLive tour. We’re seeing more and more great examples and use cases coming through. Watch out for the next SplunkLive near you.