How to Introduce Yourself to Machine Learning

By Splunk

Most IT and business leaders know that despite the economic and human disruption of the COVID-19 pandemic, digital transformation will ultimately speed up, not slow down. The immediate challenges of the pandemic have led companies to find innovative ways to get things done, relying on data-driven decisions and technologies.

As the volume and variety of data from both existing and emerging use cases explodes, we need to act on that data in real time. Humans can’t effectively pore through log data, for instance, looking for security or operational red flags. Drawing insights from data requires machine learning. Machine learning algorithms can autonomously learn from the data they process to perform — and improve the performance of — specific tasks. (Though “machine learning” and “artificial intelligence” are often used interchangeably, ML is a subfield of AI.)

Both to improve resilience against future crises and to excel in a period of accelerated digital transformation, machine learning is the only way we can understand and act on the volumes of data we’re taking in, at the speed required to serve our customers, outpace our competitors, or fulfill our mission.

Machine learning is complex, and there is a lot of buzz around it; not all of it positive, not all of it accurate. Further complicating uptake, vendors tend to talk about machine learning as a feature, like the seasoning in a really good dish, rather than the point of the meal. Yet of all the technologies that will drive us furthInter into the Data Age, machine learning is the most foundational. Machine learning algorithms will allow us to work with data at the volume current and future technologies will bring. It’s necessary for every organization to understand how to begin using machine learning. Fortunately, it’s not an all-or-nothing proposition. I often explain it in terms of learning to crawl, then walk, before you run.

Crawl: Trust But Verify. If your organization doesn’t have a background in AI, often the best route is to start with a product that has ML “baked in,” in which the machine learning is something of a black box. A SIEM solution, for instance, will employ machine learning, and while the solution is configurable, you won’t really be interacting with the algorithms. What’s essential here is explainabilty. When the algorithm is sorting malware activity vs. false positives vs. normal behavior, you might want a bit of understanding of how those calls are made. That way, you can go back and check the results and improve (or justifiably decrease) your confidence in the algorithm.
Walk: Tune Inputs, Improve Outputs. When you have people in your organization who can go to the next level, you can do more than rely on explainability to help you triage and take better action on the ML outputs. With a little more coding, you can start to control the features that go into the system. Now it’s more than “show me malware instances, and I can check your work.” You can control how much metadata you give to make the algorithm better, and you can refine the data so that the algorithm is working with the most relevant information.
Run: Take Control, Run Tests. The third level requires a higher level of expertise, an ability to work with the algorithms at a much deeper level. If being able to refine a given algorithm is helpful, even more useful is being able to test algorithms against one another. You’ve refined algorithm A as much as you can, so now let’s test it against newly created algorithm B, and see which one works best. That’s a scientific approach to applying machine learning to get optimal results, and gives you maximum transparency and confidence.

There is a lot of concern and confusion about machine learning, and about the larger field of artificial intelligence. Some concerns are justified; algorithms have been shown to have bias, for instance, and that bias must be identified and removed from the algorithm. AI will also affect people’s jobs, in terms of the skills we need to grow in our careers, and in terms of which functions will be handed entirely to automation. But these are cases of how AI is designed and deployed. The solution is not to hide from the technology — that’s not possible. The solution is to carefully embrace the technology, assess how it works, and develop the talent to work more closely with it, to drive continual improvement.

As we move forward into an increasingly fast, increasingly data-rich world, machine learning is going to be an essential tool to navigate a successful path.

----------------------------------------------------
Thanks!
Ram Sriharsha

Recover Lost Visibility of IT Infrastructure With Splunk

The news of the “Sunburst Backdoor” malware delivered via SolarWinds Orion software has organizations choosing to shut down Orion to protect themselves. This includes several U.S. government organizations following the recent CISA guidance. If you are considering a similar response in your own environment, a critical next step is quickly restoring the lost visibility to the health and operations of your infrastructure. To do this, we’ll introduce you to Splunk’s infrastructure monitoring and troubleshooting capabilities that can help you recover much of the visibility lost when Orion was shut down.

IT 4 Min Read

Access Patterns and Tiered Storage in Apache Pulsar

In this post we look at how Apache Pulsar handles the common access patterns of messaging and how that enables tiered storage.

IT 4 Min Read

Splunk Ranked No. 1 in Gartner’s Market Share for Performance Analysis

Splunk is ranked No. 1 in Gartner’s Market Share: Enterprise Infrastructure Software, Worldwide, 2019 for Performance Analysis: AIOps, ITIM and Other Monitoring Tools category

About Splunk

The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.

Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.

Learn more about Splunk

How to Introduce Yourself to Machine Learning

Related Articles

Recover Lost Visibility of IT Infrastructure With Splunk

Access Patterns and Tiered Storage in Apache Pulsar

Splunk Ranked No. 1 in Gartner’s Market Share for Performance Analysis

About Splunk

Subscribe to our blog

Connect with Splunk on X

Connect with Splunk on Instagram