IT

How to Introduce Yourself to Machine Learning

Most IT and business leaders know that despite the economic and human disruption of the COVID-19 pandemic, digital transformation will ultimately speed up, not slow down. The immediate challenges of the pandemic have led companies to find innovative ways to get things done, relying on data-driven decisions and technologies.

As the volume and variety of data from both existing and emerging use cases explodes, we need to act on that data in real time. Humans can’t effectively pore through log data, for instance, looking for security or operational red flags. Drawing insights from data requires machine learning. Machine learning algorithms can autonomously learn from the data they process to perform — and improve the performance of — specific tasks. (Though “machine learning” and “artificial intelligence” are often used interchangeably, ML is a subfield of AI.)

Both to improve resilience against future crises and to excel in a period of accelerated digital transformation, machine learning is the only way we can understand and act on the volumes of data we’re taking in, at the speed required to serve our customers, outpace our competitors, or fulfill our mission.

Machine learning is complex, and there is a lot of buzz around it; not all of it positive, not all of it accurate. Further complicating uptake, vendors tend to talk about machine learning as a feature, like the seasoning in a really good dish, rather than the point of the meal. Yet of all the technologies that will drive us furthInter into the Data Age, machine learning is the most foundational. Machine learning algorithms will allow us to work with data at the volume current and future technologies will bring. It’s necessary for every organization to understand how to begin using machine learning. Fortunately, it’s not an all-or-nothing proposition. I often explain it in terms of learning to crawl, then walk, before you run.

  1. Crawl: Trust But Verify. If your organization doesn’t have a background in AI, often the best route is to start with a product that has ML “baked in,” in which the machine learning is something of a black box. A SIEM solution, for instance, will employ machine learning, and while the solution is configurable, you won’t really be interacting with the algorithms. What’s essential here is explainabilty. When the algorithm is sorting malware activity vs. false positives vs. normal behavior, you might want a bit of understanding of how those calls are made. That way, you can go back and check the results and improve (or justifiably decrease) your confidence in the algorithm. 

  2. Walk: Tune Inputs, Improve Outputs. When you have people in your organization who can go to the next level, you can do more than rely on explainability to help you triage and take better action on the ML outputs. With a little more coding, you can start to control the features that go into the system. Now it’s more than “show me malware instances, and I can check your work.” You can control how much metadata you give to make the algorithm better, and you can refine the data so that the algorithm is working with the most relevant information. 

  3. Run: Take Control, Run Tests. The third level requires a higher level of expertise, an ability to work with the algorithms at a much deeper level. If being able to refine a given algorithm is helpful, even more useful is being able to test algorithms against one another. You’ve refined algorithm A as much as you can, so now let’s test it against newly created algorithm B, and see which one works best. That’s a scientific approach to applying machine learning to get optimal results, and gives you maximum transparency and confidence.
     

There is a lot of concern and confusion about machine learning, and about the larger field of artificial intelligence. Some concerns are justified; algorithms have been shown to have bias, for instance, and that bias must be identified and removed from the algorithm. AI will also affect people’s jobs, in terms of the skills we need to grow in our careers, and in terms of which functions will be handed entirely to automation. But these are cases of how AI is designed and deployed. The solution is not to hide from the technology — that’s not possible. The solution is to carefully embrace the technology, assess how it works, and develop the talent to work more closely with it, to drive continual improvement.

As we move forward into an increasingly fast, increasingly data-rich world, machine learning is going to be an essential tool to navigate a successful path.

Ram Sriharsha
Posted by

Ram Sriharsha

Ram is the head of Machine Learning at Splunk. His group applies and advances state of the art machine learning in areas relevant to Splunk. They also develop machine learning based insights that power Splunk’s core products. Prior to Splunk, he worked at Databricks where he led all engineering and product development for the Genomics Vertical and started the R&D center for Apache Spark in Amsterdam. Prior roles also include Principal Scientist at Yahoo Research where he focused on large scale machine learning and real time machine learning in search and display advertising, as well as login risk detection. He holds a PhD in Theoretical Physics from the University of Maryland. He is also an Apache Spark PMC Member and Committer, and in his spare time he creates and maintains open source projects like Magellan (fast geospatial analytics on top of Spark).

TAGS

How to Introduce Yourself to Machine Learning

Show All Tags
Show Less Tags

Join the Discussion