Salt Lake City-based Recursion Pharmaceuticals set a goal: discover new treatments for 100 genetic diseases by 2025 using an innovative combination of biology, automation and machine learning.
Today Recursion uses robotic microscopes to capture tens of thousands of cell culture images, generating terabytes of data daily. As a rapidly growing startup with just a few employees, it faced challenges in logging procedures and tracking operational data. Since deploying Splunk Enterprise, the company has seen benefits including:
Time to value in three days, with complex laboratory adoption complete in only three months
Visibility into all parts of the production process informs disease-targeting experiments
Rocket-fast ramp-up and massive data volume requires extreme scalability
Aggressive target of 100 new treatments calls for real-time visibility into experiments
Pulling together critical information from different sources and test instruments
Diverse users and requirements, from Python-programming scientists to dashboard-browsing execs and techs
Initial time to value in only three days, with complex laboratory ramping up operations in only three months
Massive scale while managing data volume of 700,000 TIFF files weekly
Timely visibility into all parts of the production facility informs experiments
Robust platform for central log management and review
Machine Learning Toolkit provides otherwise hidden insights into experiments
Flexible data management platform for moving information in and out of data scientists’ code base
Substantial cost savings versus specialized laboratory information management software
In the past, Recursion found it difficult to manage large amounts of time-series data collected from computer-controlled instruments and video footage generated from cameras in the laboratory. The initial data management strategy hardly matched the firm’s aggressive high-volume ambitions—its laboratory’s microscopes currently produce on the order of 700,000 TIFF files each week, representing an 800 percent increase in productivity over 10 months.
While the company considered open-source alternatives, Ben Miller, director of high-throughput science (HTS) operations, saw the pivotal role that Splunk Enterprise could fill as Recursion ramped up its capabilities. “I was getting value out of Splunk Enterprise within about three days,” Miller says.
Recursion has built a world-class proprietary machine learning system that analyzes terabytes of experimental image data daily to discover new treatments for critical diseases. This system is integrated with Splunk Enterprise via the Splunk SDK for Python, which passes operational data back into the experiment-design processes, and Splunk DB Connect to enrich log data with quality metrics. From there the Splunk Machine Learning Toolkit makes it easy to comb these higher-level operational metrics for new insights into laboratory processes—in Miller’s words, to “wrangle really large quantities of data and understand what correlations are happening as they are happening, not months later.”
Splunk Enterprise gives us that visibility into all parts of the production facility in a very timely fashion to help inform our next set of experiments.
John Pereira,Chief Operating Officer, Recursion Pharmaceuticals