|
Splunk |
Datadog |
Log analytics |
Splunk automatically ingests, indexes and stores any human-readable file regardless of source. Metrics and 100% of traces are automatically correlated with logs enabling teams to find and resolve issues quickly. We have proven indexing and search scale across enterprise datasets to find knowns and unknowns, ensuring engineering and ITOps teams can find what they need when they need it. |
Datadog’s datastore lacks the flexibility that Splunk has, focusing primarily on storing metric time series and application logs. Unlike Splunk, users are required to choose between search cost and performance which extends MTTR when engineers encounter unforeseen issues. The result: Extended outages and unexpected overages to reindex logs to improve search queries after the fact. |
Detection and alerting |
Splunk Observability Cloud collectors stream granular, one-second data every 2-3 seconds, powering near real-time visualizations, issue detection and alerting. This speed improves MTTR, consumer experience and reduces frustration for engineers and business leaders. |
Datadog’s agents poll APM telemetry data once every 60 seconds. It takes additional time to store, process, and visualize the telemetry resulting in increased MTTR, slower detection and alerting and a suboptimal experience for engineers and business leaders. |
Data retention and integration |
Splunk’s NoSampleTM tracing stores all traces, without risk of storing redundant spans. Metrics pipeline management makes it easy to transform, redact and drop data to strike the right balance between cost and performance. We also support federated search in AWS S3, which lowers costs while still retaining search capability. The result? You have all the data you need to isolate problems quickly and easily without compromising cost controls. |
Datadog stores 100% of trace data for the first 15 minutes, after which users are forced to sample traces.8 This can lead to delayed alerting and slowed troubleshooting while engineers wait for the platform to capture the offending traces. Datadog’s pipeline capability lacks the robust routing capabilities Splunk has. Its transformations are less flexible and they cannot easily redact data without modifying the underlying source. Customers are encouraged to dehydrate data, which increases costs and makes it harder for engineers to search and isolate problems. |
Troubleshooting experience |
Splunk identifies the business impact of performance problems spanning multiple services and teams. We correlate metrics, logs and traces into cohesive easy-to-understand visualizations with dynamic, AI powered alert thresholds. Constructing search queries from any data element is easy thanks to our rich suggestion libraries and fully indexed logs. Splunk IT Service Intelligence provides visibility into business health and its relationship to IT asset and service health. Users find and resolve issues faster with Splunk. |
In complex scenarios, Datadog’s troubleshooting capabilities aren’t as robust. Limited full trace collection and a long learning period prevents engineers from using dynamic alert thresholds in investigating unforeseen issues, forcing them to manual tune alerts to ultimately capture root cause. Log search is enabled through a combination of automated and user-defined tags. When tags don’t exist, users resort to potentially slow attribute queries. Business views don’t support 3rd party data or highly customized views the way Splunk can. |
OpenTelemetry support |
Splunk Observability Cloud is 100% OpenTelemetry native and a significant project contributor. Splunk users can confidently collect, process, transform, visualize and alert on OpenTelemetry data without worrying about exceptions and OpenTelemetry-specific constraints. They can directly contribute to the community and fully realize the benefits of OpenTelemetry. |
Datadog's inaccurate OpenTelemetry documentation wastes users' time fixing example code to understand the examples. Their OpenTelemetry tracing doesn't support span events and users cannot link traces and logs by manually patching their own logging module or library. Logging and trace data are stored separately which means they can’t be correlated, and users are unable to query span data as metrics in their dashboard. |