Written by: Nimesh Chakravarthi, Co-founder & CTO, Struct
Key Takeaways
- Datadog Bits AI automates SRE investigations via Slack @Bits commands, correlating metrics, logs, traces, and GitHub code context to reduce MTTR by 70-90%.
- Setup follows seven steps: enable Bits, configure Slack and GitHub integrations, create bits.md runbooks, and test auto-investigations on monitors.
- Bits performs well for reactive workflows like triage, remediation suggestions, and post-mortems but still triggers only after alerts, so alert fatigue remains.
- Struct focuses on proactive monitoring with under-5-minute investigations, 5-10 minute setup, and about 80% triage reduction without manual runbooks.
- Automate your on-call runbook with Struct’s proactive AI SRE platform to move your team toward prevention-focused operations.
How Datadog Bits AI Supports SRE Teams
Datadog Bits AI SRE acts as an automated investigation agent that runs root cause analysis on logs, metrics, and traces when alerts fire for software engineering teams. The 2026 updates introduce a new agent harness and MCP-powered tool integration, completing investigations in 3-4 minutes with expanded data sources including Real User Monitoring, Database Monitoring, and Network Path Analysis.
Core capabilities include automatic investigations that reduce MTTR by 70-90%, a conversational chat interface, and development agent features for GitHub PR creation. Bits executes seven triage actions including Slack notifications, incident creation via Datadog Incident Response, and Jira ticket generation.
Bits still operates as a reactive system, since it triggers only after alerts fire. Software engineering teams continue to face alert fatigue and context-switching delays even with faster investigations. For teams that want to prevent incidents instead of only responding faster, Struct offers proactive monitoring that identifies issues before they escalate and delivers automated investigations without manual prompting.
Compare Struct’s proactive investigations with Datadog Bits
Seven Steps to Configure Datadog Bits AI SRE
These seven steps guide you through configuring Datadog Bits for automated SRE workflows in your software engineering environment.
1. Enable Bits AI SRE: Navigate to AI > Bits AI SRE in your Datadog console and activate the service. This activation creates the base layer for automated investigations.
2. Configure Integrations: With Bits active, connect Slack for alert notifications and GitHub for code context, and confirm that Datadog monitors are configured across logs, metrics, and traces. These integrations supply the data sources that Bits queries during investigations.
3. Create bits.md Runbooks: Configure the customizable bits.md file with team-specific troubleshooting knowledge. Include correlation IDs, service dependencies, and escalation procedures so Bits can tailor investigations to your environment.
4. Test Query Interface: Use @Bits commands in Slack channels to confirm that the AI responds correctly to investigation requests such as “analyze recent API latency spike.” These tests validate that integrations and runbooks work as expected.
5. Enable Auto-Investigations: Configure monitors to automatically trigger Bits investigations when thresholds are breached. This step shifts your team from manual queries to automated responses.
6. Set Up GitHub Dev Agent: Link your GitHub organization to enable automatic team syncing and code correlation for incident attribution. This connection lets Bits tie incidents to specific deployments and pull requests.
7. Refine Dashboards for Investigations: Create investigation-focused dashboards that Bits can reference during automated analysis. These dashboards give the AI curated views of critical metrics and logs.
Common troubleshooting steps include verifying Slack permissions for bot access and checking bits.md YAML syntax for errors. Teams can complete setup by following this sequence and validating each stage before moving to the next.
Teams that want similar automation without custom runbooks often prefer tools with pre-built integrations and default playbooks. See how Struct removes manual runbook setup for Datadog users
How Bits Powers AI-Driven SRE Workflows
Datadog Bits supports three core SRE workflows: automated triage for blast radius assessment, remediation through suggested fixes, and post-mortem generation. The AI correlates signals across the full stack, tracing elevated API latency through user sessions, backend dependencies, database queries, and network paths.
Typical workflows include pulling logs from five minutes before alert triggers, correlating GitHub deployment changes with error spikes, and generating incident timelines with supporting telemetry data. These workflows give on-call engineers a structured view of each incident without manual querying.
| Metric | Bits AI | Manual Process |
|---|---|---|
| Investigation Time | 3-4 minutes | 45 minutes |
| MTTR Reduction | 70-90% | N/A |
| Parallel Investigations | Multiple simultaneous | Sequential only |
Key limitations include reactive triggering that requires alerts to fire first, potential investigation delays during high-volume incidents, and dependency on comprehensive Datadog instrumentation. Alert fatigue often persists because Bits accelerates response but does not prevent issues from reaching alert thresholds.
Teams that want to combine fast reactive analysis with early detection often pair Bits with a proactive platform. Compare proactive and reactive SRE workflows with Struct
Datadog Bits Pricing and Real-World Feedback
Datadog provides a 14-day free trial of the entire product suite, including Bits AI SRE. Datadog Infrastructure Pro starts at $15 per infra host per month (billed annually) and Enterprise at $23 per infra host per month, plus Bits AI paid add-on costs.
| Plan | Base Cost | Bits Included |
|---|---|---|
| Pro/Enterprise | ~$15-$23/host/month + AI | Yes |
| Trial | 14 days free | Yes |
User feedback shows strong results for some teams and modest gains for others. Rafael Bento from iFood reports “From day one, Bits AI SRE started cutting our MTTR by 70%”, while other teams describe Bits as mainly a powerful query interface instead of full automation.
Compared to Struct’s Growth plan supporting 200 issues per month at similar pricing, Bits often requires more manual intervention and setup effort. This difference matters most for teams with lean SRE staffing or frequent incident volume.
Estimate your SRE ROI with a Struct demo
Why Struct Fits Proactive AI-Driven SRE
Struct focuses on proactive automation that identifies issues before alerts fire, which complements or replaces reactive tools like Bits. Struct deploys in 5-10 minutes, integrates with leading observability platforms including Datadog, and customers report an 80% reduction in triage time.
Struct’s main advantages include automatic investigation initiation without manual prompting, broad integration with Slack, GitHub, and observability tools, and junior-friendly runbooks that close knowledge gaps. One fintech customer reached SLA compliance by cutting investigation time from 45 minutes to under five minutes.
| Feature | Datadog Bits | Struct |
|---|---|---|
| Trigger Method | Reactive (post-alert) | Proactive monitoring |
| Investigation Speed | 3-4 minutes | Under 5 minutes |
| Setup Time | Varies | 5-10 minutes |
| Triage Reduction | 70-90% | 80% |
Schedule a Struct pilot to see proactive SRE in action
Common Datadog Bits Pitfalls and How to Avoid Them
Teams often run into three recurring Datadog Bits implementation issues: incomplete bits.md configuration that produces generic investigations, over-reliance on AI without human validation, and weak Datadog instrumentation that creates investigation gaps. Teams must fully instrument their application footprint with metrics, logs, traces, and runtime data before relying on Bits.
Effective usage starts with iterating on runbook configurations based on investigation results. This feedback loop helps Bits deliver more accurate and relevant recommendations over time. Teams should also maintain disciplined alerting thresholds to reduce noise, since even automated investigations become overwhelming when low-value alerts trigger them. Finally, ensure SOC2 and HIPAA compliance for sensitive log data, because Bits processes and may store this information during investigations.
Many teams pair Bits with Struct to cover both sides of incident management. Struct handles proactive monitoring and early detection, while Bits provides reactive deep-dive analysis once alerts occur.
FAQ
How does Datadog Bits compare to Struct for SRE automation?
Datadog Bits operates as a reactive system, triggering investigations only after alerts fire and relying on manual setup of bits.md runbooks. Struct provides proactive monitoring that surfaces issues before they escalate and automatically correlates logs and metrics without custom runbooks. Struct typically completes investigations in under five minutes, while Bits reaches a similar 3-4 minute benchmark, but Struct prevents many alerts from firing at all.
What is the typical setup time for each platform?
Datadog Bits requires configuration across several layers, including integration setup, bits.md runbook creation, and testing across Slack, GitHub, and monitoring systems. Struct deploys in 5-10 minutes through simple authentication with existing observability tools, Slack channels, and code repositories. This difference comes from Struct’s pre-built integrations compared to Bits’ need for custom runbook configuration.
What if our logging and telemetry infrastructure needs improvement?
Both platforms depend on quality telemetry data to support software engineering observability. Datadog Bits needs comprehensive instrumentation across metrics, logs, traces, and runtime data in the Datadog platform. Struct connects to existing observability tools but also relies on proper logging, trace IDs, and alert configurations. Teams with weak telemetry should first improve data coverage and quality before rolling out either AI-driven solution.
How do pricing models compare between the platforms?
Datadog Bits requires Pro or Enterprise plans. Datadog Infrastructure Pro starts at $15 per infra host per month (billed annually) and Enterprise at $23 per infra host per month, plus Bits AI paid add-on costs, with a 14-day free trial available. Struct offers tiered pricing with a Startup plan for up to five users and 30 issues monthly, a Growth plan supporting unlimited users and 200 issues monthly, and Enterprise plans with custom pricing. Both platforms deliver comparable value for mid-scale engineering teams, with differences driven mainly by setup effort and workflow style.
Can these tools integrate with GitHub for code correlation during incidents?
Both platforms integrate with GitHub for code correlation during incidents. Datadog Bits connects with GitHub through team synchronization and code correlation features, which enables automatic attribution of incidents to recent deployments or code changes. Struct provides native GitHub integration for correlating alerts with recent commits, pull requests, and deployment events. Both tools can suggest fixes and create pull requests for remediation, while Struct’s integration usually requires less manual configuration.
Conclusion: Choosing Between Bits and Struct
Datadog Bits AI gives teams a strong foundation for reactive SRE automation, allowing engineers to investigate alerts through conversational interfaces and automated workflows. The 2026 updates improve investigation speed and expand data correlation across the stack. Teams that want proactive incident prevention and consistent sub-five-minute response times should also evaluate Struct’s automated approach. Struct transforms SRE operations with about 80% faster triage and removes many reactive limitations that come from relying solely on alert-driven tools.