Deductive AI Automates Software Debugging, Saving DoorDash Over 1,000 Engineering Hours

As software systems continue to evolve in complexity, the challenges associated with debugging have become increasingly pronounced. Engineers are now spending a significant portion of their time—up to 50%—hunting down the causes of software failures instead of focusing on building new features and products. This growing dilemma has led to the emergence of a new category of tools designed to alleviate the burden of debugging: AI agents capable of diagnosing production failures in mere minutes rather than hours.

One such innovative solution is Deductive AI, a startup that recently emerged from stealth mode with a mission to revolutionize the way software incidents are diagnosed and resolved. With $7.5 million in seed funding led by CRV and supported by notable investors such as Databricks Ventures, Thomvest Ventures, and PrimeSet, Deductive AI aims to commercialize its unique approach to incident management through what it calls “AI SRE agents.” These agents leverage reinforcement learning—an advanced technology commonly associated with game-playing AI—to navigate the intricate landscape of production software incidents.

The frustration within engineering organizations is palpable. Traditional observability tools can indicate that something has gone awry, but they often fall short of providing insights into the underlying reasons for the failure. When a production system experiences an outage at an inconvenient hour, engineers are left to engage in hours of manual detective work, cross-referencing logs, metrics, deployment histories, and code changes across numerous interconnected services to pinpoint the root cause. This process can feel akin to searching for a needle in a haystack, where the haystack is not only vast but also constantly shifting and filled with countless other needles.

Sameer Agarwal, co-founder and chief technology officer of Deductive AI, articulated this challenge succinctly: “The complexities and inter-dependencies of modern infrastructure mean that investigating the root cause of an outage or incident can feel like searching for a needle in a haystack, except the haystack is the size of a football field, it’s made of a million other needles, it’s constantly reshuffling itself, and is on fire—and every second you don’t find it equals lost revenue.”

To address this pressing issue, Deductive AI has developed a system that constructs what the company refers to as a “knowledge graph.” This graph maps relationships across various elements, including codebases, telemetry data, engineering discussions, and internal documentation. When an incident occurs, multiple AI agents collaborate to form hypotheses, test them against live system evidence, and converge on a root cause. This investigative workflow mimics that of experienced site reliability engineers (SREs) but operates at machine speed, significantly reducing the time required to diagnose issues.

The impact of Deductive AI’s technology has already been demonstrated in some of the most demanding production environments. For instance, DoorDash, a leading food delivery service, has integrated Deductive into its incident response workflow for its advertising platform, which relies on real-time auctions that must be completed in under 100 milliseconds. The company has set an ambitious goal of resolving production incidents within 10 minutes by 2026.

Shahrooz Ansari, Senior Director of Engineering at DoorDash, emphasized the critical role Deductive plays in their operations: “Our Ads Platform operates at a pace where manual, slow-moving investigations are no longer viable. Every minute of downtime directly affects company revenue. Deductive has become a critical extension of our team, rapidly synthesizing signals across dozens of services and surfacing the insights that matter—within minutes.”

DoorDash estimates that Deductive has successfully root-caused approximately 100 production incidents over the past few months, translating to more than 1,000 hours of annual engineering productivity and a revenue impact measured in millions of dollars. Similarly, at Foursquare, another early adopter of Deductive’s technology, the time required to diagnose Apache Spark job failures has been reduced by an impressive 90%. What once took hours or even days to resolve can now be completed in under 10 minutes, resulting in annual savings exceeding $275,000.

The timing of Deductive’s launch is particularly relevant given the current landscape of software development. AI coding assistants have empowered engineers to generate code at unprecedented speeds, but this rapid development often leads to software that is more complex and challenging to maintain. This phenomenon, referred to as “vibe coding” by AI researcher Andrej Karpathy, involves using natural language prompts to generate code through AI tools. While these tools enhance productivity, they can introduce redundancies, break architectural boundaries, and create assumptions that accumulate over time.

Agarwal noted, “Most AI-generated code still introduces redundancies, breaks architectural boundaries, makes assumptions, or ignores established design patterns. In many ways, we now need AI to help clean up the mess that AI itself is creating.” The assertion that engineers spend a substantial amount of their time debugging is supported by research from the Association for Computing Machinery, which indicates that developers allocate between 35% to 50% of their time to validating and debugging software. A recent report from Harness’s State of Software Delivery 2025 found that 67% of developers are dedicating more time to debugging AI-generated code.

Deductive AI’s technical approach sets it apart from existing observability platforms such as Datadog and New Relic. While many of these systems utilize large language models to summarize data or identify correlations, they often lack what Agarwal describes as “code-aware reasoning.” This capability enables Deductive’s system to understand not only that something has broken but also why the code behaves in a particular manner.

Agarwal explained, “Most enterprises use multiple observability tools across different teams and services, so no vendor has a single holistic view of how their systems behave, fail, and recover—nor are they able to pair that with an understanding of the code that defines system behavior. These are key ingredients to resolving software incidents, and it is exactly the gap Deductive fills.”

The Deductive system connects to existing infrastructure using read-only API access to observability platforms, code repositories, incident management tools, and chat systems. It continuously builds and updates its knowledge graph, mapping dependencies between services and tracking deployment histories. When an alert is triggered, Deductive initiates what it describes as a multi-agent investigation. Different agents specialize in various aspects of the problem: one may analyze recent code changes, another examines trace data, while a third correlates the timing of the incident with recent deployments. The agents share findings and iteratively refine their hypotheses.

A distinguishing feature of Deductive’s approach is its use of reinforcement learning. The system learns from each incident, identifying which investigative steps led to correct diagnoses and which were dead ends. When engineers provide feedback, the system incorporates that signal into its learning model. Agarwal elaborated, “Each time it observes an investigation, it learns which steps, data sources, and decisions led to the right outcome. It learns how to think through problems, not just point them out.”

For example, a recent latency spike in an API at DoorDash initially appeared to be an isolated service issue. However, Deductive’s investigation revealed that the root cause was timeout errors from a downstream machine learning platform undergoing a deployment. By analyzing log volumes, traces, and deployment metadata across multiple services, Deductive connected the dots and provided a comprehensive explanation of the issue. Ansari remarked, “Without Deductive, our team would have had to manually correlate the latency spike across all logs, traces, and deployment histories. Deductive was able to explain not just what changed, but how and why it impacted production behavior.”

While Deductive’s technology has the potential to automate fixes directly in production systems, the company has intentionally chosen to keep humans in the loop—for now. Agarwal stated, “While our system is capable of deeper automation and could push fixes to production, currently, we recommend precise fixes and mitigations that engineers can review, validate, and apply. We believe maintaining a human in the loop is essential for trust, transparency, and operational safety.” However, he acknowledged that over time, deeper automation will likely evolve, changing how humans interact with the system.

The founding team behind Deductive AI boasts extensive experience from building some of Silicon Valley’s most successful data infrastructure platforms. Agarwal earned his Ph.D. at UC Berkeley, where he created BlinkDB, an influential system for approximate query processing. He was among the first engineers at Databricks, contributing to the development of Apache Spark. Rakesh Kothari, Deductive’s co-founder and CEO, was an early engineer at ThoughtSpot, where he led teams focused on distributed query processing and large-scale system optimization.

The investor syndicate backing Deductive reflects both the technical credibility of the team and the market opportunity presented by their solution. Notable figures such as Ion Stoica, founder of Databricks and Anyscale, Ajeet Singh, founder of Nutanix and ThoughtSpot, and Ben Sigelman, founder of Lightstep, have recognized the potential of Deductive’s approach.

Rather than positioning itself as a competitor to platforms like Datadog or PagerDuty, Deductive aims to serve as a complementary layer that enhances existing tools. Its pricing model is structured around the number of incidents investigated, along with a base platform fee, rather than charging based on data volume. This approach allows organizations to leverage Deductive’s capabilities without incurring excessive costs.

Deductive offers both cloud-hosted and self-hosted deployment options, emphasizing that it does not store customer data on its servers or use it to train models for other customers. This assurance is particularly important given the proprietary nature of both code and production system behavior.

With fresh capital and early traction among customers such as DoorDash, Foursquare, and Kumo AI, Deductive plans to expand its team and deepen the system’s reasoning capabilities. The company’s near-term vision includes transitioning from reactive incident analysis to proactive prevention, helping teams predict problems before they occur.

In an industry where every second of downtime translates to lost revenue, the shift from firefighting to building is becoming increasingly essential. As Shahrooz Ansari from DoorDash aptly put it, “Investigations that were previously manual and time-consuming are now automated, allowing engineers to shift their energy toward prevention, business