Salesforce Launches Agentforce Observability for Real-Time AI Agent Monitoring and Transparency

Salesforce has recently introduced a groundbreaking suite of observability tools designed to address one of the most pressing challenges in the realm of corporate artificial intelligence: understanding the decision-making processes of AI agents deployed in real-world customer interactions. This initiative, part of Salesforce’s Agentforce 360 Platform, aims to provide organizations with unprecedented visibility into the actions, reasoning steps, and guardrails that govern their AI agents’ behavior. As businesses increasingly adopt AI technologies, the need for transparency and control over these autonomous systems has never been more critical.

The launch of these observability tools comes at a time when companies are grappling with the dual promise and peril of AI adoption. While the technology offers significant efficiency gains, executives often find themselves hesitant to fully embrace autonomous systems that operate beyond their comprehension. Adam Evans, executive vice president and general manager of Salesforce AI, succinctly captured this sentiment, stating, “You can’t scale what you can’t see.” With a reported 282% increase in AI implementation among businesses, the urgency for robust monitoring systems capable of tracking fleets of AI agents making real-time business decisions is palpable.

At the heart of Salesforce’s observability offering is the recognition that while AI agents can effectively perform tasks—such as resolving customer inquiries or scheduling appointments—the rationale behind their decisions often remains opaque. For instance, a customer service bot may successfully answer a complex tax question, but the organization deploying it may struggle to trace the reasoning path that led to that outcome. This lack of diagnostic tools becomes particularly problematic when an agent encounters an edge case or fails to deliver the expected results.

To combat this issue, Salesforce has developed what it describes as a “mission control system” through its Agentforce Observability tools. Gary Lerhaupt, vice president of Salesforce AI and leader of the company’s observability initiatives, emphasized that the system not only monitors agent performance but also analyzes and optimizes it. The observability tools provide business-specific metrics that traditional monitoring solutions often overlook. For example, in a customer service context, relevant metrics might include engagement rates or deflection rates, while in sales, they could encompass leads assigned, converted, or reply rates.

The practical implications of these tools are already evident in early customer deployments. Ryan Teeples, chief technology officer at 1-800Accountant, shared insights from his company’s experience with Agentforce agents, which serve as a 24/7 digital workforce handling intricate tax inquiries and appointment scheduling. The AI agents leverage integrated data from various sources, including audit logs and customer support history, to deliver instant responses without human intervention. For a financial services firm dealing with sensitive tax information, the ability to monitor AI decision-making is crucial. Teeples noted, “With this level of sensitive information and the fast pace in which we move during tax season in particular, Observability allows us to have full trust and transparency with every agent interaction in one unified view.”

The observability tools have provided unexpected insights for 1-800Accountant. Teeples remarked that the optimization feature has been particularly enlightening, offering full visibility into agent reasoning, identifying performance gaps, and revealing how decisions are made. This capability has enabled the company to quickly diagnose issues that might have otherwise gone unnoticed and to configure appropriate guardrails in response. The business impact has been substantial; within the first 24 hours of deployment, Agentforce resolved over 1,000 client engagements. The company now anticipates a 40% growth in client volume this year without the need to recruit and train additional seasonal staff, thereby freeing up 50% more time for CPAs to focus on complex advisory work rather than administrative tasks.

Similarly, Reddit has experienced positive outcomes since implementing the observability technology. John Thompson, vice president of sales strategy and operations at Reddit, reported that the company has successfully deflected 46% of advertiser support cases since launching Agentforce for advertiser support. By observing every interaction facilitated by Agentforce, Reddit has gained valuable insights into how its AI navigates advertisers through complex tools. Thompson stated, “This insight helps us understand not just whether issues are resolved, but how decisions are made along the way.”

Salesforce’s observability system is built on two foundational components: the Session Tracing Data Model and MuleSoft Agent Fabric. The Session Tracing Data Model meticulously logs every interaction, encompassing user inputs, agent responses, reasoning steps, language model calls, and guardrail checks. This data is securely stored in Data 360, Salesforce’s comprehensive data platform, creating what the company refers to as “unified visibility” into agent behavior at the session level.

The second component, MuleSoft Agent Fabric, addresses a growing concern as organizations expand their AI systems: agent sprawl. This tool provides what Lerhaupt describes as “a single pane of glass across every agent,” including those developed outside the Salesforce ecosystem. The Agent Visualizer feature within Agent Fabric creates a visual map of a company’s entire agent network, allowing for visibility across all agent interactions from a centralized dashboard.

The observability tools are categorized into three functional areas. First, Agent Analytics tracks performance metrics, surfaces key performance indicator (KPI) trends over time, and highlights ineffective topics or actions. Second, Agent Optimization delivers end-to-end visibility of every interaction, groups similar requests to uncover patterns, and identifies configuration issues. Finally, Agent Health Monitoring, which is set to become generally available in Spring 2026, will track key health metrics in near real-time and send alerts regarding critical errors and latency spikes.

Pierre Matuchet, senior vice president of IT and digital transformation at Adecco, shared how the visibility provided by these tools helped his team build confidence even before full deployment. He noted that during early notebook testing, the agent was able to handle unexpected scenarios—such as candidates not wanting to answer questions already covered in their CVs—appropriately and as designed. Matuchet stated, “Agentforce Observability helped us identify unanticipated user behavior and gave us confidence, even before the agent went live, that it could act responsibly and reliably.”

The introduction of Salesforce’s AI observability tools positions the company in direct competition with major players like Microsoft, Google, and Amazon Web Services, all of which offer monitoring capabilities integrated into their AI agent platforms. Lerhaupt contended that enterprises require more than the basic monitoring solutions provided by these competitors. He asserted, “Observability comes out-of-the-box standard with Agentforce at no extra cost,” framing the offering as comprehensive rather than supplementary. The tools are designed to provide “deeper insight than ever before” by capturing the full telemetry and reasoning behind every agent interaction through the Session Tracing Data Model. This data is then utilized to deliver key analysis and session quality scoring, enabling customers to optimize and enhance their agents.

The competitive landscape is significant because enterprises face a critical decision: should they build their AI infrastructure on a cloud provider’s platform and utilize its native monitoring tools, or should they adopt a specialized observability layer like Salesforce’s? Lerhaupt framed this choice as one of depth versus breadth, emphasizing that “enterprises need more than basic monitoring to measure the success of their AI deployments.” Full visibility into every agent interaction and decision is essential for effective management and optimization.

A broader question arises regarding whether Salesforce is addressing an imminent challenge faced by most enterprises or building for a future that remains years away. While the company’s reported 282% surge in AI implementation sounds impressive, it does not differentiate between production deployments and pilot projects. When pressed for clarity on this point, Lerhaupt referenced customer examples rather than providing specific breakdowns. He described a three-phase journey from experimentation to scale: Day 0 focuses on establishing trust, Day 1 involves transforming ideas into usable AI, and Day 2 centers on scaling early successes into enterprise-wide outcomes.

Salesforce claims to have over 12,000 customers across 39 countries utilizing Agentforce, collectively powering 1.2 billion agent workflows. These figures suggest that the transition from pilot projects to production is already underway at scale, although the company has not disclosed how many customers are running production workloads versus experimental deployments.

The economic pressures surrounding AI deployment may further accelerate adoption, regardless of organizational readiness. Companies are increasingly compelled to reduce headcount costs while maintaining or improving service levels. AI agents present a potential solution to this dilemma, but only if businesses can cultivate trust in their reliability. Observability tools like those offered by Salesforce represent the critical trust layer necessary for successful scaled deployment.

The narrative surrounding AI deployment is evolving. Traditionally, organizations viewed the agent development lifecycle as a linear process consisting of three foundational steps: build, test, and deploy. However, many organizations have moved past the initial hurdle of creating their first agents, and the real challenge begins immediately after deployment. This shift reflects a growing understanding of AI in production environments. Unlike traditional software, AI agents learn, adapt, and make decisions based on probabilistic models rather than deterministic code. Consequently, their behavior can drift over time, and unexpected failure modes may emerge under real-world conditions.

Lerhaupt articulated this perspective, stating, “Building an agent is just the beginning.” Once trust is established for agents to handle real work, companies may observe results but may not fully comprehend the underlying reasons or identify areas for optimization. Customers interact with products—including agents—in unpredictable ways, making transparency around agent behavior and outcomes critical for optimizing the customer experience.

Teeples from 1-800Accountant echoed this sentiment, emphasizing the importance of visibility. He remarked, “This level of visibility has given full trust in continuing to expand our agent deployment.” Without such visibility, deployment efforts would likely slow or cease altogether. 1-800Accountant plans to expand its use of Slack integrations for internal workflows, deploy Service Cloud Voice for case deflection, and leverage Tableau for conversational analytics—all contingent upon the confidence that observability provides.

A recurring theme in customer interviews is the issue of trust—or, more accurately, the lack thereof. While AI agents can perform tasks effectively, executives often hesitate to deploy them widely due to concerns about their reliability. Observability tools aim to transform black-box systems into transparent ones, replacing faith with evidence. This shift is crucial because trust is the primary bottleneck constraining AI adoption, rather than technological capability.