Artificial intelligence is often sold to governments as a straightforward efficiency upgrade: give civil servants better tools, automate routine work, and watch processing times shrink. Yet the real-world arithmetic of public-sector productivity may be more complicated. A growing body of reporting and early implementation experience suggests that any gains achieved inside agencies can be partially erased by the way citizens, businesses, and frontline users interact with AI-enabled services—especially when those services are new, imperfect, or require people to “work the system” in unfamiliar ways.
The catch is not that AI fails to help. It’s that efficiency is not only an internal metric. In public administration, the service chain includes the public at every step: submitting forms, providing evidence, answering questions, correcting mistakes, appealing decisions, and navigating portals when something goes wrong. If AI changes how those interactions happen—by shifting effort from staff to users, increasing the number of steps, or introducing new kinds of friction—then the net effect on overall throughput can disappoint.
This tension is emerging as governments move from pilots to scaled deployments, particularly in areas where AI is used to triage cases, draft responses, translate documents, guide applicants through eligibility checks, or summarize complex information for decision-makers. The promise is compelling: faster turnaround, more consistent communication, and reduced backlog pressure. But the public’s experience becomes the deciding factor. When AI is placed between a citizen and a government outcome, it can either reduce the burden of interaction—or quietly increase it.
To understand why, it helps to look at what “productivity” means in the public sector. Unlike many private-sector settings, public services are constrained by legal requirements, audit trails, and the need for fairness. Even when AI accelerates drafting or classification, agencies still must verify facts, ensure compliance, and handle exceptions. That verification work doesn’t disappear; it often moves. And when AI is used to streamline front-end processes, it can also shift responsibility for correct inputs onto the user.
In practice, the public’s use of AI-enabled services can cancel out internal gains through several mechanisms.
First, there is the “interaction tax.” Many AI tools are designed to be conversational or guided, which sounds user-friendly but can create additional steps when the tool misunderstands a request. A citizen who would previously have filled out a structured form might now be asked a series of AI-generated questions. If the system misinterprets key details—income type, dates, residency status, document categories—the user may need to backtrack, rephrase, or upload additional evidence. Each correction consumes time and attention, and it can be harder to diagnose why the system is confused than it is to see a missing field on a traditional form.
Second, there is the “re-submission loop.” Public-sector workflows already include error handling: incomplete applications, missing attachments, mismatched identifiers, and eligibility disputes. AI can reduce some errors by validating inputs or suggesting missing information. But if AI introduces new failure modes—such as generating incorrect summaries, misclassifying a case, or producing a response that triggers further clarification—then the number of re-submissions can rise. The result is a paradox: agencies may process cases faster once they reach the back office, but the overall pipeline slows because more cases cycle through corrections before they are considered complete.
Third, there is the “behavioral shift” problem. When AI changes the way people interact, it can change the distribution of work. For example, if an AI assistant encourages users to describe their situation in natural language rather than selecting predefined options, the agency may receive richer context—but also more variability. That variability can increase the burden on staff to interpret, verify, or reconcile information. In other words, the work doesn’t vanish; it changes shape. The same is true when AI tools allow users to ask follow-up questions. Those questions can be helpful, but they can also lead to misunderstandings about what the government will actually do, prompting additional inquiries or appeals.
Fourth, there is the “access and capability gap.” AI-enabled services can widen disparities if they assume a level of digital literacy, language proficiency, or comfort with interactive systems that not all users have. Governments often aim for accessibility improvements, but AI interfaces can be less predictable than static forms. A person with limited internet access, a disability affecting speech or reading, or limited familiarity with conversational tools may struggle more than they would with a well-designed, structured application. If the service becomes harder for some groups, the agency may need to provide alternative channels—call centers, in-person support, manual assistance—which can offset internal productivity gains.
Fifth, there is the “trust and transparency” dimension. Public services operate under scrutiny. When AI is involved, citizens may question outcomes more frequently, especially if explanations are unclear. If an AI system provides a recommendation or draft response without a sufficiently understandable rationale, users may seek human review. That human review is not necessarily slow, but it can add volume. And if the AI’s behavior is inconsistent—sometimes helpful, sometimes confusing—users may treat it as unreliable, increasing the likelihood they will abandon the tool and switch to other channels, including staff-intensive ones.
These dynamics are not theoretical. They show up in early deployments where AI is used for document intake, eligibility screening, and customer-service triage. In such systems, the public’s experience is shaped by the quality of the AI’s guidance and the clarity of the underlying process. If the AI is accurate but the workflow is poorly designed, users can still get stuck. If the workflow is well designed but the AI is brittle, users can still experience frustration. Efficiency depends on both.
A unique angle in the current debate is that the public’s use of AI is not simply a cost center; it can also be a source of data and leverage. When citizens interact effectively with AI tools—providing structured information, uploading relevant documents, and following guided steps—agencies can benefit from cleaner inputs and fewer downstream corrections. But that benefit requires careful design. The system must make it easy for users to provide what the government needs, not just what the AI can interpret. It must also handle uncertainty gracefully, offering clear next steps when it cannot confidently proceed.
This is where the “efficiency equation” becomes more nuanced than a simple comparison of staff hours saved. Consider two scenarios.
In the first scenario, AI drafts letters and summarizes case files for staff. The public interacts with the government through familiar channels—forms, portals, and standard communications. Staff productivity improves, and the public’s workload remains stable. Net efficiency rises.
In the second scenario, AI is embedded directly into the public-facing interface. Citizens use an AI assistant to determine eligibility, generate application narratives, and pre-check required documents. If the assistant works well, it can reduce user effort and improve completeness, leading to fewer back-and-forth cycles. Net efficiency rises even more.
But if the assistant is inaccurate or the workflow is confusing, the second scenario can reverse. Users may spend more time correcting misunderstandings, re-uploading documents, or seeking human help. Even if staff later process cases quickly, the total time from initial contact to final resolution can increase. The agency’s internal productivity gains are then partially offset by the public’s increased interaction burden.
This is why policymakers and service designers increasingly talk about “end-to-end” efficiency rather than isolated automation metrics. It’s also why procurement and evaluation frameworks are shifting toward measuring user effort, not just administrative throughput. Governments are beginning to ask questions like: How many steps does a typical applicant take? How often do users abandon the process? What proportion of cases require manual correction? How long does it take to reach a decision from first contact? How many users need assistance, and at what points?
Another factor complicating the picture is that AI can change the volume of interactions. If AI makes it easier to start a request—because the interface feels more approachable—more people may apply, ask questions, or submit preliminary inquiries. That can be good for access, but it can also increase demand faster than agencies can scale capacity. In that case, productivity gains may not translate into shorter queues because the queue grows. This is not an AI-specific issue, but AI can accelerate demand by lowering the barrier to entry.
There is also the question of how AI affects the distribution of work across time. Some AI tools reduce processing time for straightforward cases, but complex cases still require human judgment. If AI triages effectively, complex cases may become more visible and better prepared. But if AI triage is wrong—misclassifying cases or failing to route them correctly—then complex cases can be delayed. Meanwhile, users may experience longer waits when their case is stuck in the wrong category. Again, internal speed can be undermined by routing errors that manifest as user delays.
The most important implication is that AI deployment strategy should treat the public as part of the system, not as an external variable. Efficiency improvements are likely to be sustainable only when AI is paired with strong service design, robust fallback mechanisms, and clear accountability.
Robust fallback mechanisms matter because AI systems will sometimes fail. The question is whether the failure is recoverable without excessive user effort. A well-designed system should detect when it is uncertain, explain what it needs, and offer alternative paths—such as switching to a human-assisted channel, providing a checklist of required documents, or allowing users to correct specific fields rather than restarting the entire process. If recovery requires users to start over, the interaction tax becomes severe.
Clear accountability matters because public trust is not optional. If users cannot understand why a decision was made or what information is missing, they will seek human review. That review is legitimate, but it can increase workload. Transparency doesn’t mean exposing internal model logic; it means providing understandable explanations, citing relevant criteria, and showing what evidence is required. When AI drafts responses or summarizes documents, the system should preserve traceability so that staff can verify claims quickly and users can see what was considered.
Service design also includes language and accessibility. AI can translate and simplify, but it can also distort meaning. Governments must ensure that translations are accurate and that simplified explanations do not omit critical conditions
