Graduation ceremonies are built on precision. The schedule is tight, the stage time is sacred, and every name carries weight—because for students, being called up correctly is part of what makes the moment feel official. That’s why the growing use of AI-powered systems to announce graduates as they walk across the stage has been framed as a solution to a very human problem: mispronunciations, missed names, and the inevitable stress that comes with live events.
But a recent commencement at Glendale Community College in Phoenix, Arizona, offered a cautionary counterpoint. Instead of eliminating errors, the system introduced new ones—mispronouncing some names and skipping others entirely—reportedly due to timing issues as graduates moved through the stage sequence. The ceremony was paused at least twice while staff attempted to correct the announcements, and Glendale Community College President Tiffany Hernandez later apologized and offered many students a do-over.
The incident is notable not only because it involved an AI announcer, but because it exposed a deeper tension in how institutions adopt automation for high-stakes, real-time moments. When AI is used in a setting where timing, audio clarity, and human movement all interact, the failure modes aren’t limited to “the model got it wrong.” They can include cascading breakdowns—where the system’s output becomes unreliable precisely when the event needs it most.
What Glendale Community College tried to do—and why
Across the country, schools have increasingly experimented with AI-assisted name pronunciation tools for graduation. The motivation is easy to understand. Commencement programs often include names from many languages and cultural backgrounds, and even experienced announcers can struggle under pressure. Traditional approaches—like phonetic spellings on paper or last-minute corrections—can help, but they’re still vulnerable to human error and last-second changes.
AI-based systems promise something more scalable: consistent pronunciation support that can be generated quickly and updated as needed. In theory, these tools reduce the burden on staff and improve accuracy, especially for names that are difficult to pronounce based on spelling alone. Many schools also see these systems as accessibility improvements, since clearer announcements can help families, friends, and community members follow along.
In this case, Glendale Community College used an AI announcer to call students as they crossed the stage. The goal was straightforward: get pronunciations right and ensure no one is left out.
What went wrong during the ceremony
According to reports from the livestream of the Glendale Community College commencement ceremony, the AI announcer mispronounced some names and skipped others entirely. The problems weren’t described as minor glitches that could be brushed off as background noise. Instead, they were significant enough that the ceremony had to be paused at least twice while staff worked to address the announcements.
Timing appears to have been a central factor. Graduation ceremonies are choreographed: students approach the stage in sequence, the announcer reads their names, and the next graduate steps forward. If the system depends on a workflow that assumes a certain pace—such as how quickly a student arrives at a microphone, how long the audio takes to generate, or how the stage operator triggers each announcement—then any deviation can cause the system to fall out of sync.
That’s where AI systems can become fragile. Even if the pronunciation engine itself is strong, the surrounding pipeline—inputs, triggers, buffering, and audio playback—must work flawlessly in real time. If the system is late, it may read the wrong name at the wrong moment. If it fails to receive the correct input in time, it may skip. And if staff intervene manually to correct the flow, the system may not gracefully recover, especially if it’s designed primarily for “happy path” operation.
In other words, the issue wasn’t just “AI pronounced a name incorrectly.” It was that the AI announcement process behaved like a live production system rather than a simple text-to-speech tool. Live events are unforgiving: there’s no pause button for a student who has already stepped into position, and there’s no way to rewind the moment for the audience.
The human cost of getting names wrong
It’s tempting to treat mispronunciations as a small embarrassment, something that can be corrected later. But for graduates, names are identity. Being called up incorrectly can feel like being overlooked, even when the institution’s intent is the opposite.
Skipping names is even more serious. A skipped graduate isn’t merely misheard; they’re effectively removed from the ceremony’s narrative. Even if the person is eventually recognized, the emotional impact of missing the moment can’t be undone. Families who traveled, students who prepared speeches or family photos, and classmates watching from their seats all experience the ceremony as a shared event. When the system fails, it fractures that shared experience.
This is why President Tiffany Hernandez’s response mattered. She apologized for the mistakes and eventually offered many students a do-over. That kind of remediation is not just symbolic. It acknowledges that the ceremony’s purpose is not fulfilled if students don’t receive the recognition they came for.
Still, a do-over raises its own questions. How many students are affected? How much time does it take to re-run the sequence? Does the do-over happen immediately, or does it extend the ceremony into a longer, more stressful period? And perhaps most importantly: what does it mean for trust when the institution has to correct a system that was supposed to prevent exactly these kinds of errors?
Why timing is such a big deal for AI announcers
AI pronunciation tools are often evaluated in controlled settings: given a name, produce a pronunciation. But graduation ceremonies are not controlled. They involve movement, microphones, stage cues, and a chain of events that must align within seconds.
Timing issues can create several distinct failure patterns:
First, there’s the “wrong output at the right time” problem. If the system’s internal queue or trigger advances before the correct student is ready, the announcer may speak the next name too early. The result can be a mismatch between who is standing at the microphone and what the audience hears.
Second, there’s the “no output” problem. If the system doesn’t receive the input it expects—perhaps because a student arrives slightly faster or slower than anticipated—it may fail to generate an announcement in time. In that case, the system might skip the name entirely rather than waiting.
Third, there’s the “recovery failure” problem. Even if staff notice the issue quickly, the system may not be able to resume cleanly after a pause. Some automated pipelines are designed to run continuously; once interrupted, they may require manual resets or reconfiguration. If staff are trying to fix the flow while the ceremony continues, the system can compound errors.
These are not theoretical concerns. They’re common in any automated system that interacts with real-world timing: ticket scanners, live captioning, automated translation, and even robotics in warehouses. The lesson is consistent: accuracy in isolation doesn’t guarantee reliability in context.
The broader trend: AI in live spaces
Glendale Community College’s experience fits into a larger pattern. AI tools are increasingly used in education and public-facing events, often with the promise of reducing human error. But live environments are where automation meets unpredictability.
In classrooms, AI can assist with drafting, tutoring, and feedback. In those settings, there’s usually time to review and correct. In a graduation hall, there’s no time to iterate. The system must perform perfectly—or at least reliably enough that occasional errors don’t derail the ceremony.
That’s why incidents like this are likely to shape how schools evaluate AI tools going forward. It’s not enough to ask whether the AI can pronounce names correctly. Institutions will need to ask whether the entire system can handle:
1) Variations in student pacing
2) Microphone and audio latency
3) Stage operator workflows
4) Network or processing delays
5) How the system behaves when paused or restarted
6) Whether there’s a reliable fallback plan when the AI fails
If those questions aren’t answered, the AI may simply shift the risk from human error to system error.
A unique take: the “announcement” is actually a choreography problem
One way to understand what happened is to stop thinking of the AI announcer as a standalone pronunciation engine. In practice, it functions as part of a choreography system. The announcer’s job isn’t only to speak; it’s to synchronize speech with bodies moving through space.
That means the success of the system depends on more than language processing. It depends on operational design. For example, if the system expects a specific sequence of inputs—like a stage manager confirming each student before the AI speaks—then the workflow must be robust to real-world deviations. If the system relies on a timing assumption (“the student will be at the microphone by the time the audio plays”), then it’s vulnerable to even small delays.
This is where many AI deployments can go wrong: they focus on the intelligence of the model and underinvest in the engineering of the pipeline. The model might be capable, but the system might not be resilient.
In Glendale’s case, the reported timing issues suggest that the pipeline didn’t maintain synchronization under live conditions. The pauses indicate that staff were actively trying to regain control of the sequence. That’s a sign of a system that can fail in ways that require human intervention—exactly when the ceremony is least able to absorb disruption.
What schools can learn from this incident
While the details of Glendale’s setup aren’t fully described in the available reporting, the incident points to practical lessons that other institutions can apply.
First, schools should treat AI announcers as “assistive,” not “autonomous.” Even if the AI is doing the speaking, there should be a clear human backup method that can take over instantly. That backup could be a trained announcer reading names manually, or a preloaded audio/phonetic script system that can be triggered without delay.
Second, institutions should test the system under realistic conditions. A rehearsal with students moving at typical speeds, with typical microphone behavior, and with the actual stage workflow is essential. Testing only the pronunciation accuracy in a quiet room won’t reveal timing failures.
Third, schools should build
