
A December service disruption at Amazon Web Services has been traced to human configuration errors after an internal AI coding agent made system-level changes that triggered an outage in parts of mainland China.
While the automated tool carried out the actions that caused the disruption, Amazon says the root cause was a permissions mistake by employees that allowed the agent broader access than intended.
What triggered the AWS disruption
According to people familiar with the incident, the AI coding assistant deleted and recreated the environment it was operating in, leading to a service outage that lasted around 13 hours in December.
The tool normally requires approval from two human reviewers before changes are pushed into production systems. However, in this case, the agent inherited the access rights of its human operator, removing those safeguards.
That permissions oversight enabled the automated system to carry out infrastructure-level changes without the usual checks in place.
Amazon points to human oversight, not automation
Amazon described the outage as an “extremely limited event” and said the involvement of AI tools was coincidental.
The company stressed that similar incidents could occur through manual developer actions or traditional software tools if access controls are misconfigured.
Following the disruption, Amazon said it introduced additional internal training and safeguards aimed at tightening permission management and oversight of automated systems.
The incident was first reported by the Financial Times and later detailed by The Verge.
A senior AWS employee reportedly said the December outage was the second production issue tied to Amazon’s AI tools in recent months. Another incident involved an AI chatbot used by developers, though that did not affect customer-facing AWS services.
While both events were described internally as minor, employees said the problems were foreseeable given how access permissions were structured.
Amazon maintains that AI tools remain productivity aids rather than operational risks. The company is continuing to refine access policies and automation workflows as it expands the use of AI across engineering teams.
The outage highlights how human oversight — particularly around permissions and system access — remains a critical factor in cloud reliability, even as automation plays a larger role in infrastructure management.
Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!
Find the best of Al News in one place, specially curated for you every weekend.
Stay on top of the latest tech trends and biggest startup news.