Site icon Technology Shout

Claude Code deletes developers’ production setup, including its database and snapshots — 2.5 years of records were nuked in an instant

Future and its affiliate partners may earn a commission when you purchase through links on our articles.

    Desperate developers.

Image source: Getty Images

Everyone loves a good story about agent bots gone wrong, and these stories usually come with some gloat To our virtual companions. Sometimes, though, these mistakes can be due to poor supervision, as in the case of Alexey Grigorev, who gamely detailed how he let Claude Code Erase years of records On the website, recovery snapshots are included.

The story begins when Grigolev wanted to move his website, Artificial Intelligence Shipping Laboratoryto AWS and have it share the same infrastructure Data Talk Club. Claude himself advised against this option, but Grigolev decided it wasn’t worth the trouble or cost of keeping two separate settings.

Gregory uses Terraform, an infrastructure management utility that can create (or destroy) entire setups, including networking, load balancing, databases, and of course the servers themselves. He asked Claude to run Terraform plan Set up a new website but forgot to upload important information state The file contains a complete description of the settings that exist at any time.

Claude did as Gregory wished and created a setup for the Shipping Lab site, however, the operators stopped it midway. Because it’s missing the state file, it creates duplicate resources. Gregory corrected the situation by asking Cloud to identify the duplicate resources and then uploaded the status file, believing he had resolved the situation.

A closer look at TH Premium: Artificial Intelligence and Data Center

Unfortunately, Gregory assumed at this point that the bot would continue to clean up duplicate resources before looking at the status file to see how it was originally set up. Terraform and similar tools can be very ruthless, especially when followed blindly. Since Claude now has state file, it logically follows it, issuing a Terraform “destroy” operation, ready to set up correctly this time.

Given that the infrastructure description included the DataTalks.Club website, this resulted in both sites’ settings being completely wiped, including databases with 2.5 years of records as well as database snapshots that Grigolev was counting on as backups. The carrier had to contact Amazon Business Support, which helped restore the data within about a day.

In the postmortem, Gregory described some of the steps he took to avoid similar incidents in the future, including setting up periodic testing of database recovery, applying deletion protection to Terraform and AWS permissions, and moving Terraform state files to S3 storage instead of the local machine. He also admitted that he was “over-reliant on AI agents to run Terraform commands” and is now blocking agents from doing so and will manually review every plan Cloud comes up with so that he can carry out any destructive actions himself.

It would be easy to label this story as just another one of “dumb robots gone wrong”, but it’s a reasonable guess that most sysadmins would find baseline problems with Grigorev’s approach, including granting broad permissions to what were actually his subordinates and not scoping permissions in a production environment in the first place.

Perhaps the biggest lesson is assuming Claude even has the context (pun not intended) to understand what the existence of the second website means, just like a junior sysadmin wouldn’t.

Spread the love
Exit mobile version