AWS took 6 months to fix Security Token Service bug - IAM policy simulator inadequate, says Stedi

"No system is infallible. Sometimes, it is AWS..."

AWS took 6 months to fix Security Token Service bug - IAM policy simulator inadequate, says Stedi

Updated April 12, 12:15 with detail from AWS.

A security flaw in AWS Security Token Service (STS), an IAM service, gave read-only users admin access to AWS resources under certain conditions. 

When reported to AWS, the cloud provider found that the extent of the bug was wider than first thought – but took six months to push a full fix.

The issue was discovered by a team at Stedi, a startup that’s built a new platform for Electronic Data Interchange (EDI) – a schema to cover all possible electronic business transactions across most industries.

In an April 9 blog, the Stedi team reported encountering the bug and seeing that “again and again, our tests gained access to roles above their designated authorization level” –  reporting it to AWS in June 2023; the cloud provider initially thought it was Stedi misconfiguration.

“We spent a lot of time second-guessing ourselves when discovering and diagnosing this bug. We were well aware of IAM’s provable security via automated reasoning, and the documentation is so comprehensive (and intimidating at times) that we were sure it had to be our fault,” Stedi said.

“Of course, you should do your due diligence before reporting issues, but no system is infallible. Sometimes, it is AWS” the company noted. 

The AWS STS vulnerability was an “edge case” Stedi admitted – but said it was frustrated in its investigation by poor AWS toolings; specifically, an IAM policy simulator that “does not support role trust policy evaluation.”

An AWS engineer, Colm MacCArthaigh later posted on X that just nine customers had set up STS with the configuration that exposed the vulnerability. He added: "It's not like "only 9 customers, no need to rush".

"Just more like our priority is to make sure that customers aren't exposed, and all actively managed by the service team and security operations with very intense tracking. There's also even more nuance in that having an 'affected configuration' means 'config will need to change because of the fix, but that doesn't always mean 'is exploitable', because of other conditions and factors..."

AWS STS bug: OK, explain?

In short, Stedi uses temporary access controls through AWS STS to grant access to customer resources. Their IAM role trust policies relied on tags to control access. These tags are custom attribute labels attached to resources or users and can be used to control access in IAM policies.

The AWS STS vulnerability was discovered because of how Stedi’s policy referenced these tags – using variables with the same name for both the "request tag" (coming from the user's token) and the "resource tag" (on the IAM role). When these variable names matched, the policy wasn't evaluating the tags' actual values, granting unauthorised access.

Whilst the chances of numerous other organisations having similar policy conventions may be small, after reporting it to AWS, the hyperscaler “also discovered the issue was not limited to role trust policies, which are just resource policies for IAM roles (as a resource) – it also extended to statements within IAM boundary policies and SCP policies that contained the same pattern of STS role assumption with tag-based conditions.”

The timeline was as follows, Stedi said:

  • 2023-06-20 - Role access issue discovered, AWS alerted
  • 2023-06-21 - Minimal reproduction steps provided using STS assume role
  • 2023-07-06 - AWS acknowledges issue and determines root cause
  • 2023-10-30 - STS tag handling implementation updated for new IAM roles
  • 2024-01-09 - STS tag handling implementation updated for IAM roles for customers impacted in a 30-day window

We need better tools for testing IAM policies, Stedi concluded: “The [AWS] IAM policy simulator does not support role trust policy evaluation. 

“Proving the security of a system to grant federated identities access to IAM roles continues to rely on both positive and negative end-to-end tests with long test cycles. Developing more mature tooling would massively improve the developer experience, and we hope AWS will consider investing in this area moving forward.” – The Stack has asked AWS for comment on the length of time it took to push a fix and whether it is considering improving the IAM policy simulator in response. 

We will update this story when we have a response. 

It’s not the first time that STS has had security problems. Amazon Elastic Kubernetes Service also suffered from authentication token-related bugs in 2020 (one allocated CVE-2022-2385, another detailed here.)

Stedi’s technically detailed blog is here

Join peers following The Stack on LinkedIn