My Bucket, My Data! (or is it?)

August 24, 2021

AWS S3 has long become a standard for storing file object data. Despite the many efforts in making S3 secure, we continue to see data in private buckets exposed or exploited in novel ways over the years.

Just how many ways can I trip over my own buckets (and spill the data)? Short answer: too many.

To start, here's a checklist of a dozen key security configurations and best practices that should be considered for S3:

Enable server side encryption for data-at-rest.
Enforce "aws:SecureTransport" via bucket policy (deny non-TLS/HTTPS requests)
For buckets with critical data, enable MFA delete.
Configure "Block Public Access" bucket settings properly.
Tag buckets / objects with Classification and Owner.
Enable server access logging or logging with CloudTrail.
Enable event notifications to monitor for key changes.
Enable cross-account cross-region replication on buckets storing critical data for disaster recovery.
Leverage lifecycle configuration and versioning for resilience.
Consider access restriction via VPC endpoints or PrivateLink.
Identify and regularly review all IAM roles, users, and user groups with access to important buckets.
Identify all buckets with public access and monitor whenever a new bucket is made public.

You've probably already seen the first 10 items of the above checklist somewhere -- in AWS security best practices or CIS AWS Foundations Benchmark. If you think you've got the last two items covered, think again. It may not be as trivial as it sounds.

Consider the following:

First, do you have an up-to-date inventory of all buckets across all accounts? Could a developer have created a new bucket or even a whole new AWS account without any security visibility?

Next, understand that an S3 bucket (or objects within a bucket) can be made externally or publicly accessible in multiple ways beyond just bucket ACLs and IAM policies. Some of them can be tricky to identify due to the complex, multi-hop and/or cross-account relationships among connected resources.

Here are some questions we should ask:

Are there buckets granted access to someone outside of the owner account?
Are there buckets granted access to public facing EC2 instances via EC2 instance profile?
Are there buckets granted access to AWS services (such as CloudTrail, Config, Serverless Repo) without "aws:SourceAccount" condition to prevent potential cross-account attacks?
Are there buckets accessible via cross-account VPC peering?
Which Okta users have access to production S3 buckets via SAML SSO?

Last, how can we reduce false positives and further identify risks by knowing which buckets are supposed to be public and therefore: a) filter out those exceptions and b) ensure those public buckets do not contain sensitive data or secrets?

These questions can be difficult to answer and timing consuming to keep up. But there is a better way. If we take a relationship-focused approach in looking at the bucket configurations, access policies, connected users and resources in a graph, we could easily query and traverse the graph for answers. We can also set up continuous monitoring to detect drift and changes using these graph queries.

Let's look at some examples.

The following examples are done using JupiterOne (free, lifetime license), although you should be able to achieve the same results yourself using Neo4j or some other graph technology if you choose to.

Question 1:
Are there buckets granted access to someone outside of the owner account?

Query (J1QL)

Find aws_s3_bucket with _source!='system-mapper' as bucket
that ALLOWS as grant * as grantee
(that ASSIGNED * as principal)?
where
bucket.accountId != grantee.accountId or
(principal._type!=undefined and bucket.accountId != principal.accountId)
return tree

Graph (with sample data for Question 1):

Question 2:
Are there buckets granted access to public facing EC2 instances via EC2 instance profile?

Query (J1QL)

Find Internet
that allows aws_security_group
that protects aws_instance with active=true
that uses aws_iam_role that assigned AccessPolicy
that allows (aws_s3|aws_s3_bucket) with classification!='public'
return tree

Graph (with sample data for Question 2):

Question 3:
Are there buckets accessible via cross-account VPC peering?

Query (J1QL)

Find (aws_s3|aws_s3_bucket)
that allows aws_vpc_endpoint
that has aws_vpc as vpc1
that connects aws_vpc as vpc2
where
vpc1.accountId != vpc2.accountId
return tree

Graph (with sample data for Question 3):

Question 4:
Are there buckets granted access to AWS services without the "aws:SourceAccount" condition?

Query (J1QL)

Find aws_s3_bucket as bucket
that allows Service
with name = ('serverlessrepo' or 'cloudtrail' or 'config')
where
allows.conditions = undefined or (
allows.conditions !~= 'aws:SourceAccount' and
allows.conditions !~= 'aws:PrincipalOrgId' and
allows.conditions !~= bucket.accountId)
return tree

Graph (with sample data for Question 4):

Question 5:
Which public buckets may contain sensitive data or secrets?

Query (J1QL)

Find (Everyone|aws_cloudfront_distribution)
that (allows|connects) aws_s3_bucket
that has Finding
with hasSecrets=true or
hasSensitiveData=true
return tree

Graph (with sample data for Question 5):

Question 6:
Which Okta users have access to production S3 buckets via SAML SSO?

Query (J1QL)

Find okta_user that assigned AccessRole
that assigned AccessPolicy
that allows (aws_s3|aws_s3_bucket|aws_account) with tag.Production=true
(that has aws_s3)?
(that has aws_s3_bucket)?
return tree

Graph (with sample data for Question 6):

Automated Data Analysis

Most of these questions are not that hard to answer, once you know what you are looking for, in a relatively simple and small environment. However, once your operations expand to multiple AWS accounts -- sometimes hundreds or even thousands of accounts, with potentially millions of resources across the entire environment, this can become an impossibly challenging task.

The only way to identify the issues, and to continuously monitor them, is with automated data analysis.

With JupiterOne, you can easily turn this:

Into this:

Conclusion

I'll leave you with this: let's not forget the "shared responsibility model" between cloud providers (AWS, in this case) and cloud consumers (you). While AWS secures the infrastructure behind the scenes, they also make it very flexible for you to configure the resources and their access.

Understanding this flexibility and applying controls properly is your responsibility. Yet this amount of flexibility can sometimes get in the way and complicate things. That's why I have long been an advocate of using a graph data model and automated data analysis to assist.

Erkang Zheng

I founded JupiterOne because I envision a world where decisions are made on facts, not fear; teams are fulfilled, not frustrated; breaches are improbable, not inevitable. Security is a basic right.

We are building a cloud-native software platform at JupiterOne to deliver knowledge, transparency and confidence to every digital operation in every organization, large or small.

I am the Founder and CEO of JupiterOne, and also a cybersecurity practitioner with 20+ years experience across IAM, pen testing, IR, data, app, and cloud security. An engineer by trade, entrepreneur at heart, I am passionate about technology and solving real-world challenges. Former CISO, security leader at IBM and Fidelity Investments, I hold five patents and multiple industry certifications.

Keep Reading

What’s New in Kubernetes Security: More CIS Rules, Smarter Detection, and Expanded Coverage | JupiterOne

June 13, 2025

Blog

What’s New in Kubernetes Security: More CIS Rules, Smarter Detection, and Expanded Coverage

New Kubernetes Rule Pack covers more CIS controls for Namespace and Secrets Management

Level Up Kubernetes Security with Our New Rule Pack Built on the CIS Benchmark | JupiterOne

June 3, 2025

Blog

Level Up Kubernetes Security with Our New Rule Pack Built on the CIS Benchmark

New Kubernetes Rule Pack covers 26 CIS controls for RBAC & Pod Security

Stop Stitching User Data Together. Get a Unified Identity Instead | JupiterOne

May 15, 2025

Blog

Stop Stitching User Data Together. Get a Unified Identity Instead.

Legacy IAM falls short. Identity first security uses continuous, contextual access controls to protect a decentralized world—far beyond the old perimeter.

15 Mar 2022

Blog

One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.

15 Mar 2022

Blog

One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.

15 Mar 2022

Blog

One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.

My Bucket, My Data! (or is it?)

Consider the following:

Let's look at some examples.

Question 1:
Are there buckets granted access to someone outside of the owner account?

Question 2:
Are there buckets granted access to public facing EC2 instances via EC2 instance profile?

Question 3:
Are there buckets accessible via cross-account VPC peering?

Question 4:
Are there buckets granted access to AWS services without the "aws:SourceAccount" condition?

Question 5:
Which public buckets may contain sensitive data or secrets?

Question 6:
Which Okta users have access to production S3 buckets via SAML SSO?

Automated Data Analysis

Conclusion

Erkang Zheng

SUBSCRIBE TO OUR NEWSLETTER

Stay up-to-date on emerging threats and security news

Keep Reading

My Bucket, My Data! (or is it?)

Consider the following:

Let's look at some examples.

Question 1:Are there buckets granted access to someone outside of the owner account?

Question 2:Are there buckets granted access to public facing EC2 instances via EC2 instance profile?

Question 3:Are there buckets accessible via cross-account VPC peering?

Question 4:Are there buckets granted access to AWS services without the "aws:SourceAccount" condition?

Question 5:Which public buckets may contain sensitive data or secrets?

Question 6:Which Okta users have access to production S3 buckets via SAML SSO?

Automated Data Analysis

Conclusion

Erkang Zheng

SUBSCRIBE TO OUR NEWSLETTER

Stay up-to-date on emerging threats and security news

SUBSCRIBE TO OUR NEWSLETTER

Keep Reading

Question 1:
Are there buckets granted access to someone outside of the owner account?

Question 2:
Are there buckets granted access to public facing EC2 instances via EC2 instance profile?

Question 3:
Are there buckets accessible via cross-account VPC peering?

Question 4:
Are there buckets granted access to AWS services without the "aws:SourceAccount" condition?

Question 5:
Which public buckets may contain sensitive data or secrets?

Question 6:
Which Okta users have access to production S3 buckets via SAML SSO?