My Bucket, My Data! (or is it?)

By

AWS S3 has long become a standard for storing file object data. Despite the many efforts in making S3 secure, we continue to see data in private buckets exposed or exploited in novel ways over the years.

Just how many ways can I trip over my own buckets (and spill the data)? Short answer: too many.

To start, here's a checklist of a dozen key security configurations and best practices that should be considered for S3:

  1. Enable server side encryption for data-at-rest.
  2. Enforce "aws:SecureTransport" via bucket policy (deny non-TLS/HTTPS requests)
  3. For buckets with critical data, enable MFA delete.
  4. Configure "Block Public Access" bucket settings properly.
  5. Tag buckets / objects with Classification and Owner.
  6. Enable server access logging or logging with CloudTrail
  7. Enable event notifications to monitor for key changes.
  8. Enable cross-account cross-region replication on buckets storing critical data for disaster recovery.
  9. Leverage lifecycle configuration and versioning for resilience. 
  10. Consider access restriction via VPC endpoints or PrivateLink.
  11. Identify and regularly review all IAM roles, users, and user groups with access to important buckets. 
  12. Identify all buckets with public access and monitor whenever a new bucket is made public.

You've probably already seen the first 10 items of the above checklist somewhere -- in AWS security best practices or CIS AWS Foundations Benchmark. If you think you've got the last two items covered, think again. It may not be as trivial as it sounds.

Consider the following:

First, do you have an up-to-date inventory of all buckets across all accounts? Could a developer have created a new bucket or even a whole new AWS account without any security visibility? 

Next, understand that an S3 bucket (or objects within a bucket) can be made externally or publicly accessible in multiple ways beyond just bucket ACLs and IAM policies. Some of them can be tricky to identify due to the complex, multi-hop and/or cross-account relationships among connected resources.

Here are some questions we should ask:

  • Are there buckets granted access to someone outside of the owner account?
  • Are there buckets granted access to public facing EC2 instances via EC2 instance profile?
  • Are there buckets granted access to AWS services (such as CloudTrail, Config, Serverless Repo) without "aws:SourceAccount" condition to prevent potential cross-account attacks? 
  • Are there buckets accessible via cross-account VPC peering?
  • Which Okta users have access to production S3 buckets via SAML SSO?

Last, how can we reduce false positives and further identify risks by knowing which buckets are supposed to be public and therefore: a) filter out those exceptions and b) ensure those public buckets do not contain sensitive data or secrets? 

These questions can be difficult to answer and timing consuming to keep up. But there is a better way. If we take a relationship-focused approach in looking at the bucket configurations, access policies, connected users and resources in a graph, we could easily query and traverse the graph for answers. We can also set up continuous monitoring to detect drift and changes using these graph queries.

Let's look at some examples. 

The following examples are done using JupiterOne (free, lifetime license), although you should be able to achieve the same results yourself using Neo4j or some other graph technology if you choose to.

Question 1:
Are there buckets granted access to someone outside of the owner account?

Query (J1QL)

Find aws_s3_bucket with _source!='system-mapper' as bucket
that ALLOWS as grant * as grantee
(that ASSIGNED * as principal)?
where
bucket.accountId != grantee.accountId or
(principal._type!=undefined and bucket.accountId != principal.accountId)
return tree


Graph (with sample data for Question 1):

Graph with Sample Data -JupiterOne


Question 2:
Are there buckets granted access to public facing EC2 instances via EC2 instance profile?

Query (J1QL)

Find Internet
that allows aws_security_group
that protects aws_instance with active=true
that uses aws_iam_role that assigned AccessPolicy
that allows (aws_s3|aws_s3_bucket) with classification!='public'
return tree

 

Graph (with sample data for Question 2):

Graph with Sample Data - 02



Question 3:
Are there buckets accessible via cross-account VPC peering?

Query (J1QL)

Find (aws_s3|aws_s3_bucket)
that allows aws_vpc_endpoint
that has aws_vpc as vpc1
that connects aws_vpc as vpc2
where
vpc1.accountId != vpc2.accountId
return tree

 

Graph (with sample data for Question 3):

Graph with Sample Data - 03


Question 4:
Are there buckets granted access to AWS services without the "aws:SourceAccount" condition?

Query (J1QL)

Find aws_s3_bucket as bucket
that allows Service
with name = ('serverlessrepo' or 'cloudtrail' or 'config')
where
allows.conditions = undefined or (
allows.conditions !~= 'aws:SourceAccount' and
allows.conditions !~= 'aws:PrincipalOrgId' and
allows.conditions !~= bucket.accountId)
return tree


Graph (with sample data for Question 4):

Graph with Sample Data - 03


Question 5:
Which public buckets may contain sensitive data or secrets?

Query (J1QL)

Find (Everyone|aws_cloudfront_distribution)
that (allows|connects) aws_s3_bucket
that has Finding
with hasSecrets=true or
hasSensitiveData=true
return tree


Graph (with sample data for Question 5):

Graph with Sample Data - 05


Question 6:
Which Okta users have access to production S3 buckets via SAML SSO?

Query (J1QL)

Find okta_user that assigned AccessRole
that assigned AccessPolicy
that allows (aws_s3|aws_s3_bucket|aws_account) with tag.Production=true
(that has aws_s3)?
(that has aws_s3_bucket)?
return tree

 

Graph (with sample data for Question 6):

Graph with Sample Data - 06


Automated Data Analysis

Most of these questions are not that hard to answer, once you know what you are looking for, in a relatively simple and small environment. However, once your operations expand to multiple AWS accounts -- sometimes hundreds or even thousands of accounts, with potentially millions of resources across the entire environment, this can become an impossibly challenging task.

The only way to identify the issues, and to continuously monitor them, is with automated data analysis. 

With JupiterOne, you can easily turn this:

Standard List Output - JupiterOne


Into this:

Graph with Sample Data - JupiterOne

Conclusion

I'll leave you with this: let's not forget the "shared responsibility model" between cloud providers (AWS, in this case) and cloud consumers (you). While AWS secures the infrastructure behind the scenes, they also make it very flexible for you to configure the resources and their access.

Understanding this flexibility and applying controls properly is your responsibility. Yet this amount of flexibility can sometimes get in the way and complicate things. That's why I have long been an advocate of using a graph data model and automated data analysis to assist. 

Erkang Zheng
Erkang Zheng

I envision a world where decisions are made on facts, not fear; teams are fulfilled, not frustrated; breaches are improbable, not inevitable. Security is a basic right.

I am a cybersecurity practitioner and founder with 20+ years across IAM, pen testing, IR, data, app, and cloud security. An engineer by trade, entrepreneur at heart, I am passionate about technology and solving real-world challenges. Former CISO, security leader at IBM and Fidelity Investments, I hold five patents and multiple industry certifications.

I am building a cloud-native software platform at JupiterOne to deliver knowledge, transparency and confidence to every digital operation in every organization, large or small.

To hear more from Erkang, get our newsletter. No spam, just the good stuff once or twice a month. Sign up below.

Keep Reading

What’s new in JupiterOne: Reducing time to value with the new Query Builder (Part 2)
February 6, 2023
Blog
What’s new in JupiterOne: Reducing time to value with the new Query Builder (Part 2)

The new JupiterOne Query Builder streamlines your querying experience by eliminating errors, simplifying query builds, and reducing time to value.

The top 10 questions that every engineering leader should be able to answer
February 2, 2023
Blog
The top 10 questions that every engineering leader should be able to answer

We polled some of our engineering leaders to see what it takes to succeed. In part two, we see if their answers align with the CISOs we talked to.

Identify compromised versions of Github using JupiterOne
January 31, 2023
Blog
Identify compromised versions of GitHub apps using JupiterOne

As a preventative measure, Github will be deprecating the Mac and Windows signing certificates used to sign Desktop app versions 3.0.2-3.1.2 and Atom versions 1.63.0-

15 Mar 2022
Blog
One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.

15 Mar 2022
Blog
One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.

15 Mar 2022
Blog
One line headline, one line headline

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud eiut.