How to create an AWS S3 bucket for Postgres backups
Schedule a Github Action to move data from your Neon projects to S3 every night - Part 1
In this post, I’ll walk you through setting up an AWS S3 bucket to store Postgres backups. This is part 1 of a 2-part series on automating nightly backups for multiple Neon projects—a helpful approach if you’re managing hundreds or even thousands of Neon projects (e.g. in multi-tenant architectures with one project per customer).
Neon already provides S3-level durability and rollback options for changes made to databases, however some customers still want to back up data to their own S3. This is somewhat simple if you only have one project, but Neon users often adopt a database-per-tenant architecture—making manually backing up each project to S3 can feel overwhelming.
Setting up a scheduled GitHub Action for each database simplifies this process, making the workflow much easier to manage. This first post will focus on the AWS side of things; in the following post, I explain how the GitHub Actions work.
Setup AWS Providers and Roles
There are three parts to the AWS setup, they are:
- Creating an OIDC Identity Provider
- Creating a Role
- Creating an S3 bucket and updating the S3 bucket policy
Add an Identity provider
An OIDC (OpenID Connect) Identity Provider (IdP) in AWS is a third-party service that handles authentication. GitHub must be added as an identity provider to allow the Action to use your AWS credentials.
To create a new Identity Provider, navigate to IAM > Access Management > Identity Providers, and click Add provider.
On the next screen select OpenID Connect and add the following to the Provider URL and Audience fields.
- Provider URL: https://token.actions.githubusercontent.com
- Audience: sts.amazonaws.com
When you’re done, click Add Provider.
You should now see this provider is visible in the list under IAM > Access Management > Identity Providers.
Create Role
A Role is an identity that you can assume to obtain temporary security credentials for specific tasks or actions within AWS. Roles are used to delegate permissions and grant access to AWS services without the need for credentials like passwords or access keys.
To create a new Role, navigate to IAM > Access Management > Roles, and click Create role.
On the next screen you can create a Trusted Identity for the Role.
Select Trusted Identity
On this screen select Web Identity, then select token.actions.githubusercontent.com from the Identity Provider dropdown menu.
Once you select the Identity Provider, you’ll be shown a number of fields to fill out. Select sts.amazonaws.com from the Audience dropdown menu, then fill out the GitHub repository details as per your requirements. When you’re ready, click Next.
For reference, the options shown in the image below are for the following repository:
Add Permissions — Skip
You can skip selecting anything from this screen and click Next to continue.
Name, review and create
On this screen give the Role a name and description. You’ll use the Role name in the code for the GitHub Action. Consider naming this role using specifics as to avoid confusion later down the line.
When you’re ready click Create role.
Setup AWS S3 bucket
There are two parts to creating an S3 bucket, they are:
- Creating an S3 bucket
- Updating the bucket policy
Create S3 bucket
AWS S3 (Amazon Simple Storage Service) buckets are storage containers used to store objects in Amazon’s cloud storage service. An S3 bucket can store any amount of data, from files and documents to images and videos, or in the case of a database backup, a .gz
(GNU zip) file.
To create a new bucket, navigate to S3 > buckets, and click Create bucket.
On the next screen select General Purpose for the bucket Type and then give your bucket a name.
The most important thing to notice on this screen is the region where you’re creating the bucket.
It might be difficult to know ahead of time which region each and every one of your databases has been deployed to, and it’s not necessarily a big problem to have an S3 bucket in us-east-1 and a database in ap-southeast-1, but naturally, the greater the distance the data has to travel, the longer a backup job might take to complete.
S3 bucket Policy
To ensure the Role being used in the GitHub Action can perform actions on the S3 bucket, you’ll need to update the bucket policy.
Select your bucket then select the Permissions tab and click Edit.
You can now add the following policy which grants the Role you created earlier access to perform S3 List, Get, Put and Delete actions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::627917386332:role/neon-multiple-db-s3-backups-github-action"
},
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::neon-multiple-db-s3-backups",
"arn:aws:s3:::neon-multiple-db-s3-backups/*"
]
}
]
}
From the snippet above replace the Role name (neon-multiple-db-s3-backups-github-action) with your Role name and replace the S3 bucket name (neon-multiple-db-s3-backups) with your S3 bucket name. When you’re ready click Save changes.
And that’s it.
Finished
There are just a couple of things to note before moving on to the second part of this blog post series. You’ll be creating several GitHub Secrets to hold various values that you likely won’t want to expose or repeat in code. These are:
AWS_ACCOUNT_ID
: This can be found by clicking on your user name in the AWS console.S3_BUCKET_NAME
: In my case, this would be, neon-multiple-db-s3-backupsIAM_ROLE
: In my case this would be, neon-multiple-db-s3-backups-github-action
Make a note of these so you have them ready for the next part!
Neon is a Postgres provider that takes the world’s most loved database and delivers it as a serverless platform with autoscaling, scale-to-zero, and database branching. Get started with our Free Plan in seconds (no credit card required).