S3 App File Storage

This part should be relatively easy since we've already done most of the work to get this part working.

Why S3?

Currently, all the images for the qr app are stored in the EC2 instance. This is not a good idea for a few reasons:

  1. It's not scalable. If we want to horizontally scale the app, we would need to distribute the images across multiple EC2 instances.
  2. It's not durable. If the EC2 instance fails, the images will be lost.
  3. It's cheaper to store the images in S3 than to store them on an EC2 instance's EBS volume.
  4. It's often better to have users access the images directly from S3 to save on bandwidth and reduce the load on the EC2 instance. And if we connect the S3 bucket to a CloudFront distribution, we can even speed up delivery of the images.

Setup

We already setup the bucket and the IAM policy to allow the EC2 instance to access the bucket. Now we just need to tell the qr code app to store images there instead of the file system.

Fortunately, the qr code app is already able to store images in S3, it just needs some environment variables to be set. The code in the app basically looks like this:

const bucketName = process.env.BUCKET_NAME;
const bucketRegion = process.env.BUCKET_REGION;
if (!bucketName || !bucketRegion) {
// save the image to the local file system
} else {
const s3Client = new S3Client({
region: bucketRegion
});
const key = `qr-codes/${name}`
const command = new PutObjectCommand({
Bucket: bucketName,
Key: key,
Body: buffer,
ContentType: "image/webp",
});
await s3Client.send(command);
const url = `https://${bucketName}.s3.amazonaws.com/${key}`;
}

Where as long as the BUCKET_NAME and BUCKET_REGION environment variables are set, the app will store the images in S3 instead of the file system.

step 1:

Add these two environment variables to the app.env file

sudo vim /etc/app.env

MYSQL_URL='mysql://my_app_user:StrongPassword123!@localhost:3306/my_app'
BUCKET_NAME='your-bucket-name'
BUCKET_REGION='your-bucket-region'
step 2:

Reload systemd and restart your service

sudo systemctl daemon-reload
sudo systemctl restart myapp.service

Now try out the app. Add a new photo and click the SaveQRCode button. Then check out the S3 bucket to see if the images were stored in S3.

If the images are not stored in S3, check the following:

  1. Make sure the BUCKET_NAME and BUCKET_REGION environment variables are set correctly in the app.env file.
  2. Check the application logs to see if there are any errors.
  3. Make sure the app has the correct permissions to access the S3 bucket. This was setup previously in the IAM EC2 part.
Show timestamps
00:00
Now we're gonna set up the QR code app
00:02
to store image files in an S3 bucket
00:05
instead of in the EC2 file system.
00:07
And this is a pretty common thing to do
00:10
with web applications where users can generate content
00:13
or upload images.
00:14
Instead of storing that on the EC2 file system,
00:17
we choose to store it in an S3 bucket
00:19
because that allows us to horizontally scale our application.
00:21
If we have multiple EC2 instances running our application,
00:25
then it would be really difficult to scale our image files,
00:28
in this case, across multiple EC2 instances
00:30
so they're all accessible.
00:32
So by storing them in a single location,
00:34
that makes it a lot easier
00:35
to horizontally scale our application.
00:37
It also is much cheaper to store it in S3
00:39
than in an EBS drive.
00:41
And we can also connect that to a CloudFront distribution
00:44
if we wanted.
00:44
So people could upload an image to the S3 bucket,
00:47
but if a bunch of people globally need to see that image,
00:50
then they can get it straight
00:51
from their closest edge location.
00:53
And I already have the application set up.
00:55
I have my EC2 instance right here.
00:56
So if we go
00:57
to the public IP address
01:00
and go to generate a new QR code,
01:02
I have the option to upload an image here
01:05
and that will generate a QR code
01:06
with the image as the background.
01:08
So then if I save this code,
01:10
this image is gonna be uploaded
01:12
to the EC2 instances file system.
01:15
And we could actually go and see that.
01:16
So I'm logged in to the EC2 instance here.
01:19
And if I cd into the app directory,
01:22
I think they're in the server directory.
01:25
Yeah, so if I list the contents
01:26
of the uploads directory,
01:29
I have a bunch of image files here.
01:31
So essentially every time I've been creating a QR code,
01:35
it creates an image for each of the different styles
01:38
so that it's really quick and easy
01:39
to then get those images later on.
01:41
I don't have to process them over and over again.
01:42
They get processed once and saved.
01:44
So currently just for this web app,
01:46
they're being stored in the server directory
01:48
and that can be fine for a small application.
01:50
But like I said, we're gonna change this
01:51
so that these images actually get stored in an S3 bucket
01:54
instead of on the EC2 instance.
01:56
And the application code for this server
01:59
has already been set up that if we have a bucket name
02:02
and a bucket region in the environment variables
02:05
on the EC2 instance,
02:07
the application is gonna grab those details
02:09
and then using the S3 client JavaScript SDK from AWS,
02:14
we can make a put request from TypeScript
02:17
to send the images to S3
02:19
instead of in the local file system.
02:21
And if we're using a different language,
02:23
we can use a library for that language,
02:25
or make the request directly.
02:27
But essentially any piece of software that you're writing
02:30
will be able to just add a little bit of code like this,
02:32
a put object request and store files in an S3 bucket.
02:37
And I have a simplified version of this.
02:40
So we can see kind of the key components
02:42
as long as there is a bucket name
02:44
and bucket region environment variable on the EC2 instance,
02:48
then it will just try and make that put object request
02:50
to the S3 bucket.
02:51
So what we need to do is have a bucket
02:53
and update the environment variables on the EC2 instance.
02:54
To store it in that bucket.
02:57
So I'm gonna head back over to the AWS console
03:00
and I'm gonna go to S3
03:01
'cause you could use the S3 bucket
03:03
that we set up at the beginning of this section,
03:05
but I actually deleted that one.
03:06
So I'm gonna create a brand new bucket.
03:07
So I'm gonna create a general purpose bucket.
03:09
This is my QR codes app bucket.
03:12
So I'll store all the user uploaded assets in this bucket.
03:16
And I don't need to modify any of the other settings.
03:18
It is gonna be private by default.
03:20
It's gonna be secure by default.
03:21
And that name is taken QR codes app demo.
03:24
So I'll create that bucket.
03:27
So here is the bucket that I just made.
03:29
This is the name of the bucket and the region I'm in Oregon.
03:33
So the region is gonna be US West two.
03:35
So back in my EC2 instance,
03:38
I am going to sudo vim/etc/app.env.
03:43
'Cause that's where we're keeping the environment variables
03:47
for this application.
03:49
And then I need that bucket name environment variable.
03:53
And that's just the name of the bucket we made.
03:55
And then I'm also gonna need bucket region.
03:59
And this is gonna be US West two,
04:01
the location where the bucket is.
04:03
So these are the environment variables I need
04:05
for that application.
04:06
I'm gonna save this file and then sudo system,
04:09
CTL restart my app dot service.
04:13
And now my application should try and save assets
04:17
to this S3 bucket every time I upload a new QR code.
04:20
So I'm gonna go back to the homepage and I am gonna upload,
04:23
the image again and save this QR code.
04:27
And I'm getting error on the front end,
04:28
error saving QR code, please try again.
04:30
That was a server error.
04:32
So to see that error,
04:33
we could go back into the EC2 instance
04:35
and check the logs there.
04:36
But since we already have CloudWatch set up,
04:39
let's see if we can check the logs in there.
04:42
So I'll go back to log groups and my QR app log group.
04:47
And then at the bottom here,
04:49
I'm expecting one of these to be the error logs.
04:51
That's level 50.
04:52
I know those are errors.
04:53
So I'm gonna go back to my S3 service exception.
04:55
And the message here is saying that the IAM role,
04:58
test IAM role,
04:59
which is the current role that I have attached
05:01
to the EC2 instance that we set up earlier,
05:03
is not authorized to perform put object
05:05
on my QR codes app demo again, S3 bucket.
05:09
And that is because if we go back over to IAM
05:13
and I go to roles and get my test IAM role,
05:17
'cause that's the one that's attached
05:18
to the current EC2 instance,
05:20
I can see that two policies,
05:21
one's for CloudWatch and one's that old IAM
05:23
demo S3 policy that I had for the old bucket,
05:26
but I don't actually have a new policy
05:28
for this new bucket that I made.
05:29
So what I'm actually gonna do is I'm gonna modify
05:31
this policy.
05:32
I could delete this policy and create a new one,
05:33
or I could just add another one.
05:34
But since I deleted this bucket,
05:36
I'm just gonna modify this existing policy
05:38
because all I need to do in here,
05:40
all of these actions are correct.
05:42
This is what I want the application to be able to do,
05:44
but I just need to change the name of the bucket.
05:46
So this is instead for the QR codes app demo bucket.
05:50
So now if I save this policy,
05:52
we'll be allowed to do those actions
05:53
on that specific resource.
05:55
So I'm going to go next and save the changes.
05:59
And this should be immediate.
06:00
I should now definitely have permission
06:03
to write to that S3 bucket.
06:05
So let's try saving the QR code again,
06:07
and it should be successful.
06:08
It's processing all those images.
06:09
It's trying to put it into the S3 bucket now.
06:11
And that seemed to be successful.
06:12
So if I head back over to the S3 bucket
06:15
and I go to objects, I'm gonna refresh this,
06:18
and I can see I have a QR codes folder.
06:19
That's where my application is putting those files.
06:23
And if we keep going down here,
06:24
here are all the files for the QR code.
06:27
And if I actually went to one of these and downloaded it,
06:29
I'll be able to see that this is one of the QR codes.
06:33
It's not a great one for this black and white image,
06:35
but the QR code assets are now being stored
06:38
in the S3 bucket.
06:39
So the application logic has to make sure
06:41
that it's putting objects in the bucket,
06:43
but we also need to make sure that the IAM role
06:46
has a policy that is allowing certain actions
06:50
on that specific bucket.
06:51
So the EC2 instance,
06:53
right now is storing all of its assets in this S3 bucket.
06:58
So essentially, if a user makes a request,
07:02
a post request to upload an image to this application,
07:06
the EC2 instance is gonna store that file in S3.
07:08
Then if the user wants to get that image back and view it,
07:12
we can follow a similar path where the client
07:14
can make a get request to the EC2 instance.
07:16
The EC2 instance can get the object from the S3 bucket.
07:19
It can go back to the instance and back to the user.
07:21
However, when we post the file,
07:23
we might wanna do some server-side authentication.
07:26
In this case, I'm actually processing the image heavily
07:28
on the server before I save it into the S3 bucket.
07:31
So in this case, it's a requirement that it goes through
07:34
the EC2 instance on its way to the S3 bucket,
07:36
but there's actually no reason why the image needs
07:39
to pass through the EC2 instance, take up compute,
07:42
take up network and bandwidth on my instance,
07:45
when realistically, when we post the image,
07:47
we have to do it this way, but when we get the image,
07:49
the client could just make a request directly to the S3 bucket
07:53
to get that image out of it.
07:54
Or we could have this S3 bucket connected
07:57
to a CloudFront distribution,
07:59
which would increase performance
08:01
and potentially even lower costs.
08:03
And instead, the client could make a get request
08:06
to the CloudFront distribution,
08:07
which would be connected to that S3 bucket.
08:09
So we have a lot of different options here.
08:11
And this is definitely the preferred way
08:12
that you would end up doing things like this
08:14
in large-scale applications where you have a lot
08:17
of get requests to static assets.
08:19
But for now, what we're gonna do is we're gonna enable
08:21
the user to just get
08:23
the assets directly from S3.
08:24
So we're not taking up EC2 compute or bandwidth
08:27
every time we make a get request for an image.
08:29
And they're all gonna be public images,
08:30
so we're not gonna have to have any security features
08:32
on this.
08:33
If you request an asset in the bucket,
08:34
you're just gonna get that back.
08:35
And if we head back over to the S3 bucket here,
08:39
I'm on one of the objects.
08:40
So this is one of the images that is stored
08:42
in the S3 bucket.
08:43
And if we come down here, we can see there's an object URL.
08:45
So there is a direct URL to access the assets
08:48
in the S3 bucket.
08:49
But if I were to try and open this URL,
08:51
we'll see that we get
08:53
access denied.
08:53
Because again, everything's gonna be secure
08:56
and locked down and private by default.
08:57
So even though there's a URL to access this asset,
08:59
we won't be able to access it
09:01
unless we grant permission somehow.
09:03
I also will note that although all of these assets
09:06
will be public, like anyone will be able to see them,
09:09
the names, the way these names are generated
09:11
is in a cryptographically secure way.
09:13
These aren't super secure, they're kind of short,
09:15
but you can make them even longer.
09:16
And a lot of applications work this way,
09:19
where you create a very random name for the image,
09:22
store it in
09:23
an S3 bucket and then you just make them all public.
09:26
And that's secure enough because people won't be able
09:29
to guess the image name.
09:30
So if you send a image in a private direct message
09:33
on Discord, for example,
09:34
that image is technically public to the entire world.
09:37
It's just that the URL is not guessable.
09:39
But there's no server-side auth to check
09:41
if you're allowed to access these images.
09:43
They're just public because it's more efficient
09:45
to just allow users to directly grab the asset
09:48
from the S3 bucket or from the CloudFront distribution,
09:50
rather than having to go through an EC2 instance,
09:53
or some compute, which is slower and more expensive.
09:55
So this is a very common way of allowing users
09:59
to grab static assets.
10:01
But like I said, we actually need to enable this feature.
10:03
So in the S3 bucket, I'm going to head back to the bucket
10:07
and I'm going to go to permissions inside of the bucket.
10:09
And the first thing we're going to do is allow public access,
10:12
just like we did for the static site.
10:14
So instead of blocking all public access,
10:16
we're going to allow that.
10:17
And I'm going to type in confirm.
10:20
We are allowing things to be accessed over the public internet.
10:22
But that's not enough because we have
10:24
to write a bucket policy, which, again, is a JSON policy.
10:28
It's different than an IAM policy,
10:29
but the idea is similar, that we have to specify exactly what
10:33
we're allowing and who or what is able to access
10:36
the assets within this bucket.
10:37
So we're going to edit this policy.
10:39
And we'll add a new statement here for S3,
10:43
where we're just going to allow getting object publicly.
10:46
So just the get object there.
10:47
And then if we go down, we're going to add a resource.
10:50
So this is going to be--
10:52
object resource.
10:54
So for the bucket name, this needs to be the bucket name
10:57
that we already have.
10:58
And then for the object, this is going to be any object.
11:00
So anything inside this S3 bucket,
11:01
you're going to be allowed to make a get request for it.
11:04
So I'm going to add that as the resource.
11:06
And then for principal, we're going
11:08
to allow this permission to be on basically everything.
11:12
Anything or anyone that makes a get request
11:15
is going to be allowed to get an object, any object,
11:19
in this S3 bucket.
11:21
So this is what the policy is going to look like.
11:22
And I will save these changes.
11:24
And now, if I go back to the objects here,
11:26
I'm just going to select a random image.
11:29
If we try and visit this URL, we're
11:32
going to see that we can actually
11:33
access the different images in here.
11:35
And I'm going to select a different one, one
11:37
that's a little bit more interesting.
11:38
But if we open that URL, we're now
11:40
able to access any of the assets in the S3 bucket.
11:43
So putting an asset will go through the EC2 instance,
11:45
but getting an asset straight from the S3 bucket,
11:47
and it wouldn't be that difficult to then connect this
11:49
to a CloudFront distribution.
11:50
And the way that this app is already
11:52
set up is that if we have the S3 bucket stuff already enabled,
11:57
then this is going to make the request to the S3 bucket.
12:00
So I can open up DevTools here.
12:01
I'm going to go into the Network tab,
12:03
and I'm going to refresh this page.
12:05
And I'm just going to filter on images.
12:07
Now, the previous images that I had before I set up the S3
12:09
bucket, they're on the EC2 instance.
12:11
These aren't going to work.
12:12
I'd have to reset the app somehow.
12:14
But this one up here, that's this image.
12:16
And if I start clicking on these different ones,
12:18
we can see that it's downloading these images.
12:21
And if I zoom in a little bit--
12:22
I'm going to open this up--
12:24
I can see that it's making that request directly to the S3
12:27
bucket.
12:27
So this application, just by default, knows to do that.
12:30
I have some code in there that if we have the bucket name
12:33
and bucket region there, it just knows to use the S3 bucket
12:35
instead.
12:36
And this makes it really easy to just get the assets directly
12:38
from the bucket.
12:39
And that is it for this IAM section.
12:41
We have an EC2 instance that is now
12:44
able to write logs to CloudWatch.
12:46
It is able to store and get assets from an S3 bucket.
12:50
And we have those IAM policies.
12:52
It's defining those exact permissions and then
12:55
a role that is allowing the EC2 instance to assume an identity
12:58
to actually do those things.
13:00
So once you have checked that this is working using
13:03
the Cloud Core CLI, you should definitely come in here
13:06
and terminate your EC2 instance and clean up anything else
13:10
that you have that you have created during this section.
Cloud Course
Cloud Course
$89.70
$299.00 Lifetime
  • 81+ learning resources
  • 57 lessons and tutorials
  • 15 hands-on deployments
  • 9 quizzes
  • 29 videos (4h 17m 38s)
  • More content coming soon
  • Unlimited lifetime access to all course content
  • Deployment assessment CLI tool
  • Exclusive Discord access
Original Price:$299.00
Discount:-$209.30
Total:$89.70