7 min read

Introduction to AWS S3

The foundation of cloud storage is critical, like the AWS S3 (Amazon Simple Storage Service). Understanding this will help you leverage the storage options, objects, keys, and bucket and its limitation, access control, and policies.
a MacBook Air with a lot of stickers about programming on the back.

talk is cheap show me the code
just do it right now
open source
:wq
aws
jetbrains
cloudflare
rtfm 
unicorn
t-fashion
Photo by Mehmet Ali Peker / Unsplash

Introduction and Background

Amazon S3 is an exceptionally fantastic, reliable, and scalable storage service. It is so simple and excellent for data storage and retrieval. However, let us see some basic concepts around storage before discussing Amazon S3.

Table of Contents

Storage Options on AWS

There are three general types of storage services: block, file, and object. Let's see them one by one.

Block Storage

Typically this is accessed as a local storage device like a hard disk drive. Moreover, this type of storage isn't physically attached to a server. Furthermore, it is available as Elastic Block Store (EBS) and can be configured as volumes attached to your Elastic Compute Cloud (EC2) instances.

A good example is EBS volumes which aren't directly attached to the EC2 instance you deployed but are connected via high-speed network links. As an advantage, it allows us to detach EBS volume from one EC2 instance and attach it to another.

Note: Block storage uses management software that can be part of the operating system to organize data blocks. For example, when a user tries to access a specific file, the management software will identify the blocks to retrieve, reassembles it, and then presents it as a whole file to the user.

File Storage

It is also known as file-based storage, a highly available centralized place for storing files and folders. It can be accessed using the file level protocols such as the following:

  • Server Message Block (SMB)
  • Network File System (NFS)
  • Common Internet File System (CIFS)

It is typically used for storing and retrieving files and folders and chiefly for file sharing within a corporate organization. Because of the folder/sub-folder architecture, organizing an organization's data easily fits well with the current structure.

Amazon suggests three file storage systems:

  • Elastic File System (EFS) is designed to help you or your organization share file data without managing the storage as you would with Elastic Block Store (EBS). Moreover, your filesystem will grow and shrink as you add and remove data, and mount points can only be created on Linux EC2 instances.
  • FSx for Lustre (designed for Linux-based EC2 instances) is a high-performance filesystem intended for an application that needs super-fast storage and can scale to hundreds of gigabytes of throughput and millions of Input/Output Operation Per Second (IOPS).
  • FSx for Windows File Server is designed for Windows EC2 instances, offers a fully managed file-share solution, and natively supports the industry-standard Server Message Block (SMB) protocol.

Object Storage

Unlike the traditional file hierarchy, object storage involves storing files as individual objects. Moreover, it represents a flat file structure - wherein you create a container and place the objects within the container without a folder or file-level hierarchy structure (also known as unstructured data). We'll see more in the S3 section.

Benefits of using object storage

Scalability

Object storage in AWS is highly scalable; organizations can store practically unlimited amounts of data. Of course, AWS takes care of the scaling automatically to accommodate storage needs.

Availability

It is designed for high durability and availability, AWS replicates organizations' objects across multiple servers and data centers within a region, and AWS ensures in the event of failures like hardware or any disruptions, organizations' data remains accessible.

Developer Friendly

It provides API and SDK for easy integration into any platform. Developers can easily programmatically manage objects using AWS SDK, command line tools, or RESTful APIs.

What is AWS S3 (Amazon Simple Storage Service)?

Amazon offers S3 (one of Amazon's flagship products) as their cloud-based object storage service, making it easy to access storage over the internet. In other words, consumers of Amazon's S3 can store and retrieve any data anytime from anywhere in the world using the internet.

Moreover, it provides access to a highly scalable, reliable, efficient, and low-cost storage infrastructure. Even the company Amazon uses it to run its global network of websites. Furthermore, it offers up to 99.99% durability and fulfills the storage requirement for most clients and their individual business needs.

Note: Durability means if you store 10,000,000 objects on Amazon S3, you'll expect to incur a single object's loss once every 1,000 years.

Suggested Storage for AWS S3 (Amazon Simple Storage Service)

AWS S3 can store nearly any data type (in any file format). Moreover, it is ideal for storing static contents like graphics, documents, log files, audio and video, compressed files, etc.

Note: Presently, the permitted object size is 0 bytes to 5 TB.

What are Buckets?

It is a container; we can consider buckets like a folder (a logical unit) wherein we can store objects and folders. Here are some things we need to be aware of:

  • Buckets are created at the root level in S3 with a globally unique name across the AWS ecosystem.
  • Buckets are accessible via the S3 Uniform Resource Locator (URL)
  • The buckets' names must always be in lowercase letters. It can, however, contain numbers, hypens (-), and dots (.) only.
  • Bucket names can range between 3 to 63 characters only.
One takeaway about buckets, they act as a namespace for your objects; let's think of them as analogous to directories or files in a file system.

What are Objects and Keys?

These are the actual object that an organization stores in S3. It can be any file or data.

Here are some things we need to be aware of:

  • The filename of an object is called a key. Keys can be up to 1,024 bytes long, and your objects can consist of letters and numbers and the following characters: !, -, _, ., *, and ( ).
  • As discussed, objects are stored in an unstructured manner (flat filesystem), so we must know the unique identifier when accessing the object. Thus, buckets are accessible via their keys.
  • The maximum size of an object on Amazon S3 is 5TB.

Creating A Bucket

Before deep diving into some areas of a bucket, let's try to create one using the AWS Management Console.

Steps on Creating a Bucket

  • Open the Amazon S3 service.
  • Find and click the "Create bucket" button.
  • In the bucket name field - enter a unique name. More of this on the next section
  • Choose the AWS region where you want to create the bucket.
  • Going at the button part of the page you'll see "Create bucket" button but review the details before submitting.

Optional Things

  • There some optional things to configure about bucket settings such enabling versioning to store multiple versions of an object. Configure logging to record detailed access logs for the bucket. Even setup events to trigger actions when specific events occur in the bucket.
  • You can also choose the appropriate options for setting permissions. Although by default we can keep the settings.

Bucket Limitations

There are some bucket limitations that we need to be aware of.

Let's see them below.

  • Amazon imposes a soft limit of 100 buckets in each AWS account. Raising a support ticket to Amazon is recommended if you need to increase this soft limit.
  • Once a bucket has been created, you can't change its name or region.
  • Creating a bucket is associated with the AWS account and the user you have used to create it. Bucket ownership cannot be transferred to another AWS account or user within the same AWS account.

Bucket Access Control

Individual bucket in S3 is associated with an access control policy that controls how objects are created, deleted, and enumerated within the bucket.

When creating an S3 resource, all S3 resources (buckets, objects, lifecycle policy, or static website configuration) are private by default. Only the owner of the resource can access the resource. That's why the owner can optionally grant permissions to other users using an access control policy after creating a resource.

Bucket Policy

Let's try to show first a policy JSON document before we discuss what comprises a bucket policy.

{
  "Id": "Policy1685879241487",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1685879133102",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::s3bucketmanila",
      "Principal": {
        "AWS": [
          "iam_user2"
        ]
      }
    }
  ]
}

There's a tool on AWS console where in we can generate a policy it is called the AWS Policy Generator, see the sample screenshot below.

AWS Policy Generator

A bucket policy contains the following elements:

Resource - this shows Amazon S3 resources such as buckets and objects.

Action - this shows actions that are either allowed or denied. Our example s3:GetBucketLocation specifies the permission to get the region the bucket resides in. While s3:GetObject it specifies the permission to read the object data. Lastly,  s3:ListBucket it specifies the permission to list objects in the bucket.

Effect - this shows the action type, either allow or deny access. By default, access is denied if permission isn't explicitly granted on a particular resource.

Principal - this shows the account or user allowed or denied to the particular resources mentioned in the policy statement.

Sid - is an optional identifier known as statement  ID, specified for the policy statement, and must be unique.

Summary

In this post, we have discussed the storage options on AWS, such as block storage, file storage, and object storage.

From there, we focused on AWS S3 to understand the buckets, objects, and keys. Of course, we need to know how to create a bucket and some of its limitations, access control, and related policy to appreciate it.

I hope you have enjoyed this article, as I enjoyed it while writing.

Till next time, happy cloud computing and programming!

References: