Simplify Local Development with MinIO: An S3-Compatible Solution
Overview
MinIO is a high-performance, S3-compatible object storage solution. Its API is compatible with the major hyperscalers' object storage services such as Amazon's Simple Storage Service (S3), Google's Storage, and Azure's Blob Storage.
What is Object Storage?
Object storage is a data storage architecture for storing unstructured data. The file's data blocks are kept together as an object, which can be assigned a key to be used to identify the object. This object can then be placed, or written to, a storage pool, also known as a bucket. Objects can then be queried and accessed, using a consistent set of RESTful APIs.
Comparison with Block Storage
Capability | Block storage | Object storage |
---|---|---|
Capacity | Limited | Nearly unlimited |
Storage method | Data is stored in blocks of fixed size and reassembled on demand | Non-hierarchical, unstructured data |
Metadata | Limited | Unlimited and customizable |
Data retrieval method | Data lookup table, effectively a map | Depends, but predominantly through a REST request |
Performance | Fast and storage is mounted directly | Primarily depends on network latency. However, running an object storage solution (such as MinIO) directly will offer comparable results to block storage |
Cost | Depends on the vendor, but generally more expensive to mount a volume | Usually less expensive but can grow costly through data transfer fees (egress fees) |
API type | iSCSI | RESTful |
Note on Egress Fees
Major hyperscalers typically do not charge, or charge only nominal amounts, for data ingress (uploading data to their services). However, data egress fees, also known as bandwidth or data transfer fees, are charges from cloud providers for moving or transferring data out of the cloud storage where it was uploaded. These fees are separate from the fees organizations pay for storing data in the cloud. These fees can quickly grow and become untenable for companies to maintain, and create a vendor lock-in for organizations. It is therefore important to conduct due diligence at the beginning of a project and assess which object storage service provider to work with.
Cloudflare's R2 waives egress fees, and offers a comparable cost for storage. This can quickly pay off if a platform is predicted to have many read operations monthly. If there is an interest in moving off of S3, Cloudflare offers Sippy, a service that allows for the copying of data to R2 as its served.
Pain Points in Local Development Using Cloud Object Storage
Developing using object storage, however, can be cumbersome for a platform engineer. Possibly a DevOps engineer needs to create the necessary Terraform to provision the bucket. Then engineers need to wade through the nuance of Identity Access Management (IAM), or even determine the correct permission structure on the objects or buckets. Not only does this cause friction, but it's susceptible to over-provisioning and permissive security postures.
Leveraging MinIO as a container (Docker or LXC) in a local development environment can create a uniform, consistent, and congruent development environment. Developers can easily swap volumes to facilitate tests and other workflows.
MinIO offers an SDK, but thanks to its commitment to S3 compatibility, developers can use the SDK of their cloud provider without needing to install a separate MinIO SDK. This allows them to leverage their cloud provider's services, such as Lambda or SQS, while seamlessly swapping in MinIO for their object storage needs within the context of the same SDK.
Tutorial
This tutorial will assume the reader has a basic familiarity with Docker, and will demonstrate how an existing Docker compose project can be extended with MinIO.
- Docker compose up an existing project. This project could look something like:
name: "example_project"
services:
server:
build:
context: ./server
dockerfile: Dockerfile
target: ${NODE_ENV:-development}
depends_on:
db:
condition: service_healthy
environment:
...
volumes:
- "./server:/home/node/server"
ports:
- "${API_PORT:-3001}:${API_PORT:-3001}"
networks:
- example_project_network
db:
image: postgres:15.7-alpine
environment:
...
ports:
- "${DB_PORT:-5432}:5432"
healthcheck:
...
volumes:
- "db-data:/var/lib/postgresql/data"
networks:
- example_project_network
web:
build:
context: ./web
dockerfile: Dockerfile
target: base
environment:
...
volumes:
- "./web:/home/node/web"
ports: # host:container
- "${APP_PORT:-3000}:${APP_PORT:-3000}"
networks:
- example_project_network
volumes:
db-data:
networks:
example_project_network:
name: example_project_network
driver: bridge
- After adding the necessary configuration to the
docker-compose.yaml
file, the project can be created withdocker compose up <-d optional>
.- Check the containers are running correctly with
docker ps
, ordocker compose ls
for project-specific.
- Check the containers are running correctly with
- Create or add to an existing
.env
fileCOMPOSE_PROFILES=dev
. Compose profiles help adjust applications for different environments. Those containers marked with a profile within theCOMPOSE_PROFILES
list will be provisioned respectfully. - This will ensure that only containers marked specifically with
dev
are built and run. Those that are not - Create a new file and call it
docker-compose.minio.yaml
and add the following:
name: "example_project"
services:
minio:
image: minio/minio
profiles:
- "dev" <--- Docker compose profile
restart: on-failure
env_file:
- .env
environment:
MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID:-admin}
MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY:-password}
MINIO_REGION_NAME: ${AWS_REGION:-ca-central-1}
networks:
- example_project_network
volumes:
- buckets:/data
ports:
- 9000:9000
- 9001:9001
command: ["server", "/data", "--console-address", ":9001"]
createbuckets:
# Minio CLI to configure Minio Buckets
image: minio/mc
depends_on:
- minio
profiles:
- "dev"
env_file:
- .env
environment:
AWS_REGION: ${AWS_REGION:-ca-central-1}
AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-admin}
AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-password}
AWS_S3_BUCKET_NAME: ${AWS_S3_BUCKET_NAME:-cats}
networks:
- example_project_network
entrypoint: >
/bin/sh -c "
/usr/bin/mc config host add example_user http://minio:9000 ${AWS_ACCESS_KEY_ID} ${AWS_SECRET_ACCESS_KEY}; <-- create user
/usr/bin/mc mb example_project/${AWS_S3_BUCKET_NAME} --region ${AWS_REGION}; <--creates a new bucket
/usr/bin/mc anonymous set public
example_project/${AWS_S3_BUCKET_NAME}; <-- set access policy
exit 0;
"
volumes:
buckets:
- This will pull down the MinIO container, and provision a new bucket with a MinIO Client "init" container. This bucket will be publicly available on the developer's local area network (LAN).
- The developer can log in with the credentials above (
admin
andpassword
as sensible defaults) to the MinIO interface via port 9001.- The S3 API will be available on port 9000 in this example.
- Important Note:
- MinIO does differ slightly from the hyperscalers' customizability. For instance, any retention or lifecycle policy must be created at the point of bucket creation.
Using MinIO in a Node.js project
- Adapt the following and add to a constructor or create a method similar to the following:
import aws, {Endpoint} from "aws-sdk";
/**
* Instantiate S3Client
* @return {Promise<aws.S3>} - S3Client
*/
async instantiateClient(): Promise<aws.S3> {
const devEndpoint: Endpoint = new Endpoint(this.appConfig.s3Endpoint);
const s3 = new aws.S3({
endpoint: this.appConfig.production ? undefined : devEndpoint, <--- if NODE_ENV is set to production s3.<region>.amazonaws.com
region: this.awsConfig.region,
credentials: {
accessKeyId: this.awsConfig.credentials.accessKeyId,
secretAccessKey: this.awsConfig.credentials.secretAccessKey,
},
s3ForcePathStyle: true,
});
return s3;
}
- In this case, this method will return an
S3Client
. This client then can be used to run queries against a bucket such as the following:
/**
* Upload file
* @param {string} path - The key of the file
* @param {File} file - The file to upload
*/
async uploadFile(path: string, file: File): Promise<string> {
const s3 = await this.instantiateClient();
try {
const {Key: key} = await s3
.upload({
Bucket: this.awsConfig.bucketName,
Key: path,
Body: file,
})
.promise();
return key;
} catch (error: unknown) {
throw new InternalServerErrorException(`Failed to upload file (${path})`);
}
}
- Call
uploadFile
method from a controller or other class. ThisuploadFile
promise will resolve after the file has been written to MinIO.- Confirm the write of the file by going to the MinIO interface or by checking the container's
data
directory- Create a shell in the container by running
docker exec -it <id or name of MinIO container> /bin/bash
- Change directories to the bucket and list the contents -
cd data/<name of bucket> && ls
- Create a shell in the container by running
- Confirm the write of the file by going to the MinIO interface or by checking the container's
- Get a link to the file and display it within the platform by generating a pre-signed URL with a 60min time-to-live (TTL)
/**
* Get a pre-signed URL for a file
* @param {string} key - The key of the file to generate the pre-signed URL for
* @returns {Promise<string>} - The pre-signed URL with the TTL set to an hour (consider setting this as a default, and providing an optional parameter)
*/
async getPresignedUrl(key: string): Promise<string> {
const s3 = await this.instantiateClient();
const url = s3.getSignedUrl("getObject", {
Bucket: this.awsConfig.bucketName,
Key: key,
Expires: 3600, // URL expires in 1 hour
});
return url;
}
Conclusion
Object storage provides a scalable and cost-effective solution for managing unstructured data. Using MinIO as a container enables developers to maintain a uniform development environment, ensuring compatibility with major cloud providers' SDKs. This seamless integration allows developers to transition effortlessly between local development and cloud environments.