Simplify Local Development with MinIO: An S3-Compatible Solution

Overview

MinIO is a high-performance, S3-compatible object storage solution. Its API is compatible with the major hyperscalers' object storage services such as Amazon's Simple Storage Service (S3), Google's Storage, and Azure's Blob Storage.

What is Object Storage?

Object storage is a data storage architecture for storing unstructured data. The file's data blocks are kept together as an object, which can be assigned a key to be used to identify the object. This object can then be placed, or written to, a storage pool, also known as a bucket. Objects can then be queried and accessed, using a consistent set of RESTful APIs.

Comparison with Block Storage

Capability Block storage Object storage
Capacity Limited Nearly unlimited
Storage method Data is stored in blocks of fixed size and reassembled on demand Non-hierarchical, unstructured data
Metadata Limited Unlimited and customizable
Data retrieval method Data lookup table, effectively a map Depends, but predominantly through a REST request
Performance Fast and storage is mounted directly Primarily depends on network latency. However, running an object storage solution (such as MinIO) directly will offer comparable results to block storage
Cost Depends on the vendor, but generally more expensive to mount a volume Usually less expensive but can grow costly through data transfer fees (egress fees)
API type iSCSI RESTful

Note on Egress Fees

Major hyperscalers typically do not charge, or charge only nominal amounts, for data ingress (uploading data to their services). However, data egress fees, also known as bandwidth or data transfer fees, are charges from cloud providers for moving or transferring data out of the cloud storage where it was uploaded. These fees are separate from the fees organizations pay for storing data in the cloud. These fees can quickly grow and become untenable for companies to maintain, and create a vendor lock-in for organizations. It is therefore important to conduct due diligence at the beginning of a project and assess which object storage service provider to work with.

Cloudflare's R2 waives egress fees, and offers a comparable cost for storage. This can quickly pay off if a platform is predicted to have many read operations monthly. If there is an interest in moving off of S3, Cloudflare offers Sippy, a service that allows for the copying of data to R2 as its served.

Pain Points in Local Development Using Cloud Object Storage

Developing using object storage, however, can be cumbersome for a platform engineer. Possibly a DevOps engineer needs to create the necessary Terraform to provision the bucket. Then engineers need to wade through the nuance of Identity Access Management (IAM), or even determine the correct permission structure on the objects or buckets. Not only does this cause friction, but it's susceptible to over-provisioning and permissive security postures.

Leveraging MinIO as a container (Docker or LXC) in a local development environment can create a uniform, consistent, and congruent development environment. Developers can easily swap volumes to facilitate tests and other workflows.

MinIO offers an SDK, but thanks to its commitment to S3 compatibility, developers can use the SDK of their cloud provider without needing to install a separate MinIO SDK. This allows them to leverage their cloud provider's services, such as Lambda or SQS, while seamlessly swapping in MinIO for their object storage needs within the context of the same SDK.


Tutorial

This tutorial will assume the reader has a basic familiarity with Docker, and will demonstrate how an existing Docker compose project can be extended with MinIO.

  1. Docker compose up an existing project. This project could look something like:
name: "example_project"

services:
  server:
    build:
      context: ./server
      dockerfile: Dockerfile
      target: ${NODE_ENV:-development}
    depends_on:
      db:
        condition: service_healthy
    environment:
      ...
    volumes:
      - "./server:/home/node/server"
    ports:
      - "${API_PORT:-3001}:${API_PORT:-3001}"
    networks:
      - example_project_network

  db:
    image: postgres:15.7-alpine
    environment:
      ...
    ports:
      - "${DB_PORT:-5432}:5432"
    healthcheck:
      ...
    volumes:
      - "db-data:/var/lib/postgresql/data"
    networks:
      - example_project_network

  web:
    build:
      context: ./web
      dockerfile: Dockerfile
      target: base
    environment:
      ...
    volumes:
      - "./web:/home/node/web"
    ports: # host:container
      - "${APP_PORT:-3000}:${APP_PORT:-3000}"
    networks:
      - example_project_network

volumes:
  db-data:

networks:
  example_project_network:
    name: example_project_network
    driver: bridge
  1. After adding the necessary configuration to the docker-compose.yaml file, the project can be created with docker compose up <-d optional>.
    1. Check the containers are running correctly with docker ps, or docker compose ls for project-specific.
  2. Create or add to an existing .env file COMPOSE_PROFILES=dev. Compose profiles help adjust applications for different environments. Those containers marked with a profile within the COMPOSE_PROFILESlist will be provisioned respectfully.
  3. This will ensure that only containers marked specifically with devare built and run. Those that are not
  4. Create a new file and call it docker-compose.minio.yaml and add the following:
name: "example_project"

services:
  minio:
    image: minio/minio
    profiles:
      - "dev" <--- Docker compose profile
    restart: on-failure
    env_file:
      - .env
    environment:
      MINIO_ROOT_USER: ${AWS_ACCESS_KEY_ID:-admin}
      MINIO_ROOT_PASSWORD: ${AWS_SECRET_ACCESS_KEY:-password}
      MINIO_REGION_NAME: ${AWS_REGION:-ca-central-1}
    networks:
      - example_project_network
    volumes:
      - buckets:/data
    ports:
      - 9000:9000
      - 9001:9001
    command: ["server", "/data", "--console-address", ":9001"]

  createbuckets:
    # Minio CLI to configure Minio Buckets
    image: minio/mc
    depends_on:
      - minio
    profiles:
      - "dev"
    env_file:
      - .env
    environment:
      AWS_REGION: ${AWS_REGION:-ca-central-1}
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID:-admin}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY:-password}
      AWS_S3_BUCKET_NAME: ${AWS_S3_BUCKET_NAME:-cats}
    networks:
      - example_project_network
    entrypoint: >
      /bin/sh -c "
      /usr/bin/mc config host add example_user http://minio:9000 ${AWS_ACCESS_KEY_ID} ${AWS_SECRET_ACCESS_KEY}; <-- create user
      /usr/bin/mc mb example_project/${AWS_S3_BUCKET_NAME} --region ${AWS_REGION}; <--creates a new bucket
      /usr/bin/mc anonymous set public 
example_project/${AWS_S3_BUCKET_NAME}; <-- set access policy
      exit 0;
      "

volumes:
  buckets:
  • This will pull down the MinIO container, and provision a new bucket with a MinIO Client "init" container. This bucket will be publicly available on the developer's local area network (LAN).
  • The developer can log in with the credentials above (admin and password as sensible defaults) to the MinIO interface via port 9001.
    • The S3 API will be available on port 9000 in this example.
  • Important Note:
    • MinIO does differ slightly from the hyperscalers' customizability. For instance, any retention or lifecycle policy must be created at the point of bucket creation.
Using MinIO in a Node.js project
  1. Adapt the following and add to a constructor or create a method similar to the following:
import aws, {Endpoint} from "aws-sdk";
  
  /**
   * Instantiate S3Client
   * @return {Promise<aws.S3>} - S3Client
   */
  async instantiateClient(): Promise<aws.S3> {
    const devEndpoint: Endpoint = new Endpoint(this.appConfig.s3Endpoint);

    const s3 = new aws.S3({
      endpoint: this.appConfig.production ? undefined : devEndpoint, <--- if NODE_ENV is set to production s3.<region>.amazonaws.com
      region: this.awsConfig.region,
      credentials: {
        accessKeyId: this.awsConfig.credentials.accessKeyId,
        secretAccessKey: this.awsConfig.credentials.secretAccessKey,
      },
      s3ForcePathStyle: true,
    });
    return s3;
  }
  1. In this case, this method will return an S3Client. This client then can be used to run queries against a bucket such as the following:
  /**
   * Upload file
   * @param {string} path - The key of the file
   * @param {File} file - The file to upload
   */
    async uploadFile(path: string, file: File): Promise<string> {
      const s3 = await this.instantiateClient();
      try {
        const {Key: key} = await s3
          .upload({
            Bucket: this.awsConfig.bucketName,
            Key: path,
            Body: file,
          })
          .promise();
        return key;
      } catch (error: unknown) {
        throw new InternalServerErrorException(`Failed to upload file (${path})`);
      }
    }
  1. Call uploadFile method from a controller or other class. This uploadFile promise will resolve after the file has been written to MinIO.
    1. Confirm the write of the file by going to the MinIO interface or by checking the container's data directory
      1. Create a shell in the container by running docker exec -it <id or name of MinIO container> /bin/bash
      2. Change directories to the bucket and list the contents - cd data/<name of bucket> && ls
  2. Get a link to the file and display it within the platform by generating a pre-signed URL with a 60min time-to-live (TTL)
  /**
   * Get a pre-signed URL for a file
   * @param {string} key - The key of the file to generate the pre-signed URL for
   * @returns {Promise<string>} - The pre-signed URL with the TTL set to an hour (consider setting this as a default, and providing an optional parameter)
   */
  async getPresignedUrl(key: string): Promise<string> {
    const s3 = await this.instantiateClient();

    const url = s3.getSignedUrl("getObject", {
      Bucket: this.awsConfig.bucketName,
      Key: key,
      Expires: 3600, // URL expires in 1 hour
    });

    return url;
  }

Conclusion

Object storage provides a scalable and cost-effective solution for managing unstructured data. Using MinIO as a container enables developers to maintain a uniform development environment, ensuring compatibility with major cloud providers' SDKs. This seamless integration allows developers to transition effortlessly between local development and cloud environments.