I recently built a serverless image upload and processing pipeline on AWS, and this post outlines the architecture, services used, key learnings, and tips that helped me along the way.

🚀 Project Overview

The goal was to build a system where users can:

  1. Upload an image via an API
  2. Automatically process the image using AWS Rekognition
  3. Store extracted metadata in DynamoDB
  4. Retrieve metadata through an API

Everything runs serverlessly, using AWS-managed services.


🧱 AWS Services I Used

Service Purpose
S3 Store uploaded images
Lambda Process images and handle API logic
API Gateway REST API endpoints for upload and metadata retrieval
DynamoDB Store image metadata (labels, timestamp, image ID)
Rekognition Extract labels from uploaded images
IAM Fine-grained access control between services
CloudWatch Logs for debugging and monitoring

🧩 Understanding the Architecture

The architecture follows an event-driven serverless pattern:

  1. A user uploads an image via a POST /upload endpoint.
  2. The image is stored in an S3 bucket.
  3. The S3 PUT event triggers a Lambda function.
  4. The Lambda reads the image and sends it to AWS Rekognition to get labels.
  5. Metadata (labels, timestamp, image key) is written to a DynamoDB table.
  6. A GET /metadata API allows retrieval of all or specific image metadata.
            User
              ↓
          API Gateway
              ↓
            Lambda
              ↓
              S3
              ↓
         (S3 Trigger)
              ↓
            Lambda
              ↓
    Rekognition + DynamoDB
              ↓
      GET /metadata (API)

🧠 What I Learned

Here are some practical lessons I picked up:

✅ Everything Must Be in the Same Region

  • Lambda can’t be triggered by S3 events across regions.
  • Rekognition doesn’t support cross-region usage.
  • Keep all services (S3, Lambda, DynamoDB, API Gateway) in the same region.

✅ Binary Uploads with API Gateway (HTTP APIs)

  • When sending images via HTTP API, API Gateway base64-encodes the body.
  • Your Lambda must check isBase64Encoded and decode it.
  • Also, API Gateway lowercases all headers, so use headers.get(‘content-type’).

✅ IAM Best Practices

  • Avoid full-access policies. Use least-privilege IAM policies.
  • Restrict Lambda access to specific S3 buckets, DynamoDB tables, and actions like rekognition:DetectLabels.

✅ Structured Logging Helps

  • Use print() in Lambda functions for debugging.
  • CloudWatch helps you trace image processing and API errors in near real-time.

✅ Progressive API Building

  • Start with the core upload → process pipeline.
  • Add GET endpoints later to expose only what’s needed (e.g., excluding S3 URLs from metadata).

🎯 Skills Gained

  • Event-driven architecture on AWS
  • End-to-end integration using Lambda, S3, API Gateway, and Rekognition
  • DynamoDB usage patterns (partition keys, on-demand mode)
  • IAM permissions and CloudWatch debugging
  • Building APIs for image processing pipelines

📌 What’s Next?

I plan to build a simple frontend to visualize the metadata and uploaded images, and maybe even expand the system with:

  • User authentication using AWS Cognito
  • Step Functions to orchestrate more complex processing

If you’re just getting started with AWS Lambda and serverless architecture, this kind of project is a perfect mix of learning and showcasing. Feel free to reach out if you’d like to try building it yourself or run into issues — happy to help!