← Back to Projects
Serverless AI Application demo
01

Problem

Adding AI capabilities to an application is not just a model selection decision — it is an integration architecture decision. A foundation model that can generate high-quality content is only useful if it can receive requests from a frontend, process them reliably, and return structured responses in a form the application can consume. The traditional approach of provisioning dedicated servers for this integration layer adds operational overhead — servers to maintain, scale, and pay for whether requests are arriving or not. For teams building proof-of-concept AI applications or low-traffic tools, that overhead is disproportionate to the actual compute demand. What these use cases need is an event-driven architecture that activates only when a request arrives and costs nothing when idle.


02

Solution

This project built a fully serverless AI application that accepts study notes from a frontend, routes them through API Gateway and Lambda to Amazon Bedrock, and returns Claude-generated flashcards — with no servers to provision, no runtime to maintain, and zero cost when idle. API Gateway handled routing and CORS, Lambda handled request processing and Bedrock invocation, and IAM policies controlled which services the Lambda function was allowed to call. Prompt engineering shaped the flashcard output format, directing the model to return structured question-answer pairs rather than free-form text. Working through each layer's configuration — CORS headers, IAM permission scopes, JSON payload formatting, API Gateway integration settings — was where the real systems understanding was built. The project demonstrates that connecting a modern foundation model to a real application is primarily an infrastructure and integration problem, not a model problem.


03

Skills Acquired

Understanding those pieces individually is useful; seeing how they connect in a working system is where the architecture becomes clear.


04

Deep Dive

AI applications don't need dedicated servers to run. With the right cloud architecture, you can wire together a frontend, a REST API, and a foundation model — and the whole thing scales automatically, costs nothing when idle, and requires zero infrastructure management. That's the idea this project puts into practice.

This guided project from the AWS Cloud Institute had me build a serverless AI application end-to-end: a frontend that accepts study notes, an API Gateway endpoint that receives them, a Lambda function that processes and forwards the request to Amazon Bedrock, and a Claude-powered response that returns structured flashcards. Every piece is managed by AWS — no server to provision, no runtime to maintain.

The hardest part wasn't the AI — it was getting all the pieces to talk to each other correctly. CORS, IAM permissions, JSON formatting, and API Gateway configuration each had their own failure modes. Working through those is where the real learning happened.

Why This Project?

Understanding how to connect managed AI services to real applications is one of the most in-demand skills in cloud engineering right now. Companies don't need engineers who can train models from scratch — they need people who can wire foundation models into production workflows using the right infrastructure primitives. This project gave me direct, hands-on experience with exactly that: how API Gateway routes requests, how Lambda handles serverless compute, and how Bedrock invokes foundation models through a standardized interface.

I chose the flashcard use case because it's useful, clear, and forces you to think about prompt engineering — how to get a structured, predictable output from an LLM rather than freeform text. Reliable structure in LLM responses is a real engineering problem in production systems.


What You'll Learn from This


Key Takeaways


Architecture Overview

The application follows a straightforward serverless request flow. The user submits study notes through a web frontend. Those notes travel over HTTPS to an API Gateway endpoint, which triggers a Lambda function. Lambda formats the notes into a structured prompt, calls Amazon Bedrock to invoke the Claude foundation model, and returns the generated flashcards as a JSON response — which the frontend renders for the user.

Browser / Frontend
   ↓ HTTPS POST (study notes as JSON)
Amazon API Gateway
   ↓ Triggers Lambda on every request
AWS Lambda (Python)
   ↓ Formats prompt → calls Bedrock SDK
Amazon Bedrock (Claude)
   ↑ Returns structured flashcard JSON
Lambda → API Gateway → Browser
ComponentRoleWhy This Choice
API GatewayREST endpoint exposureManaged routing, CORS handling, no server to run
AWS LambdaBusiness logicServerless, scales to zero, cost per invocation
Amazon BedrockFoundation model invocationManaged AI API — no model hosting or infrastructure
IAM RoleSecurity & permissionsGrants Lambda precisely the access it needs, nothing more

How It Was Built

Step 1

Configure API Gateway & Enable CORS

Created a REST API in API Gateway with a POST /flashcards endpoint. CORS was enabled to allow the browser-based frontend to make cross-origin requests — without this, the browser's Same-Origin Policy blocks the call entirely before it even reaches the server.

CORS works by adding specific response headers that tell the browser: "This API explicitly allows requests from your origin." API Gateway adds these headers automatically when CORS is configured — but only if it's set up correctly for both the OPTIONS preflight request and the actual POST response.

Step 2

Write the Lambda Function

The Lambda function receives the API Gateway event, extracts the study notes from the request body, formats them into a structured prompt, and calls the Bedrock client to invoke the Claude model. The response is parsed and returned as JSON to API Gateway, which passes it back to the frontend.

import json
import boto3

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

def lambda_handler(event, context):
    body      = json.loads(event['body'])
    notes     = body.get('notes', '')

    prompt = build_prompt(notes)

    response = bedrock.invoke_model(
        modelId    = 'anthropic.claude-v2',
        body       = json.dumps({
            'prompt':      prompt,
            'max_tokens':   1024,
            'temperature': 0.3,
        }),
        contentType = 'application/json',
        accept      = 'application/json',
    )

    result    = json.loads(response['body'].read())
    flashcards = json.loads(result['completion'])

    return {
        'statusCode': 200,
        'headers': {
            'Access-Control-Allow-Origin': '*',
            'Content-Type': 'application/json',
        },
        'body': json.dumps({'flashcards': flashcards}),
    }

Step 3

Engineer the Prompt for Structured Output

Getting an LLM to return consistently structured JSON — rather than freeform prose — requires explicit prompt engineering. The prompt specifies the exact output format, provides an example, and instructs the model to return only the JSON with no additional commentary. Consistency here is a reliability concern: if the model occasionally adds explanation text around the JSON, parsing breaks.

def build_prompt(notes):
    prompt = ("Human: You are a study assistant. Convert the following notes\n"
              "into flashcards. Return ONLY a valid JSON array — no commentary.\n"
              '[{"front": "Q", "back": "A"}, ...]\n\n'
              "Notes to convert:\n" + notes + "\n\nAssistant:")
    return prompt

Temperature was set to 0.3 — low enough to produce consistent, structured output without making the model rigidly repetitive. Higher temperature values introduce more creative variation, which is the opposite of what you want when you need reliable JSON.

Step 4

IAM Role & Bedrock Permissions

Lambda functions run as IAM execution roles — they only have the permissions explicitly granted to them. I attached a custom policy to the Lambda execution role that allowed only the bedrock:InvokeModel action on the specific Claude model ARN. This is the principle of least privilege: the function can do exactly what it needs to do and nothing else.

# IAM policy attached to Lambda execution role
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect":   "Allow",
      "Action":   ["bedrock:InvokeModel"],
      "Resource": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2"
    }
  ]
}

Without this policy, Lambda's call to Bedrock returns an AccessDeniedException. IAM is the invisible foundation of every AWS application — permissions must be explicit, and debugging access errors is a standard part of building in AWS.


What I Built & What I Learned


What I Learned & Why It Matters to Employers

This project gave me hands-on experience with the architecture pattern that underlies most production AI applications: a managed compute layer (Lambda), a managed API layer (API Gateway), and a managed model layer (Bedrock). I didn't just read about how these services connect — I debugged CORS errors, fixed IAM permission denials, and iterated on prompt formatting until the output was reliably parseable. Those failure modes are the real curriculum. Anyone building AI-powered products on AWS will hit every one of them.

Conclusion & Reflections

The most valuable lesson from this project wasn't about AI — it was about integration. Getting a foundation model to produce good output is the easy part. Getting the request to actually reach it, getting the response to actually return to the user, and doing it securely with the right IAM permissions — that's where most of the work lives in production systems.

Serverless architecture removes the operational burden of managing servers and scaling infrastructure, but it doesn't remove complexity — it shifts it into configuration: API Gateway settings, Lambda environment variables, IAM policies, and CORS headers. Understanding each layer deeply is what separates someone who can follow a tutorial from someone who can build and debug systems independently.

ComponentStatus
API Gateway REST endpoint configuredCOMPLETED ✓
CORS enabled for cross-origin frontendCOMPLETED ✓
Lambda function written and deployedCOMPLETED ✓
Bedrock integration with structured promptCOMPLETED ✓
IAM execution role with least-privilege permissionsCOMPLETED ✓
End-to-end flashcard generation workingCOMPLETED ✓

Want to See the Full Code?

The complete Lambda function, IAM policy, and project notes are on GitHub.