Simplifying Cloud Infrastructure: My Journey to AWS with Terraform

Deploying to the cloud can often feel like juggling dozens of moving parts—manually clicking through the AWS console, hoping you didn’t miss a security checkbox, and praying that your staging environment matches your production.

That is why I moved my entire infrastructure to Terraform. By treating my infrastructure as code, I’ve turned a manual, error-prone process into a repeatable, automated, and—most importantly—understandable workflow.

In this post, I want to share how I’ve architected my booking application on AWS using Terraform.

The Architecture: A “Reverse Proxy” Approach

My goal was simple: create a fast, secure, and modern booking application. To achieve this, I use a combination of serverless technologies and a powerful CDN.

1. The Frontend: Amazon S3 & CloudFront

My website assets—the HTML, CSS, and JavaScript—live in an Amazon S3 bucket. However, I don’t serve them directly from S3. Instead, I use Amazon CloudFront, a global Content Delivery Network (CDN).

  • The Connection: In my frontend.tf, I define the aws_cloudfront_distribution resource. This connects to my S3 bucket via Origin Access Control (OAC), ensuring that only CloudFront can read my bucket files. This effectively makes the site globally fast while keeping my raw files private.

2. The Backend: API Gateway & Lambda

For the heavy lifting—like handling form submissions and managing bookings—I use Amazon API Gateway and AWS Lambda.

  • The Connection: My api_gateway.tf file defines the endpoints (like /booking or /submit). It links these paths directly to my Lambda functions, which contain the business logic. Because it’s serverless, I don’t pay for idle servers—I only pay when a user actually interacts with my site.

3. The “Traffic Cop”: CloudFront Reverse Proxy

The secret sauce of this project is using CloudFront as a Reverse Proxy. Rather than forcing my frontend to make cross-domain API calls (which causes those annoying CORS headaches), I route both my frontend and my API through the same CloudFront domain.

  • The Connection: In my reverse_proxy.tf, I set up ordered_cache_behavior blocks. These blocks look at the URL path (e.g., /booking* or /submit*) and intelligently route that request to my API Gateway instead of S3.
  • Why it matters: Because the browser sees everything as coming from one domain, the “same-origin” policy kicks in. CORS errors disappear, and the architecture becomes much more secure by hiding the raw API Gateway URL from the public.

Why Terraform is a Game-Changer

Writing this in Terraform means I have a “Single Source of Truth.” If I need to update my API Gateway stage or change how my site routes traffic, I don’t go hunting through the AWS console. I simply update my .tf files and run terraform apply.

Some of the key wins for me have been:

  • Consistency: My sandbox environment is a perfect mirror of what I’ll eventually deploy to production.
  • Transparency: I can share my configuration with others, and they can see exactly how the api-gateway-policy in reverse_proxy.tf is constructed to handle headers.
  • Automation: I’ve even configured Terraform to dynamically inject my API URL into a config.js file at deployment time, so my frontend always knows exactly where to find the backend.

Final Thoughts

Moving to Infrastructure as Code hasn’t just made my deployments faster; it’s made them smarter. By leveraging CloudFront as a reverse proxy, I’ve cleaned up my frontend code, eliminated CORS issues, and built a foundation that can scale.

If you’re still clicking buttons in the AWS console, I highly recommend giving Terraform a try. Your future self—and your deployment logs—will thank you!

Happy coding, and see you in the cloud!

Fine tuning Ai to get the response you want and prevent the response you don’t want!

Tuning Your AI App: The “Sweet Spot” Strategy

Tools Used: Ollama + llama3, Spring Ai

Building an AI-driven booking system sounds straightforward until your Kenyan safari guide starts trying to sell trips to Hawaii. After some deep-dive troubleshooting with Spring AI and Vector Stores, I’ve found that the secret to a perfect RAG (Retrieval-Augmented Generation) system isn’t just better data—it’s finding the right Similarity Threshold.


The Problem: The “Over-Helpful” AI

When using a vector store (like SimpleVectorStore), the system looks for the “closest” match to a user’s question. If a user asks for “Hawaii” and your database only contains “Kenya,” the math will still find the closest thing it can—likely a Kenyan beach tour. Without a gatekeeper, your AI sees that “closest” match and assumes it’s relevant, leading to confusing responses.

The Solution: The Similarity Threshold

The Similarity Threshold acts as a filter. It tells the system: “Unless the match is at least this relevant, ignore it.”

  • Setting it too high (e.g., 0.8): The “Nothing works” zone. Even valid searches for “Safari” or “Elephants” get blocked because the mathematical match isn’t “perfect” enough.
  • Setting it too low (e.g., 0.0): The “Hawaii” zone. Everything gets through, including irrelevant data, making your AI hallucinate or provide wrong locations.

Finding the “Sweet Spot”

Through testing with Llama 3, I discovered that the industry-standard numbers (like 0.7) don’t always work for every local setup. In my case, 0.1 was the magic number.

How to Implement It in Spring AI

In the ConciergeChatService, you can enforce this by using a constant and applying it to your SearchRequest. This ensures that if the similarity doesn’t hit your target, the list of documents comes back empty, and you can handle it with a polite “Jambo! I only specialize in Kenya” message.

Java

private static final double MIN_SIMILARITY = 0.1; // The magic number
public ConciergeChatResponse getConversationalResponse(String userQuestion) {
List<Document> docs = vectorStore.similaritySearch(
SearchRequest.builder()
.similarityThreshold(MIN_SIMILARITY)
.topK(5)
.build()
);
if (docs.isEmpty()) {
return new ConciergeChatResponse("I only specialize in Kenyan tours.", docs);
}
// Proceed to generate AI response...
}

Key Takeaway

If your AI is leaking irrelevant results or staying silent when it should speak, don’t assume your code is broken.Instead:

  1. Set your threshold to 0.0 to see what scores your valid terms are getting.
  2. Slowly nudge the number up until the “noise” (like Hawaii) disappears.
  3. Let your Java logic handle the empty results rather than letting the AI guess.

Tuning these numbers is the difference between an AI that guesses and an AI that knows.

Design a site like this with WordPress.com
Get started