Resume — Viet Nguyen

Summary

Machine Learning Engineer and Data Scientist with hands-on experience building end-to-end ML pipelines, RAG systems, LangChain conversational AI, and serverless generative AI on AWS. Proficient in Python, scikit-learn, LangChain, Amazon Bedrock, FastAPI, and SQL. Pursuing an M.S. in Computer Science (Machine Learning) at Georgia Tech. Seeking ML Engineering and Data Science roles in generative AI and model deployment.

Skills

Languages & Tools: Python, SQL, Git, GitHub, Jupyter Notebook
Machine Learning: scikit-learn, Random Forest, Decision Tree, Gradient Boosting, Feature Engineering, SMOTE, Supervised Learning, Unsupervised Learning, ML Pipelines, Hyperparameter Tuning, GridSearchCV, Model Evaluation, Classification, Regression
Generative AI & LLMs: LangChain, ChatBedrock, Amazon Bedrock, Retrieval-Augmented Generation (RAG), Prompt Engineering, LLM APIs, Conversational AI, Vector Databases, Titan Embeddings, OpenSearch Serverless
Data Science: Pandas, NumPy, Matplotlib, Seaborn, SciPy, Exploratory Data Analysis (EDA), Statistical Analysis, Hypothesis Testing, A/B Testing
Cloud & AWS: AWS Lambda, Amazon API Gateway, Amazon S3, Amazon EMR, AWS EC2, CloudWatch, IAM, boto3, AWS Certified Cloud Practitioner
Model Deployment & APIs: FastAPI, Pydantic, REST API, Model Serving, joblib
Familiar With: PyTorch, TensorFlow, Docker, HuggingFace Transformers, LightGBM

Experience

Co-Founder & Operations Lead

2024 — Present

TechX Robotics · Tustin, CA

Managed full P&L for a $43,000+ revenue robotics education business across payroll, insurance, equipment, and tournament operations
Applied LLM-based comparative analysis (Claude, ChatGPT, Gemini) to evaluate owner compensation scenarios — direct application of LLM evaluation to a real financial decision
Designed competitive curriculum that led VEX Robotics teams to 1 World Championship and multiple State Championships

Marketing Data Analyst · Volunteer

2025 — Present

Association of Talent Development – Orange County (ATD-OC) · Anaheim, CA

Delivered a data-driven digital marketing analysis using an AI-powered agentic workspace, surfacing actionable growth opportunities across member engagement channels
Developed a 3-phase strategic roadmap targeting 15% membership growth and 40% engagement improvement
Deployed an interactive marketing analytics dashboard via Vercel to accompany the strategic report

Engineering & Robotics Teacher · VEX AI Robotics Coach

2016 — Present

Jeffrey Trail Middle School · Irvine, CA

Manage a $15,000–$20,000 annual engineering and robotics program budget
Teach AI, machine learning concepts, robotics, and computer science to middle school students
Coached teams to 4 consecutive World Championships and 5 consecutive State Championships

I.T. & Cybersecurity Support Intern

2023

Manhattan Beachwear · Cypress, CA

Supported post-ransomware IT overhaul following a $10M breach; deployed MFA, AI-endpoint protection, and secure remote access across all employee systems

Projects

LangChain Conversational AI — AWS Bedrock

March 2026

Built a stateful, context-aware staff scheduling chatbot using LangChain's ChatBedrock — implementing prompt templates, output parsers, conversation memory, and CSV document injection to power multi-turn AI conversations on Amazon Bedrock.

LangChainChatBedrockAmazon BedrockConversational AIPrompt EngineeringPython

RAG System — Amazon Bedrock Knowledge Base

March 2026

Engineered a retrieval-augmented generation (RAG) pipeline on Amazon Bedrock using Titan Embeddings for semantic search, OpenSearch Serverless as the vector store, and the RetrieveAndGenerate API to ground LLM responses in private enterprise documents.

RAGAmazon BedrockVector DatabasesOpenSearch ServerlessTitan Embeddingsboto3

House Price Prediction — Production REST API

March 2026

Deployed a Random Forest regression model as a production REST API using FastAPI and Pydantic schema validation, with startup model loading via joblib and a /health endpoint for monitoring.

FastAPIRandom ForestModel DeploymentREST APIscikit-learnPydantic

Customer Churn Prediction — Beta Bank

January 2026

Built a binary classification model using feature engineering, SMOTE oversampling, and GridSearchCV across 108 hyperparameter combinations. Achieved F1 = 0.6197 on a 20% minority-class imbalanced dataset.

scikit-learnClassificationSMOTEFeature EngineeringGridSearchCVRandom Forest

Big Data Benchmarking — Apache Spark on AWS EMR

April 2024

Benchmarked Apache Spark and Hadoop across EC2 instance types on Amazon EMR to identify the optimal distributed computing configuration for a real-world platform. Delivered cost/performance analysis via CloudWatch metrics.

Apache SparkPySparkAmazon EMRBig DataAWS EC2CloudWatch

Serverless AI Application — AWS Lambda + Bedrock

February 2026

Connected Amazon Bedrock to a frontend via Lambda and API Gateway to generate AI-powered flashcards from study notes — fully serverless, CORS-enabled, and prompt-engineered.

AWS LambdaAmazon BedrockAPI GatewayPythonIAM

Megaline Prepaid Plan Statistical Analysis

March 2026

Analyzed 500 clients across 5 merged tables to compute per-user monthly revenue; confirmed via two-sample t-test that Surf generates significantly more revenue than Ultimate (avg $50.33 vs $47.31, p ≈ 0).

PythonPandasSciPyHypothesis TestingStatistical Analysis

Mobile Plan Recommendation Engine

December 2025

Random Forest classifier recommending the right Megaline mobile plan — 81.8% test accuracy, beating the target by 6.8 points.

scikit-learnRandom ForestDecision TreeClassification

Chicago Taxi Market Analysis

November 2025

SQL + hypothesis testing proving bad weather extends Loop-to-O'Hare rides by 20.6% (t = 6.84, p ≈ 0).

PythonSQLSciPyHypothesis TestingEDA

Video Game Sales Analysis

July 2025

Mined 16,715 game records to build a data-backed 2017 ad strategy — hypothesis-tested platform and genre preferences across NA, EU, and Japan.

PythonPandasMatplotlibHypothesis TestingEDA

IMDb "Golden Age" TV Analysis

April 2025

Tested whether highly-rated "Golden Age" TV shows also get the most IMDb votes — cleaned messy real-world data before confirming the hypothesis.

PythonPandasEDAData Preprocessing

Instacart Customer Behavior EDA

2024

Cleaned and analyzed 4.5M order records to uncover peak shopping windows, reorder rhythms, and top items across 206K customers.

PythonPandasMatplotlibEDA

VEX AI Robotics Competition Enhancement

February 2024

Proposed SpotFi localization, RGB-D 3D mapping, and P2P decentralized coordination to solve the limited field-of-view problem in autonomous VEX robots. (Georgia Tech CS6675)

VEX AIRGB-D MappingSensor FusionGeorgia Tech

AI Fairness in Housing Lending

Georgia Tech CS6603

Applied Disparate Impact and Statistical Parity Difference to 2.5M Fannie Mae mortgage records across Race and Gender; measured whether Reweighting bias mitigation survives classifier training.

PythonFairness MetricsBias MitigationGeorgia Tech

Certifications

AWS Certified Cloud Practitioner

2026

Amazon Web Services

AWS Cloud Institute — Cloud Application Developer

Expected 2027

Amazon Web Services

Education

M.S. Computer Science — Machine Learning Specialization

Expected 2028

Georgia Institute of Technology

AI and Machine Learning Bootcamp

Expected 2026

TripleTen

M.A. Teaching — Science Education

2011

University of Southern California