Hi, I'm

Sagorika Ghosh

I build |

Applied ML Engineer & Data Scientist at University of Washington crafting intelligent systems with RAG, LLMs, Deep Learning, and Production ML.

Sagorika Ghosh
ML/AI
Data Science
LLMs & RAG
Agentic AI
A/B Testing
Statistical Analysis
Scroll Down

01. About Me

I'm an Applied ML Engineer and Data Scientist with a passion for building intelligent systems that solve real-world problems. I recently graduated with my Master's from University of Washington, specializing in Retrieval-Augmented Generation (RAG), LLM, Agentic AI Systems, and Scalable ML Pipelines.

With experience at Uber, American Express, Contextual AI, AWS, Hitachi Vantara, and Microsoft, I bridge the gap between cutting-edge research and production-grade software. I don't just train models; I deploy them to create tangible impact.

Sagorika Ghosh

Most recently at Contextual AI, I designed a multilingual multimodal RAG system with hybrid retrieval, cross-encoder reranking, and LLM-as-a-Judge evaluation, achieving 86.5% Recall@3 across 150+ technical manuals. I've also built deep learning models for real-time predictions, forecasting systems across global markets, and scalable ETL pipelines.

I'm a University Gold Medalist from GGSIPU with a perfect 4.0 GPA, and I believe in building ML systems that don't just work in notebooks—they work in production.

3.9 MS GPA
4.0 UG GPA
4+ YOE
6+ Projects

Education

University of Washington

M.S. in Data Science

GPA: 3.9/4.0 • Seattle, WA

Sep 2024 – Mar 2026

NLP LLM Serving Systems Deep Learning Applied Statistics Agentic AI RAG AWS

Guru Gobind Singh Indraprastha University

B.Tech in Computer Science & Engineering (University Gold Medalist)

GPA: 4.0/4.0 • Delhi, India

Aug 2018 – Jun 2022

Applied Mathematics Programming Fundamentals Probability and Statistics Machine Learning DBMS

02. Skills & Technologies

Languages

Python
Python
C++
C++
Java
Java
SQL
SQL
JavaScript
JavaScript

ML & Modeling

PyTorch
PyTorch
TensorFlow
TensorFlow
Regression & Classification
Time-Series Forecasting
NLP & Transformers

LLMs & GenAI

RAG Systems
Agentic AI & CrewAI
Prompt Engineering
LLM Evaluation

Systems & Data

Spark & Hive
AWS (Bedrock, SageMaker)
Snowflake & Databricks
ETL Pipelines

Experimentation & Tools

A/B Testing & CUPED
Causal Inference
Git
Git & CI/CD
Docker
Docker
Tableau & Streamlit

03. Experience

AWS Logo

AI Practitioner

Amazon Web Services (AWS)
Feb 2026 – Apr 2026

Built enterprise-grade AI workflows using Amazon Bedrock and SageMaker as part of a selective technical cohort, focusing on RAG patterns, foundation model selection, and inference tuning with cost-latency optimization.

Amazon Bedrock SageMaker RAG LLMs
Contextual AI Logo

Applied Science Capstone Apprenticeship

Contextual AI
Sep 2025 – Mar 2026

Built a multilingual, multimodal RAG system for HVAC refrigerant recovery that supports WhatsApp voice and image queries, OCR-based equipment parsing, and grounded QA over 150+ technical manuals. Designed the retrieval stack with hybrid search, metadata-aware filtering, query expansion, and reranking, and benchmarked both open-source and Contextual AI pipelines on retrieval quality, latency, and hallucination detection.

RAG OCR Cross-Encoder LLM-as-Judge
Uber Logo

Applied Scientist Intern

Uber Technologies
Jun 2025 – Sep 2025

Worked on delivery ETA and marketplace optimization at Uber by improving ATD prediction with ML models and production SQL pipelines, using statistical analysis to evaluate dispatch decisions under supply and food readiness constraints, and supporting A/B tests that improved ETA accuracy and reduced courier time per trip in undersupplied markets.

PyTorch Deep Sets A/B Testing CUPED SQL
American Express Logo

Data Scientist 2

American Express
Aug 2023 – Aug 2024

Improved fraud detection recall by 1.2% through XGBoost optimization, driving $20K in annual loss reduction. Developed LSTM-based cloud autoscaling models with 98.9% accuracy and built global BiLSTM forecasting systems for FX rates and inbound calls across 28 global markets.

XGBoost LSTM CNN-LSTM Python
American Express Logo

Data Engineer

American Express
Aug 2022 – Aug 2023

Migrated 75% of global asset management data to a big data lake, optimizing ETL pipelines for 12% faster processing. Built high-throughput Java microservices routing 200K+ daily requests, reducing infrastructure costs by $10.2K annually.

Java ETL Big Data Microservices
Hitachi Vantara Logo

ML Engineer

Hitachi Vantara
Feb 2022 – Jul 2022

Optimized IQ data pipelines by integrating U-Net autoencoder-based anomaly detection, improving data processing efficiency by 12.3% and enhancing real-time monitoring capabilities.

U-Net Anomaly Detection Python
Microsoft Logo

Engage Intern

Microsoft
Oct 2021 – Dec 2021

Developed a neural collaborative filtering recommender with constraint optimization for scheduling, achieving a 29% improvement in overall scheduling efficiency and system performance.

Collaborative Filtering Optimization Python

04. Featured Projects

End-to-end ML systems built for production, not just prototypes.

RAG • Investment Analysis

Quant Copilot

RAG-based investment analysis system on Databricks, serving Llama inference over Delta tables (5M+ rows) with demonstrated 60% end-to-end time reduction in manual vs. RAG-assisted benchmarks.

1.8s Median Latency
5M+ Rows
60% Time Reduction
Databricks Llama Delta Tables RAG
Deep Learning • Healthcare

Breast Cancer Dashboard

Interactive dashboard for breast cancer analysis featuring ResNet-based classification, Kaplan-Meier survival analysis, and exploratory data analysis with Streamlit interface.

ResNet Kaplan-Meier Streamlit EDA

05. Get In Touch

I'm currently looking for opportunities in Applied AI/ML, Data Science, and ML Modelling. Whether you have a question, want to discuss a project, or just want to say hi, my inbox is always open!

Let's Connect
Ask me something
about me!

Portfolio Assistant

Ready to help

Hi there! 👋 Ask me anything about Sagorika's background, skills, or projects.