Default profile banner
HG

Hanna Gupta

@user.2537928

Data Scientist

Mumbai

Blenheim ChalcotLovely Professional University

Hanna Gupta is a Data Scientist at Blenheim Chalcot with expertise in ML pipelines, LLMs, RAG systems, and Azure-based document processing for the UK public sector. She developed an ML pipeline processing 500K+ NHS patient records, engineered a document pipeline that reduced processing time by 97%, and led 6+ generative AI solutions achieving 80% efficiency gains. Hanna holds a B.Tech in Computer Science specializing in Data Science (AI & ML) from Lovely Professional University with a GPA of 8.4.

Experience

Data Scientist

Blenheim Chalcot

•Mar 2024 - Present•Mumbai | London

Developed ML pipeline processing 500K+ NHS patient records with temporal/weather data, improving resource allocation by 35%. Enhanced model with SMOTE sampling and time-based validation splits, achieving 95% improvement in no-show predictions. Implemented ensemble models (XGBoost/Random Forest), reducing features by 40% while maintaining 95% accuracy and cutting costs by 30%. Built FastAPI server for FOI/SAR document intake leveraging LLMs for 85% faster contextual multi-class classification with 99.9% uptime. Developed transformer and graph-based entity detection system with Azure PII and NER, improving redaction accuracy from 65% to 90%. Engineered Azure/PostgreSQL document pipeline with fine-tuned LLMs, reducing processing time by 97% (from 5 days to 15 minutes) while maintaining 100% compliance. Implemented RAG pipeline with vector database for automatic answer retrieval.

Data Science Intern

Blenheim Chalcot

Internship•Jun 2023 - Feb 2024•Mumbai | London

Led 6+ Generative AI solutions for the UK public sector, including social services automation that reduced documentation time from 4 hours to 45 minutes for 200+ workers (80% efficiency gain). Built OpenAI/Flask-based request triage system handling 5K+ weekly requests with 95% accuracy and CI/CD deployment reducing response time by 60%. Implemented RAG system with Pinecone/LangChain achieving 99.9% data security while reducing retrieval time by 70%. Created multimodal (LLM and diffusion model) comic generator producing 1,000+ educational materials with 85% satisfaction.

Education

Lovely Professional University

B.Tech

Computer Science specialization in Data Science (AI & ML)

Aug 2020 - Jul 2024•Grade: 8.4/10

Licenses & Certifications

Neural Networks and Deep Learning

Coursera, DeepLearning.AI

Issued: Sep 2024

Supervised Machine Learning: Regression and Classification

Coursera, DeepLearning.AI, Stanford

Issued: Oct 2024

Skills

Python
Java
SQL
NoSQL
FastAPI
Flask
Streamlit
SQLAlchemy
Machine Learning
NLP
Deep Learning
Generative AI
LLMs
LangChain
LlamaIndex
Prompt Engineering
RAG
Diffusion Models
Knowledge Graph
DeepEval
TensorFlow
Scikit-Learn
PyTorch
Pinecone
ChromaDB
FAISS
Neo4j
Azure
Databricks
Power Automate
Power BI
XGBoost
Random Forest
SMOTE