Skilled Data Scientist with 5 years of total experience in data acquisition and engineering, statistical analysis, model building (machine learning, deep learning, time series, NLP) and deployment following CRISP-DM methodology.
Experience
Machine Learning Engineer
Asentech LLC
Prepared dataset for a healthcare domain problem, by web scraping, annotating, pre-processing. Built a conversational AI chatbot (using Python, NLP and Deep Learning) by leveraging the dataset to provide solutions to user queries. Bot can understand contexts and perform multi-class intent classification, entity extraction, sentiment analysis, and question answering tasks. Used state-of-the-art Transformer models (BERT) for fine tuning. Docker image was built and model was deployed on AWS using ECR and ECS (with Fargate). Designed CNN based model using TensorFlow which detects level of eye blindness caused due to diabetes. Implemented data augmentation, oversampling and transfer learning (using Inception V3) to boost accuracy from 66% to 82%. Used MLFlow for model tracking. Developed a multi-label classification model using AWS Sagemaker which detects several complications of diabetes. XGBoost gave best performance of 92% F1-score. Model was deployed using AWS Lambda and API Gateway.
Founding Member
Dukandar
Dukandar is an inventory management app for shopkeepers; with stock planning based on previous purchases, using ARIMA model. Built FAQ Chatbot using Machine Learning models. Monitor app and user metrics in Google Play. Product Marketing, User Acquisition, Customer Retention that helps retailers in stock audit & planning. Managed teams (tech and sales team) for building Dukandar.
Programmer Analyst
Cognizant India Pvt Ltd
Built ML-based Fixed Deposit propensity model for a banking client using Python and Pandas, to get the customers who are more prone to take FD. Process involved building an ETL pipeline to extract and transform data from multiple sources and aggregate in MYSQL. Performed exploratory analysis to identify trends, generated reports, visualizations and built dashboards using Tableau. AdaBoost helped to improve customer conversion rate by 35% and enhanced customer engagement to client. Built and deployed web app using Flask, PySpark, AWS EC2, which can identify the customers that are eligible for Loan, so that the client can specifically target those customers. Random Forest gave best performance with F1-Score of 73%. It helped to reduce total process time by 20%. Structured and re-modelled business process / conceptual / data models for clinical trials in ER Studio. Managed master data, developed mappings, workflow, profiling using Informatica Data Quality. Data Migration project: Historical and current data of 750 organizations was cleansed, validated and migrated from Payroll to modern Payroll + HCM system using Oracle & Datalake (client in-house ETL tool). This was a key project of Cognizant (120+ team members).
Education
Praxis Business School
PGP in Data Science
Karunya University
BTech in Computer Science
Licenses & Certifications
TensorFlow Developer Certificate
TensorFlow
AWS Certified Cloud Practitioner
AWS
ANSI SQL Skill
Cognizant
Big Data and Data Analytics
Cognizant