Default profile banner
AM

Atharva Mulay

@atharvamulay

Data Scientist (AI/ML) at Capgemini

Pune, India

https://github.com/atharva369/Duplicate-Question-Pairs-classification

CapgeminiK.J. Somaiya Institute of Management Studies and Reserch

Atharva has nearly two years of experience building Data Science and Machine Learning solutions using Python, focusing on Computer Vision, NLP, and Deep Learning. He is skilled in deploying ML/DL models using Flask across AWS, and possesses hands-on knowledge of Amazon Sagemaker, MySQL, and MongoDB. He recently achieved the Azure certified Data Scientist Associate certification.

Experience

Data Scientist (AI/ML)

Capgemini

May 2021 - PresentPune, India

Document verification and car damage severity detection to build Buyer Recommendation System (Computer Vision): Automated complete document verification and classification using Deep learning. Implemented classification using Inception model and tagged the markings using Azure form recognizer. Assessed the external condition of vehicle and damage using YoloV3 for object detection to find if the number on the number plate is insured by the company using Google OCR and Detectron for instance segmentation to determine which part of the car is damaged and its severity. This in turn assists in determining the final sale price of the vehicle scrap. Insurance Fraud Detection (Clustering Classification with Deployment using AWS): Automated the manual process of segregating fraud and defaults in insurance application by segregating them into clusters using K-means and then classifying them using XgBoost and SVC by finding the best hyperparameters to improve the accuracy of the model from 69% to 95% and subsequently improving on the recall and AUC. Sourced the client files from AWS S3 bucket(boto). Containerized the ML Web App using Dockers and then deployed the model on AWS EC2 instance with the requests redirected using Flask API. Automated question answering from FAQ’s using BERT (NLP POC): An NLP Bi-LSTM algorithm is trained using the existing tickets information from the company’s customer service platform. The request made by the customer is routed through the Flask API to the respective teams for resolution. A POC was setup for automatic question answering using the state-of-the-art Distill-BERT model using the CDQA(closed domain question answering platform) which was trained on the PDF’s related to the customer querying services over multiple years.

Data Analyst

ABC Steps Technologies

Aug 2018 - Aug 2019

Sentiment/Review analysis about the reviews laid out by the customers for the company’s services over the years. Implemented the probabilistic Naïve Bayes algorithm to filter out the positive and negative/neutral sentiments and redirected the positive sentiments to the company’s testimonial database and negative ones to the PR dept. Support customer buying behavior and preparing market basket analysis using Apriori with minimum support of 0.07 and Association rule with metric of “lift”. We recommended products where the lift was greater than 3 and confidence greater than 0.03. The recommendation from this study led to a substantial rise in sale for the client.

Education

K.J. Somaiya Institute of Management Studies and Reserch

MBA in Data Science

Data Science

Jan 2019 - Jan 2021

Pune University

BE ENTC

Electronics and Telecommunication

Jan 2013 - Jan 2017

Licenses & Certifications

Microsoft Azure certified Data Scientist Associate

Microsoft Azure

• No expiration

Skills

Decision Tree
SVM
Random Forest
KNN
XGBoost
K-Means
DBSCAN
Hierarchical Clustering
PCA
t-SNE
ANN
CNN
RNN
LSTM
Transfer Learning
Seq2Seq
Attention
Transformers
BERT
Image classification
Object Detection using SSD,YOLO,GAN’s
Numpy
Pandas
Seaborn
Scikit-Learn
FLASK
Tensorflow
Keras
Azure
AWS
Heroku
MySQL
MongoDB
Git
Dockers