Default profile banner
SUBHASH BUNKARSB

SUBHASH BUNKAR

@subhashbunkar

Data Scientist at UNRAVEL

Jaipur, Rajasthan

https://www.linkedin.com/in/subhash-bayal-018b2a12

UNRAVELIIT Kharagpur

Experienced Data Scientist with a BTech degree from IIT Kharagpur, complemented by 2 internships and involvement in 2 projects. Possessing over 3 years of hands-on experience in the field, specializing in machine learning, deep learning, computer vision, NLP and generative AI.

Experience

DATA SCIENTIST, MTS-1 | Software Engineer

UNRAVEL

Aug 2022 - PresentPune

Unraveling Customer Chatbot Utilizing T5: Extracted and cleaned Unravel documentation text from official website using BeautifulSoup to enhance the usability. Developed a chatbot to assist the customers in identifying configuration features & basic information about unravel’s recommendations utilizing Langchain and Llama2 LLM. Unravel Real-time Query Calculator: Extracted features from SQL Queries using SQLGlot to construct a regression-based model for SQL query cost prediction prior to execution. Trained ML models such as linear regression, Random Forest and XGBoost regression and achieved an Adjusted R^2 of 0.85. Trained an XGBoost classifier model with 84% recall & 78% precision to categorize queries into 3 categories (Low, Medium, and High). Healthcheck: Performed EDA of query, warehouse insights data & generated backend health check report json to assist customers in decision-making regarding the Unravel product. Others: Developed and deployed 2 dashboards using Plotly and Dash for cost anomaly of the EMR user’s data. Developed a ML model utilizing LR, Decision Tree, SVM Random Forest, XGBoost, Adaboost and Neural Networks to categorize the high and low-cost Spark applications. Optimized BigQuery cloud cost by generating custom quotas suggestions and moving queries to a different pricing plan.

Associate Software Engineer | Data Analyst

Innoplexus

Feb 2021 - Sep 2022Pune

Developing utilities (Python, OOPs) and handling the IEP pipeline which is used to Priorities the Indication for a given target (Protein). Developed market analysis component to Priorities Indication based upon current marketed and clinical trial status. Developed a CNN model (Recall: 87%, Precision: 78%) for predicting gene-disease association using Protein sequence & Disease Symptoms embedding.

Marketing Analyst

Adret Retail Pvt. Ltd

InternshipMay 2020 - Aug 2020Bangalore

Scrapped reviews of 10 different 'KAPIVA' products from Amazon using Web Scrapper and analyzed for model creation. Developed a model with hyperparameter tuning using Grid search CV to categorize reviews into positive and negative classes using LR, RNN, LSTM, and BERT. Identified 40-50 relevant keywords of the KAPIVA products to optimize AMAZON listing.

Data Scientist

Angel Broking Pvt Ltd

InternshipMay 2019 - Jul 2019Mumbai

Selected important features and combined them with affinity data to predict offer types for the lead to maximize closure rate. Developed a ML model and achieved 87% recall and precision 83%, used KNN, DT, Random Forest, Adaboost and XGboost to predict offer for daily leads. Predicted probability of lead's conversion using logistic regression & took monthly lead closure rate from 1% to 15% in 2 months.

Education

IIT Kharagpur

B.TECH

Jan 2021Grade: 7.94 / 10

Vidya Bharati School

12th

Jan 2016Grade: 86.80%

Vidya Bharati School

10th

Jan 2014Grade: 92.33%

Licenses & Certifications

Introduction to Generative AI and LLMs

Self

Issued: Sep 2023Expires: Dec 2023

Foundation of Generative AI

Self

Issued: Dec 2023

Mathematics - Statistics & Probability for Data Science, Machine Learning & Deep learning

Issued: Nov 2023

Skills

Python
Machine learning
Deep Learning (CNN, RNN, ANN)
LSTM
BERT (Language Model)
Langchain
Open-AI API
Spacy
NLTK
Large Language Models (LLM)
NLP
C
C++
R
SQL
Transformers
Recommendation Systems
Auto-Encoder
LLM-fine tuning (LORA, QLoRa, RAG)
Vector Database(Pinecone)
Dashboard Development
Statistics
BigQuery
Snowflake
Flask-API
Web scrapping
Elastic-search
Pandas
Numpy
Sklearn
Scikit
Keras
Tensorflow
Dash
Stream-lit
Plotly
SQL-Glot