Default profile banner
SM

Sumit Mundkar

@sumitmundkar

Data Scientist

Pune, Maharashtra

Tech Mahindra

Sumit Mundkar is an experienced Data Scientist specializing in predictive modeling, machine learning, and generative AI solutions. He has a proven track record of developing and deploying advanced models, including those for customer churn prediction, sales forecasting, and airline booking analysis, utilizing Python, SQL, and cloud platforms like AWS. His expertise spans MLOps practices, including CI/CD pipelines, Docker, and MLflow, enabling the deployment of robust, scalable data solutions.

Experience

Sr Software Engineer

Tech Mahindra

•Jan 2024 - Mar 2024•Pune

Worked on Splunk SOAR technology using Python playbook. Developed automation workflows to streamline security operations. Collaborated with the security team to enhance incident response processes.

Data Scientist

Comsense Technologies

•Jul 2022 - Dec 2023•Pune

Analyzed booking data of an international airline using Python and ML to predict passenger booking times. Used SQL ETL & Python CICD pipeline for data extraction and deployment in the project. Built an Advanced Segmentation model for airline passengers, incorporating patterns and behaviors. Implemented the best-performing model in a Flask web application, deployed on AWS using Docker. Utilized MLflow for model deployment and tracking. The model was used by the marketing team for campaign planning. Developed a landing page generator utilizing Open AI and large language model to create engaging content tailored to customer preferences. Leveraged the power of large language models (LLMs) to dynamically generate website copy, headlines, and call-to-action buttons. Integrated the generator into Comsense Technologies’ platform to streamline the process of creating captivating landing pages for clients. Collaborated with the development team using CICD pipeline to optimize the performance and accuracy of the generated content. Designed and implemented a web scraping tool using OpenAI’s API to extract reviews, images, and product details from retail sites. Utilized large language models (LLMs) to analyze and summarize customer reviews, extracting valuable insights for automatic report generation. Engineered a robust system to collect and organize data from various sources, enabling comprehensive analysis of seller performance. Played a key role in developing algorithms to assess seller reputation based on scraped data, facilitating informed decision-making for clients. Built a recommender model for the airline industry using large datasets and Python. Implemented the recommendation algorithm based on the SAR Model, tailoring it for the airline industry using CICD pipeline. Predicted the possibility of breakdown and the expected time till the next failure using Data Engineering and Data Science project, scoping ETL from different data sources and utilizing MLflow for model d

Data Scientist

KINFOTECH

•Jan 2021 - Jul 2022•Bangalore

Analyzed sales data of a retail company using Python to forecast future sales. Used SQL ETL pipeline for data extraction in the project. Built time-series models using ARIMA and SARIMA, achieving 95 percent accuracy. Implemented the best-performing model in a Flask web application, deployed on AWS using Docker. Used by the business team for production and inventory planning. Detected hotspots in real-time data based on customer volume using Python. Implemented the hot spot detection system on CP4D (IBM Cloud).

Licenses & Certifications

AWS SageMaker Masterclass

AWS

Issued: Jan 2024• No expiration

NLP Certification- BERT, GPTS, HMTL Multimodal Large Model

Issued: Jan 2024• No expiration

IBM Data Science

IBM

Issued: Jan 2022

Udemy Mathematics and Statistics of ML & Data Science

Udemy

Issued: Jan 2022

AIML workshop on building AI-enabled Face Mask Detector

Issued: Jan 2023

Python Programming - From Basics to Advanced level

Issued: Jan 2024• No expiration

Udemy SQL for Data Analysis & Data Science

Udemy

Issued: Jan 2022

Udemy Tableau 2020: A-Z Hands on Tableau For Data Science

Udemy

Issued: Jan 2021

Udemy Microsoft Power BI Certification: A-Z Level (Ver 7.3)

Udemy

Issued: Jan 2021

Skills

Python
SQL
PySpark
HTML
LLM
RAG
Deep Learning
NLP
PyTorch
GPT
Gen AI
Regression
Classification
Clustering
Predictive Modeling
ML Flow
CICD
Docker
Kubernetes
DevOps
Terraform
AWS
Azure
Tensorflow
Jupyter Notebook
Flask
ARIMA
SARIMA