Default profile banner
PG

Purushottam Gurav

@purushottamgurav

Data Scientist

Pune, India

LTIMindtree Pvt. Ltd.University of Mumbai

Over 2 years of experience leveraging data science and engineering skills to develop data-driven solutions. Possesses a strong foundation in statistics, machine learning, and programming languages like Python (libraries like Pandas and NumPy). Proficient in utilizing cloud platforms like Microsoft Azure, particularly Azure Databricks and Azure Machine Learning, to streamline data pipelines and build machine learning models. Staying abreast of the latest developments by reading blogs, continually enhancing skills through various platforms.

Experience

Data Scientist

LTIMindtree Pvt. Ltd.

Jan 2022 - Jan 2024

Led Project strategy, timelines, and workflow decisions, and initiated data preparation from Vertica tables. Conducted comprehensive research and experimentation phase, exploring various techniques and algorithms for categorizing transactions into different categories which includes NER-based transformers models, Facebook's Fasttext, and Azure Cognitive services, etc. Deployed End-to-End Training and Inference pipelines in Databricks Workflows, utilizing MLflow on Databricks resulting in overall accuracy of 82% along with showcasing proficiency in PySpark, Python, Pandas,etc & highlighted the reusability of the approach and pipelines in another project named "OBOF Domestic: Intent Classification". Led Project strategy, workflows and initiated data preparation from scratch using Vertica tables. Led Data preparation and optimized SQL queries for merging different tables in Azure Databricks and store it in a Feature Store for reusability in EDM usecase, resulting a reduction in data retrieval time and improved overall data accuracy. Performed comprehensive research and experimentation on different algorithms of Classifications such as XGBoost, LGBM, Random Forest models which helped to determine the most effective approach. Deployed End-to-End Training and Inference Pipelines using Sklearn, MLflow,Pandas,etc and generated predictions which resulted in an overall average accuracy of 80% for all products enabling the company to make data-driven business decisions. Initiated data preparation from scratch using Vertica table and performed EDA using Pandas, Matplotlib & Seaborn unveiling pivotal insights from datasets, empowering data-driven decision making. Worked on pre-processing of data using Scikit-Learn such as data cleaning, imputation of nulls, handling categorical data using One-Hot Encoder, Ordinal Encoder, etc as well as simultaneously researched on algorithms such as Decision Trees and XGBoost to extract rules from it along with metrics calculation. Developed an

Education

University of Mumbai

Master of Science in Statistics

Statistics

Jan 2020 - Jan 2022

University of Mumbai

Bachelor of Science

Statistics

Jan 2017 - Jan 2020

Licenses & Certifications

Azure Data Scientist Associate - DP 100

• No expiration

Azure Fundamentals - AZ 900

• No expiration

Azure AI Fundamentals - AI 900

• No expiration

Azure Data Fundamentals - DP 900

• No expiration

Skills

Statistics
Python
PySpark
Pandas
Azure Databricks
Machine Learning
Scikit-Learn Framework
MLFlow Framework
SQL
Data Science
Mircosoft Azure Cloud
Azure Machine Learning
Deep Learning
Pytorch
MS - Excel
MS - PowerPoint
MS - Word