Purushottam Gurav
@purushottamgurav
Data Scientist
Pune, India
Over 2 years of experience leveraging data science and engineering skills to develop data-driven solutions. Possesses a strong foundation in statistics, machine learning, and programming languages like Python (libraries like Pandas and NumPy). Proficient in utilizing cloud platforms like Microsoft Azure, particularly Azure Databricks and Azure Machine Learning, to streamline data pipelines and build machine learning models. Staying abreast of the latest developments by reading blogs, continually enhancing skills through various platforms.
Experience
Data Scientist
LTIMindtree Pvt. Ltd.
Led Project strategy, timelines, and workflow decisions, and initiated data preparation from Vertica tables. Conducted comprehensive research and experimentation phase, exploring various techniques and algorithms for categorizing transactions into different categories which includes NER-based transformers models, Facebook's Fasttext, and Azure Cognitive services, etc. Deployed End-to-End Training and Inference pipelines in Databricks Workflows, utilizing MLflow on Databricks resulting in overall accuracy of 82% along with showcasing proficiency in PySpark, Python, Pandas,etc & highlighted the reusability of the approach and pipelines in another project named "OBOF Domestic: Intent Classification". Led Project strategy, workflows and initiated data preparation from scratch using Vertica tables. Led Data preparation and optimized SQL queries for merging different tables in Azure Databricks and store it in a Feature Store for reusability in EDM usecase, resulting a reduction in data retrieval time and improved overall data accuracy. Performed comprehensive research and experimentation on different algorithms of Classifications such as XGBoost, LGBM, Random Forest models which helped to determine the most effective approach. Deployed End-to-End Training and Inference Pipelines using Sklearn, MLflow,Pandas,etc and generated predictions which resulted in an overall average accuracy of 80% for all products enabling the company to make data-driven business decisions. Initiated data preparation from scratch using Vertica table and performed EDA using Pandas, Matplotlib & Seaborn unveiling pivotal insights from datasets, empowering data-driven decision making. Worked on pre-processing of data using Scikit-Learn such as data cleaning, imputation of nulls, handling categorical data using One-Hot Encoder, Ordinal Encoder, etc as well as simultaneously researched on algorithms such as Decision Trees and XGBoost to extract rules from it along with metrics calculation. Developed an
Education
University of Mumbai
Master of Science in Statistics
Statistics
University of Mumbai
Bachelor of Science
Statistics
Licenses & Certifications
Azure Data Scientist Associate - DP 100
Azure Fundamentals - AZ 900
Azure AI Fundamentals - AI 900
Azure Data Fundamentals - DP 900