Default profile banner
PP

Prasad Pawar

@prasadpawar

Lead Data Scientist • Data Science Consultant

Pune, India

linkedin.com/in/prasad-pawar-27224117

Tata Consultancy ServicesWalchand College of Engineering

Prasad Pawar is a highly skilled data scientist with over 13 years of experience in research and development across Data Science and High-Performance Computing. He possesses deep knowledge in the entire data lifecycle, including data preparation, model building, and production deployment. His expertise spans complex problem-solving, utilizing advanced techniques in Machine Learning, Deep Learning, and NLP, and presenting actionable insights to stakeholders.

Experience

Lead Data Scientist / Data Science Consultant

Tata Consultancy Services

Consultancy/ProjectMay 2014 - Present

Developed Surveillance for Capital Market and Anomaly Detection solution to identify abuse scenarios in the capital market. Worked on identification and prediction of suspicious entities using classification algorithms, model training, testing, evaluation techniques & optimization. Segmented the dataset based on business logic using Louvain community partition and K-means Clustering algorithms and represented it in the form of network graphs for better analysis. Developed Big data application from scratch implementing business logic using Python, Pyspark, and Pandas, deployed on Kubernetes clusters which helped in time optimized results by fine-tuning spark configurations. Applied various Machine learning techniques such as Topic Modeling, Sentiment & Summarization using BERT, Name Entity Recognition on communication data to get more understanding and relevance of fraudulent activities performed by participants of the capital market. Delivered solution for life science-based domain, "Early detection of cross binders in the drug discovery process" by applying Machine Learning and Deep Learning approach using neural networks, Random Forest, XGBoost algorithms for training and prediction. Developed a memory recommender system for High-Performance Computation applications to increase the availability of resources with a better estimation of memory before execution of the application on HPC clusters to achieve optimum utilization of HPC resources. Optimized OpenFOAM on Intel Xeon-Phi (KNL) by identifying hotspots using Vtune and applying AVX-512 intrinsics. Applied vectorization using SIMD pragmas to enhance the performance. Parallelized and optimized the IRS computation using CUDA-C on Nvidia K20, and using OpenMP and ICC on Intel Xeon Haswell-EP.

Developer/Engineer

KPIT Technologies

ProjectJan 2011 - May 2014

Improved performance of various image processing (ADAS) applications by code optimizations and parallelizing the source code using C, OpenMP, OpenCL, GPGPU, and Linux OS. Designed and implemented auto-parallelization of loops using YUCCA, and redesigned the automatic parallelization module as a Tech Lead. Worked on a project of performance enhancement and did various experiments on OpenMP constructs.

Developer/Engineer

CDAC

ProjectSep 2008 - Jan 2011

Designed and implemented a Disaster recovery module to achieve zero Recovery Point Objective (RPO) and negligible Recovery Time Objective (RTO), including automatic replication of data from DC site to DR site at block level using iSCSI protocol with PITR techniques of PostgreSQL. Patent granted for Method and System for Business Continuity and Disaster Recovery. Published research on Automatic Sequential to Parallel code conversion (S2P tool) and Enterprise Storage Architecture for Optimal Business Continuity.

Education

Walchand College of Engineering

Masters in Computer Engineering

Computer Engineering

COE Osmanabad

Bachelors in Computer Engineering

Computer Engineering

Skills

Python
PySpark
Scala
R
Machine Learning
Deep Learning
Tensorflow
Keras
Sklearn
Azure essentials
Databricks
Google Cloud Essentials
NLP
Tableau
HPC
Parallelization and Optimization
OpenMP
MPI
GPGPU
Shell Scripts
Linear Regression
Logistic Regression
Decision Tree
Random Forest
XGBoost
SVM
KNN
K-means
Louvain Community Partition
Time Series Forecasting
Text Classification
Topic Modeling using LDA
Sentiment & Summarization using BERT
Named Entity Recognition
CNN
RNN
LSTM
Apache Spark
C
C++
CUDA
OpenCL
POSIX Threads
Socket Programming