Abdul Jameel Shaik
@abduljameelshaik
Senior Associate - Data Scientist at PwC India
Hyderabad, India
Data Scientist with 5 years of professional experience specializing in Machine Learning, Deep Learning, and advanced statistical analytics. He has a proven ability to solve complex business problems across various industries, including beverages, banking, and manufacturing.
Experience
Senior Associate - Data Scientist
PwC India
Project 1: Go To Market Tool - Beverages Manufacturing Industry: Designed an Alteryx ETL tool workflow for processing huge amounts of raw data from external data source. Integrated Clustering Algorithm using R code in Alteryx to cluster various customer segments based on amount of sales and regions. Utilized the functions of Alteryx such as Filters, Joins, Unions, Connectors to clean and format raw data into a well structured data which can be used further for Data Analysis. Project 2: Predictive Prospect Scoring Model -Beverages Manufacturing Industry: Applied Exploratory Data Analysis using Python to understand feature variables. Integrated Feature Selection techniques to solve curse of dimensionality issue. Built XG Boost Classification ML model in AWS sagemaker. Achieved accuracy of 98 percent which is a 4 percent increase from previous model. Supported the BI team in building Prospective Leads Tool and displayed the results on Power BI custom dashboard. Project 3: Large Scale Food Products Manufacturing Industry Pricing Analytics: Worked on data preparation and analyzing the data using EDA and feature engineering. Integrated the Log-Log Regression model to predict the no of units sold based on Actual Price, Base Price and Promoted Price. Calculated the accuracy of the model using MAPE and R2. Alteryx ETL tool was used to integrated the data and MLOps for automation purposes. Project 4: Demand Forecasting-Banking Industry: Collaborated in data preparation of 5- 15 years on multiple commodities and countries. Built a Forecasting algorithm such as SARIMA to forecast the production and consumption for next 10 years. The results helped the bank in rolling out loans for commodities in multiple countries. Forecasted results were used by roads and transport team for integrating it with the app. Project 5: Large Scale Business Re-Insurance Analytics: Overhauled the old insurance rater (MS-Excel) to a full functional R package based application. Wrote R scripts for indiv
Associate Consultant - Data Scientist
Innodatatics
Project 1: Electric Vehicle Battery Breakdown Predictive Analytics: Used Python libraries for Exploratory Data Analysis to understand the distribution of feature variables. Applied Random Forest Classification model to predict a EV battery health based on variables. Calculated the performance metrics of the model using Confusion Matrix and ROC curve. Optimized the model using K-fold Cross Validation methods and improved the accuracy. This model predicts with a 97 percent accuracy the health of a EV battery based on feature variables. Project 2: Medical Insurance Claim Analytics: Built a Multiple Linear Regression model to predict the claim amount. Optimized the model using L1 and L2 regularization methods to improve the R-square and drop less important variables in order to reduce the train and test RMSE value. The ML model provided the opportunity to have a block amount based on different surgeries and reduce fraud claims. Project 3: Predict Machine Component Failure in Beverage Industry: Extracted the data from Influx DB and converted into csv file using python. Built a Forecasting model using SARIMA to predict component failure and integrated the results in Power BI dashboard which shows the results for blower and labeller components by selecting shift wise (8 hours) and day wise (24 hours) forecasting time. The over all goal was to reduce production breakdown and suggest preventive maintenance schedule. Project 4: Wind Turbine Failure Prediction Model: Extracted the IOT sensors data from database using Python of wind turbine components. Built a classification algorithms such as Bagging, Boosting(XG boost), SVM and compute the accuracy of the ML models using Confusion Matrix. XG Boost model provided good accuracy of 96 percent and the model helped in predicting the lifetime value of the turbine components and reduce down times in energy production.
Education
JNTUH
Bachelor's of Technology
Electronics and Communication Engineering