Sumit Jha
@sumitjha
Data Scientist at Digital India Corporation
Greater Noida
Sumit Jha is a Data Scientist holding a PG Diploma in Data Science from IIIT-B. He possesses hands-on experience developing predictive models using supervised and unsupervised learning algorithms. Proficient in Python, SQL, Tableau, and AWS, he specializes in end-to-end MLOps pipelines, data visualization, and advanced analytics to drive data-informed decision-making.
Experience
Data Scientist
Digital India Corporation
Developing predictive models on Ministry dataset using supervised learning algorithms such as Linear Regression, Decision Trees, Random Forests, and Support Vector Machines. Trained models on labeled datasets to make accurate predictions on new data. Collaborated with cross-functional teams to gather requirements and transformed raw data into meaningful visualizations that effectively conveyed key insights. Managed and maintained large databases, ensuring data accuracy, integrity, and security through regular backups and data validation checks. Conducted rigorous validation and evaluation of models by employing techniques like cross-validation and performance metrics such as accuracy, precision, recall, and F1-score to assess model effectiveness. Gathering data of all ministry and standardize the Data and Meta Data for interoperability and making data AI ready. Putting all data on common portal for accessible to all ministries for further investigation and analysis purpose.
Senior Executiv
EDA and Predictive Modelling (Linear Regression, logistic Regression, Decision Tree & Random Forest) on Media Dataset to provide insightful information with tools Python and Tableau. Worked on structured media Dataset and created a models like linear and Logistic Regression. Created a dashboards using Visualization tool Tableau. Sql, AWS and Mongodb also used for Data analytics. Conducted installation and Licensing of almost 400 NUC Meters to manage big chunk of Channels data. Planned and executed 10 + end to end maintenance drives of Antennas in stipulated time frame. Basic Operations of TELNET, PING and TRACERT. Panel Training for BAR-O-Meter. Working on Watermark Monitoring and Playout Infrastructure.
Associate Manager
Broadcast Audience Research Council India
Developed predictive models using supervised learning algorithms such as Linear Regression, Decision Trees, Random Forests, and Support Vector Machines. Trained models on labeled datasets to make accurate predictions on new data. Unsupervised learning methods including clustering (K-Means) and dimensionality reduction (PCA) to uncover patterns, trends, and insights within large datasets lacking labeled outcomes. Manipulated structured data from various sources, including relational databases and CSV files. Performed data preprocessing, cleaning, and transformation to ensure data quality and consistency for downstream analysis. Engineered relevant features from raw data to improve model performance. Leveraged domain knowledge to create informative and discriminative features for both supervised and unsupervised tasks. Collaborated with team of Data Scientists and Software Engineers, end-to-end MLOps pipelines for deploying machine learning models, ensuring smooth transition from development to production. Created interactive and insightful Tableau dashboards to visualize complex data sets, enabling data-driven decision-making for stakeholders.
Associate
Prime Focus Technologies
DATA SCIENCE PROJECTS: Credit EDA Case Study (2 members): Applying EDA in a real business scenario to understand risk analytics in banking and financial services. Bike Sharing (2 Members): Creating a model for shared bike demand based on independent variables to help manipulate business strategy. NGO Clustering (2 Members): Categorizing countries using socio-economic and health factors to suggest NGO focus areas. X Education Lead Scoring Case Study (2 Members): Building a model to assign a lead score to select the most promising leads for conversion. Capstone Project – Credit Card Fraud Detection (2 Members): Developing a machine learning model to detect fraudulent transactions and analyzing the business impact to recommend mitigation strategies.
Education
IIIT Bangalore
PG Diploma in Data Science
Data Science
UPTU
B.Tech
Electronics & Communications
CBSE
Class 12th
CBSE
Class 10th