Data scientist with 3+ years experience in executing data-driven solutions to increase efficiency, utility of data processing. Experienced at creating data regression models, using predictive data modelling to deliver insights and implement solutions to complex business problems
Experience
Absenteeism Predictive model
ATCS Inc.
Developed data ingestion, data engineering, modelling pipelines for absenteeism prediction model in Kedro python framework. Experience in data ingestion using Rest APIs and requests in python. Containerized Kedro pipelines in docker and deployed to Azure VM for scheduled model training. Tools: Azure Datalake, VM, Python, Kedro, Requests, Docker.
Real time part quality prediction
ATCS Inc.
Developed real time part quality predictive model by analysing historical data for a manufacturing client. Deployed model in ThingWorx Analytics to show alerts in ThingWorx dashboard in real time. Developed model training pipelines in Kedro python framework and deployed to Azure VM for automated training using TSI data from Azure Datalake. Tools: Kedro, Python, ThingWorx Analytics, Azure Datalake, VM.
Python Django Back-end API Development
ATCS Inc.
Developed back-end modules for application using Python Django Web Framework and PostgreSQL as back-end database. Created and deployed applications on diverse range of platforms including Linux and Docker. Experience in deploying scalable applications using Kubernetes in AWS EKS and ECR. Tools: Python(Django), AWS EKS, ECR, Docker, Kubernetes, PostgreSQL.
Social Listening
ATCS Inc.
Developed AWS Glue jobs in python for extracting social media posts from twitter, YouTube. Performed NLP based Tokenization, Lemmatization, vectorization and processed data in Machine understandable language. Designed and modeled Naive Bayes algorithm for data analysis to determine sentiment polarity of data set. Visualized the results in Apache Superset deployed in Amazon ECS to get actionable insights on real time datasets. Tools: AWS Lambda, Glue, S3, API Gateway, Apache Superset, ECS.
Repair Packages Generation for Auto OEM
ATCS Inc.
Repair packages developed using raw data, which acted as a guide for the dealers while making repairs solving the global problem of inconsistent repair quality. Python was used as primary language for extracting data from different data sources like json, csv, xml, txt files. Used Decision Tree and Random Forest classification models to segregate automobile parts and operations to form a complete Repair package. Tools: Azure Datalake, HDInsight, Python.
ATCS Internship
ATCS Inc.
Hands on experience for analyzing sentiment data from twitter. Experience in using NLP algorithms like LDA for a Healthcare Client. Developed Solution using NLP and Text Clusterization techniques like K-means model for clustering similar parts based on part name and descriptions for a German Automotive Client.
Education
Indian Institute of Technology, Kanpur
B.Tech
Chemical Engineering