Default profile banner
SN

Shubham Nagar

@user.2481337

Data Engineer at Crimson AI

Vadodara, Gujarat, India

Crimson AILovely Professional University, Jalandhar

Accomplished Data Engineer with over 4 years of experience, specializing in building scalable ETL pipelines and real-time data streaming solutions. Successfully achieved a $5,000 monthly reduction in cloud costs through optimized data pipelines and enhanced NLP services with a 7x performance boost using Kafka.

Experience

Data Engineer - 2

Crimson AI

•Jan 2023 - Sep 2024•Mumbai

Spearheaded the optimization of data pipelines with PySpark, Qdrant, and Elasticsearch, achieving a monthly reduction of $5,000 in cloud operational costs. Designed and implemented AI-driven NLP tools utilizing Mistral's vLLM and spaCy, resulting in improved application efficiency and enriched user engagement. Engineered robust ETL pipelines leveraging dbt and PySpark, efficiently processing terabytes of data and facilitating seamless integration through Flask and FastAPI RESTful APIs. Developed real-time data streaming capabilities with Apache Kafka, optimizing ML model performance using TensorFlow and PyTorch, resulting in a 7x uplift in service speed for NLP applications. Developed advanced data visualization platforms using Streamlit and Metabase, employing Matplotlib and Seaborn to transform large datasets into strategic insights, enhancing decision-making capabilities. Implemented scalable monitoring systems with Prometheus and Kibana, enhancing CI/CD pipeline efficiency through real-time alerting for production applications.

Data Analyst

Wakefit

•Jan 2022 - May 2022•Bengaluru

Developed a competitor price analytics platform using Tableau, Redshift, MongoDB, and Airflow, automating the detection of market shifts and enabling proactive pricing strategies. Analyzed and streamlined data processes to enhance inventory management efficiency, resulting in a 15% reduction in overstock across multiple distribution centers.

Software Specialist

Shiv Shakti Hospital

•Nov 2020 - Jan 2022•Kota

Streamlined the government paperwork submission process with an automated system, cutting down manual labor by 70% and addressing site reliability issues. Implemented a robust payroll and billing system on-premises using Django, leading to streamlined operations and enhanced data protection. Engineered real-time analytics dashboards with D3.js on-premise, driving enhanced visibility into patient statistics and hospital workflows, contributing to streamlined operational efficiency. Leveraged AWS and CDN to deploy a hospital landing page, resulting in a 30% increase in site traffic and enhanced user experience. Spearheaded multimedia marketing strategies for hospital services, achieving a 50% increase in patient inquiries and appointments through targeted online platforms.

Data Scientist & Data Analyst

upGrad

•Oct 2019 - Nov 2020•Mumbai

Utilized cloud-based technologies including Redshift and EC2 to develop a job aggregator engine that processes 1 million job entries for 10 million users, leading to streamlined search functionalities and enhanced user engagement. Architected and implemented a data-driven recommendation framework, boosting the efficiency of job opportunity marketing campaigns and increasing click-through rates by 25%. Developed robust REST APIs with Django to seamlessly integrate and streamline multiple external data sources, enhancing data accessibility for cross-functional teams by 40%. Streamlined data validation and cleaning workflows, resulting in a 30% improvement in dataset accuracy and reliability across multi-disciplinary teams. Designed and deployed real-time dashboards utilizing Redshift, Python, and Tableau, facilitating data-driven decision-making across departments and improving reporting accuracy by 30%. Optimized project timelines by automating Google Sheets workflows through Python scripting, enhancing productivity across departments.

Education

Lovely Professional University, Jalandhar

B. Tech.

Computer Science and Engineering

Aug 2016 - Aug 2020•Grade: 8.78

Specialized in Data Science with additional coursework in Big Data, providing a solid academic foundation for designing and optimizing data pipelines.

Skills

Python
SQL
Shell
Airflow
ETL
PySpark
Kafka
dbt
Data Modeling
FastAPI
Django
Flask
Streamlit
PostgreSQL
Redshift
Elasticsearch
MongoDB
Redis
Qdrant
Mistral
vLLM
LangChain
Semantic Search
SpaCy
TensorFlow
PyTorch
Scikit-Learn
Tableau
D3.js
Matplotlib
Metabase
Prometheus
Kibana
Grafana