I don’t just build pipelines; I architect data systems that scale with business growth while keeping cloud costs under control. Throughout my career at organizations like Wayfair and RedDoorz, I have focused on the intersection of high-performance engineering and business impact. High-Impact Achievements: ● Cost Optimization: Reduced Snowflake monthly compute costs by 30% at Diaceutics through warehouse refactoring and query optimization. ● Performance Engineering: Achieved a 10x reduction in Parquet file ingestion runtime using optimized Spark logic. ● Infrastructure Ownership: Architected and deployed production-grade Airflow environments on AWS EKS using Secrets Manager and EFS Drivers. ● Strategic Automation: Engineered a specialized scraping framework to provide competitive pricing insights, directly influencing business strategy. My Technical Focus: ● Languages: Python, SQL, PySpark. ● Modern Data Stack: Snowflake, BigQuery, Airflow, DBT, and Meltano. ● Cloud & DevOps: AWS, GCP, Docker, Kubernetes (EKS), and CI/CD (GitLab). I thrive in fast-paced environments where I can own the data lifecycle .from initial schema design to automated, fault-tolerant deployment. I am passionate about mentoring junior talent and building data cultures rooted in code integrity and reliability.
Experience
Senior Data Engineer
Databerry Technologies Pvt Ltd
● Architected end-to-end ELT workflows using Airflow, AWS, Snowflake, and DBT, enabling modular data modeling for complex datasets. ● Reduced monthly cloud compute costs by 30% by optimizing Snowflake warehouse utilization and query performance. ● Improved data reliability for stakeholders by engineering a comprehensive data quality framework with schema enforcement and reconciliation logic. ● Architected, designed and implemented tag based masking policies for the business facing views and tables
Senior Data Engineer - Contractor
Wayfair
● Optimized GCP data pipelines, resulting in reduced total processing time and significantly increased system reliability for high-volume customer data. ● Developed complex analytical models to measure email/SMS campaign effectiveness, directly enabling the marketing team to reduce customer fatigue and optimize communication frequency. ● Designed and maintained fault-tolerant Airflow DAGs, ensuring seamless automation of business-critical data workflows.
Sr Associate Technical Lead - Data Engineer
Reddoorz
● Extracting Data from different data sources like RDS (Mysql, Postgres) BigQuery, and third-party APIs. ● Parsing/Extracting CSV, JSON files and automating the extraction using python scripts. ● Transforming the raw data and loading it into target databases like redshift and snowflake. ● Performing quality checks for the data to make sure the data is clean and usable. ● Orchestrating the D.E jobs to automate the extraction and loading of the data. ● Managing data analysis and processing activities involving analyzing, studying, and summarizing data for extracting useful information which would assist in strategic decision-making and planning
Associate
Axtria
● Achieved a 10x reduction in Parquet file ingestion runtime through performance engineering and optimized Spark logic. ● Served as a core team member for the flagship product "DataMax," building complex business logic in PySpark, Redshift, and Matillion. ● Developed a custom Data Quality Check pipeline using Dagster to ensure high accuracy across all data deliverables.
Programmer Analyst
Cognizant● Developed PySpark and Python scripts to automate manual data tasks, significantly increasing team productivity. ● Converted legacy Hive queries into PySpark scripts, reducing runtime and infrastructure resource consumption. ● Built the logic for processing and populating tables for pricing and sales data sourced from SAP systems. ● Contributed to a seamless data loading framework that improved data accessibility for all downstream users.
Senior Data Engineer
Hinge Health
● Led the migration of legacy PostgreSQL pipelines from Airflow to Databricks, resulting in enhanced performance and streamlined resource utilization. ● Enhanced data pipelines to boost efficiency and reduce processing times by leveraging Databricks and PySpark. ● Refined complex SQL queries and workflows to minimize operational overhead and further reduce infrastructure costs.
Education
NIIT University
B.Tech
Computer Science and Engineering