Default profile banner
VD

Vishal Dixit

@vishal_dxt11

Data Engineer at DeHaat

Gurugram, Haryana, India

DeHaatVellore Institute of Technology

Data Engineer experienced in designing and operating batch and near real-time pipelines on AWS using Spark, SQL, and CDC ingestion. Delivered measurable impact — cutting data movement costs by 30% and reducing manual workflow effort by 60–70%. Skilled in ETL/ELT development, lakehouse modeling, and event-driven architectures with a focus on reliability, data quality, and cost efficiency.

Experience

Data Engineer

DeHaat

Aug 2025 - PresentGurugram, India

Managed and optimized AWS DMS replication tasks; migrated selected workloads to Zero-ETL pipelines, reducing data movement costs by 30% while maintaining near-real-time availability in Redshift. Built and maintained scalable ETL pipelines processing 5–20 GB/day using AWS Glue (Spark) and Python to load batch and incremental data into S3/Redshift, with monitoring, retries, and freshness validation. Designed data models and ingestion workflows integrating CRM, transactional, and agronomy API data into a unified operational dataset consumed by internal dashboards and business workflows. Built RabbitMQ-based event ingestion services to capture operational transactions with idempotent consumers and retry handling, ensuring reliable and exactly-once processing.

Data Analytics Intern

STMicroelectronics

Jul 2024 - Jun 2025Greater Noida, India

Built Python-based data processing pipelines to analyze large-scale SoC design metrics and logs across multiple engineering teams. Automated SoC design workflows (lint, simulation, synthesis) using Python scripting, reducing manual engineering effort by 60–70% and accelerating design iteration cycles. Developed internal analytics dashboards and reporting workflows in Power BI to support engineering decision-making across design and verification teams.

Education

Vellore Institute of Technology

M.Tech

Computer Science (Big Data Analytics)

Aug 2023 - Jun 2025Grade: 8.9/10

Dr. APJ Abdul Kalam Technical University

B.Tech

Aug 2018 - Jun 2022Grade: 7.4/10

Licenses & Certifications

IBM Data Engineering Essentials

IBM

• No expiration

Data Pipelines with Airflow & Kafka

Coursera

• No expiration

Big Data with Spark and Hadoop

• No expiration

Skills

Python
SQL
PySpark
Bash
Apache Spark
Databricks
Hadoop
Batch & Incremental Processing
ETL/ELT Pipelines
Kafka
RabbitMQ
CDC Pipelines
Event-Driven Architecture
AWS
S3
EMR
Redshift
RDS
Glue
DMS
Lambda
Athena
IAM
Warehouse Modeling
Lakehouse Architecture
Kappa Architecture
Lambda Architecture
Data Validation
Data Quality
Parquet
Delta Lake
Airflow
Docker
Git
CI/CD
CloudWatch
Power BI