Deep jethwa

@DeepJethwa

Data Engineer at V4C.ai

Mumbai

https://linkedin.com/in/deep-jethwa-2a00a7240

V4C.aiUniversal College Of Engineering

Deep Jethwa is a Data Engineer with hands-on experience building scalable data pipelines on Databricks using PySpark and Delta Lake in Azure environments. He is skilled in developing automated ETL/ELT workflows and implementing Medallion Architecture to create reliable, analytics-ready datasets. He has experience optimizing Spark workloads and transforming complex raw data into high-quality datasets using Python and SQL for analytics and decision-making.

Experience

Data Engineer

V4C.ai

•Remote

Built scalable ETL/ELT pipelines using Databricks and PySpark, enabling efficient ingestion and processing of large-scale datasets. Implemented Lakehouse architecture with Medallion Architecture (Bronze–Silver–Gold) using Delta Lake to deliver reliable, analytics-ready data layers. Developed batch and incremental ingestion pipelines on Azure Data Lake Storage (ADLS Gen2) using Auto Loader to process structured and semi-structured data (CSV, JSON, XML). Implemented data quality checks, validation rules, and data cleansing processes to ensure reliable and consistent datasets. Applied Delta Lake features such as MERGE, schema evolution, and time travel to support reliable data versioning and governance. Optimized Apache Spark workloads using partitioning, caching, and query tuning to improve pipeline performance and execution efficiency.

Data Engineer Intern

eMeasurematics

•Mumbai

Built a production-grade ETL data pipeline in Linux-based environments using Python, MongoDB, Airflow, Docker to ingest and process large yard-management datasets. Performed data validation, cleansing, and normalization to ensure high-quality, consistent datasets for downstream analytics. Automated multi-interval data processing (hourly to yearly aggregations), improving workflow scalability and operational visibility. Developed automated data workflows that reduced manual reporting effort by ~80% and improved data delivery efficiency. Designed analytics-ready data models supporting business reporting, forecasting, and downstream analytics.

Education

Universal College Of Engineering

(B.E) Computer science & Honours in Data Science (Minor Specialization)

Computer Science

Jan 2021 - Jan 2025•Grade: CGPI: 8.7 GPA: 3.5

Licenses & Certifications

Certified in English Proficiency (C1 Level)

TOEFL by ETS

• No expiration

Build Data Pipelines with Lakeflow Spark Declarative Pipelines

Databricks Academy

• No expiration

Deploy Workloads with Lakeflow Jobs

Databricks Academy

• No expiration

DevOps Essentials for Data Engineering

Databricks Academy

• No expiration

SQL Analytics on Databricks

Databricks Academy

• No expiration

Data Ingestion with Lakeflow Connect

Databricks Academy

• No expiration

The Complete Python Bootcamp: From Zero to Hero in Python

Udemy

• No expiration

Skills

Databricks

PySpark

Apache Spark

Delta Lake

ETL/ELT Pipelines

Medallion Architecture

Python

Pandas

Numpy

SQL

Microsoft Azure

Azure Databricks

Azure Data Factory

ADLS Gen2

Data Modeling

Dimensional Modeling

Data Warehousing Concepts

Apache Airflow

Kafka

Fivetran

MySQL

MongoDB

Git

Docker

Power BI

Excel

API Integration

REST APIs

Postman

SDLC

Hadoop

Linux Shell Scripting