Darshan Pandey

@Darshan_Pandey

Associate Software Engineer at EngagelyAI Pvt. Ltd.

Mumbai, Maharashtra, India

EngagelyAI Pvt. Ltd.Terna Engineering College

Data Engineer specializing in modernizing legacy batch workflows into scalable, push-based ingestion architectures using PySpark, Databricks, Azure Data Factory, Kafka, and Snowflake. Experienced in designing ETL/ELT pipelines, real-time streaming systems, and partitioned lakehouse datasets with transformation logic during ingestion. Strong focus on reliability, data quality, monitoring, and performance optimization across distributed big data platforms.

Experience

Associate Software Engineer

EngagelyAI Pvt. Ltd.

Full-time•Oct 2025 - Present•Mumbai, Maharashtra, India

Designed Kafka-based event ingestion pipelines capturing high-volume application and user interaction events enabling near real-time analytics for 100K+ daily records. Built PySpark ETL pipelines performing schema mapping, normalization, and deduplication to generate partitioned analytics-ready datasets. Implemented event-driven Spark Structured Streaming pipelines reducing end-to-end data latency by ~35% compared to legacy batch jobs. Added data lineage tracking, schema validation, and monitoring to improve reliability and observability of data pipelines.

Associate Software Engineer

Neosoft Technologies Pvt. Ltd.

Full-time•Mar 2025 - Oct 2025•Mumbai, Maharashtra, India

Modernized legacy ingestion workflows by replacing scheduled SQL jobs with Azure Data Factory and Spark pipelines across 10+ enterprise systems. Developed distributed ETL/ELT pipelines using PySpark and Databricks with schema enforcement, checkpointing, and automated validation. Architected event-driven clickstream data platform using Snowplow + Kafka storing partitioned datasets in ADLS Gen2. Optimized Spark joins, partitioning strategies, and execution plans, improving job performance by ~30% while reducing compute costs.

Software Developer

Microbiome Research Pvt. Ltd.

Full-time•Jun 2024 - Dec 2024•Mumbai, Maharashtra, India

Built PySpark ETL pipelines processing huge number of monthly behavioral records using incremental ingestion strategies. Designed data transformation workflows generating analytics-ready datasets for reporting and analysis. Automated reporting dataset generation reducing manual data processing effort by ~90%.

Associate Software Developer

Nimap Infotech

Full-time•Feb 2024 - Apr 2024•Mumbai, Maharashtra, India

Designed and automated data extraction and transformation workflows for operational datasets, enabling faster downstream data processing. Developed backend services and APIs responsible for data ingestion, transformation, and retrieval from relational databases. Automated recurring ETL workflows, reducing manual data processing effort and improving operational efficiency by ~30%.

Education

Terna Engineering College

Bachelor of Engineering (B.E.)

Computer Engineering

Aug 2019 - Jun 2023•Grade: CGPA: 8.26

Completed Bachelor of Engineering in Computer Engineering with focus on software development, data engineering, and distributed systems. Studied core subjects including Data Structures, Algorithms, Database Management Systems, Operating Systems, and Computer Networks. Built multiple projects involving Python, SQL, backend development, and data processing pipelines. Developed strong foundations in software engineering principles, scalable system design, and data-driven applications.

Skills

Apache Spark

PySpark

Spark Structured Streaming

Apache Kafka

Real-time Data Processing

Distributed Data Processing

Azure Data Factory (ADF)

Azure Data Lake Storage (ADLS Gen2)

Azure Databricks

Snowflake

Python

SQL

PostgreSQL

MySQL

MongoDB

FastAPI

Django

Django REST Framework

REST APIs

JWT Authentication

RBAC

Docker

Git

Linux