Default profile banner
DP

Darshan Pandey

@Darshan_Pandey

Associate Software Engineer at EngagelyAI Pvt. Ltd.

Mumbai, Maharashtra, India

EngagelyAI Pvt. Ltd.Terna Engineering College

Data Engineer specializing in modernizing legacy batch workflows into scalable, push-based ingestion architectures using PySpark, Databricks, Azure Data Factory, Kafka, and Snowflake. Experienced in designing ETL/ELT pipelines, real-time streaming systems, and partitioned lakehouse datasets with transformation logic during ingestion. Strong focus on reliability, data quality, monitoring, and performance optimization across distributed big data platforms.

Experience

Associate Software Engineer

EngagelyAI Pvt. Ltd.

Full-timeOct 2025 - PresentMumbai, Maharashtra, India

Designed Kafka-based event ingestion pipelines capturing high-volume application and user interaction events enabling near real-time analytics for 100K+ daily records. Built PySpark ETL pipelines performing schema mapping, normalization, and deduplication to generate partitioned analytics-ready datasets. Implemented event-driven Spark Structured Streaming pipelines reducing end-to-end data latency by ~35% compared to legacy batch jobs. Added data lineage tracking, schema validation, and monitoring to improve reliability and observability of data pipelines.

Associate Software Engineer

Neosoft Technologies Pvt. Ltd.

Full-timeMar 2025 - Oct 2025Mumbai, Maharashtra, India

Modernized legacy ingestion workflows by replacing scheduled SQL jobs with Azure Data Factory and Spark pipelines across 10+ enterprise systems. Developed distributed ETL/ELT pipelines using PySpark and Databricks with schema enforcement, checkpointing, and automated validation. Architected event-driven clickstream data platform using Snowplow + Kafka storing partitioned datasets in ADLS Gen2. Optimized Spark joins, partitioning strategies, and execution plans, improving job performance by ~30% while reducing compute costs.

Software Developer

Microbiome Research Pvt. Ltd.

Full-timeJun 2024 - Dec 2024Mumbai, Maharashtra, India

Built PySpark ETL pipelines processing huge number of monthly behavioral records using incremental ingestion strategies. Designed data transformation workflows generating analytics-ready datasets for reporting and analysis. Automated reporting dataset generation reducing manual data processing effort by ~90%.

Associate Software Developer

Nimap Infotech

Full-timeFeb 2024 - Apr 2024Mumbai, Maharashtra, India

Designed and automated data extraction and transformation workflows for operational datasets, enabling faster downstream data processing. Developed backend services and APIs responsible for data ingestion, transformation, and retrieval from relational databases. Automated recurring ETL workflows, reducing manual data processing effort and improving operational efficiency by ~30%.

Education

Terna Engineering College

Bachelor of Engineering (B.E.)

Computer Engineering

Aug 2019 - Jun 2023Grade: CGPA: 8.26

Completed Bachelor of Engineering in Computer Engineering with focus on software development, data engineering, and distributed systems. Studied core subjects including Data Structures, Algorithms, Database Management Systems, Operating Systems, and Computer Networks. Built multiple projects involving Python, SQL, backend development, and data processing pipelines. Developed strong foundations in software engineering principles, scalable system design, and data-driven applications.

Skills

Apache Spark
PySpark
Spark Structured Streaming
Apache Kafka
Real-time Data Processing
Distributed Data Processing
Azure Data Factory (ADF)
Azure Data Lake Storage (ADLS Gen2)
Azure Databricks
Snowflake
Python
SQL
PostgreSQL
MySQL
MongoDB
FastAPI
Django
Django REST Framework
REST APIs
JWT Authentication
RBAC
Docker
Git
Linux