Darshan Pandey
@Darshan_Pandey
Associate Software Engineer at EngagelyAI Pvt. Ltd.
Mumbai, Maharashtra, India
Data Engineer specializing in modernizing legacy batch workflows into scalable, push-based ingestion architectures using PySpark, Databricks, Azure Data Factory, Kafka, and Snowflake. Experienced in designing ETL/ELT pipelines, real-time streaming systems, and partitioned lakehouse datasets with transformation logic during ingestion. Strong focus on reliability, data quality, monitoring, and performance optimization across distributed big data platforms.
Experience
Associate Software Engineer
EngagelyAI Pvt. Ltd.
Designed Kafka-based event ingestion pipelines capturing high-volume application and user interaction events enabling near real-time analytics for 100K+ daily records. Built PySpark ETL pipelines performing schema mapping, normalization, and deduplication to generate partitioned analytics-ready datasets. Implemented event-driven Spark Structured Streaming pipelines reducing end-to-end data latency by ~35% compared to legacy batch jobs. Added data lineage tracking, schema validation, and monitoring to improve reliability and observability of data pipelines.
Associate Software Engineer
Neosoft Technologies Pvt. Ltd.
Modernized legacy ingestion workflows by replacing scheduled SQL jobs with Azure Data Factory and Spark pipelines across 10+ enterprise systems. Developed distributed ETL/ELT pipelines using PySpark and Databricks with schema enforcement, checkpointing, and automated validation. Architected event-driven clickstream data platform using Snowplow + Kafka storing partitioned datasets in ADLS Gen2. Optimized Spark joins, partitioning strategies, and execution plans, improving job performance by ~30% while reducing compute costs.
Software Developer
Microbiome Research Pvt. Ltd.
Built PySpark ETL pipelines processing huge number of monthly behavioral records using incremental ingestion strategies. Designed data transformation workflows generating analytics-ready datasets for reporting and analysis. Automated reporting dataset generation reducing manual data processing effort by ~90%.
Associate Software Developer
Nimap Infotech
Designed and automated data extraction and transformation workflows for operational datasets, enabling faster downstream data processing. Developed backend services and APIs responsible for data ingestion, transformation, and retrieval from relational databases. Automated recurring ETL workflows, reducing manual data processing effort and improving operational efficiency by ~30%.
Education
Terna Engineering College
Bachelor of Engineering (B.E.)
Computer Engineering
Completed Bachelor of Engineering in Computer Engineering with focus on software development, data engineering, and distributed systems. Studied core subjects including Data Structures, Algorithms, Database Management Systems, Operating Systems, and Computer Networks. Built multiple projects involving Python, SQL, backend development, and data processing pipelines. Developed strong foundations in software engineering principles, scalable system design, and data-driven applications.