Sukanya Banerjee

@SukanyaBanerjee

Azure Data Engineer

Kolkata, West Bengal, India

CapgeminiMaulana Abul Kalam Azad University Of Technology

Data Engineer with 3.5 years of experience in designing and building scalable data pipelines on the Microsoft Azure platform. Experienced in Azure Data Factory, Azure Databricks, ADLS Gen2, and Delta Lake for building reliable ETL/ELT workflows. Skilled in Python, SQL, and PySpark for large-scale data transformation and processing. Proven ability to implement Medallion Architecture, incremental data pipelines, and Spark optimizations to support analytics and reporting workloads.

Experience

Senior Software Engineer

Capgemini

•Jun 2022 - Present

Designed and developed ADF pipelines to ingest data from multiple sources (CSV, JSON, SQL DB, REST APIs) into Azure Data Lake Storage Gen2, enabling scalable data ingestion. Implemented Medallion Architecture (Bronze, Silver, Gold) using Azure Databricks and ADLS Gen2, improving data pipeline efficiency and enabling 50% faster data access for downstream analytics. Built PySpark-based transformation pipelines in Azure Databricks to cleanse, transform, and load large datasets into Delta tables. Developed incremental data ingestion pipelines using Databricks Auto Loader and Delta Lake MERGE operations, ensuring efficient processing of new and updated records. Optimized slow-running Databricks jobs using repartitioning, caching, and efficient Spark transformations, reducing pipeline runtime from 1 hour to 30 minutes. Improved storage and query performance by implementing partitioning and compression strategies in ADLS Gen2 and Delta tables, reducing storage usage by 30%. Monitored and resolved pipeline issues, achieving a 99.9% uptime and improving the team's SLA compliance by 15%. Collaborated with cross-functional teams, aligning data solutions with business objectives and enhancing

Education

Maulana Abul Kalam Azad University Of Technology

MSC

Computer Science

Aug 2019 - Jul 2021

Licenses & Certifications

Microsoft Certified: Azure Fundamentals (AZ-900)

Microsoft

• No expiration

Skills

SQL

PySpark

Azure Data Factory

Azure Databricks