Experience

Data Engineer

Accenture

Full-time•Dec 2024 - Present•Gurugram, Haryana, India

Led full lifecycle onboarding of new consumption-domain pipelines—partnered with stakeholders to define requirements, evolved data models, built scalable ETL pipelines, and delivered analytics-ready datasets. Refactored a legacy functional-Python PySpark ETL framework using OOP principles, reducing entity onboarding time by 30%. Architected and deployed end-to-end ETL pipelines (Denodo → ADLS → PySpark → Azure Synapse), implementing SCD-2 for dimension/fact tables and enabling Power BI reporting via Denodo. Drove bug analysis & production support, identifying root-causes, deploying fixes, and restoring ETL workflows— minimizing data latency and SLA breaches.

Big Data Engineer

Wipro Limited

Full-time•Nov 2023 - Dec 2024•Noida, Uttar Pradesh, India

Led cloud migration initiative using Azure Data Factory and Databricks to migrate SAP ISU & SAP HANA workloads from on-premises to Azure ecosystem. Owned end-to-end defect resolution during UAT and client testing cycles for 125+ delta tables handling 380M+ records, enhancing data quality and reducing post-deployment issues. Architected scalable dimensional data models (Star & Snowflake Schema) in Azure Databricks and embedded SCD 2 processing within the ETL framework to support historical data preservation and high-performance analytical queries. Successfully addressed over 27 defects, including issues related to data refreshing and SQL code disparities. Conducted gap analysis and Root Cause Analysis (RCA), implementing fixes for issues raised by BAU teams. Built 11+ production-grade ingestion pipelines using Apache NiFi to import/export data from remote servers and Kafka into HDFS, ensuring reliable and scalable data flow. Extracted data from streaming sources and remote SFTP servers, then generated aggregates and reports using Spark best practices to aid downstream teams in data analysis and insights generation. Developed parameterized Spark/PySpark applications using spark-submit utility, utilizing DataFrames and Spark SQL API, and orchestrated scripts into Airflow for streamlined workflow management. Enhanced workflows and data cleansing processes with DevOps and Data Engineering teams, resulting in a 30% increase in data accuracy and a 20% boost in operational efficiency, driving ongoing enhancement efforts. Handled highly structured and semi-structured data with partition sizes ranging from 20 Gigabytes to 1.6 Terabytes per partition for each day

Education

Vishveshwarya Group of Institutions

B.Tech

Electronics & Communication Engineering

Jan 2017 - Jan 2021•Grade: 8.11/10 CGPA

Licenses & Certifications

Databricks Certified Data Engineer Associate

DATABRICKS

Issued: Mar 2024• No expiration

AWS Certified Cloud Practitioner

AMAZON WEB SERVICES (AWS)

Issued: Mar 2024• No expiration

Skills

Pyspark

Spark

Azure

Azure Data Factory

Synapse

Unix

Databricks

Denodo

ETL

ELT

Ci/cd

Git

Azure DevOps

Simran Bhamra

Experience

Education

Vishveshwarya Group of Institutions

Licenses & Certifications

Skills