Default profile banner
SB

Simran Bhamra

@Simranbhamra10

Data Engineer at Accenture

Ghaziabad, Uttar Pradesh, India

Accenture Vishveshwarya Group of Institutions

Experience

Data Engineer

Accenture

Full-timeDec 2024 - PresentGurugram, Haryana, India

Led full lifecycle onboarding of new consumption-domain pipelines—partnered with stakeholders to define requirements, evolved data models, built scalable ETL pipelines, and delivered analytics-ready datasets. Refactored a legacy functional-Python PySpark ETL framework using OOP principles, reducing entity onboarding time by 30%. Architected and deployed end-to-end ETL pipelines (Denodo → ADLS → PySpark → Azure Synapse), implementing SCD-2 for dimension/fact tables and enabling Power BI reporting via Denodo. Drove bug analysis & production support, identifying root-causes, deploying fixes, and restoring ETL workflows— minimizing data latency and SLA breaches.

Big Data Engineer

Wipro Limited

Full-timeNov 2023 - Dec 2024Noida, Uttar Pradesh, India

Led cloud migration initiative using Azure Data Factory and Databricks to migrate SAP ISU & SAP HANA workloads from on-premises to Azure ecosystem. Owned end-to-end defect resolution during UAT and client testing cycles for 125+ delta tables handling 380M+ records, enhancing data quality and reducing post-deployment issues. Architected scalable dimensional data models (Star & Snowflake Schema) in Azure Databricks and embedded SCD 2 processing within the ETL framework to support historical data preservation and high-performance analytical queries. Successfully addressed over 27 defects, including issues related to data refreshing and SQL code disparities. Conducted gap analysis and Root Cause Analysis (RCA), implementing fixes for issues raised by BAU teams. Built 11+ production-grade ingestion pipelines using Apache NiFi to import/export data from remote servers and Kafka into HDFS, ensuring reliable and scalable data flow. Extracted data from streaming sources and remote SFTP servers, then generated aggregates and reports using Spark best practices to aid downstream teams in data analysis and insights generation. Developed parameterized Spark/PySpark applications using spark-submit utility, utilizing DataFrames and Spark SQL API, and orchestrated scripts into Airflow for streamlined workflow management. Enhanced workflows and data cleansing processes with DevOps and Data Engineering teams, resulting in a 30% increase in data accuracy and a 20% boost in operational efficiency, driving ongoing enhancement efforts. Handled highly structured and semi-structured data with partition sizes ranging from 20 Gigabytes to 1.6 Terabytes per partition for each day

Education

Vishveshwarya Group of Institutions

B.Tech

Electronics & Communication Engineering

Jan 2017 - Jan 2021Grade: 8.11/10 CGPA

Licenses & Certifications

Databricks Certified Data Engineer Associate

DATABRICKS

Issued: Mar 2024• No expiration

AWS Certified Cloud Practitioner

AMAZON WEB SERVICES (AWS)

Issued: Mar 2024• No expiration

Skills

Pyspark
Spark
Azure
Azure Data Factory
Synapse
Unix
Databricks
Denodo
ETL
ELT
Ci/cd
Git
Azure DevOps