Simran Bhamra
@Simranbhamra10
Data Engineer at Accenture
Ghaziabad, Uttar Pradesh, India
Experience
Data Engineer
Accenture
Led full lifecycle onboarding of new consumption-domain pipelines—partnered with stakeholders to define requirements, evolved data models, built scalable ETL pipelines, and delivered analytics-ready datasets. Refactored a legacy functional-Python PySpark ETL framework using OOP principles, reducing entity onboarding time by 30%. Architected and deployed end-to-end ETL pipelines (Denodo → ADLS → PySpark → Azure Synapse), implementing SCD-2 for dimension/fact tables and enabling Power BI reporting via Denodo. Drove bug analysis & production support, identifying root-causes, deploying fixes, and restoring ETL workflows— minimizing data latency and SLA breaches.
Big Data Engineer
Wipro Limited
Led cloud migration initiative using Azure Data Factory and Databricks to migrate SAP ISU & SAP HANA workloads from on-premises to Azure ecosystem. Owned end-to-end defect resolution during UAT and client testing cycles for 125+ delta tables handling 380M+ records, enhancing data quality and reducing post-deployment issues. Architected scalable dimensional data models (Star & Snowflake Schema) in Azure Databricks and embedded SCD 2 processing within the ETL framework to support historical data preservation and high-performance analytical queries. Successfully addressed over 27 defects, including issues related to data refreshing and SQL code disparities. Conducted gap analysis and Root Cause Analysis (RCA), implementing fixes for issues raised by BAU teams. Built 11+ production-grade ingestion pipelines using Apache NiFi to import/export data from remote servers and Kafka into HDFS, ensuring reliable and scalable data flow. Extracted data from streaming sources and remote SFTP servers, then generated aggregates and reports using Spark best practices to aid downstream teams in data analysis and insights generation. Developed parameterized Spark/PySpark applications using spark-submit utility, utilizing DataFrames and Spark SQL API, and orchestrated scripts into Airflow for streamlined workflow management. Enhanced workflows and data cleansing processes with DevOps and Data Engineering teams, resulting in a 30% increase in data accuracy and a 20% boost in operational efficiency, driving ongoing enhancement efforts. Handled highly structured and semi-structured data with partition sizes ranging from 20 Gigabytes to 1.6 Terabytes per partition for each day
Education
Vishveshwarya Group of Institutions
B.Tech
Electronics & Communication Engineering
Licenses & Certifications
Databricks Certified Data Engineer Associate
DATABRICKS
AWS Certified Cloud Practitioner
AMAZON WEB SERVICES (AWS)