Ipsita Sarkar
@Ipsita17
Data Engineer (Associate Software Engineer) at Accenture
Gurgaon, Haryana, India
Innovative Data Engineer specializing in building scalable data pipelines and lakehouse architectures leveraging PySpark, Databricks, Azure Cloud, and GCP. Proven expertise in designing enterprise-grade ETL/ELT work-flows, optimizing big data processing, and implementing robust data quality frameworks. Databricks Certified with hands-on experience in Apache Spark, Delta Lake, and real-time streaming.
Experience
Data Engineer (Associate Software Engineer)
Accenture
Architected production-grade ETL/ELT pipelines using PySpark and Spark SQL, processing 100M+ records daily with 30-40% performance improvement through partition optimization, broadcast joins, and adaptive query execution; Designed multi-layer lakehouse architecture (Landing → Bronze → Silver → Gold) ensuring data governance, lineage, and quality across enterprise platform serving 500+ business users; Developed reusable PySpark transformation modules with complex joins, window functions, and aggregations, reducing code duplication by 40% and accelerating feature delivery by 2-3 weeks; Orchestrated end-to-end workflows using Azure Data Factory and GCP Cloud Composer, enabling reliable cross-platform data movement between ADLS, BigQuery, and Cloud Storage with 99.8% pipeline success rate; Built optimized SQL models with materialized views, partitioned tables, and clustered indexes, reducing dash-board query latency from 45s to 18s (60% improvement) and supporting real-time analytics; Implemented comprehensive automated data quality framework including schema validation, duplicate detec-tion, null checks, and threshold-based alerts, reducing production defects by 30% and improving data
Education
Bharati Vidyapeeth’s Institute of Computer Applications and Management (GGSIPU), Delhi
Master of Computer Applications
Computer Applications
Asansol Engineering College, West Bengal
Bachelor of Computer Applications
Computer Applications
Licenses & Certifications
Databricks Certified Data Engineer Associate
Databricks
Credential ID: 163966870
Databricks Certified Generative AI Engineer
Databricks
Credential ID: 163966870