Vani Singh
@Vanisingh
Data Engineer | Databricks | SQL | Python | PySpark | Snowflake | AWS | ETL
Noida, Uttar Pradesh, India
Data Engineer with 2.8 years of experience in designing, building, and optimizing Data Pipelines. Experienced in Cloud Migration, transforming legacy data systems into scalable cloud architectures. Proficient in SQL, Python, Pyspark, Databricks, Snowflake, AWS, Data Warehousing and ETL Workflows. Skilled in handling Relational, Geospatial, Semi-Structured and Streaming data.
Experience
Data Engineer
Accenture
Migrated 4 legacy SAS-based data applications to a cloud-native Databricks–Snowflake–AWS architecture, enhancing scalability and reducing processing time by 60%. • Designed and implemented end-to-end ETL pipelines for the Advertising Analytics team, integrating cross-source datasets to enable targeted ad campaigns, contributing to a 15% reduction in claim count. • Developed a PySpark-based lightweight application to process coverage-level and state-level data, applying Spark performance tuning that improved runtime by 25%. • Orchestrated multiple Databricks jobs using Databricks API, AWS Lambda, Step Functions, EventBridge, and SNS, automating workflow execution and achieving a 70% gain in operational efficiency. • Initiated compute optimization efforts for Databricks and Snowflake, analyzing Spark UI and logs to identify performance bottlenecks, resulting in 11,500 DBUs saved annually. • Delivered scalable, automated, and fault-tolerant solutions with scheduled data refreshes and monitoring. • Developed Lambda-based automation to trigger and validate Snowflake SQL queries, streamlining data refresh cycles. • Built a data validation and quality check framework using Python and PySpark, automating data profiling, anomaly detection, and visualization, reducing manual effort by 80% and increasing data reliability. • Processed and transformed Geospatial datasets for property underwriting, managing 65 million records to support risk scoring and underwriting decision-making. • Performed parallel runs between on-premises SAS systems and Databricks pipelines to validate output accuracy during migration, ensuring 99% consistency across platforms. • Collaborated with cross-functional teams to document data workflows, define data lineage, and improve governance practices in the migration process. • Supported production operations by troubleshooting ETL job failures, optimizing cluster configurations, and ensuring stable daily and monthly refreshes. • Contributed to data pipeline enhancements, including partition pruning, caching strategies, and incremental load optimization to improve overall performance and cost efficiency.
Education
JSS Academy Of Technical Education Noida
Bachelors In Technology
Computer Science
- Part of Google Developers Student Club (Developer Student Clubs is a Google Developers program for university students to help them learn and build together. DSC JSS Noida is a community of programmers, developers and designers who grow their knowledge in a peer to peer learning environment and build solutions for local business and their community.) - Part Of Impetus Student Society (Impetus Student Society is a student body in the JSS Academy of Technical Education, Noida, which uniquely aims at launching a movement to turn good students into better professionals, so as to provide the corporate world with quality professionals and hence making a student’s induction into the industry much easier. Impetus lays initiative to introduce today’s academician to tomorrows executive.)
Licenses & Certifications
Databricks Certified Data Engineer Associate
Databricks
SnowPro Core Certification
Snowflake
Github Copilot (GH-300)
Microsoft