Yogesh Singh
@yogeshsingh0720
Data Engineer
Gurgaon, Haryana, India
AWS &Microsoft Fabric Certified Data Engineer with 5 years of experience building and optimizing scalable pipelines and cloud-based solutions. Proficient in Python, SQL, PySpark, AWS Glue, CI/CD, and multi-cloud migrations with a strong track record of swift P1-P3 RCA and zero-delay delivery. Recognized for ownership, cross-functional collaboration, and delivering high-quality, business-driven solutions that ensure data integrity and reduced processing time.
Experience
Data Engineer
Capgemini
• Spearheaded AWS Glue pipeline upgrades across 200+ Glue Jobs, owned 50% of Glue, Python & library version upgrades, Lambda-to-Glue migrations, and Glue optimizations, ensuring zero-delay delivery to clients. • Resolved P1-P3 production incidents with swift RCA, including complex SQL debugging, missing data resolution, count mismatches, and data discrepancy investigations post S/4HANA migration Hypercare support. • Maintained data integrity via priority-driven reload activities. Decommissioned legacy Glue Jobs with Git-based code archival for clean version governance. • Collaborated with Business Analysts and cross-functional teams on critical data quality issues; maintained technical documentation on Confluence for all changes and process improvements. • Upskilling in Databricks and GenAI/Agentic AI through internal corporate training and specialized workshops. • Awarded Shining Star within 3 months for exceptional ownership, collaborative problem-solving, and consistent on-time delivery.
Data Engineer
Tata Consultancy services• Designed and implemented multi-cloud ETL pipelines, migrating workflows from AWS to Microsoft Fabric using Azure services, reducing processing time by 25% and optimizing infrastructure costs. • Migrated on-premises Windows server applications to AWS Cloud. Automated data retrieval, validation, manipulation, and report generation using AWS services (Lambda, Batch, S3, CloudFormation, Redshift). • Developed scalable ETL processes using PySpark and AWS Glue for large-scale data transformation, achieving a 20% improvement in data accuracy across enterprise datasets. • Built automated monitoring and self-healing ETL systems using AWS Lambda, Step Functions, CloudWatch, EventBridge, and SNS, reducing manual operational effort by 30%. • Automated ETL file and record validation using SQL Stored Procedures and AWS (Redshift, Lambda, Glue, SNS), logging pipeline stats in Redshift and cutting manual effort by 60%. • Streamlined CI/CD pipelines using GitHub Actions and AWS CloudFormation for seamless automated deployments across Dev, QA, and Prod environments. • Integrated SNS with ServiceNow for real-time ETL alerts, managed Incidents & Change Requests within SLAs using JIRA and Confluence in an Agile/Scrum delivery model.
Education
Echelon Insitute of technology
B.Tech
Electronics and Communication Engineering
Licenses & Certifications
AWS Data Engineer Associate
Amazon Web Services
Credential ID: 1f3ef992104043a2b24728ce6ba3d832
View CredentialMicrosoft Fabric Data Engineer Associate
Microsoft
Credential ID: C9095D6DCD2C8375