Default profile banner
YS

Yogesh Singh

@yogeshsingh0720

Data Engineer

Gurgaon, Haryana, India

CapgeminiEchelon Insitute of technology

AWS &Microsoft Fabric Certified Data Engineer with 5 years of experience building and optimizing scalable pipelines and cloud-based solutions. Proficient in Python, SQL, PySpark, AWS Glue, CI/CD, and multi-cloud migrations with a strong track record of swift P1-P3 RCA and zero-delay delivery. Recognized for ownership, cross-functional collaboration, and delivering high-quality, business-driven solutions that ensure data integrity and reduced processing time.

Experience

Data Engineer

Capgemini

Full-timeMar 2025 - PresentGurgaon, Haryana, India

• Spearheaded AWS Glue pipeline upgrades across 200+ Glue Jobs, owned 50% of Glue, Python & library version upgrades, Lambda-to-Glue migrations, and Glue optimizations, ensuring zero-delay delivery to clients. • Resolved P1-P3 production incidents with swift RCA, including complex SQL debugging, missing data resolution, count mismatches, and data discrepancy investigations post S/4HANA migration Hypercare support. • Maintained data integrity via priority-driven reload activities. Decommissioned legacy Glue Jobs with Git-based code archival for clean version governance. • Collaborated with Business Analysts and cross-functional teams on critical data quality issues; maintained technical documentation on Confluence for all changes and process improvements. • Upskilling in Databricks and GenAI/Agentic AI through internal corporate training and specialized workshops. • Awarded Shining Star within 3 months for exceptional ownership, collaborative problem-solving, and consistent on-time delivery.

Data Engineer

Tata Consultancy services
Full-timeApr 2021 - Nov 2024Gurgaon, Haryana, India

• Designed and implemented multi-cloud ETL pipelines, migrating workflows from AWS to Microsoft Fabric using Azure services, reducing processing time by 25% and optimizing infrastructure costs. • Migrated on-premises Windows server applications to AWS Cloud. Automated data retrieval, validation, manipulation, and report generation using AWS services (Lambda, Batch, S3, CloudFormation, Redshift). • Developed scalable ETL processes using PySpark and AWS Glue for large-scale data transformation, achieving a 20% improvement in data accuracy across enterprise datasets. • Built automated monitoring and self-healing ETL systems using AWS Lambda, Step Functions, CloudWatch, EventBridge, and SNS, reducing manual operational effort by 30%. • Automated ETL file and record validation using SQL Stored Procedures and AWS (Redshift, Lambda, Glue, SNS), logging pipeline stats in Redshift and cutting manual effort by 60%. • Streamlined CI/CD pipelines using GitHub Actions and AWS CloudFormation for seamless automated deployments across Dev, QA, and Prod environments. • Integrated SNS with ServiceNow for real-time ETL alerts, managed Incidents & Change Requests within SLAs using JIRA and Confluence in an Agile/Scrum delivery model.

Education

Echelon Insitute of technology

B.Tech

Electronics and Communication Engineering

Jul 2016 - Aug 2020Grade: First Division

Licenses & Certifications

AWS Data Engineer Associate

Amazon Web Services

Issued: Sep 2025Expires: Sep 2028

Credential ID: 1f3ef992104043a2b24728ce6ba3d832

View Credential

Microsoft Fabric Data Engineer Associate

Microsoft

Issued: Oct 2026Expires: Oct 2026

Credential ID: C9095D6DCD2C8375

Skills

Python
SQL
AWS
Pyspark
Microsoft Fabric
Snaplogic
Jenkins
ETL
Git
JIRA
Agile
Generative AI