Default profile banner
SA

Syed Atif Ali

@syed.atif-ali

Data Engineer at Capgemini

Navi Mumbai, India

https://www.linkedin.com/in/syed-atif-ali-b2659a244

CapgeminiSant Gadge Baba Amravati University

Results-driven Data Engineer with around 3 years of experience specializing in Spark, PySpark, Python, SQL, and Databricks. Proven expertise in developing and managing ETL pipelines, implementing business logic, and performing rigorous data quality checks. Skilled in automating deployment processes using Jenkins and managing code with Git/GitHub. Adept at understanding complex data models, creating comprehensive mapping documents, and ensuring seamless data processing workflows.

Experience

Data Engineer (GAP)

Capgemini

Full-timeJan 2023 - PresentNavi Mumbai, India

Designed and developed end-to-end data pipelines for transforming and integrating data into Delta tables using PySpark. Designed and implemented a real-time data pipeline to process semi-structured data by integrating raw records from data sources using Kafka and PySpark. Deployed data pipelines using Jenkins. Optimized Spark jobs for performance, resulting in up to 40% reduction in runtime and 30% lower resource consumption.

Data Engineer (Yahoo)

Capgemini

Full-timeMar 2022 - Dec 2023Navi Mumbai, India

Led the conversion of legacy Hive ETL pipelines to PySpark. Conducted comprehensive performance optimization for PySpark pipelines, reducing data processing times by up to 35%. Managed data extraction and cleaning processes. Deployed and managed ETL pipelines on AWS EMR.

Education

Sant Gadge Baba Amravati University

B.E.

Jan 2015 - Jan 2019

Licenses & Certifications

AWS Certified Developer – Associate

Amazon Web Services

• No expiration

Microsoft Certified: Azure Data Fundamentals

Microsoft

• No expiration

Academy Accreditation - Databricks Lakehouse Fundamentals

Databricks

• No expiration

Skills

Python
SQL
Spark
PySpark
Spark Structured Streaming
Hadoop
Hive
Azure Databricks
Azure Data Factory
ADLS Gen2
AWS S3
HDFS
Git
GitHub
Jenkins
Kafka
AWS EMR