Default profile banner
SJ

Samyak Jain

@SamyakJain

ETL Developer at ZS Associates

Pune, Maharashtra, India

ZS AssociatesDIT University

Data Engineer with 3+ years of experience building scalable ETL/ELT pipelines using PySpark, SparkSQL, and SQL on AWS and Azure. Experienced in developing cloud-based data warehouses, processing structured and semi-structured data (Parquet, XML, JSON, CSV), and optimizing distributed data workflows. Hands-on with AWS services like EMR and S3, and Delta Lake architecture. Strong collaborator focused on delivering reliable data solutions.

Experience

ETL Developer

ZS Associates

PresentPune, Maharashtra

Architected, Model and owned large-scale ETL/ELT pipelines using SQL and Spark (PySpark) application to process Large-volumes healthcare and Pharma data. Built and supported executive and market intelligence dashboards focused on pharmacy marketing, brand performance, and regional sales trends. Orchestrated and monitored Spark workflows on EMR and Hive compatible environments using Azkaban and AWS EMR. Optimized and automated ETL and Spark workloads by allocating shuffle partitions and executor configurations. Implement Master Data Management (MDM) and data governance frameworks for products and customer hierarchies.

Data Engineer

Celebal Technologies

Noida

Engineered scalable data pipelines and ETL pipelines using SQL, Python, Spark (PySpark/Spark SQL) on Azure Databricks. Designed and implemented data enrichments and transformations using Medallion Architecture (Bronze/Silver/Gold) with Delta Lake. Developed a data quality audit engine and integrated datasets with Power BI. Created Delta Live Table pipelines for incremental and real-time data processing. Developed advanced SQL procedures and reusable transformation scripts.

Education

DIT University

Bachelor's degree

Computer Science & Engineering with specialization Artificial Intelligence and Data Science

Grade: 8.23

Licenses & Certifications

AZ 900 Microsoft Certified: Azure Fundamentals

Microsoft

• No expiration

Academy Accreditation - Databricks Lakehouse Fundamentals

Databricks

• No expiration

DP 900 Microsoft Certified: Azure Data Fundamentals

Microsoft

• No expiration

Databricks Certified Data Engineer Professional

Databricks

• No expiration

Skills

Advanced SQL
PL SQL
MySQL
PostgreSQL
Oracle
PySpark
Delta Live Table
ETL/ELT
Business Analysis
Salesforce
Hadoop
Hive
Kafka
Apache
Data Warehousing
Azure Databricks
Airflow
Big Data Analysis
Data Modeling
Data Engineering
Python
Flask
Docker
Django
FastAPI
Kubernetes
AI
Machine Learning
Snowflake
Cassandra
MongoDB
Agile
AWS
Google Cloud Platform
DBT
OLAP
APIs
SQL Server
SSRS
SSIS