Default profile banner
SJ

Smriti Jain

@sjain6700

Data Engineer at Tata Consultancy Services (TCS)

Noida, Uttar Pradesh, India

Tata Consultancy Services (TCS)Pranveer Singh Institute of Technology

Data Engineer experienced in designing and orchestrating scalable ETL pipelines using PySpark, SQL, Airflow, Databricks, and AWS Cloud Platform. Skilled in developing end-to-end data workflows, ensuring performance, reliability, and data integrity across systems.

Experience

Data Engineer

Tata Consultancy Services (TCS)

Aug 2023 - PresentNoida, India

Modernized enterprise ETL and Data Quality (DQ) workflows after Databricks cluster migration, enhancing scalability, governance, and reliability by over 40%. Migrated 15+ Informatica workflows to Databricks SaaS, redesigning pipelines into modular Source → Canonical → Cleanse → DQ layers aligned with the Medallion Architecture. Implemented Delta Lake with schema enforcement, ACID versioning, and audit logging to ensure compliance, traceability, and simplified rollback management. Built a scalable PySpark-based JSON ingestion framework to dynamically flatten nested data into parent–child tables, supporting flexible relational modeling and downstream analytics. Optimized ingestion performance by 35% through partition pruning, salting, and balanced cluster resource utilization, minimizing shuffle overhead in shared-mode environments.

Summer Analyst Intern

Goldman Sachs

Jul 2022 - Aug 2022Bengaluru, India

Automated 15+ manual processes, reducing operational effort by 40%. Developed AutoSys jobs to streamline Java application scheduling, improving efficiency and timeliness. Expanded dashboard functionality by adding 4 new monitoring flags, increasing usability for business teams.

Education

Pranveer Singh Institute of Technology

B.Tech.

CSE

Sep 2019 - Jun 2023Grade: 9.27 CGPA

Licenses & Certifications

AWS Certified Data Engineer – Associate

AWS

• No expiration

AWS Certified Cloud Practitioner

AWS

• No expiration

Web Development

Internshala

• No expiration

Skills

Python
SQL
C
Java
Apache Spark
PySpark
Databricks
Delta Lake
Airflow
Informatica PowerCenter
Informatica Developer
AWS
S3
Redshift
Hive
MySQL
Oracle
SQL Server
Git
GitHub
GitLab
ChatGPT
Copilot
Data Structures & Algorithms
Object-Oriented Programming