Default profile banner
TS

Tejeswar Sai

@tejeswarsai

Data Engineer

Bangalore, Karnataka

linkedin.com/tejeswar-sai

IQVIAGITAM University

Data Engineer with 4 years of experience in ADF, Databricks, PySpark, and SQL, building and optimizing cloud-based data pipelines. Skilled in ETL, data modeling, and performance tuning for efficient data processing.

Experience

Data Engineer

IQVIA

•Feb 2023 - Present

Designed and developed scalable ETL pipelines using Azure Data Factory (ADF) and Azure Databricks for data ingestion and transformation. Created ADF pipeline jobs, scheduled triggers, and implemented Mapping Data Flows, using Azure Key Vault for secure credential management. Implemented PySpark-based transformations to process large datasets efficiently in Delta Lake. Optimized data processing performance, reducing ETL execution time by 80% using Spark optimizations and SQL tuning. Managed SQL Database, Azure Data Lake Storage (ADLS), ensuring efficient data storage and retrieval. Developed data models and schema designs to support business reporting and analytics. Utilized Databricks Autoloader for incremental and real-time data ingestion from Azure Data Lake Storage (ADLS), improving data pipeline efficiency. Integrated Unity Catalog for centralized data governance, access control, and lineage tracking across Azure Databricks environments. Automated data validation and monitoring processes using Python and JIRA, reducing manual debugging efforts by 50%. Troubleshot data pipeline failures and performance bottlenecks, ensuring 99.9% uptime for critical workflows. Contributed to documentation and technical standards for data pipeline development.

Data Engineer

ATTRA InfoTech

•Jan 2021 - Sep 2022

Built ETL pipelines for structured and semi-structured data using Spark and SQL. Developed and optimized batch processing jobs in Azure Databricks for data transformation. Worked with SQL databases for querying, analysis, and performance tuning. Automated data pipelines using Airflow. Migrated on-premises SQL Server data to Azure SQL DB and Azure Synapse, ensuring data consistency and performance optimization. Used Spark-SQL to process the data and to run on Spark engine. Explored with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, and Data Frame. Worked with SonarQube to ensure code quality and maintainability across data engineering projects. Collaborated with stakeholders to resolve data quality issues and improve reporting accuracy.

Education

GITAM University

Bachelor of Engineering

Computer Science

Jan 2016 - Jan 2020

Licenses & Certifications

EMC Academic Associate, Data Science and Big data Analytics

EMC

• No expiration

Masters 6 months Intensive industry Big data program

TrendyTech

Skills

Python
Scala
Shell Scripting
Azure Data Lake Gen2
Azure Databricks
Azure Data Factory
Azure SQL DB
Blob Storage
AWS S3
Apache Spark
Airflow
Hive
Kafka
Sqoop
MySQL
GitHub
GitLab
Bitbucket
Agile
Azure DevOps
Jira
IntelliJ
PyCharm
Maven
SBT
Power BI
Putty
WinSCP
ETL
data modeling
performance tuning