Experience

Data Engineer

IQVIA

•Feb 2023 - Present

Designed and developed scalable ETL pipelines using Azure Data Factory (ADF) and Azure Databricks for data ingestion and transformation. Created ADF pipeline jobs, scheduled triggers, and implemented Mapping Data Flows, using Azure Key Vault for secure credential management. Implemented PySpark-based transformations to process large datasets efficiently in Delta Lake. Optimized data processing performance, reducing ETL execution time by 80% using Spark optimizations and SQL tuning. Managed SQL Database, Azure Data Lake Storage (ADLS), ensuring efficient data storage and retrieval. Developed data models and schema designs to support business reporting and analytics. Utilized Databricks Autoloader for incremental and real-time data ingestion from Azure Data Lake Storage (ADLS), improving data pipeline efficiency. Integrated Unity Catalog for centralized data governance, access control, and lineage tracking across Azure Databricks environments. Automated data validation and monitoring processes using Python and JIRA, reducing manual debugging efforts by 50%. Troubleshot data pipeline failures and performance bottlenecks, ensuring 99.9% uptime for critical workflows. Contributed to documentation and technical standards for data pipeline development.

Data Engineer

ATTRA InfoTech

•Jan 2021 - Sep 2022

Built ETL pipelines for structured and semi-structured data using Spark and SQL. Developed and optimized batch processing jobs in Azure Databricks for data transformation. Worked with SQL databases for querying, analysis, and performance tuning. Automated data pipelines using Airflow. Migrated on-premises SQL Server data to Azure SQL DB and Azure Synapse, ensuring data consistency and performance optimization. Used Spark-SQL to process the data and to run on Spark engine. Explored with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, and Data Frame. Worked with SonarQube to ensure code quality and maintainability across data engineering projects. Collaborated with stakeholders to resolve data quality issues and improve reporting accuracy.

Education

GITAM University

Bachelor of Engineering

Computer Science

Jan 2016 - Jan 2020

Licenses & Certifications

EMC Academic Associate, Data Science and Big data Analytics

EMC

• No expiration

Masters 6 months Intensive industry Big data program

TrendyTech

Skills

Python

Scala

Shell Scripting

Azure Data Lake Gen2

Azure Databricks

Azure Data Factory

Azure SQL DB

Blob Storage

AWS S3

Apache Spark

Airflow

Hive

Kafka

Sqoop

MySQL

GitHub

GitLab

Bitbucket

Agile

Azure DevOps

Jira

IntelliJ

PyCharm

Maven

SBT

Power BI

Putty

WinSCP

ETL

data modeling

performance tuning

Tejeswar Sai

Experience

Education

GITAM University

Licenses & Certifications

Skills