N D S HARSHA VARDHAN

@ndsharshavardhan

Data Engineer at ADA Digital Analytics Private Limited

Bengaluru

ADA Digital Analytics Private LimitedAditya Degree College Affiliated to Adikavi Nannaya University

Data engineering professional with 3.1 years of experience, proficient in both Microsoft Azure and AWS cloud services. Expertise includes managing large datasets using technologies like Azure Synapse Analytics, AWS Redshift, and Databricks. Skilled in building real-time data pipelines using Kafka and performing complex ETL/ELT processes with Python and Pyspark.

Experience

Data Engineer

ADA Digital Analytics Private Limited

•Apr 2021 - Present•Bengaluru

Engaged in business discussions to determine KPIs and collaborated with cross-functional teams to deliver high-quality products. Gathered and processed data from various sources using Pyspark in Azure Databricks, and scheduled data flows with Azure Data Factory. Created Power BI dashboards for data-driven decision-making and maintained CI/CD pipelines for seamless deployment and updates. Conducted performance tuning of Databricks jobs, and Data Factory pipelines, and established best practices for data ingestion, transformation, and storage. Developed Azure Functions in Python for efficient application management and identified bottlenecks in the infrastructure for improvement. Utilized Pyspark testing libraries like test utils, Pytest, and Unit tests to reduce failures by 60% and optimized Spark code to reduce runtime by 30%. Implemented cost optimization strategies, reducing Azure cloud infrastructure expenses by 50% and saving up to 25% on platform costs by migrating from custom-built data collection microservices to Kafka. Developed Java Kstreams and Kafka Connector applications for real-time data processing and built a monitoring dashboard using Datadog. Ensured data security and compliance with data privacy regulations and provided ongoing support and troubleshooting for deployed solutions. Developed Pyspark script according to business logic which will fetch the source data hosted on the AWS S3 Storage, process the data according to the requirement, and push it to the redshift tables. Designed a Data warehouse using AWS Redshift with the databases for both production and development. Scheduled ETL Jobs to push the transformed data to the staging environment in the Redshift warehouse. Scheduled the scripts using AWS Glue for timely monitoring. Used AWS Cloudwatch to collect and track metrics, collect and monitor log files, and set alarms. Developed SQL scripts to migrate the data from staging to prod environment with some transformations and scheduled using La

Education

Aditya Degree College Affiliated to Adikavi Nannaya University

BSC Computer Science

Jan 2017 - Jan 2020•Grade: CGPA: 7.6/10

Sri Chaitanya Junior College

Mathematics, Physics, and Chemistry

Jan 2015 - Jan 2017•Grade: Percentage: 83.6

Sri Bharathi Public School

Jan 2014 - Jan 2015•Grade: GPA: 9.2/10

Licenses & Certifications

Learning path in Data Science

Board Infinity

• No expiration

Skills

ETL

Apache Spark and Pyspark

Streaming

Apache Kafka

Confluent Kafka and Azure Event hub

Stream Processing

Spark-Streaming

Kafka-Streams

Kafka-Connect

Databases

Oracle

MSSQL

PostgreSQL

Cosmos DB

Dynamo DB

Bigdata (Hadoop)

HDFS

Hive

Sqoop

MapReduce and YARN

Scripting/Programming

Shell Scripting

Python

SQL

PL/SQL

Pandas

NumPy

Java

Visualization Tool

Excel

Power BI

Tableau

Metabase

Quicksight

Cloud

Azure - SQL Database

Data Lake Storage

Blob

Data Factory

Databricks

Synapse Analytics

Event hub

Functions

Webapps

Power BI and Power Apps

AWS - S3

EC2

EMR

Lambda

Athena

Glue

Step Functions

Redshift

IAM and Secrets Manager

DevOps

GitHub

Azure DevOps