RSData Engineer with over two years of professional experience, specializing in building and maintaining robust data pipelines. Proficient in Python, Spark, and SQL, the candidate has experience automating ETL processes and handling data ingestion from multiple sources. Expertise includes utilizing cloud platforms such as Azure and AWS for data transformation and analysis.
Experience
Data Engineer
LTIMindtree Ltd
Maintained data pipeline up-time of 90% while investigating streaming and transactional data across 8 different primary data sources using Spark, Redshift and python. Automated ETL process across millions of rows of data, which reduced manual workload. Ingested data from data sources using combination of SQL, Google Analytics API using Python to create data views to be used in Bi tools like tableau. Designed and developed features to meet customer need. Implemented coalesce function to repartion the data to single file. Implemented NFRs like Exception Handling, Logging framework, Unit testing with Pytest and Asset. Developed Spark applications using Spark-SQL in Databricks for data Extraction, Transformation and aggregation from multiple file format. Extracted, transformed and loaded data from sources system to Azure Data Storage Services using combination of Azure Data Factory, Spark SQL. Responsible for estimating the cluster size, monitoring and troubleshooting of the spark databricks cluster. Implemented the spark DataFrame API to complete Data Manipulation within Spark Session.
Education
Sai Vidya Institute of Technology
B.E
Sri Devraj URS Pre-university
PUC
Nalanda High School
SSLC