Experienced Data Engineer with over 5 years of expertise in designing, developing, and optimizing scalable data platforms and ETL pipelines. Proficient in cloud-native services, streaming frameworks (Apache Flink, Kafka, Spark), and orchestration tools like Airflow. Adept at data warehousing, modeling, and building robust, compliant data solutions for regulated industries such as healthcare and pharmaceuticals.
Experience
Data Engineer
Thoughtworks
Designed and implemented end-to-end scalable data pipelines using Apache Flink, Kafka, and DBT, enabling real-time data ingestion and transformation from IoT devices such as Apple & Fitbit smartwatches for analytics and reporting. Led the development of a high-performance data warehouse on AWS and Snowflake, optimizing fact/dimension models, Type 1/Type 2 SCD, and conformed dimensions integration to support data-driven decision-making. Worked closely with data analysts, data scientists, and business stakeholders to understand data requirements, ensuring timely and accurate data availability. Established data governance frameworks, access control policies, and monitoring systems on Kubernetes, ensuring compliance with HIPAA and other regulatory standards. Spearheaded pharma and healthcare data integrations, streamlining data pipelines for clinical trials, patient records, and research datasets.
Data Engineer
Avaya Inc.
Developed and maintained secure, scalable ETL pipelines using AWS, PySpark, Snowflake, and DBT, optimizing data transformations and reducing processing time for 50TB of structured and semi-structured financial and healthcare data. Designed and implemented a Python-based data parsing library, reducing pipeline errors by ~12%, and achieving 99.9% uptime for critical data services with enhanced failover mechanisms. Automated data orchestration workflows using Apache Airflow, ensuring seamless execution of daily revenue, sales, and financial compliance reports for global clients. Partnered with healthcare and pharmaceutical stakeholders to develop tailored data solutions that improved drug research analytics, optimized clinical trial management, and enhanced patient insights. Conducted performance tuning for Snowflake-based data transformations, reducing query runtimes by ~40%, and optimizing cost efficiency for large-scale analytical queries.
Data Engineer
vPhrase Analytics Solutions
Designed and built custom data connectors to extract and integrate data from 10+ disparate sources, supporting BI tools like Tableau and Power BI for dynamic business intelligence reporting. Implemented test-driven development (TDD) and CI/CD pipelines, achieving ~80% test coverage and ensuring robust, scalable data pipelines. Led the migration of a monolithic data processing architecture to microservices using Docker and Kubernetes, improving fault tolerance by ~30% and data processing throughput by ~15%. Built and optimized ETL workflows for healthcare providers, ensuring seamless data ingestion, cleaning, and processing of patient records, medical claims, and clinical data.
Education
Government College of Engineering, Amravati
Bachelor of Technology
Computer Engineering
Relevant coursework included Data Architecture, Advanced Python, Network Security, DSA, and Operating Systems.