Pooja Mallick
@poojamallick
Data Engineer at Sdora Consulting
Hyderabad, india
Pooja is a skilled Data Engineer with 2 years of experience in data processing and management. She has successfully migrated and validated big data using tools like Hadoop, Spark, and Google Cloud Platform. She is proficient in designing ETL processes, developing data models, and ensuring high data accuracy.
Experience
Data Engineer
Sdora Consulting (third party under the Deloitte)
Successful migration and validation of Big data for a well-known US retail giant using state-of-the-art tools such as Hadoop, Data Proc, Spark, Google Cloud platform, Apache Airflow and more. Designed and implemented effective database solutions and models to store and retrieve data. Spearheaded the development and implementation of automation scripts in Python and Bash to validate over 50,000 data points; increased efficiency by 80% and saved over 1,000 hours per year. Conducted a thorough audit of migrated data; identified and resolved 250+ SQL errors within a week, ensuring 99.9% data accuracy. Developed, implemented and maintained data analytics protocols, standards, and documentation. Collaborated seamlessly with cross-functional teams to identify and address data pipeline issues, resulting in enhanced data quality and operational excellence. Leveraged Github for version control and collaboration among the team. Proficient in using Data warehouses such as Big Query and Hive for data extraction, transformation, and loading. Implemented and managed ETL processes to extract, transform, and load data from various sources, including clinical trials, manufacturing, supply chain, sales, and regulatory compliance systems. Designed and maintained data models, ensuring data integrity and accuracy throughout the ETL pipeline. Conducted data cleansing and transformation activities to improve data quality and consistency. Developed and executed validation procedures to verify the accuracy of transformed data and performed data reconciliation between source and target systems. Collaborated with cross-functional teams to understand data requirements and design efficient ETL workflows to support business analytics and reporting needs.
Education
Regional College Of Management
BBA
Licenses & Certifications
Crash Course on Python
SQL for Data science
University of Davis, California