Mounisha Aripaka
@Mounisha_A
Data Engineer at CATALYST
Hyderabad, Telangana, India
Results-oriented and highly skilled Data Engineer with 4 years of experience in designing, implementing, and optimizing data pipelines and workflows. Proficient in working with large datasets, ETL processes, data transformation, and data ingestion across various sources. Expertise in Scala-Spark, PySpark, SQL and working with Cloudera platform. Adept at collaborating with cross-functional teams to deliver data solutions that enhance data accessibility and model training for data scientists. Strong understanding of platform management and service upgrades with hands-on experience in Linux environments. Also have good knowledge on Data Warehousing solution.
Experience
Data Engineer
CATALYST
Created Data engineering pipelines on top of Cloudera Machine Learning and Cloudera Data Engineering products. Created ETL pipelines to automate data movement from multiple sources such as Oracle and APIs to AWS S3 storage using Pyspark. Optimized usage of Cloudera Machine Learning. Optimized and Migrated existing ETL pipelines from Cloudera CML to Cloudera CDE thereby reducing the costs by 70%. Worked on development of key features for the Catalyst tool. Provided the Engineered data to the Data Science team for the model training.
Data Engineer
Modak Analytics LLP
Designed and implemented ETL pipelines using Scala-Spark for data extraction from various sources (Oracle, SFTP, APIs), cleaning, and loading to AWS S3 and Hive. Collaborated with Data Scientists to process and prepare datasets for model training. Leveraged PySpark to fetch data from AWS S3, developed required features, and load cleaned and transformed data into Oracle. Actively contributed to platform management tasks in Cloudera CML. Automated the process of sending timely reports of data quality via email using SMTP. Reduced the billing costs by 70% by optimizing and migrating the Cloudera CML pipelines to CDE.
Data Engineer
DATALABS
Worked on ETL processes, handling data ingestions from Oracle, SFTP, APIs to Hive and AWS S3 using Scala Spark. Ensured smooth and efficient data transfer for better business insights. Worked for documentation of the ETL processes to provide an handbook for Clients.
Education
Gayatri Vidya Parishad College of Engineering (Autonomous)
B.Tech
Narayana Junior College
Intermediate
Little Angels High School
SSC