Kruthi B
@kruthib
Data Engineer II at Ganit Business Solutions
Bangalore
Kruthi is an experienced Data Engineer specializing in building robust data platforms focused on data quality, governance, and lineage. She has extensive experience in the retail and FMCG sectors, developing scalable ETL pipelines and automating processes. Proficient in AWS services (S3, Glue, Lambda) and big data technologies like PySpark and Snowflake, she is skilled in creating data-driven dashboards and reports.
Experience
Data Engineer II
Ganit Business Solutions
Data Platform focused on Data Quality & Governance, Data Observability, and Data Lineage. – Retail and FMCG Sector: Built a real-time data quality monitoring system, reducing inconsistencies and improving data reliability. Implemented data lineage tracking to ensure seamless integration across multiple disconnected systems. Developed proactive anomaly detection using automated alerts for sales, inventory, and transaction data. Designed dashboards for sales trends, customer behavior, and product performance, enabling data-driven decisions. Technologies used include AWS Glue, PySpark, S3, CloudWatch, Lookout for Metrics, Lambda, SNS, EventBridge, Snowflake, Airflow, SnapLogic.
Data Engineer II
Ganit Business Solutions
Data Platform Development – FMCG Sector: Designed and implemented a data platform to enhance data system efficiency, automate processes, and reduce manual hours. Eliminated data silos and manual data entry by integrating disparate departmental data, creating a scalable ETL pipeline that reduced a two-hour task to just 15 minutes, and developed impactful reports for various domains. Currently, leading a team of three interns for development activities, daily pipeline monitoring and maintenance. Facilitated cross-functional collaboration via automated processes, delivering key reports for Finance (working capital, fixed overhead analysis, variance analysis), HR (attrition, talent acquisition, headcount), and Sales (variance analysis). Technologies used include PySpark, SQL, SnapLogic, AWS S3, Glue, Airflow, AWS SNS, Snowflake.
Data Engineer I
Ganit Business Solutions
Data Lake & Warehouse Migration Accelerator for Infrastructure Modernization – Manufacturing Sector: Worked on migrating 10TB of data from 7000 tables to AWS S3 using DMS, with robust backup and archival setups. Developed and deployed the “Euclidean Framework” as a build package on AWS EMR, using PySpark for data transformation and business logic and moved the data to Redshift for visualisation and analysis. Implemented a sophisticated CI/CD pipeline utilizing various AWS services, significantly reducing processing time by 40% and minimizing manual errors by 20%, while ensuring automated scalability and enhanced logging and troubleshooting capabilities. Technologies used include PySpark, AWS S3, DMS, EMR, Redshift, Step Functions, CloudWatch, AWS Budgets, CodeCommit, CodeBuild, CodeDeploy.
Data Analyst
Ganit Business Solutions
Mutual Fund Analysis - Capstone Project: Established a data lake on AWS S3 for profiling 15 years of mutual fund data, ensuring robust management. Implemented quality control measures with AWS DataBrew to monitor errors, ensuring data integrity. Orchestrated an ETL pipeline using AWS Glue and Redshift, automated with Lamba triggers and SNS for efficiency. Created Power BI dashboards to track trends, compare funds, identify top performers, and forecast future performance. Technologies used include Python, AWS S3, Glue, DataBrew, StepFunctions, Power BI.
Education
JSS Science and Technology University
B.E.
Industrial and Production Engineering
Relevant coursework included Engineering Mathematics I, Engineering Mathematics II, Computational Mathematics, Problem solving Using C, Operations Research, Supply Chain Management, Design of Machine Elements, Quality Engineering, Lean and Agile Manufacturing, etc.
Licenses & Certifications
AWS Cloud Practitioner Certificate
AWS
Spark and Python for Big Data with PySpark
Udemy
Python
University of Michigan