DHARMESH SONI
@dharmeshsoni
Data Engineer
Indore
Experienced Data Engineer with expertise in building and optimizing end-to-end data pipelines using Python, SparkSQL, and Presto. Proven ability to manage large-scale data sets (up to 100B rows) and implement robust data governance frameworks. Skilled in cloud environments (AWS, GCP) and various tools including Airflow, Tableau, and KNIME.
Experience
Data Engineer
NK Technolabs
Led end-to-end data engineering projects, scoping projects, building frameworks, execution, and detailed documentation. Collaborated with stakeholders and xFN partners - data scientist, and product managers and used project management skills. Orchestrated data processing pipelines with ~100 B rows using Python, Presto, and SparkSQL in distributed environment and programs using Python to automate the frameworks steps. Led the project with stakeholders to mitigate the user data from root sources across Ads warehouses for privacy concerns. Delivered product and analytics data models with data quality and governance frameworks and iterated for improvements. Designed and deployed No-SQL data models to production – horizontally scalable, performant, documented, and tested. Defined and managed SLAs for all data sets in allocated areas of ownership and optimized the pipelines/ETL. Worked with structured and unstructured data Ingested from logs, file systems, hive tables, and APIs in tool (like Airflow). Collaborated to design and leverage Identity and Access Management framework on Ads Measurement data warehouse. Built data governance - monitoring, alerting, and guidance framework to meet privacy commitment for Ads data warehouse. Closely managed the fidelity of data – data profiling, accuracy, correctness, completeness, and data transformation needs. Built dashboards to measure and monitor the business-critical metrics and performed root-cause analysis. Setup the data pipeline monitoring and alerting system to proactively optimize and troubleshoot data pipeline and models.
Business Data Analyst
Collabera Inc
Led data-driven project for Operational Excellence team - provided technical solution to onboard third-party tool, improve data consistency, built data marts as source of truth, and reporting solutions. Engineered end to end analytics data pipeline and dashboard to track performance metrics of Sales team with 450+ reps. Translated business requirements into data models and used dimensional modeling framework to build data sets. Consulted senior to define the KPI metrics and build data marts to establish source of truth as per the business use cases. Increased efficiency through Salesforce data process automation, reducing the daily task to weekly and saved ~125 hours. Enabled data quality check into data pipelines, documented rules, and recommended process change for data consistency. Worked with Product, Sales, and Marketing datasets gathered from Hive (AWS S3), Salesforce, Zendesk, JIRA, Highspot.
Associate Analyst, Data
Axelon Services Corporation
Automated ETL data pipelines using SQL and Python in KNIME Analytics to build analytics models for real-time dashboards. Partnered with XFN teams to effectively model business processes and automate reporting workflows to measure metrics. Enabled capacity monitoring and planning through dashboards to measure metrics for fulfilment centers and carriers. Communicated analytical findings with XFN teams and carriers on lane performance and negotiated line-haul needs and provided strategic solutions to minimize line-haul requirements. Built real-time Tableau dashboards and reporting workflows to measure operational metrices for multiple fulfilment centers and carriers, enabling capacity monitoring and decision making. Built fact data models for business intelligence and analytics to measure actual versus forecast measures.
Data R&L Specialist
Spectraforce Technologies Inc
Subject Matter Expert to define and improve modeling accuracy for product data quality that drives positive user experience. Managed external vendors by handling escalation cases, analyzing trends, and resolving issues.
Data Analyst Intern
ShadyFoods
Analyzed and onboarded third-party data to conduct experimentation (A/B) on emails and social media platforms.
Education
California State University, Los Angeles
Master of Science
Information Systems
Licenses & Certifications
Data Engineering with GCP by Google
Coursera