Default profile banner
VS

Ved Prakash Shukla

@Vedprshukla

Data Engineer at EXL

Lucknow, Uttar Pradesh, India

EXLKanpur Institute of Technology, Kanpur

Seasoned Data Engineer with over 4 years of hands-on experience architecting and leading scalable data platforms for enterprise-scale environments. Expertise in designing robust ETL architectures using Apache Spark, Databricks, Azure Data Factory, and Delta Lake to drive strategic data initiatives. Proven track record in mentoring cross-functional teams, optimizing data governance, and delivering high-ROI solutions that enhance analytics maturity, forecasting accuracy, and operational resilience for global clients in healthcare and fintech sectors.

Experience

Data Engineer

EXL

Jan 2024 - PresentNoida

Led end-to-end data engineering initiatives, including migrating legacy SQL procedures to Databricks by developing Python, PySpark, and Spark SQL notebooks, enabling seamless transition to modern, scalable processing for healthcare analytics. Designed and implemented Medallion architecture (Bronze, Silver, Gold layers) for data lakes, establishing layered data refinement pipelines that improved data quality and accessibility for downstream consumers. Orchestrated file ingestion workflows leveraging Azure Data Lake Storage (ADLS), Azure Data Factory (ADF) pipelines, and Databricks workflows with automated scheduling, processing 10M+ daily records while ensuring fault-tolerant and efficient data movement. Directed workflow orchestration via Azure Data Factory, implementing CI/CD pipelines that accelerated deployments by 40% and supported multi-environment scalability for revenue cycle management and value-based care systems. Championed Delta Lake adoption for ACID-compliant data lakes, establishing governance frameworks that mitigated risks and enabled audit-ready compliance for regulated industries, including physician key metrics and operational reporting.

Software Engineer - Data

Paytm

May 2023 - Jan 2024Noida

Designed and automated enterprise-grade payroll reconciliation workflows, streamlining processes for 40K+ employees and reducing manual intervention by 50% through advanced scripting and monitoring. Engineered sophisticated reconciliation algorithms for high-volume credit datasets, boosting accuracy by 40% and minimizing financial discrepancies in a fast-paced fintech environment.

DataOps Developer

De Soto Technologies

Jan 2023 - Apr 2023Ahmedabad

Orchestrated data ingestion pipelines from Google Analytics into Redshift clusters, optimizing query performance to cut reporting latencies by 60% for marketing intelligence. Integrated and harmonized 3+ disparate external data sources into unified dashboards, providing actionable insights that drove 25% uplift in campaign ROI for cross-functional stakeholders.

L2 Data Engineer

SalescodeAI

Jul 2021 - Jan 2023Gurgaon

Optimized complex SQL queries and streaming pipelines for real-time dashboards, enhancing query throughput by 30% and supporting high-availability analytics for sales operations. Led proactive monitoring and tuning of AWS infrastructure, Kafka streams, and log analytics, reducing system downtime by 35% through predictive alerting and root-cause analysis.

Education

Kanpur Institute of Technology, Kanpur

B.Tech

Computer Science and Engineering

Jan 2017 - Jan 2021Grade: 7.3/10

Skills

Python
Pandas
NumPy
Scikit-learn
SQL
PySpark
Scala
Apache Spark
Delta Lake
Databricks
Kafka
Airflow
Azure
ADF
Synapse
Storage
AWS
EC2
S3
EMR
Redshift
PostgreSQL
SQL Server
MongoDB
Elasticsearch
Snowflake
Data Lineage Tools
CI/CD
Azure DevOps
Query Optimization
Power BI
Tableau
Advanced Excel
Git
Docker
Postman