Anukool Tiwari

@Anukool

Data Engineer at Modak Analytics

India

Modak AnalyticsGL Bajaj Institute of Technology and Management

Data Engineer with nearly 2 years of experience specializing in the Azure Data Stack. Expert in designing and optimizing scalable data pipelines using PySpark, SQL, Azure Databricks, and Azure Data Factory. Strong experience building end-to-end ETL/ELT pipelines, implementing Delta Lake architectures, and applying CDC (Change Data Capture) for incremental data processing.

Experience

Data Engineer

Modak Analytics

•Mar 2025 - Present•Hyderabad, India

Modernized Architecture (ADF + Databricks) | Client: Humana. Developed and orchestrated ADF pipelines to ingest data into Azure Data Lake Storage (ADLS Gen2) following bronzesilvergold layering. Architected and engineered end-to-end curation pipelines using PySpark, Delta Lake, and Azure Databricks, achieving a 40% data quality gain. Utilized Azure Databricks notebooks to perform complex data transformations, validations, and aggregations. Processed and managed 10M+ records daily using incremental Change Data Capture (CDC) techniques. Led performance optimization initiatives in Databricks by tuning Spark configurations, joins, and transformations, achieving a 50% improvement in data processing speed. Reduced overall pipeline execution time by 30% and enhanced data accuracy and consistency by 40%. Leveraged ADLS Gen2 for staging and intermediate storage, resulting in a 25% reduction in storage costs. Optimized PySpark jobs in Azure Databricks using partitioning, caching, and efficient Spark execution strategies.

Data Engineer

Modak Analytics

•Apr 2024 - Mar 2025•Hyderabad, India

StreamSets-based Ingestion | Client: Humana. Designed and implemented ETL pipelines using StreamSets to ingest data from Genesys REST APIs, ensuring reliable batch ingestion. Stored raw ingested data in Google Cloud Storage (GCS) and maintained MongoDB audit collections for ingestion tracking and reconciliation.

Education

GL Bajaj Institute of Technology and Management

Master of Computer Applications (MCA)

Computer Applications

Jan 2022 - Jan 2024•Grade: 7.9/10 CGPA

Licenses & Certifications

Microsoft Azure Fundamentals (AZ-900)

Microsoft

• No expiration

Skills

Azure Databricks

Azure Data Factory (ADF)

Azure Synapse Analytics

ADLS Gen2

Google Cloud Storage

SQL

Python

PySpark

Apache Spark

Delta Lake

Hadoop

HDFS

StreamSets

ETL/ELT

REST APIs

CDC

MongoDB

Delta Tables

Parquet

JSON

Data Warehousing

Incremental Processing

Performance Tuning

Partitioning

Azure DevOps

CI/CD