Default profile banner
RR

Ritwik Raj

@ritwik_raj

Senior Data Engineer - II at MakeMyTrip

Bangalore, Karnataka

MakeMyTripB.M.S. Institute of Technology

Ritwik is a Data Engineer and LLM Specialist with over 5 years of experience in designing, optimizing, and deploying large-scale data systems and AI-driven solutions. He specializes in building scalable pipelines using Spark, Kafka, and cloud technologies (AWS). His expertise includes Generative AI, RAG, and optimizing petabyte-scale data environments to drive business impact.

Experience

Senior Data Engineer - II

MakeMyTrip

Invalid Date - PresentBangalore, Karnataka

Created a real-time stream processing system using Kafka and Spark Streaming, optimizing payment gateways by reducing latency and minimizing failures, increasing the payment success rate from 80% to 85%. Developed an end-to-end synthetic data generation pipeline leveraging LLMs, where the model dynamically generates Python code, reducing manual dataset creation time and improving data quality. Built a scalable data pipeline from scratch using PySpark, SQL, Python, and FastAPI, powering MakeMyTrip’s dynamic flight discount system, leading to a 5% increase in click share through real-time pricing optimization. Created a high-performance Python library for efficient interaction with Aerospike, optimizing large-scale data storage and retrieval, reducing query latency, improving system throughput, and cutting developer effort by 50%. Leveraged AWS services (DMS, S3, EMR, Glue, Athena, Redshift) to streamline big data processing, improving cost efficiency and performance of large-scale data environments.

Software Engineer - Data Platform

Ola Cabs

Invalid Date - Invalid DateBangalore, Karnataka

Built microservices to transfer real-time data from Kafka and MySQL to a data lake (S3), deployed on Kubernetes for scalable and efficient data storage. Led a proof of concept (POC) to deploy Apache Pinot and Trino on a Kubernetes cluster, enabling sub-second query performance on high-throughput Kafka topic data and reducing analytics latency by 50%. Developed an abstract library/API for seamless data ingestion into central Kafka topics, standardizing data flows and improving pipeline efficiency, reducing developer integration efforts by 50% and accelerating deployment timelines.

Cloud Data Engineer

Amazon Web Services

Invalid Date - Invalid DateBangalore, Karnataka

Developed an EMR debugging tool to efficiently analyze and troubleshoot jobs running on EMR clusters, reducing debugging time by 3x and improving system reliability. Designed and implemented an end-to-end data pipeline to transfer data from relational databases to a data lake with upsert support using Apache Iceberg, enhancing data ingestion efficiency and ensuring scalable data management. Experienced with AWS Big Data and Analytics services such as S3, Glue, Redshift, AWS DMS, EMR, Athena, and QuickSight, leveraging them for scalable data processing and visualization.

Education

B.M.S. Institute of Technology

B.E (CSE)

Computer Science

Invalid Date

Completed courses in computer science with a specialization in Big data and Engineering.

Licenses & Certifications

AWS Certified Solutions Architect - Associate

AWS

• No expiration

Skills

Spark
Python
SQL
Hadoop
Kafka
Java
Hive
Spark Streaming
Kubernetes
Docker
Machine Learning
Large Language Models
Amazon Web Services
Software Development Process
OpenAI