AMAnshul Malhotra
@anshul_malhotra
Data Engineer
Gurugram, Haryana, India
Anshul Malhotra is a Data Engineer with over 5 years of experience working in the telecom and travel domains. He has a strong technical foundation, utilizing tools and technologies including Java, SQL, Spark, Hadoop, and various cloud services. He is skilled in designing and implementing scalable data pipelines, microservice architecture, and working with platforms like Kubernetes and Docker.
Experience
Data Engineer
TEKSystems
Implemented Data Pipelines to process Customer Feedbacks from over 8+ ingestion sources. Implemented Spark Batch Applications to process parquets, AVRO and JSON from S3 and persisted results to Hive after enrichment. Implemented and Scheduled Data Pipelines with Airflow. Implemented metrics capturing mechanism to codebases for pushing metrics to Datadog. Configured Monitors in Datadog with metrics captured to Alert team in case of any issues in data pipelines. Worked on North star vision of project to design system that can receive feedback in real-time. Documented Component level LLD for creating Scalable and Fault Tolerant systems. Implemented APIs using Spring Boot to receive events in real time and publish those events to Kafka and deployed application over Kubernetes.
Module Lead
Rakuten Symphony
Developed complex Big Data projects by gathering, parsing, managing, analysing, interpreting, and visualizing large datasets to extract valuable insights and convert those insights into actionable business decisions. Used Spark SQL over Hortonworks Hadoop YARN to perform analytics on data in H-Base. Implemented efficient Big Data solutions by integrating multiple Big Data tools. Transformed spark RDD jobs to spark SQL jobs. Transformed Cron schedules to NiFi based scheduling. Implemented REST API’s using Spring Boot. Designed and implemented scalable applications for automating the bring up process of virtual machines in private data centres. Directed software design and development across multifaceted team to meet product’s needs for functionality, timeline and performance. Reviewed requirements, specifications and technical design documents to provide timely and meaningful feedback for Sprint discussions. Adjusted design parameters to boost performance and incorporate new features.
Education
Asia Pacific Institute of Information Technology
B.E. – Computing
Computing
D.A.V. Public School Thermal, Panipat (Haryana)
Intermediate
D.A.V. Public School Thermal, Panipat (Haryana)
Matriculation
Licenses & Certifications
Apache Airflow Fundamentals
Astronomer
Apache Cassandra 3 Developer Associate Certification
DataStax
Introduction to Big Data
University of California, San Diego
Distributed Computing with Spark SQL
University of California, Davis
Python for Data Science
IBM
AWS Cloud Practitioner Essentials
AWS