Default profile banner
SG

Shivam Gupta

@shivam_gupta

Lead Data Engineer

Pune, Maharashtra

linkedin.com/in/shivamgupta911

Persistent SystemsIIT (BHU), Varanasi

Shivam Gupta is an experienced Software Engineer skilled in Java, Apache Spark, SQL, and data structures. He has a proven ability to solve complex problems and contribute to high-performance distributed applications. He is proficient in developing data processing modules and integrating various components into robust pipelines, with experience in cloud platforms like Azure Databricks.

Experience

Lead Data Engineer

Persistent Systems

Sep 2021 - Present

Working with architectural group to understand the design, identifying the bottlenecks and flaws to develop a high quality and high-performance Distributed Application. Developed data processing modules with Java using Spark RDD, Dataset APIs. Written wrapper shell scripts for integrating multiple Spark-Java, Hive components to prepare a pipeline of series of operations. Involved in migration of the on-premise Application to Databricks. Experienced in loading data from source systems to Azure Data Storage and processing the data in Azure Databricks. Contributed to all stages of release cycle, requirement gathering, analysis, prototyping, development, functional/non-functional testing, production rollouts. Involved in Building, deploying, configuring, and supporting CI/CD data processing pipelines on Spark cluster.

Persistent Systems

Jan 2020 - Sep 2021

Responsible for coding Scala/spark to design and develop the application. Worked closely with the client business and technical team to understand requirement functionality and bring innovative ideas to implement.

Deep Learning Intern

Tamkang University

InternshipMay 2018 - Jul 2018Taiwan

Applied Deep Learning in Construction Management System.

Education

IIT (BHU), Varanasi

Bachelor of Technology – B.Tech

Chemical Engineering

Jan 2015 - Jan 2019

Skills

Java
Python
Shell Scripting
Spark
Hadoop
Hive
SQL
Elasticsearch
Azure Databricks
Azure Data Factory
Azure DevOps
Airflow
Git