Default profile banner
BD

Bhupinder Dangi

@bhupinderdangi

Senior Data Engineer at OLX Group

Gurgaon, India

OLX GroupIIT-KANPUR

Bhupinder is a highly motivated Data Engineer with over 6 years of experience in developing, designing, and delivering complex data solutions. He possesses in-depth knowledge of data manipulation techniques, computer programming, and integrating new software packages. He is skilled in managing ETL pipelines, cloud technologies (AWS, Azure), and various big data tools like Spark, AirFlow, and Redshift.

Experience

Senior Data Engineer

OLX Group

Full-time•Oct 2021 - Present•Gurgaon, India

Data Engineer

Renew Power Limited

Full-time•Apr 2020 - Oct 2021•Gurgaon, India

Senior Technology Consultant

Virtusa

Full-time•Jun 2019 - Mar 2020•Gurgaon, India

Involved in Designing and Implementing ETL pipeline framework. Third party API data ingestion pipelines into Redshift cluster such as Marketing Ads data [Facebook, Google, Taboola], Salesforce, K2, Gupshup etc. Replication of production databases tables, click stream data on OLX website into Redshift cluster. Cluster maintenance e.g. upgrade, identifying problematic queries, EC2 storage maintenance, Access management of services [Jenkins, Jupyter Lab & Redshift] etc. Managed performance monitoring and tuning while identifying and repairing issues within ETL processes. Technologies: EC2, S3, Kinesis, AWS Glue, Python, Pyspark, Redshift, Jenkins, Jupyter Lab, Git Lab, Docker, AirFlow. Understand business requirements by collaborating with different domain experts. Development and Deployment of Pipelines codes on DataBricks spark cluster. Building dashboard on PowerBI for predictive maintenance of assents such as solar panel, wind turbines etc. Report generation of ETL process failures. Data Pipeline creation for various Solar and Wind use cases to fetch data from APIs like Synaptiq API, Bazefield API, Shared Folder etc. Data lake creation on Azure Blob storage for various domain e.g. finance data, Sensors data from solar inverter & wind turbines etc. Data was divided into 3 layers Raw, Intermediate, Logical. Technologies: Azure Blob Storage, Azure Synapse, Azure Analysis Service, Azure Data Factory, DataBricks PySpark, Python, Logic App. Client: RBS. Understand the client requirements and implementing the code solution. Creation of Data Lake from multiple sources into S3 staging zone then fresh data is compared with history, updated and stored in data lake Tables. SAS code conversion into Pyspark codes. Testing, debugging of converted codes and final deployment to production cluster. Technologies: PySpark,Python, Unix, AWS-EMR,Airflow, AWS athena.

Big Data Developer

TCS

Full-time•Aug 2016 - Jun 2019•Noida, India

Client: Apple Inc. Modified existing software systems to enhance performance and add new features. e.g. Hive to Spark Migration. Demonstrated leadership by making improvements to work processes and helping to train others [Spark Trainings]. Understanding business requirements and implementing end to end. Framework deployment on testing and production servers. Data Analysis using Hive, MapReduce and Spark on multiple projects such as Retargeting and Blacklist customers feed based on their history of Apps purchase on App-Store, In-App purchases Recency, User vector generation for improving app store search, Auto phrase completion using cosine similarity. Technologies: Spark,Spark Streaming, Machine Learning, Kafka, Map-Reduce, HIVE, Oozie, Autosys, Java, Python, Unix.

Education

IIT-KANPUR

M.Tech

Materials Science And Engineering

Jun 2016

Punjab Engineering College

B.E

Metallurgical Engineering

Jun 2013

Skills

Python
Spark
SQL
Unix
Cloud
AWS
Azure
Git
Docker
AWS-EMR
HIVE
JAVA-Core
Spark-Streaming
AirFlow
Redshift
Jenkins
AWS Glue
Pyspark
Azure Synapse
Azure Data Factory
Kinesis
Jupyter Lab
PowerBI
S3
Azure Blob Storage