Bhupinder Dangi
@bhupinderdangi
Senior Data Engineer at OLX Group
Gurgaon, India
Bhupinder is a highly motivated Data Engineer with over 6 years of experience in developing, designing, and delivering complex data solutions. He possesses in-depth knowledge of data manipulation techniques, computer programming, and integrating new software packages. He is skilled in managing ETL pipelines, cloud technologies (AWS, Azure), and various big data tools like Spark, AirFlow, and Redshift.
Experience
Senior Data Engineer
OLX Group
Data Engineer
Renew Power Limited
Senior Technology Consultant
Virtusa
Involved in Designing and Implementing ETL pipeline framework. Third party API data ingestion pipelines into Redshift cluster such as Marketing Ads data [Facebook, Google, Taboola], Salesforce, K2, Gupshup etc. Replication of production databases tables, click stream data on OLX website into Redshift cluster. Cluster maintenance e.g. upgrade, identifying problematic queries, EC2 storage maintenance, Access management of services [Jenkins, Jupyter Lab & Redshift] etc. Managed performance monitoring and tuning while identifying and repairing issues within ETL processes. Technologies: EC2, S3, Kinesis, AWS Glue, Python, Pyspark, Redshift, Jenkins, Jupyter Lab, Git Lab, Docker, AirFlow. Understand business requirements by collaborating with different domain experts. Development and Deployment of Pipelines codes on DataBricks spark cluster. Building dashboard on PowerBI for predictive maintenance of assents such as solar panel, wind turbines etc. Report generation of ETL process failures. Data Pipeline creation for various Solar and Wind use cases to fetch data from APIs like Synaptiq API, Bazefield API, Shared Folder etc. Data lake creation on Azure Blob storage for various domain e.g. finance data, Sensors data from solar inverter & wind turbines etc. Data was divided into 3 layers Raw, Intermediate, Logical. Technologies: Azure Blob Storage, Azure Synapse, Azure Analysis Service, Azure Data Factory, DataBricks PySpark, Python, Logic App. Client: RBS. Understand the client requirements and implementing the code solution. Creation of Data Lake from multiple sources into S3 staging zone then fresh data is compared with history, updated and stored in data lake Tables. SAS code conversion into Pyspark codes. Testing, debugging of converted codes and final deployment to production cluster. Technologies: PySpark,Python, Unix, AWS-EMR,Airflow, AWS athena.
Big Data Developer
TCS
Client: Apple Inc. Modified existing software systems to enhance performance and add new features. e.g. Hive to Spark Migration. Demonstrated leadership by making improvements to work processes and helping to train others [Spark Trainings]. Understanding business requirements and implementing end to end. Framework deployment on testing and production servers. Data Analysis using Hive, MapReduce and Spark on multiple projects such as Retargeting and Blacklist customers feed based on their history of Apps purchase on App-Store, In-App purchases Recency, User vector generation for improving app store search, Auto phrase completion using cosine similarity. Technologies: Spark,Spark Streaming, Machine Learning, Kafka, Map-Reduce, HIVE, Oozie, Autosys, Java, Python, Unix.
Education
IIT-KANPUR
M.Tech
Materials Science And Engineering
Punjab Engineering College
B.E
Metallurgical Engineering