Default profile banner
AK

Arvind kale

@arvindkale

Data Engineer at Edelweiss(zuno) general Insurance limited

Mumbai, India

Edelweiss(zuno) general Insurance limitedAlma Better

Data Engineer with 2+ years of experience in building large-scale data pipelines, ETL processes, and data warehouse solutions. Utilized technologies like Python, SQL, Spark to develop multi-terabyte scalable big data solutions.

Experience

Data Engineer

Edelweiss(zuno) general Insurance limited

Nov 2022 - PresentMumbai, India

• Led the designing, developing, and optimizing data pipelines in Aws Glue. • Developed an automated pipeline with stored procedure in plpgsql for finance reports which reduced the tat by 80 %. • Integrated and processed insurance data from multiple sources including AWS, Azure, GCP, sap Hana and Oracle, handling various file formats like JSON, CSV, and Parquet. • Implemented Spark optimization techniques such as caching, multithreading, and broadcast joins, resulting in a 20% decrease in processing time for handling a daily load of around 1 Million records. • Created an API service using Python to generate dynamic DAGs in Apache Airflow. • Designed and implemented advanced scheduling capabilities using Airflow for data pipeline orchestration, reducing manual intervention time by 80% and streamlining workflow efficiency. • Delivered a project to migrate legacy on-premise processes to the cloud using Big Data technologies (Spark), reducing processing time by 20%. • Conducted in-depth data analysis using python, spark, and Spark SQL, providing SIT/UAT fixes and ensuring smooth operations in the production environment. • Optimized overall process performance through Spark performance tuning, improving job run times by 20% and efficiently managing a 15Terabyte (TB) dataset containing approximately 12 billion records. • Used lambda for event driven processing and CloudWatch for monitoring and logging job details. • Worked on data ingestion pipeline to ingest the flat file in the Data lake and access infrequent data using Athena.

Education

Alma Better

Bachelor of Technology

Bengaluru

PGP - Data Science and Engineering

Data Science and Engineering

Skills

Python
SQL
Spark
PySpark
Spark SQL
YARN
Hadoop
Hive
ADF
Databricks
ADLS
Amazon S3
AWS Glue
AWS EMR
Data Modelling
ETL/ELT data Pipeline
Airflow
cron jobs
Metabase
AWS
Azure
GCP
sap Hana
Oracle
JSON
CSV
Parquet
lambda
CloudWatch
Athena