Default profile banner
SD

Saurabh Dhawale

@saurabhdhawale

Data Engineer at NeoSoft Systems and Cloud Services

Pune, India

linkedin.com/in/saurabh-dhawale-195639313

NeoSoft Systems and Cloud ServicesUniversity of Pune

Saurabh is a Data Engineer with over 3 years of experience in designing, implementing, and supporting big data applications using Apache Spark, Hadoop, and AWS. He possesses strong expertise in Spark-core, Spark-SQL, and streaming, along with proficiency in Python and complex SQL queries. His experience spans the full project lifecycle, including data modeling, ETL processes, and optimizing performance across various cloud environments like AWS S3 and Redshift.

Experience

Data Engineer

NeoSoft Systems and Cloud Services

•Oct 2021 - Present

Developed solutions based on customer requirements. Key projects included: 1) Data Warehouse Analysis (Banking Domain), involving Pyspark data extraction, Hive table creation, Spark/Hive optimization, Airflow automation, and handling CDC/SCD. 2) Enterprise Data Hub (Telecommunication Domain), utilizing Medallion architecture (Bronze, Silver, Gold) with Pyspark and AWS Glue ETL. Managed job scheduling with AWS Glue and delivered reports using AWS S3. Tools used include Hadoop, Spark, Hive, AWS RDS, EMR, S3, Python, AWS Glue, and Athena.

Education

University of Pune

Master of Business Administration

MBA

Jan 2020•Grade: First Class

Skills

Hadoop
MapReduce
HDFS
HBase
Hive
Pyspark
Sqoop
Spark
Spark API
Data Modelling
Data Ingestion
Data Cleaning
Data Migration
Data Mining
Data Lake
Python
Oracle
PostgreSQL
MySQL
AWS
EMR
S3
EC2
IAM
RDS
Athena
Redshift
Glue
Amazon Cloudwatch
Shell Scripting
SQL
HQL
Visual Studio Code
PyCharm
Azure Databricks
Git
Git Bash
Airflow
Crontab