Default profile banner
MAHESH PADEKARMP

MAHESH PADEKAR

@maheshpadekar

Big Data Engineer

Thane, India

https://www.linkedin.com/in/mahesh-padekar-2890b5127/

SpiderWeb InfoTech PVT. LTDMumbai University

Mahesh Padekar is a Big Data Engineer with over 2 years of experience in Hadoop ecosystems, including HDFS, Spark, Hive, and MapReduce. He specializes in loading and transforming large datasets from RDBMS to HDFS using tools like Sqoop and processing data via Spark Core and Spark SQL. His expertise extends to cloud technologies, including AWS services such as EMR, Glue, and S3, and performance tuning of data warehousing solutions.

Experience

Big Data Developer

SpiderWeb InfoTech PVT. LTD

•Mar 2020 - Present

Maintained day-to-day data in Hive Datawarehouse from different source systems. Wrote Sqoop Jobs to transfer data from SQL Server and Oracle Databases to HDFS. Loaded processed data into Hive External table, creating Partitioning and Bucketing techniques using ORC and Parquet formats to improve performance. Scheduled Sqoop jobs using Airflow. Monitored SQL Database for successful daily ingestion jobs and debugged issues. Ran Analytics Pipeline for aggregating and joining multiple daily ingestion tables. Scheduled jobs and transferred final output data into Redshift for reporting. Created cluster groups in EMR for running daily ingestion and Analytics Pipeline jobs.

Education

Mumbai University

B.E.

Mechanical

Jan 2015

Licenses & Certifications

Big Data

• No expiration

Skills

Hadoop
MapReduce
Linux
Hive
Sqoop
HBase
Spark
Scala
HDFS
SparkSQL
YARN
Oozie
AWS Glue
AWS Athena
AWS S3
Python
Boto3
Airflow
Redshift
ORC
Parquet