MANTESH CHOUGULE

@manteshchougule

Big Data Engineer and Data Scientist

Pune, Maharashtra India

Net Connect Global Pvt Ltd (Client IBM)Simplilearn

Mantesh Chougule is a skilled Big Data Engineer and Data Scientist with experience in monitoring and maintaining Hadoop and Kafka environments. He possesses in-depth knowledge of the Hadoop ecosystem, including Spark, Hive, and Map-Reduce. Furthermore, he has applied data science techniques, building predictive models and developing dashboards using Tableau and machine learning algorithms.

Experience

HADOOP AND KAFKA ADMIN (BIGDATA engineer)

Net Connect Global Pvt Ltd (Client IBM)

•Present

Monitoring and maintaining Jobs On IBM Workload Scheduler(Tivoli) And Rerun, Cancel, Hold Jobs as Per Requirements. Monitoring jobs through logs in Hadoop environment. Had in depth knowledge of Hadoop and all components of Hadoop ecosystem eg. Kafka, zookeeper, Hbase, yarn, map-reduce, sqoop,flume,spark,hive,impala,cloudera,cognos,oracle database. Monitoring and maintaining Kafka stream jobs and connectors if failed killing and rerunning them and checking file movements in both landing and archiving and source directories. Monitoring Kafka batches staging and fact table conversion if get vanished then checking logs for solution. Informing about reject count to the developers and loading data again in staging. Finding Hive and Spark Errors in shell. During Jobs showing error in Tivoli. And as per error taking action for different errors. Killing Job Manually If Required in shell and re-scheduling jobs in Cron-tab if require. Complete ELT Moving Files from Source to Target Directories. SFTP of Files After ELT to client location using scp command. Removing and Creating New Directories in Hdfs Using Hadoop and Linux Commands. Fact Count and Cross Validation of Total Files Received and Total Files Loaded On Target in Excel. Failure Tracker Report Filling and Action Taken. Monitoring Reconciliation Report and Reporting Managers About File Count Difference If Exists. Preparing Status Every Hour for Tivoli Jobs as Well as Kafka Streams and Batches. Running hive queries in hive-shell to find fact and staging tables as we all to track running and completed jobs of streams. Had in-depth knowledge of spark and its ecosystem and components eg. Kafka /spark/Hadoop integration and ETL v/s Kafka /spark streaming /ingestion/spark-sql/vCore and executors requirements/yarn complete operations. Running jobs in bag round through putty client shell and reading their updated logs. Killing batches and particular jobs and rerunning them again if failed or found long running.

Education

IEI India

B.Tech

Mechanical Engineering

Jan 2020

Simplilearn

Master Program

Data Science and Business Analytics

Jan 2021

Licenses & Certifications

Master's Program - Data Science & Business Analytics

Simplilearn Certified in collaboration with IBM

Issued: Jan 2021• No expiration

Skills

Python

MySQL

Scala

Linux

Tableau

Power BI

Excel

IBM Cognos

ANN

CNN

Boltz Man Machines

Self-Organizing Maps (SOM)

Auto Encoders

ML Lib

Seaborn

ML Plot

Imputers

NLP Dsk Lib

Pandas

Scipy

Scikit-Learn

ML Learn

TensorFlow

Keras

Pytorch

Sqoop

Flume

Kafka

Hadoop

Spark

Py-Spark

Data Bricks

AWS

Quick Learner

Innovator

Problem Solver