Default profile banner
MC

MANTESH CHOUGULE

@manteshchougule

Big Data Engineer and Data Scientist

Pune, Maharashtra India

Net Connect Global Pvt Ltd (Client IBM)Simplilearn

Mantesh Chougule is a skilled Big Data Engineer and Data Scientist with experience in monitoring and maintaining Hadoop and Kafka environments. He possesses in-depth knowledge of the Hadoop ecosystem, including Spark, Hive, and Map-Reduce. Furthermore, he has applied data science techniques, building predictive models and developing dashboards using Tableau and machine learning algorithms.

Experience

HADOOP AND KAFKA ADMIN (BIGDATA engineer)

Net Connect Global Pvt Ltd (Client IBM)

•Present

Monitoring and maintaining Jobs On IBM Workload Scheduler(Tivoli) And Rerun, Cancel, Hold Jobs as Per Requirements. Monitoring jobs through logs in Hadoop environment. Had in depth knowledge of Hadoop and all components of Hadoop ecosystem eg. Kafka, zookeeper, Hbase, yarn, map-reduce, sqoop,flume,spark,hive,impala,cloudera,cognos,oracle database. Monitoring and maintaining Kafka stream jobs and connectors if failed killing and rerunning them and checking file movements in both landing and archiving and source directories. Monitoring Kafka batches staging and fact table conversion if get vanished then checking logs for solution. Informing about reject count to the developers and loading data again in staging. Finding Hive and Spark Errors in shell. During Jobs showing error in Tivoli. And as per error taking action for different errors. Killing Job Manually If Required in shell and re-scheduling jobs in Cron-tab if require. Complete ELT Moving Files from Source to Target Directories. SFTP of Files After ELT to client location using scp command. Removing and Creating New Directories in Hdfs Using Hadoop and Linux Commands. Fact Count and Cross Validation of Total Files Received and Total Files Loaded On Target in Excel. Failure Tracker Report Filling and Action Taken. Monitoring Reconciliation Report and Reporting Managers About File Count Difference If Exists. Preparing Status Every Hour for Tivoli Jobs as Well as Kafka Streams and Batches. Running hive queries in hive-shell to find fact and staging tables as we all to track running and completed jobs of streams. Had in-depth knowledge of spark and its ecosystem and components eg. Kafka /spark/Hadoop integration and ETL v/s Kafka /spark streaming /ingestion/spark-sql/vCore and executors requirements/yarn complete operations. Running jobs in bag round through putty client shell and reading their updated logs. Killing batches and particular jobs and rerunning them again if failed or found long running.

Education

IEI India

B.Tech

Mechanical Engineering

Jan 2020

Simplilearn

Master Program

Data Science and Business Analytics

Jan 2021

Licenses & Certifications

Master's Program - Data Science & Business Analytics

Simplilearn Certified in collaboration with IBM

Issued: Jan 2021• No expiration

Skills

Python
MySQL
Scala
Linux
Tableau
Power BI
Excel
IBM Cognos
ANN
CNN
Boltz Man Machines
Self-Organizing Maps (SOM)
Auto Encoders
ML Lib
Seaborn
ML Plot
Imputers
NLP Dsk Lib
Pandas
Scipy
Scikit-Learn
ML Learn
TensorFlow
Keras
Pytorch
Sqoop
Flume
Kafka
Hadoop
Spark
Py-Spark
Data Bricks
AWS
Quick Learner
Innovator
Problem Solver