Mukesh Sahu

@mukeshsahu

Data Engineer

Bangalore, Karnataka

https://in.linkedin.com/in/mukesh-kumar-sahu

Tata consultancy servicesSilicon Institute of Technology

Mukesh has 2.4 years of experience specializing in data engineering using Apache Spark, Apache Airflow, and Python libraries (NumPy, Pandas, SciPy). He possesses hands-on experience with tools like Jupyter, Databricks, Jira, and Alation. His expertise includes optimizing data pipelines, managing data lakes, and applying in-depth knowledge of ETL processes, advanced analytics, and predictive modeling.

Experience

Data Engineer (Project)

Tata consultancy services

Project•Aug 2022 - Jun 2023•Bengaluru

Prepared annual universe estimates for the United States market. Architected schemas and datasets with best way of partitioning to reduce ingestion and execution time by 60%. Used Jira-Board and Gitlab for CI/CD.

Data Engineer / Developer

Tata Consultancy Services

Full-time•Mar 2021 - Jun 2023•Bhubaneswar

Developed and executed an end-to-end ETL process workflow (DAGS) using Apache-Airflow, Apache-Spark, Pandas, NumPy, and SQL. Used GitLab for version control and CI/CD. Devised workflow and schema for intermediate data-cache and table utilizing dynamic and oops concepts.

Data Engineer (Project)

Tata consultancy services

Project•Mar 2021 - Aug 2022•Bengaluru

Devised and created a product to automate Nielsen's weighting technique and prepare universe estimates. Built data pipelines (DAGs) through Apache Airflow with Apache Spark, Python, Pandas, NumPy, and SQL. Worked on AWS Data Lake/S3.

Education

Silicon Institute of Technology

B.Tech.

Electrical and Electronics Engineering

Jan 2016 - Jan 2020•Grade: 7.82 CGPA

S.C.S Junior College

12th Grade

Science

Jan 2014 - Jan 2016•Grade: 71.5%

Licenses & Certifications

Python for Data Science

IBM

• No expiration

Data Analysis Using Python

IBM

• No expiration

Data Visualization Using Python

IBM

• No expiration

Python Project for Data Science

Coursera

• No expiration

Applied Data Science with Python - Level 2

IBM

• No expiration

Skills

Python

SQL

Apache Spark

Apache Airflow

Pandas

NumPy

Big Data

ETL

Data Pipeline Development

Data Modeling

Data Visualization

AWS S3

Jira

Git-Lab