Default profile banner
YJ

Yash Jha

@yashjha

Big Data Engineer

Korba, Chhattisgarh, India

https://www.linkedin.com/in/yash-kumar-jha

Cognizant Technology SolutionsITM University Gwalior

Yash Kumar Jha is a data engineer with over a year of experience in developing robust data solutions. He has expertise in designing meta-data pipelines for data reconciliation and building complex ETL processes using PySpark, Scala, and Hive. Proficient in various big data tools including Hadoop, HDFS, and MySQL, he is committed to leveraging data to derive actionable insights.

Experience

Programmer Analyst Trainee

Cognizant Technology Solutions

Project: Reconcile Any Two Data Source [RADS]. Designed the meta-data pipeline for the reconciliation task to automate the process of data reconciliation. Performed PySpark jobs to parse any kind of JSON file request through SQL server. Based on column mapping defined function that matches the column mapping to give the report of successful matches of the data from the two data sources.

Programmer Analyst Trainee

Cognizant Technology Solutions

Project: Airline Revenue Analytics & Insights. Created a data pipeline to extract, transform & load data from three source systems (passenger data, trips data, aircraft data). Created the OI layer, transformed data in the transform layer (including decrypting birthday using hive function), and stored final results in the aggregate layer.

Data Analytics Intern

Suven Consultants & Technology Pvt. Ltd.

•Invalid Date - Invalid Date

Created two projects on Machine Learning and Data Analysis (Recognizing Handwritten digits in python using scikit-learn, Data Analysis on Meteorological Data) and published them on Medium and Analytics Vidhya.

Education

ITM University Gwalior

B.Tech

Computer Science & Engineering

Invalid Date•Grade: Gold Medalist

CGPA: 8.65

Kendriya Vidyalaya Kusmunda

Higher Secondary (Class 12th)

Invalid Date

CBSE Board, Marks: 79%

Licenses & Certifications

Azure Fundamentals

Microsoft

• No expiration

Python for Data Science, AI & Development

IBM

• No expiration

Skills

SQL
Python
PySpark
Hadoop
Apache Spark
HDFS
Hive
Big Data
Microsoft Azure
Linux
Shell Scripting
MySQL
Scala
SparkSQL
Jupyter Notebook