APAkash Prajapati
@akashprajapati
Data Engineer
Bangalore, India
Akash is an experienced Data Engineer proficient in Big Data components and ecosystems, including Spark, Python, Scala, Hive, and HDFS. He has extensive experience processing terabytes of data from various sources, optimizing performance in Hive scripts, and developing robust ETL pipelines. His skills include working with cloud technologies, requirement analysis, and utilizing tools like Airflow.
Experience
Data Engineer II, KPI PARTNERS
KPI PARTNERS PVT LTD.
Developed complete end to end Data pipelines using Python. Worked on CI/CD pipeline for ETL. Optimized spark job which increased the efficiency and speed of the task by 50%. Import Data into the source bucket from multiple APIs and perform transformations using Pyspark and Spark SQL and other operations and push data. Write hive DDLs to create hive tables for optimizing query performance. Implemented Spark using Python and utilizing Data frame and Spark SQL API for faster processing of data. Define all the possible Test Cases along with the Test Data. Worked on Spark Performance tuning. Designed External Hive Tables and Defined static and dynamic partitions as per requirement for optimized performance on production datasets. Used Airflow dags and operators to perform operations and schedule jobs based on requirements. Monitoring, Scheduling and Running airflow dag's and tasks respectively.
Data Engineer, BOSCH
BOSCH (Robert Bosch Engineering and Business Solutions)
Working in the Thyssenkrupp Elevator project team, worked on Spark Scala ETL services that let the user view and keep a track of manufacturing units, which is a key factor in deciding the performance of a plant. To ingest the daily data in batches from the oracle database to HDFS using Sqoop for data ingestion, developing a fault-tolerant system for data tracking. Performed operations such as data cleaning to verify the data and applied transformations on data also tested the functions for verifying the Sensor accuracy in Spark. Storing the processed data and then performing query operations for further data analysis using Spark SQL. Giving business insights on the processed data using HIVE.
Education
Lovely Professional University
B.TECH in Computer Science and Engineering
Compute Science and Engineering (Minor - Big Data and Analytics)
Licenses & Certifications
RBEI-NIPUN DATA
RBEI