Default profile banner
KM

Keshav Mishra

@keshavmishra

Data Engineer

Faridabad, India

ZS Associates Pvt. Ltd.BPIT, GGSIP University

Keshav is a detail-oriented Data Engineer with approximately 2 years of experience. He possesses strong competencies in AWS, SQL, PySpark, ETL, and Data Warehousing. He is skilled in leveraging technical and management abilities to solve complex business problems and contribute to organizational growth.

Experience

Business Technology Solution Associate

ZS Associates Pvt. Ltd.

•Jun 2021 - Present•Gurugram, Haryana

Involved in the end-to-end project lifecycle, from gathering business requirements to designing the schema and architecture, development and deployment of the final project. Key achievements include developing ETL using AWS, PySpark, and SQL for a large cloud data warehouse solution, and leading the migration from AWS RDS to Redshift, resulting in $678,000 in annual cost savings. Designed and implemented a real-time data pipeline using PySpark to process semi-structured data from 30+ sources, and designed the data pipeline architecture for a new product scaling up to 125,000 daily active users.

Project

Leading Pharma Company (Client)

•Feb 2021 - Present

Architected solutions for a Fortune 500 pharma client, integrating new data sources into AWS cloud infrastructure using Amazon S3, Redshift, and RDS. Created data pipelines and automated ETL processes using AWS Glue to feed into the data warehouse. Designed the data model and deployed the entire project in AWS, creating a data warehouse (AWS Redshift) and data lake (AWS Athena) for approximately 30TB of data.

Education

BPIT, GGSIP University

Bachelor of Technology

CSE

Aug 2017 - Jun 2021•Grade: 8.5 G.P.A

D.A.V Public School, Sec 49 - C.B.S.E Board

12th Grade

Non-Medical

Apr 2015 - Mar 2016•Grade: 86.8%

B.N. Public School, Faridabad - C.B.S.E Board

10th Grade

Apr 2013 - Mar 2014•Grade: 8.8 C.G.P.A.

Skills

SQL Development
ETL
RDBMS
Postgresql
MySql
AWS RDS
AWS Redshift
AWS S3
Python
Spark
Databricks
Data Warehousing
PySpark
Snaplogic
Data Architecture
AWS EC2
Problem Management
Incident Management
Unix
Javascript
C++
OOPS
JIRA
MS Excel
MS PowerPoint
Service Now
Exploratory Data Analysis
AWS Glue
Microstrategy
Tableau