Shivam Mistry
@shivammistry
Data Engineer
Bengaluru
Data Engineer with over two years of experience specializing in building and optimizing big data pipelines. Proficient in cloud platforms like AWS and Azure, and technologies such as Spark, Hive, and Kafka. Experienced in implementing ETL processes using AWS Glue and Redshift, and conducting advanced data analysis using PySpark and Databricks.
Experience
Data Engineer
Kingston Info Solutions
AWS ETL Pipeline Implementation for Holcim Sales Data Integration and Analysis: Ingested sales data from multiple sources (ERP, CRM, spreadsheets) into an S3 bucket. Transformed data using AWS Glue for analytics and reporting (cleaning, enrichment, aggregation). Loaded transformed data into AWS Redshift data warehouse for efficient querying and reporting. Validated data to ensure quality standards and business rules were met. Automated pipeline using AWS Lambda, and monitored using AWS CloudWatch. Also conducted S3 Data Analysis using PySpark and Databricks, analyzing data in S3, creating visualizations and dashboards using Databricks notebooks, and collaborating with business stakeholders to understand their data analysis needs.
Big Data Trainee
Trendytech
Executed Customer Data Analysis using Sqoop, Hive, Spark and S3 bucket, involving data extraction from MySQL via Sqoop to HDFS, transformation using Hive, and analysis using Spark SQL. Also worked on setting up ETL process leveraging Hadoop data pipeline for TFS, utilizing Sqoop to transfer data to HDFS raw, defining schema, creating HIVE tables, and performing data processing such as aggregation, joins, and filtering.
Education
Govt. College of Engineering, Amravati
B.Tech
Information Technology (I.T.)
Janata Mahavidyalya, Chandrapur
XII – CS
Computer Science
St.Micheal’s English School, Chandrapur
X