Sulbha Goyal is an experienced Data Engineer specializing in building robust data pipelines using big data technologies. She possesses deep expertise in Apache Spark, Kafka, and Hadoop, complemented by proficiency in AWS services like EMR, Glue, and S3. Her background includes developing complex real-time and batch ETL jobs, optimizing data flow, and migrating large-scale data systems across multiple industries.
Experience
Senior Data Engineer
MAKE MY TRIP
Developed various batch and real-time pipelines to aggregate data from multiple sources. Built data migration utilities (replacing sqoop) and created Airflow DAGs for scheduling Spark ETL jobs. Built real-time pipelines aggregating 15+ streams using Spark Streaming and Kafka, integrating Aerospike for state management. Developed ETL jobs to compute business metrics (atv, gmv, gr, sp) and built dashboards using Glue and AWS Athena.
Data Engineer
HPE Infosight
Developed data pipelines to aggregate and transform incoming sensor data using big data technologies. Owned maintenance of multiple pipelines and provided technical solutions for the Big Data layer. Developed Rest APIs for application integration.
Senior System Engineer
INFOSYS LTD.
Implemented frameworks for importing existing Hive tables and applying custom UDFS. Developed frameworks for complex data operations (joins, aggregations) using Spark or Hive, and for saving query results to HDFS. Implemented framework to import data from HDFS into user databases.
System Engineer
INFOSYS LTD.
Developed data ingestion framework to import data from various JDBC/Non-JDBC sources (MySQL, Oracle, SFTP, FTP) into HDFS and Hive. Implemented data transformation in Scala (e.g., CSV to Parquet, JSON to Parquet) using Apache Spark. Worked on HIVE SQL transformations and handled data type compatibility issues.
Intern
INFOSYS LTD.
Trained in object-oriented programming (Java), DBMS, and software engineering, covering various phases of SDLC including requirement elicitation and deployment.
Education
Punjabi University
BTech in CS
Licenses & Certifications
Functional Programming Principles in Scala
Coursera
Apache Spark 2.0 with Scala- Hands On with Big Data!
Udemy