Gunjan Surolia is an experienced Data Engineer skilled in building robust data pipelines and data lake architectures. Proficient in Python and various AWS cloud services (S3, Glue, Lambda, Athena), they have expertise in ETL processes, NLP, and handling unstructured data. They possess strong skills in Big Data technologies, including Spark, Hadoop, and implementing complex analytical solutions.
Experience
Data Engineer
Anchanto
Contributing in the design and development of the product "digitalshelf" - a tool to forecast price trends versus competiton, track impact on sales drivers and buyers engagement. Built automated data pipelines over AWS cloud to drive the ETL and provide data to the backend team for building APIs. OMS Incremental load - Created python scripts to incrementally sync the production data into the read replicas in AWS RDS instance.
Software Engineer - Data Engineering
Infoobjects Inc.
Designed and developed a data lake architecture from scratch, making the data readily available for broad set of analytics engines for predictive analytics, artificial intelligence and machine learning. Implemented the ETL processes on using different AWS cloud services such as - S3,Lambda,Glue,Athena,SQS. Dealt with the legal data from US courts and extracted information from highly unstructured data using NLP techniques and data preprocessing techniques in python.
Software Development Engineer - Backend
Infoobjects Inc.
Designed and developed a Scheduler, an important tool to author, schedule, and monitor workflows or data pipelines, for the ethos platform in FIS. Worked on the backend layer and data layer from scratch. Created Graphql APIs using graphene(Python+GraphQL) framework with SQLAlchemy and Python backend. Gained Experience in fast paced Agile software development.
Education
Birla Institute of Technology, Mesra
BE
Computer Science
Licenses & Certifications
Big Data Engineering Masters Program
Trendytech