Default profile banner
AG

Abhishek Guleria

@abhishekguleria

Senior Data Engineer at The Smart Cube

Gurugram, Haryana

The Smart CubeLovely Professional University

Abhishek Guleria is an experienced engineer with a demonstrated history in the information technology and services industry. He possesses expertise in Big Data ecosystems, Data Science, and business enhancements. His technical skills include proficiency in Python, SQL, and various big data tools like Hadoop, Spark, and Databricks.

Experience

Senior Data Engineer

The Smart Cube

Nov 2021 - Present

The objective was to create Invoice Data so that client can offer various offers to their customers. Analyzed the data stored in Google Big Query and 3rd Party data, containing the customers information, their purchase history, subscription plans, sales data etc. Client wanted to create MRR report based on Invoice Data. Migration of Data from SQLite to PostgreSQL and have automated data migration process, the frequency is monthly.

Data Engineer

Home Credit

Jul 2021 - Oct 2021

Worked on creation of SQL scripts for CRM team.

Data Engineer

Wipro Ltd

Aug 2019 - Jul 2021

Extract data from various data sources and loading, cleansing and transforming the data. Developed Data in the Data lake by ingestion, cleansing and processing data to create canonical models. Developed Spark code in Pyspark using Spark SQL & Data Frames. Utilize my Python and SQL expertise to design, test, document, deploy and maintain program to meet individual client requirement in accordance with SDLC standards and guidelines. Performance tuning of Spark queries and have implemented Data Quality checks. Conduct data cleaning, manipulation, modification and combination using variety of steps and functions. Worked closely with business and Data Science teams to encourage statistical best practices with respect to experimental design, data capture and data analysis.

Education

Lovely Professional University

Bachelor of Technology

Computer Science and Engineering

Aug 2015 - Jul 2019

Skills

Python
SQL
Pyspark
Hadoop
SQOOP
Hive
Apache Spark
Databricks
Airflow
Data Cleaning
Exploratory Data Analysis (EDA)
Data Visualization
Linear Regression
L1/L2 Regression
Logistic Regression
Naïve Bayes Classifier
Decision Trees
Random Forests
Ada Boost
XG Boost
PostgreSQL
Oracle
MySQL
Cassandra