Default profile banner
RS

Rijul Singh Malik

@rijulsinghmalik

Data Scientist at Mpect

New Delhi

https://linkedin.com/in/rijulsinghmalik/

MpectUniversity of California, Irvine

Rijul Singh Malik is a Data Scientist with a Master of Data Science from the University of California, Irvine. He has professional experience at Mpect, Ernst & Young, and Quest Diagnostics, specializing in machine learning, LLMs, and ETL optimization. He is skilled in Python, Spark, and building end-to-end data solutions to drive business revenue and efficiency.

Experience

Data Scientist

Mpect

Feb 2023 - May 2024Suwanee, GA

Spearheaded a transformation initiative, training a team of 20 on Agile methodologies and tools, resulting in a 15% increase in project delivery speed and a 20% reduction in defects. Enhanced operational efficiency by integrating Spark scripts for real-time analytics; optimized ETL processes using Hadoop, PySpark resulting in a 30% increase in revenue. Built an end-to-end user chatbot using LLMs such as GPT integrating RAG for user relevant and intent from knowledge base, further testing the model using Precision, Recall and F1 Score.

Project Ambassador

UC Irvine

Oct 2021 - Jan 2023Irvine, CA

Implemented CI/CD pipelines to streamline model training and deployment processes, reducing deployment time by 40% and enabling seamless scaling of machine learning applications across platforms. Spearheaded the development of a personalized NLP system based using NLTK, SpaCy; improved customer engagement and boosted repeat purchases by 25%.

Data Science Intern

Quest Diagnostics

InternshipJun 2022 - Aug 2022Remote – Irvine, California

Developed and implemented a robust Time Series pricing strategy based on SVM, Naïve Bayes; optimized pricing structures, which led to a 20% boost in profit margins within six months. Developed a comprehensive product pricing analysis using Tableau increasing profitability by 30% through strategic pricing adjustments; refined pricing model based on customer feedback and market trends. Engineered end-to-end E-Commerce solution, integrating Decision Trees, Random Forest for predictive analytics, resulting in 40% faster data processing and reduced storage costs by 25%.

Data Scientist

Ernst & Young

Feb 2021 - Sep 2021New Delhi, India

Led a team in conducting comprehensive correlation testing across various customer touchpoints, optimizing pricing strategies and enhancing user experience using confidence intervals; the initiative led to a 40% reduction in customer churn rate. Led a cross-functional team in the migration of data from DBT to Postgres, ensuring data integrity and availability for BI visualization; project completed 2 months ahead of schedule and reduced data retrieval time by 50%.

Big Data Intern

Ericsson

InternshipSep 2020 - Nov 2020New Delhi, India

Developed and executed innovative learning programs for team members on media analytics and modeling techniques; improved team proficiency in Causal and Bayesian Inference, leading to enhanced campaign performance. Optimized data warehouse performance by fine-tuning queries and indexing strategies using TensorFlow resulting in a 40% reduction in query execution time and improved overall system efficiency.

Research Intern

Genpact

InternshipJul 2020 - Sep 2020New Delhi, India

Implemented pricing optimization algorithms using NumPy, Pandas and Scikit Learn to analyze market trends and competitor data, resulting in a 15% increase in profit margins and a more competitive pricing strategy that aligned with consumer preferences. Implemented an intermediate stack leveraging JIRA for project management and Git for version control to streamline supply chain processes, resulting in a 25% reduction in deployment errors.

Education

University of California, Irvine

Master of Data Science

Data Science

Sep 2021 - Dec 2022

Guru Gobind Singh Indraprastha University

B.Tech.

Information Technology

Aug 2016 - Nov 2020

University of California, Berkeley

Summer School

Jun 2019 - Aug 2019

Skills

Agile
Spark
Hadoop
PySpark
LLMs
GPT
RAG
CI/CD
NLP
NLTK
SpaCy
Time Series
SVM
Naïve Bayes
Tableau
Decision Trees
Random Forest
DBT
Postgres
BI
Causal Inference
Bayesian Inference
TensorFlow
NumPy
Pandas
Scikit Learn
JIRA
Git
Airflow
Kafka
Python
SQL
PyTorch