Default profile banner
PK

Pritesh Kumar

@priteshkumar

Data Scientist at Collegedunia Web Pvt. Ltd.

123 Anywhere St., Any City

Collegedunia Web Pvt. Ltd.Dr. B.C. Roy Engg. College

Data Scientist with a ~4 years of broad-based experience in multiple domains like NLP, Computer Vision, Predictive Modeling, Recommendation System, etc, and having the required capability to overcome complex algorithm problems and scalability issues. Proficient in building algorithms from scratch using Python and solving very complex problems creatively to adhere to business requirements. Always looking for new challenges and complex problems to be solved uniquely, and new skills to learn.

Experience

Data Scientist

Collegedunia Web Pvt. Ltd.

Jan 2022 - Present

Built new algorithms from scratch for new products/apps like news aggregators, etc by: Fine-tuning sentence transformers like BERT, MPNet, etc for downstream tasks like text classification, Semantic Textual Similarity(STS) and NER-extraction, attaining an accuracy of 95%. Created unique logic for sections of news app like ‘For You’, ‘Trending’, etc, using ML-models like BERT. Developed a novel logic for batching process using ML-modeling. Memory and Speed optimization of the algorithms: reducing memory-usage and speed by 80% using multi-threading, etc. Did API-integration of text summarization models like ‘t5-base’, Facebook’s BART, Pegasus, etc for various in-house applications. Git-integration of the pipelines built. Created a new cleaner package module in “pypi.org” for easier integration to the pipelines. Generated relevant visualizations to assess the performance of various ML-models for multiple tasks using Seaborn and Matplotlib. Integrated database with the pipeline and the API for efficient running of the pipelines. Performed Data Engineering: setting up MongoDB and BigQuery, and also performed database optimization on it. Deployed ML-models and their pipelines on the server. User Recommendation System: Scraping data using Beautiful-Soup and Selenium to be used for user-tagging. NER-extraction from text using Spacy to extract important entities. User-tagging of NER-entities and news articles. Visualizing user app-interaction for better understanding of the recommendation system and better planning for user-targeting using Seaborn, Matplotlib, etc, leading to an increase in user retention-rate by 20%. Computer Vision: Designed and created pipelines for user-recommendation system using various huggingface models for: Video to text conversion: Saleforce’s blip-2 model, Audio to text conversion, Image to text conversion. For all the above tasks created a very useful unique strategy to translate video/image/audio elements into text for user-recommendation system

Data Scientist

P.R. Associates

Jan 2020 - Jan 2022

Education

Dr. B.C. Roy Engg. College

B.Tech

University/College

Enter Your Degree

Visual Design

Jan 2008 - Jan 2008

Licenses & Certifications

Complete Data Science Bootcamp

Udemy

• No expiration

Advanced Machine Learning and Data Science Master Class

Udemy

• No expiration

IBM Machine Learning Professional Certification

IBM

• No expiration

Skills

Python
Numpy
Pytorch
Tensorflow
seaborn
sklearn
Data Visualization
Machine Learning
Deep Learning
Neural Network
Transformers
SQL
BigQuery
MongoDB
PowerBI
Red-Shift
LLMs
Large Language Models
Prompt Engineering
Excellent problem solver
Critical thinking
Quick and enthusiastic Learner
Good presentation and communication skills
Excellent R&D skills
Good team-player