Default profile banner
SS

Sanket Sonu

@sanketsonu

Data Scientist (NLP/NLU) at Orcawise

India

LinkedIn: Sanket Sonu | LinkedIn

OrcawiseNational College of Ireland

Sanket Sonu is a Data Scientist with 4.8 years of industry experience, specializing in AI/ML and NLP/NLU. He possesses extensive knowledge of Python and key libraries including TensorFlow, PyTorch, Sklearn, and SpaCy. His experience includes working with cloud tools like AWS SageMaker and GCP Vertex AI, and developing custom classification models and NER systems.

Experience

Data Scientist (NLP/NLU) - Intern

Orcawise

InternJul 2022 - PresentDublin, Ireland

Optimising & data processing before building a custom classification model for data quality. Used Data Engineering concepts for processing unstructured RAW text data & performed Annotation. Designed & improved complex custom models on top of pre-trained models using BERT & LSTM and performed A/B testing. Data Mining, problem-solving & attention to detail. Development of NER (Named Entity Recognition), Coreference Resolution, & Relation Extraction for articles using SpaCy & BERT models on real world data like tourism, articles, & user’s feedback. Delivering data-driven actionable insights using Knowledge Graph & research on NLP techniques. Design, deliver, document, and presentation of user-friendly dashboards and reports.

Senior Game Data Test Engineer (AI/ML)

Pole to Win International

Jan 2021 - Apr 2021Hyderabad, India

Performed Machine Learning methodologies to utilise game data for player ranking system (matchmaking of online players using rank and in-game behaviour), functionality, and rewards system. Performed EDA and Segmented data by creating Clusters using the K-Means algorithm. Classified player’s feedback using Machine and Deep Learning Classification models like – Random Forest, XGBoost, Naïve Bayes, LSTM, CNN, & Transformers to improve the gaming experience. Verbal communication with business stakeholders & clearly and concisely explained advanced & complex analytical findings to non-analytical peers and business leaders.

Game Data Tester (AI/ML)

Ubisoft Entertainment India Pvt. Ltd

Jun 2018 - Dec 2020Pune, India

Development of advanced Machine Learning techniques to utilise game data for player ranking system (matchmaking of online players using rank and in-game behaviour). Classified player’s feedback and chats using Machine and Deep Learning Classification models & Applied NLP & worked closely with the Software Engineering team to create Chat Filters. Processing the raw data using Data Engineering Concepts. Performed Annotation, Model Selection, and A/B testing. Applied analytical skills & knowledge transfer for data management. Continually research new methods and technologies in the insights and analytics space, including AI and Machine Learning tools and techniques. Diagnosed and restructured existing or new game designs by recommending unique, creative, and innovative ideas by collaborating as a CO-DEV.

QA Engineer

Sun Technology Integrators Pvt. Ltd

Mar 2017 - Jun 2018Bangalore, India

Performed automation testing to check the functionality of games using Python scripting on platforms like PlayStation and Xbox, following compliance to enhance performance and reported bugs in Jira. Re-designed many test cases to hit on the critical bugs and improved the end-user experience.

Education

National College of Ireland

MSc in Data Analytics

Data Analytics

Jan 2021 - Jan 2022Grade: First Class Honours (1.1)

Sapthagiri College of Engineering

Bachelor of Engineering

Information Science & Technology

Jul 2012 - Jul 2016

Licenses & Certifications

Google Data Analytics Professional Certificate

Google

Issued: Oct 2021• No expiration

The Data Science Course: Complete Data Science Bootcamp

Issued: Oct 2020• No expiration

Skills

Python
Statistics
Machine Learning
Deep Learning
TensorFlow
Keras
Scikit-learn (Sklearn)
Natural Language Processing (NLP)
Social Media Analytics
Data Visualisation
Agile
Excel
Google Cloud Platform (GCP) – AutoML
GCP Vertex AI
Tableau
Power BI
Jira
Confluence
SQL
Coreference Resolution
Relation Extraction
NLTK
SpaCy
NLU
OpenCV
IBM SPSS
AWS SageMaker
AWS S3 & RDS
NumPy
Pandas
Business Intelligence (BI)
A/B Testing