Default profile banner
AS

Aayush Sharma

@aayushsharma

Associate Data Scientist at NoBroker

Bengaluru, IN

linkedin.com/in/aayushsharma9753

NoBrokerIndian Institute of Technology, Guwahati

Aayush Sharma is an Associate Data Scientist at NoBroker with a strong background in NLP and machine learning. He has extensive experience in building ASR systems, developing LLMs, and implementing federated learning frameworks. A graduate of IIT Guwahati, he has published research on tabular reasoning and earned multiple awards in technical competitions, including medals at Inter IIT Tech Meets.

Experience

Associate Data Scientist

NoBroker

•Jun 2023 - Present•Bengaluru, IN

ConvoZen.AI, an AI-driven contact center intelligence platform. Built ASR system for Indic language calls, achieving a WER below 20% using Wav-2-Vec and Conformor models. Contributed to in-house development of ConvoZen LLM, involving dataset creation, pretraining, and fine-tuning. Instituted an Indic NER model for business entities for call centre conversations, yielding 87% F1 score. Improved Key moments identification framework in calls and Engineered prompts for LLMs for crafting datasets. Worked on model optimizations and deployment, delivering on-demand APIs and real-time processing pipelines.

Data Science Intern

Envestnet | Yodlee

Internship•May 2022 - Jul 2022•Remote, IN

Prior Framework for Wealth Recommendations. Instituted prior framework using BiLSTMs, Bayesian networks, ARIMA and statistical methods to introduce Advisor Recommendations Affinity module yielding 15% increase in recommendations consumed. Built end-to-end data agnostic system with self evaluation criteria that enhanced problem specific model choice. Implemented framework on use cases such as loan repayment, data anomaly detection and money in motion.

NLP Intern

Elucidata Data Consulting Pvt. Ltd.

Internship•Feb 2022 - Jul 2022•Remote, IN

Federated Weak Learning Framework. Built Federated Weak Learning framework to train biocuration models on enterprise domains with unlabeled data. Created Weak Supervision pipeline for NER with human level accuracy (88%) and built large Multi-NER models. Submitted Industry ready - Dockerized System Demonstration in AMIA, 22 to be conducted in Washington, DC.

NLP Research Intern

University of Utah

Internship•May 2021 - Mar 2022•Remote, US

Tabular NLI Knowledge Addition and Inference. Developed custom RoBERTa and BiLSTMs to integrate external KGs like Concept-Net and WordNet. Improved benchmark performance by 7.5% on Infotabs, a tabular NLI dataset and published TKBLSTM paper. Worked on development of contextual embeddings for better sentence similarity and premise re-ranking.

Education

Indian Institute of Technology, Guwahati

B.Tech.

Biotechnology

Jan 2019 - Jan 2023•Grade: 8.51

MPBSE

Senior Secondary

Jan 2019•Grade: 85.2%

CBSE

Secondary

Jan 2017•Grade: 10

Skills

Python
C/C++
MATLAB
SQL
PyTorch
Keras
HuggingFace
NLTK
spaCy
Pyspark
Pandas
Scikit-learn
XGBoost
Dockers
Elastic
MongoDB
DAGs
AWS
GCS
Apache Kafka