Default profile banner
VS

Vikas Singh

@vikas_singh

NLP Engineer | Data Scientist

Shivpuri, Vijay Nagar, Ghaziabad, UP

https://www.linkedin.com/in/vikas-kumar-singh-58050a120/

Fractal AnalyticsIIIT Bangalore

Vikas is an experienced Data Scientist with over 5 years of expertise in GenAI, Open Source LLMs, and NLP. His technical proficiency includes utilizing frameworks like Langchain and Llama Index, alongside platforms such as Azure Databricks and PySpark. He has a proven track record of building and deploying end-to-end ML models for complex projects involving Topic Modelling, Sentiment Analysis, and advanced forecasting.

Experience

GenAI Engineer

Fractal Analytics

Jun 2022 - Present

Developed document based closed QnA chatbots using OpenAI as well as open source LLMs. Used langchain framework for better results along with various OpenAI and open source embeddings as well as search algorithms for better results. Developed language query based SQL bot using OpenAI and langchain framework to get in-depth complex queries and insights from the data along with security features like restricted DML commands on table or DB to protect client data privacy. Worked on various open source LLMs like Dolly 2.0, Falcon, llama2 and others and finetuned them using various techniques like LoRA. Worked on model quantization using huggingface algorithms to accommodate and host the LLMs on local machines with space and gpu constraints using flask, FastAPI and Django. Developed various market analysis tools to reduce time consumed in going through market articles and news and generating insights which helped the team to reduce the manual effort by 95% and time by 99%. Build topic modelling tool using BERT models to help the client in analyzing the trends and topics customers discussing on online platform about their products.

Data Scientist

Deloitte USI

Feb 2021 - May 2022

Worked as Data Scientist on multiple Data Science projects involving NLP and created pipelines for text loading, cleaning, modelling in Azure Databricks. Performed various Exploratory Data Analysis (EDA) and build Azure pipeline for data loading and cleaning. Build the wordclouds using WordCloud library for words frequency, n-grams for deep text data understanding. Developed text classification models using algorithms like RandomForest, SVM, Naïve Bayes, xgboost and evaluated them using confusion metrics, accuracy, precision, recall, RMSE and ROC curve. Developed and executed detailed ETL related functions, performance and integration. Performed data extraction using CData Connectors and python scripts from various social media platforms like FacebookAds, Google AdManager, Criteo, TradeDesk, Pinterest etc. Analyzed and transformed the data by performing filtration and aggregation and loaded the transformed data on google cloud Big Query.

Senior System Engineer

Infosys Ltd

May 2018 - Jan 2001

Performed data analysis on patient data on various levels using python libraries NumPy, pandas and python data structures. Created and build the KPIs using Django framework. Developed classification models to predict the best suitable treatment for disease as an added feature using algorithms like RandomForest, xgboost.

Education

IIIT Bangalore

Executive PG Diploma in Data Science

NLP Specialization

Jan 2021 - Jan 2022

Ajay Kumar Garg Engineering College

B.Tech

Information Technology

Jan 2013 - Jan 2017

Licenses & Certifications

Executive PG Diploma in Data Science (NLP Specialization)

IIIT Bangalore

Issued: Jan 2021

Skills

NLP
GenAI
LLM
Transformers
Langchain
Topic Modelling
Python
Pyspark
Pytorch
Flask
Django
EDA
SQL
Azure Databricks
PDF Digitization
BERT
RandomForest
SVM
Naïve Bayes
xgboost
Sentiment Analysis