Vikas Singh
@vikas_singh
NLP Engineer | Data Scientist
Shivpuri, Vijay Nagar, Ghaziabad, UP
Vikas is an experienced Data Scientist with over 5 years of expertise in GenAI, Open Source LLMs, and NLP. His technical proficiency includes utilizing frameworks like Langchain and Llama Index, alongside platforms such as Azure Databricks and PySpark. He has a proven track record of building and deploying end-to-end ML models for complex projects involving Topic Modelling, Sentiment Analysis, and advanced forecasting.
Experience
GenAI Engineer
Fractal Analytics
Developed document based closed QnA chatbots using OpenAI as well as open source LLMs. Used langchain framework for better results along with various OpenAI and open source embeddings as well as search algorithms for better results. Developed language query based SQL bot using OpenAI and langchain framework to get in-depth complex queries and insights from the data along with security features like restricted DML commands on table or DB to protect client data privacy. Worked on various open source LLMs like Dolly 2.0, Falcon, llama2 and others and finetuned them using various techniques like LoRA. Worked on model quantization using huggingface algorithms to accommodate and host the LLMs on local machines with space and gpu constraints using flask, FastAPI and Django. Developed various market analysis tools to reduce time consumed in going through market articles and news and generating insights which helped the team to reduce the manual effort by 95% and time by 99%. Build topic modelling tool using BERT models to help the client in analyzing the trends and topics customers discussing on online platform about their products.
Data Scientist
Deloitte USI
Worked as Data Scientist on multiple Data Science projects involving NLP and created pipelines for text loading, cleaning, modelling in Azure Databricks. Performed various Exploratory Data Analysis (EDA) and build Azure pipeline for data loading and cleaning. Build the wordclouds using WordCloud library for words frequency, n-grams for deep text data understanding. Developed text classification models using algorithms like RandomForest, SVM, Naïve Bayes, xgboost and evaluated them using confusion metrics, accuracy, precision, recall, RMSE and ROC curve. Developed and executed detailed ETL related functions, performance and integration. Performed data extraction using CData Connectors and python scripts from various social media platforms like FacebookAds, Google AdManager, Criteo, TradeDesk, Pinterest etc. Analyzed and transformed the data by performing filtration and aggregation and loaded the transformed data on google cloud Big Query.
Senior System Engineer
Infosys Ltd
Performed data analysis on patient data on various levels using python libraries NumPy, pandas and python data structures. Created and build the KPIs using Django framework. Developed classification models to predict the best suitable treatment for disease as an added feature using algorithms like RandomForest, xgboost.
Education
IIIT Bangalore
Executive PG Diploma in Data Science
NLP Specialization
Ajay Kumar Garg Engineering College
B.Tech
Information Technology
Licenses & Certifications
Executive PG Diploma in Data Science (NLP Specialization)
IIIT Bangalore