Default profile banner
SS

Saurabh Singh

@saurabhsingh4414

Senior Associate Consultant - Data Scientist at Infosys

Hyderabad, India

linkedin.com/in/saurabhsingh321

InfosysGalgotias University

Saurabh is an experienced professional skilled in designing and implementing solutions using ML algorithms, NLP, Deep Learning, and Cloud Technologies. He has experience working with BFSI, Hospitality, and various web/mobile products. He is adept at leading data science teams to build efficient, end-to-end solutions.

Experience

Senior Associate Consultant- Data Scientist

Infosys

Dec 2001 - PresentHyderabad, India

Led a team of 4 to develop a Computer Vision model for inhouse web based application to extract data from Legal documents. Developed an OCR Engine using CV2 and Pytesseract to process scanned PDFs of legal documents related to eviction, tuition and medical loans. Deployed the OCR Engine as a web application on an EC2 instance, secured with APIGEE and PingFederate to ensure application Is only accessible to authorized users and that the extracted data from documents is secure. Successfully processing over 300 documents daily with 97% accuracy.

Data Scientist

Incedo

Feb 2001 - Dec 2001Gurugram, India

Trained a classification model to predict the Mortgage Propensity for fintech firm. Gathered a dataset from US and UK government websites and the data engineering team, and preprocessed it using Sklearn. Evaluated the model using Logistic Regression, XGBoost and Random Forest, and choose random forest as the final model due to higher precession and recall after tuning hyperparameters. The model performed with recall of 99%. Designed Tableau Dashboards by Web Scrapping healthcare data. Scraped state-level data Oncology, Neurology and Ophthalmology from websites provided by the client using python libraries pandas, requests, scrapy and beautifulsoup. Imported the data into CSV files and preprocessed it by removing special characters and imputing missing values. Developed Tableau dashboards for features like population, number of patients, CAGR, Medicines etc.

Machine Learning Engineer

Peoplestrong

Jan 2001 - Feb 2001Gurugram, India

Productionised a resume and job description similarity NLP model. Model was built using pypdf2, spacy, NLTK and genism models(doc2vec). Model extracts skills from both documents converts them in tokens, tokens to vectors. Model then calculates the cosine similarity between the vectors of both documents and screens out the candidate with cosine similarity less than 60%. Successfully deployed the model on ALT Recruit website of Peoplestrong by creating a FlaskAPI.

Education

Galgotias University

B.Tech

Computer Science and Engineering

Jul 2001 - May 2001

Skills

Python 3.x
SQL
Flask
Tensorflow
Pytorch
PySpark
OpenCV
Data analytics
Data visualization (Tableau)
EDA
A/B testing
Regression
Classification
Clustering
Random Forest
Decision Tree
PCA & Dimensionality reduction
Support Vector Machine
K-Nearest Neighbour
K-Means Clustering
CNN
Numpy
Pandas
scikit-learn
Matplotlib
Seaborn
statsmodels
scrapy
Lexical processing
Syntactic processing
Semantic processing
NER
Similarity Analysis
Text Classification
Text generation
Sentiment Analysis
Transformers
Attention based networks
RASA
OCR Engine
Image analysis
Video analysis
Image recognition
Object Detection
Segmentation
S3
Sagemaker
Lambda
EKS
EC2