Default profile banner
RA

Rehan Ahmad

@rehanahmad

Software Engineer 2 (Data Science) at HighRadius Technologies

Hyderabad, India

HighRadius TechnologiesKalinga Institute of Industrial Technology

Result-driven data scientist with 6 years of experience in the fintech domain. Proven track record of successfully leading and completing complex machine learning projects demonstrating strong analytical and problem-solving skills. Adept at transforming complex data into actionable insights to build solutions that add business value and enhance overall performance.

Experience

Software Engineer 2 (Data Science)

HighRadius Technologies

Full-timeJan 2022 - PresentHyderabad, India

Leading a team of 6 data scientists to deliver impactful projects. Developed a data parsing solution for financial documents using GenerativeAI (GPT 4), demonstrating strong prompt engineering skills like zero, one and few-shot prompting along with a good understanding of the Generative Configurations. Researched on various open source LLMs like LLaMA 2, FLAN-T5, BLOOM and GPT-J. Developed a good understanding of the Transformer Architecture along with its constituent components and its implementations in the form of Auto-encoding, Auto-regressive and Seq2Seq models. Worked on fine-tuning hyper-parameters of the FLAN-T5 model for NLP tasks. Explored techniques such as Instruction Fine Tuning, Paramater Efficient Fine Tuning (PEFT) using LoRA, QLoRA and Soft Prompts. Fine tuned LLMs using RLHF to produce human friendly outputs making use of techniques like KL Divergence to avoid reward hacking phenomenon. Have basic understanding of LLM model evaluation metrics like ROUGE and BLEU score and benchmarks like Glue, SuperGlue, HELM and MMLU. Have a good understanding of the RAG architecture and its working. Created a classification model with a model performance accuracy of more than 90% to identify the presence of salt and pepper noise in the scanned images of financial documents. Enhancing it further, developed a solution using OpenCV that removes noise from such documents while retaining important text. Added around 8% to the automation of the cash-application product by integrating the noise detection and removal method into the Python web-service framework. Developed and productionized a Deep Learning classification solution using LayoutLM capable of classifying the different types of pages in financial documents with above 90% recall for each class along with a high precision, thus optimizing the downstream processes.

Software Engineer 1 (Data Science)

HighRadius Technologies

Full-timeJul 2020 - Dec 2021Hyderabad, India

Developed and productionized entity extraction solutions to extract important business fields from structured and unstructured financial documents using LayoutLM and BERT for NER.

Associate Software Engineer 2 (Data Science)

HighRadius Technologies

Full-timeJul 2019 - Jun 2020Hyderabad, India

Responsible for building and productionizing a machine learning model which predicts if there is a scope of manual correction in the data captured by the OCR from financial documents, thus, halved the time clients spend on manual exception handling. Developed and productionized a machine learning solution that is capable of identifying the lines containing the relevant business fields in a financial document. This identification helps in optimization of downstream processing tasks. Created monitoring reports and visualizations on a monthly basis to present to the clients the business value added by the solution.

Associate Software Engineer 1 (Data Science)

HighRadius Technologies

Full-timeApr 2018 - Jun 2019Hyderabad, India

Conducted exploratory data analysis to identify patterns and trends in large datasets. Cleaned and manipulated raw data. Performed feature engineering by extracting new features from data to improve model performance. Assisted in developing machine learning algorithms for predictive modeling tasks.

Education

Kalinga Institute of Industrial Technology

Bachelor of Technology

Computer Science And Engineering

Apr 2018 - Jun 2019Grade: CGPA: 8.57

Created an AI-based approach to identify and correct OCR capture errors by comparing the similarity of two character images using a unique scoring function. Integrated this solution into the Python web-service framework, contributing around 12% to the automation of cash-application product. Developed a novel algorithm to identify the prevailing pattern(s) in a data set (e.g., a set of invoice numbers, document numbers, etc.), assisting in the extraction of pertinent entities from financial documents and leading to a notable decrease in false positives and a major increase in data capture accuracy. PATENT - US11758071B1: Identification and removal of noise from documents (Granted: Sept,2023). In this innovation, a Machine Learning model was utilized to detect the presence of salt and pepper noise in the scanned images of financial documents, and a solution was implemented to remove the noise while retaining essential information, including characters like periods and commas, which could

Skills

Python
Java
R
GenAI
LLMs
Transformers
Deep Learning
NLP
Classification
Regression
Clustering
Association
Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn
TensorFlow
Keras
PyTorch
Docker
Git
Prompt Engineering
RAG Architecture
OpenCV
LayoutLM
BERT
NER