Arkajyoti Chakraborty

@arkajyoti.chakraborty

Machine Learning Intern at John Snow Labs

New Delhi, India

Vigilance AIDelhi Technological University

Arkajyoti Chakraborty is a skilled Machine Learning Engineer with experience in Natural Language Understanding (NLU), Computer Vision, and Deep Learning. Expertise includes implementing zero-shot classifiers, developing retrieval-augmented prompting frameworks for LLMs, and building OCR pipelines. Proven ability to apply domain adaptation techniques and contribute to research in areas like fake news detection and activity recognition.

Experience

Computer Vision Intern

Vigilance AI

Internship•Sep 2001 - Dec 2001•Remote, Illinois US

Worked on activity recognition problem on video-based data, building a model to classify different activities using CNN feature extractor and fine-tuned transformer encoder model (84% accuracy). Developed an abnormal breathing detection pipeline using YOLO for masking and the activity pipeline (79% accuracy).

Deep Learning Research Intern

Bio-metric Research Lab (DTU)

Internship•Jun 2001 - Aug 2001•New Delhi, India

Worked on domain adaptation techniques over fake news data and hypothesized in relation of fake news to emotion features. Studied about gradient reversal method to apply domain adaptation over the cross-data to enhance the accuracy. Worked on two short papers getting accepted at the AAAI’23 student abstract track and ICON’22 short paper track.

Machine Learning Intern

John Snow Labs

Internship•Jun 2001 - Jul 2001•Remote, Delaware, US

Implemented three zero-shot (Bert, Distilbert, Roberta) spark-nlp annotators in the NLU pipeline. Designed test scripts for sequence classifier models (Longformer, Xlnet, Albert, and Debarta) and implemented demo notebooks for zero-shot classifiers. Successfully merged PR’s regarding annotators Bert, Distilbert, Roberta, and Sequence Classifiers for the upcoming updated release of the NLU library.

Applied Research Intern

Tata Consultancy Service (TCS) Research

Internship•Feb 2001 - Jul 2001•Remote, New Delhi India

Research on clarification question generation via retrieval-based prompting large language models (LLMs) on Legal clauses and contracts. Proposed a retrieval-augmented prompting framework designed explicitly for clarification question generation for contracts. Designed and performed experiments on open-source large language models: Vicuna, Alpaca-Lora, and Dolly-V2 over zero-shot and few-shot setups. Paper titled ”Generating Clarification Questions for Disambiguating Contracts” under review at EMNLP’23.

Data Science Intern

Eka.Care

Internship•May 2001 - Jul 2001•Bengaluru, India

Built a pipeline of custom-trained OCR models from scratch, focusing on lab reports and their accuracy over units, ranges, and numeric values. Tested different models for inference over the edge cases and trained a TrOCR model over 50k images, achieving an average CER of 0.04, outperforming AWS Tesseract.

Education

Delhi Technological University

Bachelor of Technology

Engineering Physiscs

Aug 2019 - Jun 2023•Grade: GPA: 8.06

Skills

Python

Java

Scikit

NLTK

SpaCy

PyTorch

Keras

Spark-NLP

BERT

Distilbert

RoBERTa

Longformer

Xlnet

LLMs

Computer Vision

OCR

YOLO

TrOCR

Domain Adaptation

Gradient Reversal Method