Anuj Kumar
@anuj
Sr. ML Engineer at INTELLIF AI
Canada
Anuj Kumar is an experienced Machine Learning Engineer specializing in computer vision and deep learning model deployment. He has developed advanced AI systems, including talking head avatars and object removal models, and managed MLOPS pipelines with HIPAA compliance. His expertise spans training complex models on large datasets, enhancing XAI methods, and conducting research in 3D point cloud analysis and surgical phase recognition.
Experience
Sr. ML Engineer
INTELLIF AI
Developed an AI-driven talking head avatar system for seamless lip-syncing, reducing video production time by 50% and enhancing virtual communication quality for over 10,000 users. Developed an Inpaint Anything model with 34 PSNR which uses text to remove an object in an image using Diffusion and Grounding DINO model. Collaborated with data team to test 200+ videos to check robustness of the model.
Cofounding Team Member
DENTAL AI
Managed data team to ensure 6000 data curation applying MLOPS and contributing to a robust end-to-end model pipeline incorporating HIPAA compliance for real-world deployment. Successfully managed website team to develop a website to let users visualize 3D models and perform end to end prediction. Successfully trained a Transformer model on a massive 15GB point cloud dataset, achieving 93.1% F1 score in 3D point cloud completion task. Leveraged the efficiency of Deepspeed to decrease training time 4 times and smoothly deployed the model on Google Cloud Platform.
Computer Vision Engineer
SPYNE AI
Developed a personalized image-to-image translation model for a car catalogue using the Dreambooth Stable Diffusion model, resulting in studio-like images. Employed prompt engineering techniques like prompt matrix in InstructPix2Pix and ControlNet Stable Diffusion models to get 25000 studio like car images. Created a real-time object detection model on edge using YoloV7, trained on 80000 images achieving a 72.3% MAP for inspecting car damages.
COMPUTER VISION INTERN
SONY
Contributed to open source nnabla - a deep learning framework. Enhanced the efficiency of using Explainable AI by 90% by developing XAI API. Reduced data visualization of XAI application in CNN by 80% by modifying Saliency, SHAP and Integrated Gradients explanation methods. Improved the accuracy of Resnet, Resnext, WideResnet, and Densenet by 20% by integrating it with Attention Branch Networks.
Machine Learning Researcher
PURDUE UNIVERSITY
Built a computer vision model that can predict vital signs with 89.22% F1 score using face detection and signal processing methods.
Deep Learning Researcher
UNIVERSITY OF CALIFORNIA
Trained 3DCNN and LSTM on 400+ videos on colorectal surgery and enhanced the accuracy of surgical phase recognition to 84.66%.
Education
IIT DELHI
BTech
CHINMAYA VIDYALAYA
Licenses & Certifications
AWS Cloud Solutions Architect
AWS
Machine Learning Specialization
Deep Learning Specialization
Machine Learning Engineering for Production (MLOps) Specialization
Google Data Analytics
Natural Language Processing Specialization