Nisha Kumar
@nishakumar
NLP Data Scientist at Infosys Technologies
Pune, Maharashtra
Nisha is an experienced data science professional with over 5 years of experience developing AI/ML solutions across sectors including Telecom, Insurance, and Finance. She specializes in NLP use cases and has strong exposure to MLOps, implementing ML pipelines on cloud architectures like AWS and Azure. With an MSc in Data Science and 12+ years in the IT industry, she leverages statistics and visualization tools to solve complex data challenges.
Experience
NLP Data Scientist
Infosys Technologies
Developed and Integrated NLP based AI solution into client’s existing DevOps pipeline. Performed text analytics using BERT embeddings and K-Medoid clustering to identify the features to be prioritized and tested based on code changes committed in GitLab. To facilitate cancer research trials, developed an NLP based AI text summarizer for long medical research journals and articles specific to cancer domain. Latest sparse based transformer BIGBIRD was fine-tuned on volumes of cancer research documents to build abstractive summarization model. Analysis of several NLP architecture was done and BIGBIRD was baselined because of its capacity to process lengthy documents. Developed and Integrated NLP based AI text summarizer of lengthy medical research journals and documents to speed up the research work and clinical trials. Leveraged combination of NLP based extractive and abstractive summarization techniques to come up with an efficient summarization model with a high ROUGE score. Extractive Summarization was implemented using BERT embeddings and K-medoid clustering. The extractive summary was used as an input to final abstractive summarization layer which was based on Full Attention architecture-based transformer BART.
Automation Engineer
Worked as a developer and automation engineer majorly in Banking and Finance domain.
Sr. Software Engineer
Mindtree Limited
Worked as a developer in Taxation domain.
Data Scientist
Infosys Technologies
Developed a Risk Analytics solution for Insurance Products of WestPac. Leveraged Multi Classification machine learning models for customer segmentation as Low Risk, Moderate Risk and High Risk based on the data collected. Developed the model to Predict the premium payment defaulters followed by explainable AI i.e., mining the major features influencing the “defaulter” behavior in the customers. Machine Interpretability was developed using lime and shapely for global and local interpretations of the prediction. Simulated Service Virtualization using Machine Learning for In-house tool of Infosys. Studied data patterns and built a model that could work as a service virtualization for stateless and stateful scenarios, which was helpful in early testing of projects as it eliminated the cost of actual service virtualization. As part of Hartford’s Infrastructure upscaling and cloud migration strategy, developed an automatic performance prediction model based on a naïve Bayes classifier to predict the performance metrics of cloud nodes with respect to different options for configuration of node resources (workload attributes). Different workload parameters such as average number of jobs per minute, memory capacity, CPU cores etc. was collected and statistically analyzed. Memory Utilization, CPU utilization and Response time were the Performance metrics predicted by the model.
Education
Liverpool John Moores University
MSc - Data Science
IIIT-Bangalore
PG Diploma - Data Science
Cummins College of Engineering
B E – Information Technology
Licenses & Certifications
AWS Certified Machine Learning -Specialty
AWS
Microsoft Certified: Azure Data Scientist Associate
Microsoft
Inferential and Predictive Statistics for Business
University of Illinois (Coursera)
IBM Professional Data Science Professional Certificate
Coursera
Python (Basic) Certificate
Hackerrank