Default profile banner
NK

Nisha Kumar

@nishakumar

NLP Data Scientist at Infosys Technologies

Pune, Maharashtra

Infosys TechnologiesLiverpool John Moores University

Nisha is an experienced data science professional with over 5 years of experience developing AI/ML solutions across sectors including Telecom, Insurance, and Finance. She specializes in NLP use cases and has strong exposure to MLOps, implementing ML pipelines on cloud architectures like AWS and Azure. With an MSc in Data Science and 12+ years in the IT industry, she leverages statistics and visualization tools to solve complex data challenges.

Experience

NLP Data Scientist

Infosys Technologies

Aug 2020 - PresentPune, India

Developed and Integrated NLP based AI solution into client’s existing DevOps pipeline. Performed text analytics using BERT embeddings and K-Medoid clustering to identify the features to be prioritized and tested based on code changes committed in GitLab. To facilitate cancer research trials, developed an NLP based AI text summarizer for long medical research journals and articles specific to cancer domain. Latest sparse based transformer BIGBIRD was fine-tuned on volumes of cancer research documents to build abstractive summarization model. Analysis of several NLP architecture was done and BIGBIRD was baselined because of its capacity to process lengthy documents. Developed and Integrated NLP based AI text summarizer of lengthy medical research journals and documents to speed up the research work and clinical trials. Leveraged combination of NLP based extractive and abstractive summarization techniques to come up with an efficient summarization model with a high ROUGE score. Extractive Summarization was implemented using BERT embeddings and K-medoid clustering. The extractive summary was used as an input to final abstractive summarization layer which was based on Full Attention architecture-based transformer BART.

Automation Engineer

Sep 2010 - Apr 2018

Worked as a developer and automation engineer majorly in Banking and Finance domain.

Sr. Software Engineer

Mindtree Limited

Sep 2007 - Aug 2010Pune, India

Worked as a developer in Taxation domain.

Data Scientist

Infosys Technologies

Jan 2001 - Jul 2001Pune, India

Developed a Risk Analytics solution for Insurance Products of WestPac. Leveraged Multi Classification machine learning models for customer segmentation as Low Risk, Moderate Risk and High Risk based on the data collected. Developed the model to Predict the premium payment defaulters followed by explainable AI i.e., mining the major features influencing the “defaulter” behavior in the customers. Machine Interpretability was developed using lime and shapely for global and local interpretations of the prediction. Simulated Service Virtualization using Machine Learning for In-house tool of Infosys. Studied data patterns and built a model that could work as a service virtualization for stateless and stateful scenarios, which was helpful in early testing of projects as it eliminated the cost of actual service virtualization. As part of Hartford’s Infrastructure upscaling and cloud migration strategy, developed an automatic performance prediction model based on a naïve Bayes classifier to predict the performance metrics of cloud nodes with respect to different options for configuration of node resources (workload attributes). Different workload parameters such as average number of jobs per minute, memory capacity, CPU cores etc. was collected and statistically analyzed. Memory Utilization, CPU utilization and Response time were the Performance metrics predicted by the model.

Education

Liverpool John Moores University

MSc - Data Science

Oct 2020 - Sep 2021

IIIT-Bangalore

PG Diploma - Data Science

Oct 2019 - Sep 2020

Cummins College of Engineering

B E – Information Technology

Jul 2002 - May 2006

Licenses & Certifications

AWS Certified Machine Learning -Specialty

AWS

Issued: Sep 2001• No expiration

Microsoft Certified: Azure Data Scientist Associate

Microsoft

Issued: Nov 2001• No expiration

Inferential and Predictive Statistics for Business

University of Illinois (Coursera)

Issued: Jul 2001• No expiration

IBM Professional Data Science Professional Certificate

Coursera

Issued: Invalid Date• No expiration

Python (Basic) Certificate

Hackerrank

• No expiration

Skills

Python
Java
Statistics
Probability
Data Modeling
Data wrangling
Data visualization
Machine learning
Deep learning
Azure cloud
Spark
Flask
MySQL
Numpy
Pandas
Matplotlib
Scikit- Learn
TensorFlow
Keras
NLP
Transformers - BERT
BART
GPT-2
RoBERTA
PEGASUS
BIGBIRD
Text Analytics
Keyword extraction
Text Summarization
AUTOML
H2O
XAI (Machine Learning Interpretability)