Protected: Predicting Sentiment Based on Drug Product User Reviews (Using Real World Data) for Informed Decision Making – Part 2

Sentiment Analysis Using a Supervised Binary Text Classifier (Utilizing Large Language Models – BERT and XLNet)

Key Points

Key Techniques to Deliver Our Project:

  1. Defining a function to pre-process a given text (drug user reviews), after applying several pre-processing steps
  2. Creating sentiment features from rating (1 and 0/ positive and negative)
  3. n-gram tokenisation to consider word collocations
  4.  Word clouds for positive and negative reviews
  5. Baseline model – for our use case, we used Neural Bag of Words (BoW) and employed a logistic regression classifier with n-grams features, as a baseline
  6. Classification approach – Supervised binary classification: Large Language Models utilised (BERT and XLNet) and incorporating training and fine tuning.  This is then compared against our baseline model
  7. Evaluation using F1 score, as data was imbalanced
  8. Through implementation of several models on the same dataset with high dimensional features, we come to the conclusion that the XLNet LLM performs the best, as both feature extractor and classifier with a score of 0.93
  9. Overall, we met our objective. Which was to review sentiment based on these drug user reviews text, using a supervised binary text classifier, which classified the user reviews as positive or negative.  By analysing the sentiment expressed in online drug reviews, healthcare providers and manufacturers can gain a more comprehensive understanding of the strengths and weaknesses of their products. This information can inform product development and improvement efforts, and help to ensure that products meet the needs and expectations of patients and consumers.

 

          Discover the code behind the insights – check out our GitHub repository for this Natural Language Processing project

This content is password protected. To view it please enter your password below:

Let's collaborate to develop better healthcare solutions for tomorrow

error: Data is Protected!