Skip to content

C.18 TextBlob Sentiment Analysis Features

Xinlan Emily Hu edited this page Dec 18, 2023 · 1 revision

1. Feature Name

C.18 Polarity and Subjectivity

2. Literature Source (Serial Number, link)

C.18, My Team Will Go On: Differentiating High and Low Viability Teams through Team Interaction

3. Description of how the feature is computed (In Layman’s terms)

Polarity and Subjectivity are calculated using the TextBlob library in python. We calculate the following for both polarity and subjectivity:

  1. Average
  2. Highest (This is a particular individual)
  3. Lowest (This is a particular individual)

To calculate polarity and subjectivity, we decided to use the TextBlob Library in python. This library is implemented using the Naive Bayes Algorithm, Refer TextBlob Implementation which is a "Bag of Words"-based classifier. For example, if the sentence is "Everything in this restaurant was anything but lovely, amazing, wonderful, great!", the sentence actually has a negative meaning as it means that nothing in the restaurant was good. However, the algorithm will classify it as a positive sentence because it simply counts the number of positive and negative words (4 positive words in this case make the sentence positive for the algorithm).

The following research paper suggests that the Neural Networks algorithm is better for Natural Language Processing than Naive Bayes.

We initially wanted to add human labels to the data to our data, and corroborate the human labels to the labels predicted by TextBlob.The issue was put forth before the larger team here. They felt that adding labels might not help, as human labelling may create more noise instead of eliminating it.

It was concluded that we shall use different method to calculate the same feature, and corroborate the results. If the result of computing the same feature with multiple methods show high correlation, we will then decide how to take a suitable average as an input to our final model.

4. Algorithms used (KNN, Logistic Regression etc.)

Naive Bayes (used by TextBlob)

5. ML Inputs/Features

None

6. Statistical concepts used

Standard Deviation, Mean

7. Pages of the literature to be referred to for details

Section 3.2 of the paper

8. Any tweaks/changes/adaptions made from the original source

None

Clone this wiki locally