#NLPrimaries

GOAL: ANALYZE TWEETS MENTIONING THE CURRENT 2020 DEMOCRATIC PRESIDENTIAL CANDIDATES

Data:

Gathered data through the use of Twint — an advanced Twitter scraping tool (twint_scraper.ipynb)
Used only verified accounts in efforts to limit the number of tweets scraped, while also capturing a larger range of dates
Discluded tweets that mentioned multiple candidates, so not to interfere with analysis
Removed links, images, stop words, and non-English tweets

Process:

Performed sentiment (political_tweets.py) and subjectivity analysis, as well as TF-IDF on each tweet (political_tweets_atom.py)
Created multiple visualizations, as can be seen in NLPrimaries.pdf in this repository
Wrapped up analysis through topic modeling with the use of latent dirichlet allocation (LDA), in an attempt to see if tweets would be separated by candidate (also can be seen in NLPrimaries.pdf, with code in political_tweets.ipynb)

Conclusions:

Overall, the tweets mentioning the candidates were all very similar and thus, difficult to differentiate between candidates
Twitter is oftentimes a polarized platform, with many negative and many positive opinions

Future Work:

Would be interesting to compare these tweets with tweets about Trump around this time in 2016
Build classification model and/or neural network to differentiate tweets by different candidates

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
NLPrimaries.pdf		NLPrimaries.pdf
README.md		README.md
bernie.csv		bernie.csv
bernie_cleaned.csv		bernie_cleaned.csv
biden.csv		biden.csv
biden_cleaned.csv		biden_cleaned.csv
bloomberg.csv		bloomberg.csv
bloomberg_cleaned.csv		bloomberg_cleaned.csv
buttigieg.csv		buttigieg.csv
buttigieg_cleaned.csv		buttigieg_cleaned.csv
klobuchar.csv		klobuchar.csv
klobuchar_cleaned.csv		klobuchar_cleaned.csv
political_tweets.py		political_tweets.py
political_tweets_LDA.ipynb		political_tweets_LDA.ipynb
political_tweets_atom.py		political_tweets_atom.py
trump.csv		trump.csv
trump_cleaned.csv		trump_cleaned.csv
twint_scraping.ipynb		twint_scraping.ipynb
warren.csv		warren.csv
warren_cleaned.csv		warren_cleaned.csv

Provide feedback