Tweet analysis with TextBlob and Tweepy

Sentiment Analysis and Automatic Translation

Tweet analysis with TextBlob and Tweepy
January 4, 2018 nschaetti

Introduction

Today, social networks are the biggest data sources available on the net and provide a wide range of contents like images, video and text. However, this large variety of multimedia contents are difficult to handle and to compute. An enourmous amount of text data is available on Twitter with different kind of information on the source (pseudonym, location, description, etc) and is then a valuable to test and train NLP methods and machine learning algorithms.

This article is the first of a serie about Twitter API and NLP frameworks to deal with problems of the social network era like topic detection, text classification and sentiment analysis. We will first introduce the basic methods to access Twitter data with Python and how to analys tweet’s text with TextBlob. For this tutorial we need two packages, TextBlob and Tweepy that you can install with pip :

pip install textblob
pip install tweepy

Twitter Apps

We have first to create a Twitter application using the Twitter application management. Create a new application by clicking on the “Create new app” button.

On the next page, insert your Twitter’s application information like name and description, you can ignore the “Callback URL”.

On the next page, go to the “Keys and Access Tokens” tab and click on the button “Create my access token”.

You have now the four tokens needed to access your application through tweepy.

Twitter API credentials

Let’s now see how to use these in Python. We first create a python script with the basic canvas where we import the two mandatory packages and insert the Twitter API credientials.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#

# Imports
import tweepy
import textblob.exceptions
from textblob import TextBlob

# Twitter API credentials
consumer_key = "..."
consumer_secret = "..."
access_key = "..."
access_secret = "..."

We need the username of the Twitter user we want to access to the timeline. We allow this by adding a –user argument with the argparse package.

import argparse

And we create a parser and add the argument.

# Command line
parser = argparse.ArgumentParser(prog="tweets-acquisition", description="Get user's tweets")

# Argument
parser.add_argument("--user", type=str, help="Twitter's username", required=True)

# Parse command line
args = parser.parse_args()

You will then be able to specify the target username through the command line.

python tweet_analysis.py --user realDonaldTrump

The first thing to do is to login with tweepy to the Twitter API with the OAuth protocol and the application credentials.

# Authentification
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
# Get API
api = tweepy.API(auth)

You can iterate through the user’s timeline with the user_timeline method. The screen_name and count arguments specify the target user’s
screen name and the number of tweets you want. The maximum is 200.

# Get statuses
for status in api.user_timeline(screen_name=args.user, count=200):
    print(status.text)
# end for

Analyse tweets

It is time to use TextBlob. We create a TextBlog object with the tweet’s text as the argument.

# Analyse tweet
tweet = TextBlob(status.text)

Through this object we can access multiple information the polarity and the subjectivity.

print(u"Polarity {}, Subjectivity {}".format(tweet.sentiment.polarity, tweet.sentiment.subjectivity))

TextBlog can also do language detection with the detect_language method.

print(u"Language : {}".format(tweet.detect_language()))

But even more, we can translate the tweet to french with the translate method. The source language is specify by the from_lang argument and the destination by the to argument.

try:
    print(u"French : {}".format(tweet.translate(from_lang="en-US", to='fr')))
except textblob.exceptions.NotTranslated:
    pass
# end try

TextBlob can be used as a tokenizer to retrieve the words composing the tweet.

print(u"Tokens : {}".format(tweet.words))

Results

You can directly execute our script in a terminal a see the result.

$ python tweets_acquisition.py --user realDonaldTrump
Tweet "Dow just crashes through 25,000. Congrats! Big cuts in unnecessary regulations continuing."
Polarity -0.2, Subjectivity 0.5
Language : en
French : Dow vient de s'écraser à 25 000. Félicitations! De grosses coupures de règlements inutiles se poursuivent.
Tokens : [u'Dow', u'just', u'crashes', u'through', u'25,000', u'Congrats', u'Big', u'cuts', u'in', u'unnecessary', u'regulations', u'continuing']

Tweet "So beautiful....Show this picture to the NFL players who still kneel! https://t.co/tJLM1tvbvb"
Polarity 0.0, Subjectivity 0.0
Language : en
French : Si belle ... Montrez cette image aux joueurs de la NFL qui s'agenouillent encore! https://t.co/tJLM1tvbvb
Tokens : [u'So', u'beautiful', u'Show', u'this', u'picture', u'to', u'the', u'NFL', u'players', u'who', u'still', u'kneel', u'https', u't.co/tJLM1tvbvb']

Tweet "With all of the failed “experts” weighing in, does anybody really believe that talks and dialogue would be going on… https://t.co/EfxwZcRmLZ"
Polarity -0.15, Subjectivity 0.25
Language : en
French : Avec tous les "experts" ratés, personne ne croit vraiment que les pourparlers et le dialogue se poursuivra ... https://t.co/EfxwZcRmLZ
Tokens : [u'With', u'all', u'of', u'the', u'failed', u'\u201c', u'experts', u'\u201d', u'weighing', u'in', u'does', u'anybody', u'really', u'believe', u'that', u'talks', u'and', u'dialogue', u'would', u'be', u'going', u'on\u2026', u'https', u't.co/EfxwZcRmLZ']

Conclusions

The final source code is available on my GitHub account under the repository NS.ai. The next article on machine learning and Twitter teaches you how to classify Twitter profiles.

Nils Schaetti is a doctoral researcher in Switzerland specialised in machine learning and artificial intelligence.

2 Comments

  1. thinklogin.org 2 years ago

    What’s up, just wanted to tell you, I loved this blog post.

    It was helpful. Keep on posting!

Leave a reply

Your email address will not be published. Required fields are marked *

*