Assessing Sentiment and Other Insights with Twitter Data Getting Started with Tweepy Pagination Rate Limits Sentiment Analysis Measuring Brand Affinity Creating a Stream Listener Legal Considerations Conclusion

How can you use the Twitter API to keep a pulse on your customer base or market trends? From tracking followers to analyzing brand affinity, we’ll take a look at some various techniques that can be leveraged via the Twitter API along with logistic considerations, and regulations surrounding the Twitter API’s term of service.

Getting Started with Tweepy

As a Python programmer, one easy way to get started with the Twitter API is the Tweepy package. While it does not encapsulate all of the endpoints of the Twitter API, it streamlines the API even further and makes it incredibly easy to start making requests.

Whether you use Tweepy or the access the Twitter API directly, you first need an access token to make requests. To get one go to https://apps.twitter.com/ and create a new app. Afterwards, you can then connect to the Twitter API using Tweepy like this:

1
2
3
4
5
6
7
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

Once you’ve established a connection, you can make requests using any of Tweepy’s methods. For example, you can retrieve some information about Barack Obama (who ranks number 1 in terms of twitter followers).

1
obama = api.get_user("BarackObama")

This returns a Tweepy Object with various attributes and methods.

You can look at the number of followers:

1
obama.followers_count

Or when the user joined:

1
obama.created_at

Or where they’re from:

1
obama.location

You can also get the users 20 most recent tweets, albeit the syntax is slightly different then above:

1
tweets = api.user_timeline("BarackObama")

Pagination

Many methods such as the `api.user_timeline()` method above return a subset  of a larger collection. Obama has made over 15,000 tweets (for the exact number you could us `obama.statuses_count`), but the call above only returns the past 20. To get more results, you can use Tweepy’s cursor object to paginate through results using a for loop:

1
2
3
4
5
6
7
tweets = []
for page in tweepy.Cursor(api.user_timeline, id="BarackObama").pages():
for tweet in page:
    tweets.append(tweet)

Rate Limits 

As with any API, you need to keep in mind the rate at which you make requests. For example, the above code snippet is valid but could hit an error because you are only allowed to make so many requests within a given time period.

The limit at which you are allowed to make requests varies based on what information you are retrieving. For the `user_timeline()` method, you are allowed a generous 900 requests every 15 minutes, while for other methods such as `obama.followers()`, (returns the user’s followers, 20 at a time), you can only make a much more limited amount of requests—15 every 15 minutes.

As such, always be sure to add in some delays to moderate the number of requests you use. For example, you could add a 15 minute pause after using your allocated requests like this:

1
2
3
import time
time.sleep(15*60) #Number of seconds to pause

You can also specifically check your remaining requests for various endpoints and when they will renew using the `api.rate_limit_status()` method.

Sentiment Analysis

Once you start getting the knack of exploring data with Tweepy, the question arises for how to then employ said data in practice. One easy way to get started is simply measuring the sentiment of tweets, that is, whether they are positive or negative.

While there are many ways to perform sentiment analysis, an easy way to get started is with the TextBlob  package. TextBlob’s `.sentiment()` method will provide both the polarity and subjectivity of a given text.

The polarity quantifies the sentiment of the text from negative to positive on a scale of -1 to 1, while the subjectivity measures how objective it is on a scale of 0 to 1.

Here is how to measure the sentiment of a given tweet:

1
2
3
4
5
from textblob import TextBlob
testimonial = TextBlob(tweet)
testimonial.sentiment

Measuring Brand Affinity

Another interesting application of Twitter data is outlined by Aaron Culotta and Jennifer Cutler in their article Mining Brand Perceptions from Twitter Social NetworksTheir algorithm calculates a proxy metric on a scale of 0 to 1 for estimating the public’s perception of brand traits such as environmentally friendly or luxury.

To do this, they leverage Twitter lists—user curated collections of Twitter handles. For example, I might create a Twitter list ‘Politics’ to keep up to date with political entities of interest. Culotta and Cutler’s algorithm investigates the overlap of a brand’s Twitter followers to all of the followers of handlers from lists pertaining to the given category of interest.

To demonstrate, imagine trying to determine perceptions of Eco friendliness between Exon and BP. The user would first specify keywords such as ‘green energy’ and ‘eco-friendly’ and search for all matching lists. These lists are then mined for Twitter handles (they specify to only include handles that occur on at least 2 matching lists) and in turn the followers of these twitter handles are recorded.

Finally, these followers are then compared to the followers of the brand itself and the final metric is calculated using the Jaccard Index. For implementation details, I’ve also put together a complete python implementation for calculating Twitter Brand Affinity on Github.

Creating a Stream Listener

Aside from querying historical data, you can also ingest live data streams based on criteria. This is known as a stream listener. This allows you to keep a pulse on ongoing Twitter activity. A simple example might be to log all tweets mentioning ‘@BarackObama’.

See Tweepy’s Stream Listener documentation for more details on how to set up a basic implementation.

Legal Considerations

Always be sure to check out the rules and policies for Twitter. For example, even the research methodology described above may be a restricted use case of Twitter data, as they now generally claim that the API is not to be used to calculate aggregate Twitter user or tweet metrics.

What falls under that definition is subject to interpretation, and this article does not condone or endorse improper use.

Conclusion

There are many other possibilities for leveraging the Twitter API not presented here. For example, you can also create chat bots which can post tweets, retweet, follow users, and more. While the possibilities are vast, hopefully you’ve walked away with some insightful ideas of where to look and gotten a peak into how you can start leveraging Twitter data.

See what others are saying