Are You Ready For Some #football?

While I (Matt) was sitting here watching Monday Night Football, I decided to see who else was doing the same – especially because it’s halftime! You may have heard about Twitter – they have an awesome API which allows us to pull all sorts of data from it. If you use Python, it’s (literally) easy to install using

easy_install twitter

There’s all kinds of cool stuff we could do, but I won’t subject you everything I tried. What I ended up doing is searching for tweets which contained the text ‘MNF’ (for monday night football!), and then searching who was retweeting those tweets. This gives us a directed graph (tweeter -> retweeter) from which we can start to visualize and understand who are the “most important people” talking about the game (besides us, of course). I should say that I learned how to do some of this from the excellent O’Reilly book, “Mining the Social Network” by Matthew Russell.

The first step is to query the API to find tweets containing this tag:

import twitter
tw = twitter.Twitter(domain = "")

results = []
for page in range(1,10):
    results.append( = 'MNF', rpp = 100, page = page))
tweets = [ r['text'] \
           for result in results \
           for r in result['results'] ]

The next step is to search each tweet to decide if it was retweeted or not – this involves searching for the text ‘RT’ or ‘via’, which you are no doubt familiar with if you use twitter, and recording the name of the original tweeter. The relevant tool to do this is to use Python’s regular expression library (re), and the relevant comman is:

rt_patterns = re.compile(r"(RT|via)((?:\b\W*@\w+)+)", re.IGNORECASE)

After stripping the user names from the retweeted tweets we are going to add the user names into a directed graph which can be done using the Python package networkx. Just loop over all retweeted tweets from the step above, and add them to the graph

g = networkx.DiGraph()
g.add_edge(s, tweet["from_user"], {'tweet_id' : tweet['id'] } )

There’s all kinds of cool stuff you can do with this graph object, but I’m just going to skip most of it and show you the picture (since I have to get back to the game, of course). I manipulated it so that we only see the largest connected components of our graph:

There you go, the most important (i.e. had their tweets retweeted the most) MNF watchers are ‘ESPN’, ‘Sportscenter’, ‘JasonWitten’, ‘PeytonsHead’, ‘JordinSparks’, ‘TristinKennedy’, and ‘OmyBoyBaby’. It seems like we’re in good company!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: