Twitter API and Python

Twitter API v2, tweepy and Pandas in Python
Data Science
Python
Autor:in

Jan Kirenz

Veröffentlichungsdatum

19. Mai 2022

Usage of Twitter API v2 with tweepy and pandas in Python

In this tutorial, we’ll cover the setup to get started with the Twitter API v2 using Python and tweepy.

  1. Sign up with Twitter
  2. Create an App in the developer account: Follow steps 1 and 2 in this Twitter article
  3. Obtain the access token and access token secret. These can be generated in your developer portal, under the “Keys and tokens” tab for your developer App.
  4. Next, we need to install tweepy. Installation with Anaconda: conda install -c conda-forge tweepy

Note that we use the free “Essential access” method and therefore have the following limitations:

  • 500,000 Tweets per month
  • 1 Project per account
  • 1 App environment per Project
  • No access to standard v1.1, premium v1.1, or enterprise

Now we are ready to import tweepy:

import tweepy

Create keys

  • We need to provide the Twitter keys and tokens in order to use the API v2.

  • Therefore, we first create a simple Python script called keys.py in which we store all passwords.

We create the file with the following commands: 1. we create a variable called keys.py 1. we create the file with %%writefile: this will save this script in the same folder as this notebook 1. open keys.py and insert your keys.

# Create variable
file_name = 'keys.py'
%%writefile {file_name}

consumer_key="insert your API key"
consumer_secret="insert your API secret"
access_token="insert your access token"
access_token_secret="insert your access token secret"
bearer_token="insert your bearer token"
Writing keys.py

Make a connection with API v2

We import the keys and use them in the function tweepy.Client:

from keys import *
import requests

client = tweepy.Client( bearer_token=bearer_token, 
                        consumer_key=consumer_key, 
                        consumer_secret=consumer_secret, 
                        access_token=access_token, 
                        access_token_secret=access_token_secret, 
                        return_type = requests.Response,
                        wait_on_rate_limit=True)

Make a query

  • Let’s search Tweets from Barack Obama’s Twitter account (@BarackObama) from the last 7 days (search_recent_tweets).
  • We exclude Retweets and limit the result to a maximum of 100 Tweets.
  • We also include some additional information with tweet_fields (author id and when the Tweet was created).
# Define query
query = 'from:BarackObama -is:retweet'

# get max. 100 tweets
tweets = client.search_recent_tweets(query=query, 
                                    tweet_fields=['author_id', 'created_at'],
                                     max_results=100)

Convert to pandas Dataframe

Finally, we convert the data to a pandas Dataframe.

import pandas as pd

# Save data as dictionary
tweets_dict = tweets.json() 

# Extract "data" value from dictionary
tweets_data = tweets_dict['data'] 

# Transform to pandas Dataframe
df = pd.json_normalize(tweets_data) 
df
created_at id author_id text
0 2022-05-16T21:24:35.000Z 1526312680226799618 813286 It’s despicable, it’s dangerous — and it needs…
1 2022-05-16T21:24:34.000Z 1526312678951641088 813286 We need to repudiate in the strongest terms th…
2 2022-05-16T21:24:34.000Z 1526312677521428480 813286 This weekend’s shootings in Buffalo offer a tr…
3 2022-05-16T13:16:16.000Z 1526189794665107457 813286 I’m proud to announce the Voyager Scholarship …
4 2022-05-14T15:03:07.000Z 1525491905139773442 813286 Across the country, Americans are standing up …
# save df
df.to_csv("tweets-obama.csv")