Usage of Twitter API v2 with tweepy and pandas in Python
In this tutorial, we’ll cover the setup to get started with the Twitter API v2 using Python and tweepy.
- Sign up with Twitter
- Create an App in the developer account: Follow steps 1 and 2 in this Twitter article
- Obtain the access token and access token secret. These can be generated in your developer portal, under the “Keys and tokens” tab for your developer App.
- Next, we need to install tweepy. Installation with Anaconda:
conda install -c conda-forge tweepy
Note that we use the free “Essential access” method and therefore have the following limitations:
- 500,000 Tweets per month
- 1 Project per account
- 1 App environment per Project
- No access to standard v1.1, premium v1.1, or enterprise
Now we are ready to import tweepy:
import tweepy
Create keys
We need to provide the Twitter keys and tokens in order to use the API v2.
Therefore, we first create a simple Python script called
keys.py
in which we store all passwords.
We create the file with the following commands:
1. we create a variable called keys.py
1. we create the file with %%writefile
: this will save this script in the same folder as this notebook
1. open keys.py
and insert your keys.
# Create variable
file_name = 'keys.py'
%%writefile {file_name}
consumer_key="insert your API key"
consumer_secret="insert your API secret"
access_token="insert your access token"
access_token_secret="insert your access token secret"
bearer_token="insert your bearer token"
Writing keys.py
Make a connection with API v2
We import the keys and use them in the function tweepy.Client:
from keys import *
import requests
client = tweepy.Client( bearer_token=bearer_token,
consumer_key=consumer_key,
consumer_secret=consumer_secret,
access_token=access_token,
access_token_secret=access_token_secret,
return_type = requests.Response,
wait_on_rate_limit=True)
Make a query
- Let’s search Tweets from Barack Obama’s Twitter account (@BarackObama) from the last 7 days (
search_recent_tweets
). - We exclude Retweets and limit the result to a maximum of 100 Tweets.
- We also include some additional information with
tweet_fields
(author id and when the Tweet was created).
# Define query
query = 'from:BarackObama -is:retweet'
# get max. 100 tweets
tweets = client.search_recent_tweets(query=query,
tweet_fields=['author_id', 'created_at'],
max_results=100)
Convert to pandas Dataframe
Finally, we convert the data to a pandas Dataframe.
import pandas as pd
# Save data as dictionary
tweets_dict = tweets.json()
# Extract "data" value from dictionary
tweets_data = tweets_dict['data']
# Transform to pandas Dataframe
df = pd.json_normalize(tweets_data)
df
created_at | id | author_id | text | |
---|---|---|---|---|
0 | 2022-05-16T21:24:35.000Z | 1526312680226799618 | 813286 | It’s despicable, it’s dangerous — and it needs... |
1 | 2022-05-16T21:24:34.000Z | 1526312678951641088 | 813286 | We need to repudiate in the strongest terms th... |
2 | 2022-05-16T21:24:34.000Z | 1526312677521428480 | 813286 | This weekend’s shootings in Buffalo offer a tr... |
3 | 2022-05-16T13:16:16.000Z | 1526189794665107457 | 813286 | I’m proud to announce the Voyager Scholarship ... |
4 | 2022-05-14T15:03:07.000Z | 1525491905139773442 | 813286 | Across the country, Americans are standing up ... |
# save df
df.to_csv("tweets-obama.csv")