Photo by Alexander Shatov on Unsplash

Twitter API v2, tweepy and pandas in Python

Twitter API v2, tweepy and pandas in Python

Usage of Twitter API v2 with tweepy and pandas in Python

In this tutorial, we’ll cover the setup to get started with the Twitter API v2 using Python and tweepy.

  1. Sign up with Twitter
  2. Create an App in the developer account: Follow steps 1 and 2 in this Twitter article
  3. Obtain the access token and access token secret. These can be generated in your developer portal, under the “Keys and tokens” tab for your developer App.
  4. Next, we need to install tweepy. Installation with Anaconda: conda install -c conda-forge tweepy

Note that we use the free “Essential access” method and therefore have the following limitations:

  • 500,000 Tweets per month
  • 1 Project per account
  • 1 App environment per Project
  • No access to standard v1.1, premium v1.1, or enterprise

Now we are ready to import tweepy:

import tweepy

Create keys

We need to provide the Twitter keys and tokens in order to use the API v2. Therefore, we first create a simple Python script called keys.py in which we store all passwords. Save this script in the same folder as this notebook:

consumer_key="insert your API key"
consumer_secret="insert your API secret"
access_token="insert your access token"
access_token_secret="insert your access token secret"
bearer_token ="insert your bearer token"

Make a connection with API v2

We import the keys and use them in the function tweepy.Client:

from keys import *
import requests

client = tweepy.Client( bearer_token=bearer_token, 
                        consumer_key=consumer_key, 
                        consumer_secret=consumer_secret, 
                        access_token=access_token, 
                        access_token_secret=access_token_secret, 
                        return_type = requests.Response,
                        wait_on_rate_limit=True)

Make a query

Let’s search Tweets from Elon Musk’s Twitter account “@elonmusk” from the last 7 days (search_recent_tweets). We exclude Retweets and limit the result to a maximum of 100 Tweets. We also include some additional information with tweet_fields (author id and when the Tweet was created).

# Define query
query = 'from:elonmusk -is:retweet'

# get max. 10 tweets
tweets = client.search_recent_tweets(query=query, 
                                    tweet_fields=['author_id', 'created_at'],
                                     max_results=100)

Convert to pandas Dataframe

Finally, we convert the data to a pandas Dataframe.

import pandas as pd

# Save data as dictionary
tweets_dict = tweets.json() 

# Extract "data" value from dictionary
tweets_data = tweets_dict['data'] 

# Transform to pandas Dataframe
df = pd.json_normalize(tweets_data) 
df
created_at text id author_id
0 2021-12-12T09:51:36.000Z @HardcoreHistory I think I mentioned “octane” ... 1469968168189931521 44196397
1 2021-12-12T09:11:01.000Z @RationalEtienne @balajis Sonic, the Hedgehog,... 1469957953142808579 44196397
2 2021-12-12T03:44:29.000Z Just did a @HardcoreHistory episode with Dan C... 1469875780901609472 44196397
3 2021-12-12T00:23:16.000Z @teslaownersSV It takes 20 years (time from co... 1469825142758989824 44196397
4 2021-12-11T20:33:36.000Z @ItsGime @BillyM2k Maybe a little 😉 1469767343802884102 44196397
... ... ... ... ...
80 2021-12-06T20:44:50.000Z @DrSallyL @Tesla Coming soon. Lot of cool stuff. 1467958233482604546 44196397
81 2021-12-06T20:25:43.000Z @kimpaquette Tesla publishes accident statisti... 1467953422116831242 44196397
82 2021-12-06T05:42:48.000Z As always, Tesla is looking for hardcore AI en... 1467731226609999872 44196397
83 2021-12-06T01:44:54.000Z @AEIecon @SciGuySpace @JimPethokoukis @PE_Podc... 1467671360285589509 44196397
84 2021-12-05T19:14:07.000Z @EPavlic He is quite a bossy dog :) 1467573012765593602 44196397

85 rows × 4 columns

# save df
df.to_csv("tweets.csv")
Avatar
Jan Kirenz
Professor

I’m a data scientist educator and consultant.

Related