Last spring, Twitter announced some forthcoming changes to the structure of a tweet. These include (1) removing @replies (at the beginning of a tweet) from the body of the tweet and (2) removing media attachments (URLs at the end of a tweet) from the body of the tweet. Together, these enabled “extended tweets” which are tweets that (including @replies and media attachments) can include more than 140 characters. Those of us who are trying to squeeze in those last few words without sacrificing grammar will appreciate the additional characters.
Twitter followed this up a few weeks ago with some additional details on changes this would entail in the Twitter API. Yesterday (March 30), the first of these changes went live on the Twitter website. These changes impact applications like Social Feed Manager that collect social media data. The goal of this blog post is to explore the salient changes.
Here’s a sample tweet that I’ll be referring to:
In extended tweets from the REST API, the
full_text field replaces the
text field. The
full_text field may contain more than 140 characters:
In addition, the
display_text_range field contains the offsets in
full_text of the body of the tweet:
The Streaming API does not accept parameters, so there is no ability to select extended tweets or classic tweets. Rather, if a tweet contains some of the extended features, then an additional field called
extended_tweet is added.
full_text (which may be more than 140 characters),
display_text_range (the offsets in full_text of the body of the tweet), and
extended_entities (parsed out hashtags, mentions, URLs, media, etc.).
extended_tweet section from a sample tweet:
A few notes:
- All of the existing fields remain; in particular, the
textfield still exists and is limited to 140 characters.
- Based on a look at the sample stream, the
extended_tweetfield is not included in all tweets, only those using the extended features.
The latest release of DocNow’s twarc already supports extended tweets. Extended tweets can be selected by adding
--tweet_mode extended to the commandline or setting the tweet_mode argument in Twarc’s constructor.
SFM will be adding support for extended tweets in the forthcoming 1.7 release. (Here’s the ticket.) In addition to incorporating the new version of twarc, we’ll make sure that all parts of the application that extract data from the tweet handle all three flavors (classic, extended REST, and extended streaming).
Try it yourself
To get some extended tweets from the REST API using twarc:
python twarc.py --tweet_mode extended timeline SocialFeedMgr
To get some extended tweets from the Streaming API using twarc:
python twarc.py --tweet_mode extended sample | jq 'select(has("extended_tweet"))'
You may need to wait a bit for an extended tweet.