Disney Plus Data and Chill ;)

November 22, 2019 Josephine Lukito

On December 12, 2019, Disney unveiled its streaming service, Disney+, to the world. It received significant attention, both good and back, from the press—which makes sense, because over 10 million people signed up in the first day.

Twitter was also abuzz with conversations about Disney+ (see this string-of-tweet “news story” about Twitter activity on the first day). Several pointed out that shows, including new ones like The Mandalorian and oldies like Darkwing Duck, were trending soon after Disney+ was launched.

But what would activity look like after the first day?

To answer this question, I used Mike Kearney’s rtweet package to look at tweets posted from 11/14/19 to 11/18/19 that had one of the following keywords: disneyplus, disney plus, disney+, and disney +.

Timeline

As with any long-term (> 1 day) popular topic (like elections), tweets about Disney+ had a natural seasonality. People tweet less after midnight and pick back up at 6 or 7 a.m. the next day. While activity was still pretty high on the 14th, people tweeted less and less about it over time (as would be expected). There was a little over a million tweets in the corpus (n = 1,107,413).

Topic Modeling

I also ran an LDA topic modeling, which highlights the variety of conversations on Twitter about Disney +.

Noticeably, The Mandalorian, Hannah Montana, the Simpsons (which is on Disney+ in its original 4:3 format), and Bad Girls Club were talked about frequently enough to be (mostly) stand-alone topics. The Mandalorian hashtag (#themandalorian) was also a popular keyword in the corpus.

But we also see a variety of other topics, including one about the Nickelodeon and Netflix deal (which many people viewed as a response to Disney+’s explosive popularity) and another comparing Disney+ to other streaming services (like Netflix, Hulu, and HBO). In fact, Netflix was the third most frequent term in the dataset (behind Disney and Disneyplus).

(Some of the topics were obviously noisier than others. Topics with the little red “n” are “noisier” than the others, meaning that a large number of tweets with a high beta in that topic were not related to the topic labels. Many tweets in the “Bad Girls Club” topic, for example, don’t actually have to do with that show.)

Sentiment-Laden Words

I did a quick sentiment analysis as well, using the tidytext package (specifically, the bing sentiment lexicon). This allowed me to look at frequently used, sentiment-laden terms.

As with any sentiment analysis that is based on a lexicon, there are obvious limitations. The bing dictionary, for example, includes “trump” as a positive word, but it would count any mention of Donald “Trump” as well.

We can see a similar phenomenon here with the word “chill”, which Bing treats as a negative word. If you recall from the topic modeling results, “Disney+ and Chill” was a topic in-it-of-itself. In addition to using the specific phrase “Disney+ & Chill” (which is a snowclone from “Netflix & Chill”), we see people trying to come up with their own variants, including “Disney+ and Thrust” and “Disney+ and Bust”.

For a quick and dirty analysis, this was a pretty fun corpus of tweets to go through! You can check out my code at my Github.

Tweets about WI Gubernatorial Race Part I: October 28 to Nov 6

November 12, 2018 Josephine Lukito

Politically, Wisconsin is quite different from my home state of New York. It’s long been considered a purple, or swing, state. For that reason, Wisconsin has often received extra national attention when it comes to local or state-wide politics.

The 2018 Midterm Elections were another example of this, with many citizens around the country tracking Governor Scott Walker’s race against Superintendent Tony Evers. Today, I explore how Twitter talked about this race in the week leading up to Election night (October 28 to Nov 7). This post will focus on the lead-up to the election. Part II will focus on the last few hours of the election (12:30 to 2:30 on November 7, 2018).

(Note: Tweets were collected using the r package rtweets. All datetimes have been converted to CST. For more information about this collection and analysis, please scroll to the bottom)

A broad temporal view: Oct 28 to Nov 6

In the week leading up to the election, there were several noteworthy spikes. We focus on two in particular: November 1 (8-9pm) and November 4 (7pm).

November 1, 2018 from 8:00-9:59 pm

This was the largest spike for Walker in this week (1568 tweets in two hours). Far and away, the most common verb used was variants of “call” (e.g., “called”/”calls”/”calling”). This is because, that day, Governor Walker said that President Obama was "the biggest liar of the world.” This language (employed by non-journalists and journalists alike) was also employed in leads of news stories in Fox News and The Hill).

November 4, 2018 from 7:00-7:59 PM

Although this peak was not as prominent as the others explored here, it is one of the few times that Evers exceeded Walker in references on Twitter.

Many of these tweets appeared to be campaign-oriented tweets about Evers’ support for Wisconsin residents. Unlike the previous spike, there did not seem to be an event aligned with this moment in time. This suggests that this spike was campaign-induced, rather than naturally generated.

A closer look at Election Day

As can be seen in the above image, attention to the Walker/Evers election peaked after 12:00 AM CST, late in the night relative to other well-watched races that day. Votes rolled in minute by minute, with many outlets (including NYT, one of my main trackers) showing a less than 1% margin for several hours.

Methodology

Tweets were collected using Mike Kearney’s rtweets. I began my search at 2:40 AM CST on November 7, 2018, using the search terms “Scott Walker” OR “Tony Evers” OR “#wipolitics” OR “#wielection“. Twitter’s REST API provides an about 1% random sample of tweets. This yielded about 111,000 tweets.

Tweets were annotated for their part-of-speech and dependency using coreNLP. Within the corpus, there were over three million dependencies.

Understanding a little more about recent coverage of Korean-U.S. relations through adjective use

May 25, 2018 Josephine Lukito

Yesterday, U.S. President Trump pulled out of a "highly-anticipated" summit meeting with North Korea's Kim-Jung Un. Given the freshness of this story, it'll take some time collect enough articles to do an anlaysis of this specific incident. But, in the meantime, some interesting results from my analysis of Korean-U.S. relations in American news below.

(Data cleaned and analyzed using R tidytext, quanteda, and OpenNLP. Graphs produced by ggplot2 or MediaCloud.)

Count of articles using the words "Trump" and "North Korea" in top American news media (digital + traditional). Results gathered using MediaCloud archive.

As we can see above, the majority of the coverage appeared to be between May 7 (when North Korea claimed to have demolished a nuclear test site) and May 21. Using those two weeks as my window, I pulled all articles referencing "Trump" and "North Korea" from four news outlets: CNN (n =96), Fox (n = 114), the New York Times (n = 89) and the Washington Post (208), a total of 507 news stories.

I tagged all the words in the news stories for their part of speech using OpenNLP. I then pulled out all the adjectives, removed duplicates, and screened them for accuracy (OpenNLP has an above 90% accuracy, but the human eye is critical to ensuring quality results). I finally looked at the use of these adjectives in relation to specific actors/parties (mainly North Korea, South Korea, and the United States). Given the effect of political personalization, I consider both the country name and the name of the leader (e.g., "North Korea" OR "Moon Jae-In" OR "President Moon" OR "Moon Jae In") as keywords. I retained the adjective if it appeared within three words of the NK, SK, or US keywords.

Raw counts are presented below (keep in mind the corpus is not perfectly balanced... also, sorry I was too lazy to reorder the charts XD Just so tired and wanted to practice some code):

A broad temporal view: Oct 28 to Nov 6

November 1, 2018 from 8:00-9:59 pm

November 4, 2018 from 7:00-7:59 PM

A closer look at Election Day

Methodology

Copyright © Josephine Lukito, 2024