Of Working Faraday Cages and 5G

About a month ago, I ran across this funny tweet about people buying Faraday Cages or mental router covers to block 5G:

I got really curious about what the Amazon reviews cumulatively looked like, so I did a small data collection of reviews from 33 different “Faraday cages” (and bags).

For folks who are unfamiliar with Faraday cages, these are encasings (typically of conductive mesh) which are used to block whatever is within the cage from electromagnetic fields. If put around a router, a Faraday cage would naturally block out all internet signal (and if it doesn’t, it wouldn’t actually be a Faraday cage). In other words, buying a Faraday cage to enclose your wireless router would defeat the purpose of having a wireless router.

Amazon Faraday Cages & Router Guards

Though attention to Faraday cages and router guards on Amazon appears to be pretty recent, some of these things have been sold on Amazon for several years. Pre-2020 reviews show that people initially bought these wanted to cover smart meters, which are often installed by electricity suppliers.

More recently however, people have been purchasing these covers to block from 5G Routers. In fact, there has been a notable increase in the number of verified reviews about these products.

Throughout the time span, verified reviews of the products range greatly from folks who are convinced that using a router guard has deceased their headaches/improved sleeping to people complaining that the product has made using the internet impossible. One common feature of the positive verified reviews was an emphasis on how the guards would block elites (electric companies and governments) from “getting inside my brain.”

review_3.png

However, there were also reviews of folks complaining that their internet was no longer accessible.

Another big reason why some of the Faraday cages/bags were poorly reviewed was that they were too small for routers. This was an especially common critique when people wanted to use cages for smart meters to cover their router.

Unverified reviews typically took on two types of flavors: (1) mocking those who had genuinely bought the product or (2) corrective information that tried to explain why these products are basically pointless. Notably, since the December 2, 2020 tweet, the number of unverified reviews has grown considerably.

review_4.png
review_6.png

Unsurprisingly, the most common positive sentiment words (when using the Bing sentiment dictionary) focused on its ease of use and how it “worked perfectly” (this was said both sarcastically and genuinely). Negative words either focused on the “harmful” effects of electromagnetic fields (headaches, cancer, etc.) or criticized the cages for being a scam or joke.

Though the results were pretty unsurprising, this was a good exercise in playing around with Amazon reviews! Plus, with my first semester of teaching over, I’m hoping I can be a more active blogger.

The data and code for this analysis can be found on my github, here.

Disney Plus Data and Chill ;)

On December 12, 2019, Disney unveiled its streaming service, Disney+, to the world. It received significant attention, both good and back, from the press—which makes sense, because over 10 million people signed up in the first day.

Twitter was also abuzz with conversations about Disney+ (see this string-of-tweet “news story” about Twitter activity on the first day). Several pointed out that shows, including new ones like The Mandalorian and oldies like Darkwing Duck, were trending soon after Disney+ was launched.

But what would activity look like after the first day?

To answer this question, I used Mike Kearney’s rtweet package to look at tweets posted from 11/14/19 to 11/18/19 that had one of the following keywords: disneyplus, disney plus, disney+, and disney +.

Timeline

As with any long-term (> 1 day) popular topic (like elections), tweets about Disney+ had a natural seasonality. People tweet less after midnight and pick back up at 6 or 7 a.m. the next day. While activity was still pretty high on the 14th, people tweeted less and less about it over time (as would be expected). There was a little over a million tweets in the corpus (n = 1,107,413).

Topic Modeling

I also ran an LDA topic modeling, which highlights the variety of conversations on Twitter about Disney +.

Noticeably, The Mandalorian, Hannah Montana, the Simpsons (which is on Disney+ in its original 4:3 format), and Bad Girls Club were talked about frequently enough to be (mostly) stand-alone topics. The Mandalorian hashtag (#themandalorian) was also a popular keyword in the corpus.

But we also see a variety of other topics, including one about the Nickelodeon and Netflix deal (which many people viewed as a response to Disney+’s explosive popularity) and another comparing Disney+ to other streaming services (like Netflix, Hulu, and HBO). In fact, Netflix was the third most frequent term in the dataset (behind Disney and Disneyplus).

(Some of the topics were obviously noisier than others. Topics with the little red “n” are “noisier” than the others, meaning that a large number of tweets with a high beta in that topic were not related to the topic labels. Many tweets in the “Bad Girls Club” topic, for example, don’t actually have to do with that show.)

Sentiment-Laden Words

I did a quick sentiment analysis as well, using the tidytext package (specifically, the bing sentiment lexicon). This allowed me to look at frequently used, sentiment-laden terms.

tweet_sentiment.png

As with any sentiment analysis that is based on a lexicon, there are obvious limitations. The bing dictionary, for example, includes “trump” as a positive word, but it would count any mention of Donald “Trump” as well.

We can see a similar phenomenon here with the word “chill”, which Bing treats as a negative word. If you recall from the topic modeling results, “Disney+ and Chill” was a topic in-it-of-itself. In addition to using the specific phrase “Disney+ & Chill” (which is a snowclone from “Netflix & Chill”), we see people trying to come up with their own variants, including “Disney+ and Thrust” and “Disney+ and Bust”.

For a quick and dirty analysis, this was a pretty fun corpus of tweets to go through! You can check out my code at my Github.

Armchair Linguistic-ing: ~sparkle~ or sPoNgEbOb sarcasm?

 Today, I’m going to play the role of “armchair linguist” (which is fun and something everyone can do. Everyone can be an armchair linguist.) As much as I would love to pull some data and analyze some fun text, I’m deep in analyzing my dissertation data and should really focus my computing energy towards that. 

However, I was thinking about sarcasm recently when writing about the phrase, “the internet is serious business.” This is a sarcastic remark (and an early meme) that pokes fun at people taking online discourse too seriously. In my little memo, I went back and forth between two constructions of sarcasm:

(1)    Even if the internet is not ~serious business~,…
(2)    Even if the internet is not sErIoUs bUsInEsS,…

Both are typographic markers of sarcasm that people frequently use in online communication. The first is an example of sparkle sarcasm, sometimes described as the “sarcasm tilde.” In  Because Internet, McCulloch describes the tilde as having an exaggerated rise and fall that mimics the tonal features of sarcastic language. Even single-syllable words like “thaaaaaaaanks” and “soooooo” can be elongated for sarcastic effect. Many moments in South Park’s Sarcastaball episode show off this elongation.

Source: Top definition for “~” on Urban Dictionary

Source: Top definition for “~” on Urban Dictionary

The second is (now) an outgrowth of the popular “mocking spongebob” meme, which produced the now well-known sPoNgEbOb cAsE (fun fact: R has a sPoNgEbOb cAsE package, which you can check out here). I’ll call this spongebob sarcasm. The primary purpose of this case variation is to mock the tone of an idea or opinion—this draws from the mocking intent of the original meme. “Spongebob case” is obviously not the first use of alternating caps—like sparkle sarcasm, it was grouped with the use to tildes and asterisks constituting sparkly unicorn punctuation (~*~*iSn’T tHiS gReAt?!*~*~). But under the Mocking Spongebob meme, it’s taken on a life of its own, in the way that sparkle sarcasm is now distinct from ~*~*more ornate*~*~ uses of tildes and asterisks.

An early case of spongebob sarcasm. Source: (Know Your Meme)

An early case of spongebob sarcasm. Source: (Know Your Meme)

So how is sparkle sarcasm different from spongebob sarcasm? In Because Internet, McCulloch notes a Buzzfeed reporter’s description of sparkle sarcasm: “somewhere between sarcasm and a sort of mild self-deprecatory embarrassment.” The use of sparkles suggests a type of “anti-serious” sarcasm that is “sing-songy.”

 In contrast, spongebob sarcasm is direct and biting—a type of “insincere” sarcasm. If sparkle sarcasm is self-deprecatory, spongebob sarcasm is mockery. A core aspect of its early use included mockingly repeating what someone else has said (that norm carries to its current usage, even if mocking oneself):

(Above: my favorite example of spongebob sarcasm this morning)

Having both types of sarcasm gives online communicators a greater variety of “sarcasm” to choose from. And, because it is denoted with obvious markers (tildes and alternating lower and upper case), both sparkle and spongebob sarcasm are less likely to be taken at face-value; whereas tonally-conveyed sarcasm could produce a misunderstanding.

If we think of sarcasm as a language microcosm of satire, we could also think of sparkle sarcasm as Horatian (playful and light-heartedly humorous) and spongebob sarcasm as Juvenalian (i.e., ridicule). I bring this up to highlight that these variations of sarcasm and language are not inherently new. But, we have found new ways to communicate those ideas in daily computer-mediated language, which I think is super cool.

(PS: I went with spongebob sarcasm: tHe iNtErNeT iS nOt sErIoUs bUsInEsS!1!1!!one!!1!1!!!)

The Grammar of "A Thing": Using R to Study Digital Corpora

One of the things I love most about my field (Communication) is its unique passion for building corpora. While there is an obvious value to studying a large, well-studied, pre-structured corpora like LOB or COCA (e.g., multiple scholars working on one dataset increases knowledge about that dataset, reproducibility, etc.), some research questions require more specialized text data.

This is often the situation that I find myself in. If I want to study a linguistic phenomenon in a specific register—like the use of “a thing” in English tweets—I usually have to build my own corpus. So how does one do that?

I’ll break my process down into four broad steps: (1) armchair linguistic-ing, (2) creating the corpus, (3) finding your linguistic phenomenon, (4) corpus analysis.

01. Armchair Linguistic-ing

I became primarily interested in this construction because of its frequency in language use. In spoken English, sentences like, “oh yeah, that’s a thing” are commonplace, even in formal-ish settings, like classrooms (I’m in a J-School, so it’s not unusual to hear someone say, “Yeah, AP style is a thing”).

I’ve always liked this construction, because “a” and “thing” are particularly vague English words. The determiner “a” (as in “I gave her a book”) is indefinite, meaning it refers to something non-specific (contrast this to “I gave her the book”). And “thing” is so broad, it could refer to any tangible, inanimate object. A watch, a book, a stroller, a ticket to Disney—all of these things are things. But when we put “a thing” together, it can suddenly take on a whole new meaning. When someone says, “AP style is a thing”, they mean “people know about AP style” or “AP style is popular”. In this context, the “a thing” is more than an indefinite determiner and a vague noun. Rather, it signifies some degree of importance.

But is this always the case? I wasn’t sure. So, I turned to my corpus building skills to find out.

I figured there could be four general places that “a thing” could be situated in. The first is the subject, like in the sentences below:

  1. A thing needs to be done.

  2. A thing just arrived.

The second possibility is in the object position, such as in the examples below:

  1. I know a thing or two about school.

  2. I made a thing.

It is also possible that “a thing” is used as a predicate noun/nominative. It is also a subject complement, because it completes a linking verb (in English, “to be”). This is the structure I was most interested in.

  1. This is a thing.

  2. That has been a thing for a long time.

And finally, we’ll look into object complements, such as in the examples below:

  1. He considered the party a thing.

  2. He cooked his friends a thing.

Now that I knew what I was looking for, it was time to build and parse my corpus.

02. Building Your Corpus

The first thing you’ll need to think about is where you want to get the data from. Do you want to look at journal articles? Fiction novels? Text messages between friends?

I settled on Twitter for a few reasons, the most important of which was “it is an informal register that is easy to get.” I figured the feature I was looking for would likely not be in a formal register, like news stories or presidential speeches. However, Twitter (and social media language as a whole) is simultaneously beautiful and frustrating in its kind-of-formal, kind-of-informal language norms (beautiful in that language evolves so quickly, frustrating in that there are way too many people who use prescriptivism to put down other people’s tweets).

If you are trying to access the Twitter RestAPI through R, I strongly advocate using rtweet, by Mike Kearney. It’s a really cool package, and a great way to build interesting Twitter corpora at your leisure.

library(rtweet)
#?search_tweets
rstats_tweets <- search_tweets(q = '"a thing"',
                               n = 1000000, 
                               retryonratelimit = TRUE) #max 18,000 every 15 minutes

head(rstats_tweets, n = 5) #looks at the top 2 tweets

This search yielded about 500,000 tweets (510,574, to be exact). To identify whether the bigram “a thing” would be used as a subject, object, predicate, or object complement, I would need to annotate this bad boy.

Right now, I’m using the R clearNLP package, with back ends to spaCy and CoreNLP. I tend to use the latter more (coreNLP) because I’ve gotten better results. But spaCy is much faster and has additional support. I strongly encourage it for those who are both R and Python-proficient (it can also support word vectors and has a great displaCy visualizer).

library(rJava)
library(tokenizers)
library(cleanNLP)

In order to use cleanNLP, you’ll need to interface with the back end (either coreNLP or spaCy).

#cnlp_init_tokenizers() #initializes tokenizer backend
cnlp_download_corenlp()
cnlp_init_corenlp("en", anno_level = 2)
# cnlp_init_spacy

Once you have done this, you are ready to parse your corpus! For the purposes of this exercise, I’m going to use some toy data (parsing the full corpus took about 2 days—I had about 20 million dependencies total).

Toy Data

If you notice, 8 of the 9 sentences are the ones in my previous examples.

toy_data <- data.frame(id = c("s1", "s2", "o1", "o2", "sp1", "sp2", "sp3", "dc1", "dc2"),
                       sentence = c("A thing needs to be done.", "A thing just arrived.", 
                                    "I know a thing or two about school.",
                                    "I made a thing.", 
                                    "This is a thing.", 
                                    "Is summer camp a thing?",
                                    "That has been a thing for so long.", 
                                    "He considered the party a thing.", 
                                    "He made his friends a thing."))
starttime <- Sys.time()
full_corpus_dep <- toy_data$sentence %>% as.character() %>%
  cnlp_annotate(as_strings = TRUE, doc_ids = toy_data$id) %>%
  cnlp_get_dependency(get_token = TRUE)
endtime <- Sys.time()

You want to make sure that you indicate the doc_ids of the data, as that is what you will use to re-align the dependency information to the original tweet or sentence.

Once you do this, you should get a data frame that looks something like this:

Let’s break what cnlp_get_dependency produces. Each row represents one dependency relationship. Each column represents some information about that dependency (e.g., what document or sentence the dependency is in, what words the dependency relationship is linking, etc.)

A brief interlude to help us understand dependency grammar… Dependency grammar interprets two words as having a dependency (relationship) between them. This differs from constituency grammar, which breaks down word relationship into phrases, not dependencies. An important skillset in this work is being able to read the results of one and interpret it as the other (e.g., see dependency relations and conceptualize them as phrases, or see phrases and construct the dependencies).

Because dependencies focus on relationship between two words, we can conceive of a dependency relationship as having a “word”, a “wordtarget”, and a “relation”. Consider the very simple example of “I run.” In this sentence, we have a subject and a verb. In dependency grammar, the verb is the “root” or the center of the sentence. Therefore, each of your sentences will usually have a root. Arrows lead out from the root to other words (these are the “word targets”). Thus, if “run” is the root verb, then the word target “I” is the subject to that verb.

Let’s now look at each column in more detail. The <id> is pretty obvious: it’s the document id, or <doc_ids>, you indicated previously. The <sid> is the sentence number. For most tweets and sentences, the <sid> number will be a 1. However, blog posts, news articles, products reviews, and other longer documents are all likely to have multiple sentences. The <tid> refers to the token number of the word. There is also a <tid_target>, which is the token number of the word target.

The six other columns are: <relation>, <relation_full>, <word>, <lemma>, <word_target>, and <lemma_target>. The <lemma> and <lemma_target> are the lemmatized forms of the word and word_target (for example, the words “thinking”, “thought”, and “thinks” can be represented by the lemma /think/. Using the lemmatized form meant I largely did not have to worry about tense issues.

The <relation>, <word>, and <word_target> are the meat of the dependency analysis. The first “dependency” of a sentence is usually the ROOT verb. Let’s return to our “I run.” Example below.

As you can see, the “run” verb is identified as the root. This is not really a dependency, but more an identification of what the root verb is (hence why there is no actual <word>, and why the <tid> is 0). The second row identifies a <nsubj> dependency “relation”, with the root verb “run” as the <word>, and the noun subject “I” as the <word_target>.

There are many (many) possible dependency relations. You can find a list of them here.

There is some older documentation that can also be potentially useful here (this version of the dependencies is no longer maintained).

Let’s now apply this knowledge to our toy data.

03. Finding your Linguistic Phenomenon

Recall that our goal is to identify whether the bigram “a thing” appears as a subject, object, predicate, or object complement.

Let’s do so by identifying all the dependencies for which “thing” is a <word> or <word_target> (the “a” in “a thing” will be identified as a determiner <word_target> to the “thing”).

thing_word <- subset(full_corpus_dep, word == "thing")
thing_target <- subset(full_corpus_dep, word_target == "thing")

Notice that the “a thing” dependency shows up in the <thing_word> subsetted data. But the more useful dataset for us is the <thing_target> data.

Notice that, if in the subject position, the “thing” <word_target> has a <nsubj> (noun subject) <relation> to a verb. In the object position, the “thing” <word_target> has a <dobj> (direct object) <relation>. In the predicate position, the “thing” <word_target> is the ROOT (if you check the <word> data, you will also note a <cop>, or copula, <relation> from the verb “to be” to the “thing” <word>). In the object complement position, the “thing” <word_target> has an <xcomp>, or an “open clausal complement” <relation>.

Below is an image of all the dependency relationships I was interested in, as related to the “a thing” bigram.

Side note: While the toy data plays nicely, real data isn’t always perfectly parsed. For example, I had about 2,000 tweets where a copula-predicate relationship was identified as a subject-verb(“to be”)-object relationship (these had a “nsubj” + “det” + “dobj” relationship, but the root lemma was “be”—this meant they were initially coded as “objects” but, upon further examination, I subset them to the predicate list).

04. Corpus Analysis

Now that we know what the relationships are, we can re-aggregate to the tweet level. My corpus had a few instances (<10) where “a thing” was used twice. In all these instances, however, the “a thing” bigrams were in the same position.

subject <- subset(thing_target, relation == "nsubj", select=id) %>% mutate(subject = 1)
object <- subset(thing_target, relation == "dobj", select=id) %>% mutate(object = 1)
predicate <- subset(thing_target, relation == "root", select=id) %>% mutate(predicate = 1)
complement <- subset(thing_target, relation == "xcomp", select=id)  %>% mutate(complement = 1)

toy_data2 <- merge(toy_data, subject, by = "id", all.x = T) %>% 
  merge(object, by = "id", all.x = T) %>%
  merge(predicate, by = "id", all.x = T) %>%
  merge(complement, by = "id", all.x = T)
toy_data2[is.na(toy_data2)] <- 0

Let us now turn to the results of the full data.

Results

position_of_word.png

As we can see, the bigram “a thing” is most likely to appear in the object position (“I made a thing”) or predicate position (“This is a thing“).Rarely is “a thing” used in the subject position (e.g., “Love the phrase ‘a meteoric rise’, a thing a meteor has never done”). Fewer than 250 tweets had “a thing” in the object complement position.

“A thing” in the Predicate position

As I expected, tweets that used “a thing” in the predicate noun position discussed a subject as popular, socially important, or at least well-known. These tweets usually followed a similar structure: the word “thing” is the root. The word_target “a” is a determiner to “thing”, and the lemma “to be” (representing “is”, “are”, “was”, and “were”) is a copula to “thing”. Finally, the “nsubj” relationship would link the noun word_target to the “thing” word (this is why we need the thing_word subsetted data).

So what are nouns are described as “a thing”?

Rplot3.png

The figure above shows that, when "a thing" is a predicate, it often link to demonstrative determiners (this, that) or the pronoun "it". We also see some more specific nouns, such as ‘church”, “abortion”, and “harassment”.

Many of these tweets were exclamations (e.g., “I didn’t know this was a thing!” or “OMG this is a thing?!” or “Had no idea this was a thing!!”). Some were questions, about whether something was “still a thing”: “I grew up being told about thick and thin. Is that still a thing[?]”

“Church” appeared often in tweets like, “Is church still a thing?” and “How is church and religion still a thing?” In at least one instance, a tweet was incorrectly parsed: “separation of church and state is a thing, you know that right little @mike_pense?” (the parser coded this as ([NP] separation ([PP]of ([N] church)) ([CONJ] and) ([N] state), rather than treating “church and state” as a conjunction within the preposition phrase).

Many people also described abortions as a thing (or not a thing). A few tweets noted “Abortions are a thing” to note the frequency with which they occur. One tweet said “Late term abortions should not be a thing”, focusing on a specific type of abortions. Another said, “Post-term abortion is not a thing”, referring specifically to President Trump’s coinage.

“A thing” in the object position

Let us now turn to the use of “a thing” in the object position. In these cases, “a thing” is still relatively vague. If you “make a thing”, you’re not necessarily saying the thing you made is popular or well-known—you may simply be happy that you did it (e.g., “I did a thing!”).We can explore this structure more by looking at the verbs associated with the “a thing” bigram in the object position. The dependency relationship would be a dobj from the word_target “a thing” to a verb.

The figure below displays the verbs that appeared at least 5000+ tweets, for which “a thing” was a direct object.

Rplot2.png

Keep in mind that these verbs are lemmatized (therefore the lemma “know” represents “know” and “knew”). By far, the most common verbs used were “to do” and “to make” (as in “I did a thing” and “I made a thing”). This is followed by “to have”, “to know”, “to miss”, and “to learn”.

For example, one tweet said “I made a thing for Pokemon fans and Kingdom Hearts fans [Image]”. Note that the author provided additional info (a prepositional phrase and an image) to describe the “thing”. Another user tweeted, “It looks bad but i did a thing [image].” Many of these tweets expressed some pride over doing, making, creating, having, buying, or owning “a thing”. This was almost always accompanied by pictures of what the “thing” was.

Some tweets also referenced love, as in “love don’t mean a thing” or “love don’t cost a thing”. About 6000 tweets used the verb “change”, and were often about commenting on other people’s worth (e.g., “don’t change a thing!”)

Overall, I really enjoyed doing this analysis. It’s been more difficult to do this analysis with my Ph.D work picking up, but I’m glad I can still find the time every now and again.

Marrying "to get" and "to have"

Typically, “have got” is considered the informal structure of “got” (see here for an example). This presumes that the meaning is the same, but the construction has a formal/informal tone (in other words, the deep structure of the sentence is sustained).

This certainly works when the object is a noun. Let’s look at four sentences where this works:

1A           “I’ve got a present”
1B           “I have a present”
2A           “I’ve got a boyfriend”
2B           “I have a boyfriend”

 But this dynamic change when the object of the sentence is a pronoun. In fact, the two (“have got” and “have”) are no longer semantically similar when we apply it to pronouns. Consider the following three sentences:

3A           “I have you.”
3B           “I got you.”
3C           “I’ve got you.”

 In the first of these three sentences, I have you is an indication of possession, and can be somewhat creepy in the wrong context (after all, will you have me for lunch?). However, it can also be used in a relational context. For example, see the sentences below, from COCA:

             “You have me at a disadvantage” (Fiction)
5              “Will you have me back on the show and apologize in person?” (TV News)
6              “Once you have him chitchatting, he might inadvertently let something slip.” (Magazine)
7              “I can do anything if I have you with me.” (News)

Here, there is still the essence of possession, as typically the subject is in a position of authority. However, in the last example, [7], we can see the use of “have” in the relational context, which is where [3C] comes in.

The sentence “I got you” shows the complexity of the verb ‘to get’, since it holds multiple meanings. With nouns, often the “to get” is also an indication of possession (“I got coffee [for you]”). However, with pronouns, the verb “to get” is a signal of understanding (“she gets me” or “he gets her”).

The syntax of the last sentence, “I’ve got you” is a mix of both—possession and understanding. This combination is so much more meaningful than the individual terms, making the syntactic construction “have got” more complex than the simple “formal/informal” narrative we traditionally learn.

3C           “I’ve got you.”
9              “You’ve got me.”

In the four “words” above (technically three words, one of which is a conjunction), we are expressing both possession and understanding—in other words, trust. When I say “he’s got her”, I am indicating that the object (“she”) can trust the subject (“he”) because the subject is capable of taking on (possessing) some burden, and because the subject understands the weight of that burden.

The expression of trust and wanting to be trusted is especially clear in first- and second-person pronouns, such as in [3c] and [9]. It’s the kind of language—perhaps my literal love language—that exists in relationships, when we rely on one another immensely as day-to-day safety nets. But it’s also something I find myself saying to friends and close compadres. When I use [3C], I am expressing appreciation for you by indicating how much I trust you. When I use [9], I am expressing a desire to be trusted.

So much can be said in so few words.

Tweets about WI Gubernatorial Race Part II: Election Night

Sorry it’s been so long since I’ve posted! The last few months has been absolutely crazy, with visiting family members, paper deadlines, and end-of-semester tasks. I did hit a major milestone: I have officially completed all my coursework for my Ph.D!

Today, I want to focus on my part-2 analysis of the WI election, which I am informally calling, “If you want to know who will win gubernatorial elections first, follow local journalists.”

The figure below is a plot of tweets about Scott Walker and Tony Evers on the night of the Gubernatorial Election (12 a.m. to 2 a.m.). [For more about the data collection, please see the post below].

The Big Picture

The first vertical line represents the first tweet in my dataset that called a win for Evers. This came from Milwaukee Journal Sentinel reporter Mary Spicuzza, who tweeted out at 12:53 a.m. CST (apologies; my timestamp is not adjusted for daylights saving).

mspicuzzamjs.png

While there was an increase in the number of tweets after Spicuzza’s, it didn’t reach full attention for another hour and a half, when it was officially reported by the Associated Press at 1:25 a.m. Attention, measured by counts of tweets with the word Walker or Evers, spikes not long after this tweet.

8 ap.png

What happened in that window of time, the half hour between when it was officially reported by the Journal Sentinel, and when it was reported by the Associated Press?

A Twitter Conversation among Wisconsonites

Unsurprisingly, most of the tweets from this time appeared to come from Wisconsin residents, or people with ties to Wisconsin (as indicated by their geographic information, or by information in their profile, such as being an alum of a UW-System school. One tweet from a self-reporting Wisconsonite said, “Tony Evers (D) now up over Scott Walker (R) by just over 1,000 votes out of 2.5M votes cast. #WiGov”

There were also many references to local media outlets, as seen in the examples below (which were also retweeted by mostly Wisconsinites):

“Looks like @tmj4 just reported live from the courthouse that 38,000 votes just went to #tonyevers when @cityofmilwaukee votes we're tallied. I'm calling it. Tony Evers defeats Scott Walker as the next govenor.of #Wisconsin. Boom. https://t.co/0Y1cObTwy!” - RyanThompson

1.png
4.png

There were also many references to local Wisconsin issues, such as Walker’s rampant Union-busting, or his gutting of education funding (this was mentioned by both people in Wisconsin and those outside of Wisconsin, though the former had an obviously greater attachment):

”Fingers crossed that my great home state of Wisconsin has finally rid themselves of the Union-busting, education-destroying, Foxconn swindling corporate shill that is Scott Walker. But I won’t believe he’s gone until every vote is counted” - @sjtruog (1:17 AM)

“Wait did Scott Walker actually lose? Bc I hate him with such a particular acid for what he did to public education in that state that I want to know if I can dance a mad tarantella on that smug prick’s career grave” - @meganskittles

Cultural References

One of the things I really enjoyed about these tweets were their continual cultural references to Wisconsin. Because tweets during this time were predominantly written by Wisconsonites or those with ties to Wisconsin, there were many tweets referencing things like Menards, as noted above, Culvers, and the Packers).

“I've been this proud to be a Wisconsinite three times: when Favre won a Super Bowl, when Aaron won a Super Bowl, and when we voted melty-faced suffering-horny human khaki Scott Walker out of office.” - @meg_luvs_pandas (1:24 AM)

“If Tony Evers beats Scott Walker that would be the most Wisconsin shit that ever happened since Culver’s showed up.” - @Joe_Bowes (12:53 AM, Milwaukee, WI)

“Can someone tell me if Scott Walker is going to have to get a job at Menard's so I can go to bed.“ - @JustinLaughs (1:07 AM, from Greendale)

Another noticeable feature of this language was the use of the pronoun “we” to refer collectively to Wisconsinites.

“are… are we finally getting rid of scott walker [?] is it happening [?]” - @AlexZiebart (12:55 AM, Milwaukee, WI)

“I’m so nervous to see who won governor in wisconsin […] we need Scott walker out of office!!!” - @taypyt (12:55 AM)

“My final political tweet for the evening: if we have really finally done it, nothing has given me more pleasure than to vote against Scott Walker in five different elections. Bye Felicia” - @alephtwo (12:58 AM, Madison, WI)

The use of this “royal” we (“state-wide” we?) instills the idea of a collective identity that is directed towards the voting out of Walker from office. It evokes a sense of solidarity, or “survival” from Walker’s terms in office.

Outsiders looking in

A few outside of the WI gate were able to tap into this information, as indicated by this tweet from VA resident: This is one of the most unbelievable finishes I have ever seen. Came down to a bunch of uncounted absentee ballots. Looks like Scott Walker is done. https://t.co/D8VwrxGZOk.” - @junkiechurch, (12:57 AM)

Those in close proximity to the Wisconsin appeared to be more attentive as well:

“No more Scott Walker. Wisconsin, I tease you all the time, but you did a good job today.” - @KyleWarner3000 (1:16 AM, DeKalb, IL)

However, many (presumably) outside of Wisconsin expressed frustration about wanting to learn more:

“DAMNIT CNN SHOW ME THE SCOTT WALKER RACE“ - @Seattle_9 (1:12 AM, Seattle, WA)"

“Wait did Scott Walker actually lose? Bc I hate him with such a particular acid for what he did to public education in that state that I want to know if I can dance a mad tarantella on that smug prick’s career grave “ - @godhatesyeast (1:13 AM, USA [no state indicated])

It was about Walker losing, not Evers winning

As seen by the figure above, attention was squarely focused on Walker losing, rather than Ever winning. This suggests that Twitter communities perceived the election as a victory because Walker lost, not necessarily because Evers won. Many of the tweets were focused on insulting Walker.


”EAT MY ENTIRE SHIT COTT WALKER” - @ChiYoungMoon (12:59 AM)

“Good bye Scott Walker you trifling ham and cheese eatin' bitch https://t.co/ODW3kMWPhC“ - @jae_dubb (1:09, Chicago)

In situations where Evers was referenced, it was often because he (having been a teacher) made an ironic foil to Walker.

5 popular.png

“Scott Walker loses thanks to a Milwaukee wave. What a night. Couldn't happen to a more deserving guy. https://t.co/2KEDdpeNAe3” - @Save_the_Daves (1:15 AM, Milwaukee, WI)

In the above instances, Evers is celebrated but not explicitly mentioned. Walker, by contrast, is referenced in full name. In the first tweet, by @Bro_Pair, Evers is framed as a “kindly teacher”, the kind of person who was directly impacted by Walker’s economic policy. The expression of joy (“glad I stayed up late enough…”) reflects a sense of schadenfreude—taking pleasure in watching Scott Walker lose. This sense was expressed by many others…

“It looks like Scott Walker might lose to Tony Evers. Don’t go to bed or you might miss the best schadenfreude of the midterm elections.” - @RiskyLiberal (1:23 AM)

“Seeing Kris Kobach and Scott Walker lose is pretty sweet, but my schadenfreude dream team was Ted Cruz and Steve King “ - @antitractionist (1:23 AM)

… and sometimes in bizarrely sexual references.

The Recount Topic

Tweets about a recount appeared as early as announcements about the Milwaukee absentee ballots. Many tweets were written by conservative-identifying or MAGA-identifying accounts.

(R) Scott Walker, WI gov, is requesting a recount.” - @AKLLL49 (1:20, Profile: Love our @POTUS […] PHUCK #Grammernazis #Haters of #Guns and #Freespeech. #MAGA)

“Both sides expect protracted recount in Wisconsin governor's race between Scott Walker and Tony Evers https://t.co/ilCQHR8KUT” - @jackiebullivant (1:22, Profile: Conservative, business owner & political enthusiast. We need honest, authentic gov’t FOR and BY the people b/c people matter! Free speech. #MAGA #PPC2019)

“Governor Scott Walker's campaign has announced plans to call for a recount, should Evers come out on top. Either candidate can call for a recount if the results come in within 1%.” - @KFIZ1450 (1:03 AM, Fond du Lac, WI)

One was pretty sure Walker was going to win:

“Hey look on the bright side at least we still have Scott Walker and Ted Cruz.” - @johnforchione (12:57 AM)

Liberal Rebuttal

By 1:23, there were already (mostly liberal) rebuttals to a call for a recount, with many pointing out (ironically) that Walker had pushed for the bill that now prevented him from calling a recount.

“Tony Evers defeats Scott Walker by 1.1%! Outside of the margin for a recount that Scott Walker passed  into law for a recount. Karma is a bitch!” - @Fetzer2 (1:23 AM)

Conclusion

What can we learn from this analysis?

1) If you want to know who wins a state-wide election, follow local reporters. They have the greatest level of access to updated voting information, and are much more knowledgeable about their geographic region than national news outlets.

2) In a media environment that focuses on one event, or what researchers would call a media storm, “liberals” and “conservatives” respond to each other very quickly, within the span of a few minutes. Given the hybrid nature of the U.S. media system, it is likely that media storm dynamics will impact social media, particularly Twitter(as a platform for professional journalism). Capturing this dynamic in media storms, therefore, requires very granular levels of data.

3) To understand the politics, one needs to understand the culture of that society. Regional cultural references were an important feature of this discourse, which was unique compared to the post-AP tweet time span. In this latter time, tweets were still focused on Walker’s loss (rather than Ever’s win). However, following Associated Press’ reporting, the tweets were predominantly by those outside of Wisconsin. The story was reaching national attention, and the discourse had lost this specific local component.

Overall, this was an interesting project for me to examine how a state-wide political event goes national on Twitter in an hour and a half.

Tweets about WI Gubernatorial Race Part I: October 28 to Nov 6

Politically, Wisconsin is quite different from my home state of New York. It’s long been considered a purple, or swing, state. For that reason, Wisconsin has often received extra national attention when it comes to local or state-wide politics.

The 2018 Midterm Elections were another example of this, with many citizens around the country tracking Governor Scott Walker’s race against Superintendent Tony Evers. Today, I explore how Twitter talked about this race in the week leading up to Election night (October 28 to Nov 7). This post will focus on the lead-up to the election. Part II will focus on the last few hours of the election (12:30 to 2:30 on November 7, 2018).

(Note: Tweets were collected using the r package rtweets. All datetimes have been converted to CST. For more information about this collection and analysis, please scroll to the bottom)

A broad temporal view: Oct 28 to Nov 6

In the week leading up to the election, there were several noteworthy spikes. We focus on two in particular: November 1 (8-9pm) and November 4 (7pm).

November 1, 2018 from 8:00-9:59 pm

This was the largest spike for Walker in this week (1568 tweets in two hours). Far and away, the most common verb used was variants of “call” (e.g., “called”/”calls”/”calling”). This is because, that day, Governor Walker said that President Obama was "the biggest liar of the world.” This language (employed by non-journalists and journalists alike) was also employed in leads of news stories in Fox News and The Hill).

November 4, 2018 from 7:00-7:59 PM

Although this peak was not as prominent as the others explored here, it is one of the few times that Evers exceeded Walker in references on Twitter.

Many of these tweets appeared to be campaign-oriented tweets about Evers’ support for Wisconsin residents. Unlike the previous spike, there did not seem to be an event aligned with this moment in time. This suggests that this spike was campaign-induced, rather than naturally generated.

A closer look at Election Day

As can be seen in the above image, attention to the Walker/Evers election peaked after 12:00 AM CST, late in the night relative to other well-watched races that day. Votes rolled in minute by minute, with many outlets (including NYT, one of my main trackers) showing a less than 1% margin for several hours.

Methodology

Tweets were collected using Mike Kearney’s rtweets. I began my search at 2:40 AM CST on November 7, 2018, using the search terms “Scott Walker” OR “Tony Evers” OR “#wipolitics” OR “#wielection“. Twitter’s REST API provides an about 1% random sample of tweets. This yielded about 111,000 tweets.

Tweets were annotated for their part-of-speech and dependency using coreNLP. Within the corpus, there were over three million dependencies.

Time Series of IRA Activity on U.S. Social Media Platforms

So I've been toying around with some of the data on other social media platforms, now that much of it has been made publicly available. I'm looking forward to doing a more systematic analysis of the content. In the meantime, however, here are some counts of IRA activities on different social media platforms from 2015 to 2017. 

I was somewhat surprised to see that the time series did not line up as neatly as I thought they would have. Perhaps these strategies are meant to complement each other? This is where a deeper dive into the content or the account would be more useful. For example, perhaps conservative-imitating IRA accounts (e.g., Twitter's @TEN_GOP) responded to different things compared to liberal imitating IRA accounts (e.g., Facebook/Twitter's @Blacktivist group). 

Given the pending lockdown of information regarding this case, it is more important than ever to share and verify this information. It's a shame researcher do not get much access to this kind of data, as scientific rigor should be the minimum standard for analyzing potential foreign influences into American elections. 

Reddit Data Source: [Link]
Facebook Data Source: [Link]

Advertisements purchased by IRA on Facebook

Submissions to Reddit by IRA-controlled accounts

Tweets written by the IRA