If you're one of the thousands of New Yorkers who has, on occasion, sent out a tweet after a couple of cocktails, this algorithm may be able to pick up on your antics.
Researchers from the University of Rochester recently developed a computer program that can single out tweets sent while the user is tipsy.
Some of the strongest words include "shot," "here," "haha," a URL, and — worrisomely — "drive."
To develop the algorithm, researchers first collected millions of geo-tagged tweets and then filtered them for references to drinking and alcohol.
Then they had to sort out whether the tweet was sent actually sent about the user's own alcohol use (and not, say, a friend's) and whether it was sent while the tweeter was under the influence.
Below is a flowchart from the study that shows how the team's artificial intelligence software, called a support vector machine (SVM), sniffed out the drunk tweets. An SVM is an algorithm that can be taught to recognize features in a piece of data and classify it into one of two categories — in this case, a simple "yes" or "no" for each question about a given tweet.The first round of words are mostly what you'd expect: "Drunk" strongly indicates that the tweet is about drinking, followed by "wine," "beer," "alcohol," and so on.
Shocking, we know.
On the other side, "club," "shot," and "party" often correlate with tweets that aren't about drinking, as do various versions of "turn up" ("turnup, "#turnup," "turnt up").
But once you get past the first round, the program has to determine whether or not the tweet is about the user drinking, or someone else.
If you're talking about your own drinking, you might also tweet out something about "Friday," free," or, hilariously, "pong."
Oh yeah, and "already" — as in, "It's Friday, I'm playing free beer pong and I'm already drunk."
Once that's established, the algorithm has to figure out whether you were drinking while you sent out the tweet.
Here's the leaderboard. The words on the right are what you might use when you're actually drinking; those on the left are words you might use when tweeting about your drinking:
Apparently, everyone throws links in their drunk tweets, because the winner is "#url," which is how the algorithm represents links. They're also giving shout outs, because "#mention when" is how Twitter mentions are processed.
An example of a tweet that would pass all of these tests might be:
The algorithm would probably label that one up as "tweeting while drunk." As for this one:
"I got so drunk this weekend I'm not even sure if last night was real"
The machines would catch that this is all past tense, and you're not actually drunk at the moment (unless you woke up drunk).
Obviously, there are some worrying words in the table: It seems that people questioning their life choices might be wondering if they're an alcoholic (or at least darkly joking about it).
On the other side, "drive" is positively correlated with tweets sent while drinking. Considering that almost 10,000 people are killed in alcohol-related crashes per year in the US, we seriously hope those tweets read like this:
"I drank so I'm not going to drive, me and my friend are getting a taxi haha"
We do, however, support the liberal use of the martini emoji.