In quite a laid-back and free flowing session, researchers Oleg Zhilin and Angus Bridgman talked us through the work that goes into researching through social media, and a bit of the work they did trying to track the Canadian elections.
They first asked the audience what was of particular interest to them, and three themes emerged:
Coordinated activity of vested interests (eg Brexit)
Tracking bot activity
How controversies that surround a particular candidate them impact public perception of them
The pair explained that things that happen on social media have minimal short-term impact on national trends. It might change the way that politicians or the news speak about topics, but ultimately the number of people engaging in controversies online are still quite small.
They also explained the challenges in defining the online spaces when assessing their impact, predominantly because of the vast number of spaces available. Researchers will usually default to Twitter, but it is an enormously unrepresentative sample – it’s great for candidates, and for journalists, but not for the public at large.
Also detailed was the varying expectations of privacy on the platforms, and how to accommodate those. With Twitter and YouTube comments, for example, there is a general expectation that what you’re putting up online is for public consumption. Meanwhile, things like WeChat, or Whatsapp are frequently considered private. Do people have a different expectation of privacy when in a small Facebook group compared to a large one? Probably, but how can you really know. In reality though, they are equally non-private.
“You should have no expectation of privacy in a group that anyone can join.”
Researchers are able to scrape content from platforms using APIs. Each platform has a different set of API’s, and the collection method will be different.
Using information scraped from Twitter, the researchers were able to infer partisanship, and then try to match them to national trends. Doing so, however, is a breach of Twitter’s terms of service.
The key takeaway from the session was the importance of not oversimplifying the data, and immediately making the assumption that correlation is equal to causation. People are not thinking about falsifiability. They simply find evidence of something and then they share the results. For example, during the Canadian elections there were people who had #MAGA (Make America Great Again) in their Twitter profile commenting on Canadian politics. Does that mean they are trying to influence the election. Are they just Canadians who are fans of trump? Simply seeing #MAGA supporters commenting on Canadian politics does not immediately mean there is US interference in Canadian elections.
Expanding on that, journalists have to be careful with overblowing the importance of bots and trolls on the internet. It’s easy to point to interference, but they then lean on data that is perhaps not fully processed.
Oleg and Angus propose that rather than the onus being on the data researchers to disprove it, it needs to be on the person who is trying to make the claim. Basically, journalists should do more work to make sure their stories have credible foundations.