Ian Ardouin-Fumat

Overview / Process

I was commissioned by The Markup, an independent newsroom focused on tech accountability, to develop Citizen Browser, a research application that enabled journalists to investigate the societal impact of Facebook’s recommendation algorithms and ad targeting.

Citizen Browser monitored the Facebook feeds of a thousand study participants as a way to gain insight into the platform’s algorithms. This data was used as source material in a dozen of investigations by The Markup.

Frontend Development, for The Markup.

Press: The Markup

11/02/2021

Shedding a light on algorithmic injustice

Like traditional broadcasters, social media platforms choose—through their algorithms—which stories to amplify and which to suppress. But unlike traditional broadcasters, the social media companies are not held accountable by any standards other than their own decisions on the types of speech they will allow on their platforms. And their algorithms are not transparent. Unlike on the evening news broadcast, no one can see what they have decided will be the top story of the day. No two people see exactly the same content in their personalized feeds.

This raises important questions, as one in five Americans say they get their political news primarily through social media. Online content is known to sway elections, influence public health, and sometimes even be the source of political violence.

In this context, nonprofit publication The Markup set out to shed a light on the news and political content that Facebook algorithms push to the American public. This study, called Citizen Browser, brought together a panel of 1,000 participants, who were paid to record the content appearing in their Facebook feeds. The resulting data, combined with the participants' demographic information, enabled the first investigation of its kind into the political biases of social media algorithms.

The Markup published a full technical documentation of Citizen Browser.

Under the supervision of data journalist Surya Mattu, and in collaboration with designer Sam Morris, I was commissioned to build the frontend application for Citizen Browser.

11/03/2021

Citizen Browser study protocol

The study began by recruiting a panel of 1,000 people representative of the US adult population across gender, age, race, education, and location. The panel was assembled by a survey research provider, and was constantly updated as the study ran its course. As they signed up, participants were required to take a short demographic survey, which later was used to provide additional context to the content captured by Citizen Browser.

Upon downloading the desktop application I developed, participants would go through an onboarding flow. This step required them to sign in to Facebook in a browser that was controlled by Citizen Browser. Once successfully signed in, participants did not need to interact with the app anymore, as it would automatically load a Facebook page behind the scenes and collect the content served by the platform. The data capture was performed by NGFetch, a proprietary browser automation tool developed by Netograph.

The app collected HTML source code and screenshots from Facebook. This included the Facebook homepage, suggested groups, and recommended pages. We focused only on items promoted through shared links on these pages. This included advertisements, public posts, publicly shared video links (not the videos themselves), shared links and reaction counts (no text or usernames are captured), suggested groups, and suggested pages. Captured data points were sent to a database maintained by The Markup, and annotated with participant demographic information.

Example of the content captured by Citizen Browser

The privacy of the study's participants was of course a major consideration, and Citizen Browser was designed to redact any identifiable information from the data capture. Furthermore, any data analysis was ultimately performed over aggregates, as to not reveal the identity of participants.

The study ran its course over a two year period. It produced a trove of data that enabled journalists at The Markup to analyze what content was prioritized by Facebook, and how different categories of the US population were impacted by algorithmic disparities.

11/04/2021

Another crowdsourced data collection app

My contribution to Citizen Browser was modest—many people were involved in the project's success—yet pivotal, as I was in charge of building the desktop application that held all the pieces together. This project felt right up my alley, as I had previously worked on other crowdsourced data collection tools, such as Floodwatch or Cloudy with a Chance of Pain.

I built Citizen Browser as an Electron/React application for both Windows and MacOS. It presented interesting design and development challenges, since the app's user flow was so unique.

In particular, a crucial stage of the onboarding flow required sending new participants to the Facebook website and having them sign in to their account. While this seemed trivial at first glance, it turned out that requesting users to sign into their personal accounts from an unfamiliar app can throw many users off, and especially those who are not the most tech-savvy. Designer Sam Morris and I carefully addressed this issue by testing and updating the user flow, until we were confident participants would not drop out of the study.

Once set up, I built the application to remain resilient to connection and authentication issues, and built in user feedback where necessary. I aimed to create an experience as unobstrusive as possible, aiming for Citizen Browser to operate 24/7 and as quietly as to not introduce any biases in the data collection process. I created a pipeline that sanitized the resulting data archives and guaranteed user privacy. The data was immediately processed in order to remove information that could identify panelists and their contacts, including account names, usernames, profile pictures, and friend requests.

Screenshot of a redacted Facebook homepage. The redacted information, of which there was a lot, is replaced by pink blocks in the screenshot and deleted from the html source code without any human intervention.
11/05/2021

Citizen Browser at work and in the wild

The data collected by Citizen Browser was used in a dozen investigative reports that revealed privacy violations, political bias, and digital redlining caused by Facebook algorithms.

In January 2021, a Citizen Browser investigation revealed that Facebook failed its promise to stop pushing political groups to its users. This led Senator Ed Markey (D-MA) to demand answers from Facebook CEO Mark Zuckerberg. The company later fixed the issue, which it attributed to "technical issues."

In March 2021, The Markup leveraged Citizen Browser data to demonstrate how COVID-19 information was less likely to reach black people than other racial groups on Facebook. A month later, additional data showed that Facebook kept pushing anti-vaccine content to the public.

In April 2021, Citizen Browser data revealed that credit card ads were targeted by age, which violated Facebook’s anti-discrimination policy. In response, the company pledged to remove those ads.

In 2022, Citizen Browser won the national Edward R. Murrow Award for excellence in innovation. The project was also translated for German media, in collaboration with Süddeutsche Zeitung.