Facebook says it is using AI to prioritize potentially problematic posts for human moderators to review in order to more quickly remove content that violates community guidelines. The social media giant previously used machine learning models to proactively remove low priority content, leaving high priority content that was reported by users to human reviewers. However, Facebook claims it is now aggregating content identified by users and models into a single collection before filtering, rating, deduplicating, and distributing it to thousands of moderators, many of whom are contract employees.
Facebook’s continued investment in moderation comes because the company is reportedly unable to contain the spread of misinformation, disinformation and hate speech on its platform. Reuters recently found over three dozen sites and groups that contained discriminatory language about Rohingya refugees and undocumented migrants. In January, Caitlin Carlson, an associate professor at Seattle University, published the results of an experiment in which she and a colleague collected more than 300 posts that appeared to violate Facebook’s hate speech rules and reported them through the service’s tools. According to the report, only about half of the posts were ultimately removed. More recently, civil rights groups like the Anti-Defamation League, the National Association for the Advancement of Colored People, and Color of Change have claimed that Facebook will not enforce its hate speech guidelines. The groups organized an advertising boycott in which over 1,000 companies cut spending on social media advertising for a month.
According to Facebook, its AI systems now weigh more potentially offensive content quickly shared on Facebook, Instagram, Facebook Messenger, and other Facebook properties than content with few shares or views. Messages, photos, and videos related to real-world harm such as suicide, self-harm, terrorism, and child exploitation take precedence over other categories (such as spam) when reported or detected. Additionally, posts with signals similar to content that previously violated Facebook’s guidelines are more likely to reach the top of the moderation queue.
Using a technology known as WPIE (Whole Post Integrity Embedded), Facebook systems capture floods of information, including images, videos, text titles and bodies, comments, text in images from optical character recognition, transcribed text from audio recordings, user profiles, interactions between users, external users Context from the web and information on the knowledge base. A representation learning phase enables the systems to automatically recognize representations that are required to identify similarities in harmful content from the data. Fusion models then combine the representations to create millions of content representations or embeds that are used to train supervised multitasking learning and self-supervised learning models that flag content for each category of infringement.
One of these models is XLM-R, a natural language understanding algorithm that Facebook also uses its Community Hub to bring people in need together. According to Facebook, XLM-R, trained on 2.5 terabytes of web pages and capable of translating between approximately 100 different human languages, enables its content moderation systems to learn across dialects, so “every new human review of a violation does our system[s] globally better than just in the language of the reviewer. “(Facebook currently has around 15,000 content reviewers who collectively speak over 50 languages.)
“It’s important to note that any content violations … still receive a full human evaluation – we are using our system[s] to better prioritize content, “said Facebook product manager Ryan Barnes on Thursday to press representatives. “We expect more automation when the content breach is less severe, especially when the content is not viral or … is being shared quickly by a large number of people [on Facebook platforms]. ”
In many of its businesses, Facebook has for years been largely moving towards self-supervised learning, which uses unlabeled data in conjunction with small amounts of labeled data to improve learning accuracy. Facebook claims that its Deep Entity Classification (DEC) machine learning framework has been responsible for a 20% reduction in abusive accounts on the platform in the two years since it was deployed, and that its SybilEdge system can detect fake accounts that are less are less than a week old and less than a week old are 20 friend requests. In a separate experiment, Facebook researchers said they could train a language comprehension model that made more accurate predictions with just 80 hours of data than with 12,000 hours of manually labeled data.
To predict masculinity, Facebook relies on a supervised machine learning model that looks at previous examples of posts and the number of views they’ve accumulated over time. For example, instead of analyzing the viewing history in isolation, the model takes into account trends and privacy settings in the post (i.e., whether it could only be viewed by friends).
Aside from predicting virility, Facebook claims that this introduction of self-monitored techniques – along with automatic content prioritization – has made it possible to deal with malicious content faster, while allowing human review teams to spend more time on complex decisions like bullying and harassment. Among other things, the company cites its Community Standards Enforcement Report, which covered April 2020 through June 2020, and showed that the company’s AI detected 95% of hate speech suppressed in Q2 2020. However, it is unclear to what extent this is true.
Facebook admitted that much of the content flagged in the Wall Street Journal report would have been given a low priority for review as it had less of a potential to go viral. According to a lawsuit, Facebook failed to remove pages and accounts of those who coordinated, resulting in deadly shootings in Kenosha, Wisconsin, in late August. Nonprofit activism group Avaaz found that misleading content generated an estimated 3.8 billion views on Facebook over the past year, with the spread of medical disinformation (particularly about COVID-19) exceeding that of information from trusted sources. And Facebook users in Papua New Guinea say the company was slow or never removing child abuse content. ABC Science identified a nude picture of a young girl on a page with over 6,000 followers.
There is a limit to what AI can do, especially when it comes to content like memes and nifty deepfakes. The top performing model of over 35,000 out of more than 2,000 participants in Facebook’s Deepfake Detection Challenge achieved an accuracy of only 82.56% against a public record of 100,000 videos created for the task. When Facebook launched the Hateful Memes dataset, a benchmark to evaluate the performance of hate speech removal models, the most accurate algorithm – Visual BERT COCO – achieved an accuracy of 64.7%, while humans showed an accuracy of 85% in the dataset. A New York University study published in July estimated that Facebook’s AI systems make around 300,000 errors when moderating content every day.
Potential biases and other flaws in Facebook’s AI models and datasets can further complicate matters. A recent research by NBC found that on Instagram in the US last year, black users were about 50% more likely to deactivate their accounts through automated moderation systems than users whose activity indicated they were white. And when Facebook sent content moderators home and had to rely more on AI during quarantine, CEO Mark Zuckerberg said mistakes are inevitable because the system often doesn’t understand context.
Technology challenges aside, groups have blamed Facebook’s inconsistent, unclear, and in some cases controversial guidelines for moderating content for stumbling blocks in removing abusive posts. According to the Wall Street Journal, Facebook is often not quick to deal with user reports and enforces its own rules so material – including depictions and praise for “cruel violence” – is preserved, possibly because many of its moderators are physically distant and do not recognize The severity of the content they are reviewing. In one case, 100 Facebook groups associated with QAnon, a conspiracy identified by the FBI as a threat to domestic terrorism, grew along with over 13,600 new followers per week, according to a New York Times database.
In response to the pressure, Facebook introduced rules this summer and fall that aim to contain viral content that violates standards. Members and administrators of groups that have been removed for violating their policies cannot temporarily create new groups. Facebook is no longer recommending health-related groups, and QAnon is banned on all of the company’s platforms. Facebook applies labels to posts by politicians but removes those that violate its rules. And the Facebook Oversight Board, an outside group that will make decisions and influence precedents about what type of content should and shouldn’t be allowed on the Facebook platform, began reviewing content moderation cases in October.
Facebook has also taken an ad hoc approach to moderating hate speech to reflect political realities in specific regions of the world. The company’s rules for hate speech are stricter in Germany than in the US. In Singapore, Facebook agreed to attach a “correction notice” to messages that the government deemed incorrect. And in Vietnam, Facebook said it would restrict access to “dissident” content considered illegal in exchange for the government ending its practice of disrupting the company’s local servers.
In the meantime, problematic posts continue to slide through Facebook’s filters. In a Facebook group that was formed last week and quickly grew to nearly 400,000 people, members calling for a nationwide recount of the 2020 U.S. presidential election exchanged unsubstantiated allegations of alleged electoral fraud and the number of state votes every few seconds .
“The system is about marrying AI and human reviewers to make fewer mistakes,” said Facebook’s Chris Parlow, part of the company’s moderation tech team, during the briefing. “The AI will never be perfect.”
Best Practices for a Successful AI Center of Excellence: A Guide for CoEs and Business Units Access here