Facebook will begin publishing more data about how many posts it takes down.
Facebook disabled nearly 1.3 billion “fake” accounts over the past two quarters, many of them bots “with the intent of spreading spam or conducting illicit activities such as scams,” the company said on Monday.
Facebook disabled 583 million accounts in Q1 2018, down from 694 million accounts in Q4 of last year, a decrease the company attributes to its “variability of our detection technology’s ability to find and flag them.”
Most of the accounts “were disabled within minutes of registration,” Facebook claimed in a blog post, but Facebook doesn’t catch all fake accounts. The company estimates that 3 percent to 4 percent of its monthly active users are “fake,” up from 2 percent to 3 percent in Q3 of 2017, according to filings documents.
Those numbers are big, a reminder of what Facebook is up against just 18 months after it was learned that a Russian troll farm used Facebook to try and influence the 2016 U.S. presidential election.
Facebook says it finds most of the accounts on its own using software algorithms, but a small percentage — about 1.5 percent of the disabled accounts — were discovered after they were flagged by Facebook users.
Facebook published the numbers for the first time on Tuesday, along with another set of numbers outlining the other kinds of content the company takes down on a regular basis.
Publishing the data is a way for Facebook to hold itself accountable, but it’s also a chance for Facebook to show users that it’s actually working on these problems in the background, something that’s not always obvious to the average user scrolling through her News Feed.
“This is the start,” said Guy Rosen, a Facebook product VP working on safety and security. “People can report a lot more types of bad things [than we are updating here.] So we want to have more numbers to share [next time].”
The numbers Facebook is sharing this time focus on major content categories. The company removed 21 million “pieces of adult nudity or porn,” for example, the vast majority of which was discovered using software programs. It also removed 2.5 million pieces of “hate speech,” 56 percent more content than the 1.6 million pieces it removed in Q4.
Unlike nudity or terrorism-related content, though, hate speech is still primarily discovered by humans, not software programs. Only 38 percent of the hate speech Facebook removed in Q1 was first identified by algorithms. That’s an improvement over 23.6 percent in Q4, but still much smaller than some of the other content categories Facebook looks for.
That makes sense, as “hate speech” is much more subjective than nudity. What one person might describe as hate speech, another might describe as free speech. The fact that Facebook still has trouble detecting it without human help shows that the problem won’t go away anytime soon.
“Hate speech is really hard,” said Alex Schultz, Facebook’s VP of analytics, in a briefing with reporters. “There’s nuance, there’s context. The technology just isn’t there to really understand all of that, let alone in a long long list of languages.”
Facebook has been working to win back the trust of its users ever since the 2016 election — and the more recent Cambridge Analytica privacy scandal in which user data was collected by an outside research firm without users’ consent.
Facebook rewrote its data policies, and also published the rulebook it uses for content policy decisions over the past few months. It plans to publish data around what types of posts it removes every six months or so moving forward.
“We hope we get better, but there is the interesting balance around what happens in the real world versus what happens on our site,” Schultz said. “It would be good for the world if wars ended, and I’m sure that would be good for the graphic violence number on Facebook. Also there could be another war breakout, and that would be terrible, and that would be bad for those numbers.”
“I think we should measure them well, and we should be good at explaining to you why they have moved,” he added.