An Austin-based tech company aspires to restore trust in journalism by using machine learning models to categorize news stories. A technology and democracy expert said that while the platform shows promise, users need to try it out for themselves and better understand how it works before allowing it to guide their news consumption habits.
Otherweb was founded by Alex Fink, a technology entrepreneur, with the goal of removing waste from the digital information landscape.
“Otherweb is a new news platform that combines news, podcasts, and many other news sources in one place,” Fink said. “And this place has no paywalls or clickbait or any other form of digital bric-a-brac.”
The platform, which launched on August 1, uses machine learning models to rank news stories based on multiple metrics, including informativeness, subjectivity, hate, use of propaganda and clickbait headlines.
Fink said machine learning models require human intervention to fine-tune their accuracy, rather than just being simple lines of code that are allowed to make ranking decisions on their own.
“So we’re generating a dataset of, say, 10,000 articles and titles,” Fink said. “We have a team of annotators who go through this and flag each one whether they think it’s clickbait or not. If something is inconclusive, we remove it from the dataset so we don’t confuse the model. And then we train the model to mimic what humans do.
Taken together, the results of the models generate a “ValuRank score”, a rating out of 100, which is presented to the user along with indicators indicating whether there was hateful or offensive language in the article. This information is presented in the form of a nutrition label with a bulleted summary of the article’s content and a link to the article.
Emma Llansó, director of the Center for Democracy & Technology’s Free expression projectsaid using a nutrition label style to present ratings was an interesting choice by the company.
“It’s a format that a lot of people might look at and think, ‘Yeah, that has some authority,'” Llansó said, comparing it to how people might trust a nutrition label on a bag of coffee. chips.
Despite the label’s familiar feel and seemingly positive goal of improving media consumption, it erred on the side of caution.
“I think what really interests me about a tool like this is that it’s kind of about trying to help people engage in critical thinking about new sources that they read, which is a very laudable goal,” Llansó said. “But should we also engage in a critical reflection on the tool itself and try to understand how it makes these different assessments?
Llansó said another concern was whether the platform would collect user information.
” What is he looking at ? ” she says. “What is he trying to understand about my own behavior and web activity? »
Fink said the models used for his platform are publicly available. He even encouraged potential competitors to copy them. He added that the site does not currently track any user data. Instead of selling user data to advertisers, Fink said he plans to sell the data to advertisers.
Otherweb already collects news articles so that the company better understands the media on which the advertisements are placed.
“If advertisers place something on CNN.com, they might want to know that what it’s placed on will pass filters and show up on high-quality platforms like Otherweb,” Fink said.
Llansó pointed out that while machine learning models can very well identify things like hateful or toxic speech, they are less proficient at understanding that speech in context.
For example, if a journalist quoted toxic or hateful speech from a politician or businessperson, it could be picked up by the machine learning filter as just toxic speech and hurt the rating of the company. article.
“So I think the real risk of relying too much on machine learning tools is that they can kind of over-promise, they can kind of declare in black and white something that looks like a nutrition label “, said Llansó. “And really what these tools do is give an assessment of probability. And this assessment itself can be biased or constrained in different ways depending on how the tool is developed.
Fink recognized the difficulty of context for his machine learning models.
“It happens in our case where an article quotes someone saying something hateful, and (Otherweb) would actually decide that makes the article somewhat hateful,” Fink said. “It’s a problem we’d like to solve at some point, but natural language processing at this point isn’t as foolproof as we’d like.”
Fink said there are plans for major expansions of the new platform, including the launch of an app for Apple and Android and expanding the platform to cover podcasts, books and Wikipedia pages.
At this time, Otherweb can only remove websites without paywalls, so the articles in the New York Times or the the wall street journal are not available, but news from Reuters or ABC News is available.
Fink said he hopes platforms like Otherweb will change the incentives for sharing and consuming news.
“The reason we see the news ecosystem as broken as it is is that most content is monetized by advertising, and most advertising is pay per click or per view,” he said. Fink said. “There’s no pay-per-quality or pay-per-truth or anything like that. And so, over time, the whole ecosystem essentially drifts towards maximizing clicks and views.