The generative AI revolution shows no sign of slowing as OpenAI recently rolled out its GPT-4.5 model to paying ChatGPT users, while competitors have announced plans to introduce their own latest models—including Anthropic, which unveiled Claude 3.7 Sonnet, its latest language model, late last month. But the ease of use of these AI models is having a material impact on the information we encounter daily, according to a new study published in Cornell University’s preprint server arXiv.
An analysis of more than 300 million documents, including consumer complaints, corporate press releases, job postings, and messages for the media published by the United Nations suggests that the web is being swamped with AI-generated slop.
The study tracks the purported involvement of generative AI tools to create content across those key sectors, above, between January 2022 and September 2024. “We wanted to quantify how many people are using these tools,” says Yaohui Zhang, one of the study’s coauthors, and a researcher at Stanford University.
The answer was, a lot. Following the November 30, 2022, release of ChatGPT, the estimated proportion of content in each domain that saw suggestions of AI generation or involvement skyrocketed. From a baseline of around 1.5% in the 11 months prior to the release of ChatGPT, the proportion of customer complaints that exhibited some sort of AI help increased tenfold. Similarly, the share of press releases that had hints of AI involvement rapidly increased in the months after ChatGPT became widely available.
Which areas of the United States were more likely to adopt AI to help write complaints was made possible by the data accompanying the text of each complaint made to the Consumer Financial Protection Bureau (CFPB), the government agency that Donald Trump has now dissolved. In the 2024 data analyzed by the academics, complainants in Arkansas, Missouri, and North Dakota were the most likely to use AI, with its presence in around one in four complaints; while West Virginia, Idaho, and Vermont residents were least likely—where between one in 20 and one in 40 showed AI evidence.
Unlike off-the-shelf AI detection tools, Zhang and his colleagues developed their own statistical framework to determine whether something was likely AI-generated that compared linguistic patterns—including word frequency distributions—in texts written before the release of ChatGPT against those known to have been generated or modified by large language models. The outputs were then tested against known human- or AI-written texts, with prediction errors lower than 3.3%, suggesting it was able to accurately discern one from the other. Like many, the team behind the work is worried about the impact of samizdat content flooding the web—particularly in so many areas, from consumer complaints to corporate and non-governmental organization press releases. “I think [generative AI] is somehow constraining the creativity of humans,” says Zhang.
Zaloguj się, aby dodać komentarz
Inne posty w tej grupie


Technology workers in Kenya have held a vigil for a colleague who died in unclear circumstances after she was unable to travel to her home in Nigeria for two years.
Ladi Anzaki Olubunmi,
Featuring Matthew Prince, Cofounder and CEO, Cloudflare. Moderated by Brendan Vaughan, Editor in Chief, Fast Company.
With a quarter of the global internet powered by Cloudflare—its netw

An influx of copy-and-pasted Christian messages has recently taken over TikTok’s comment sections.
Over the past several days, comments about Jesus Christ have surfaced among the top com

Apple has successfully blocked its opponents in

Forget a diamond ring, the latest symbol of commitment now comes in the form of wearable tech.
The RAW ring, created by the dating app

Satellite-based disaster monitoring has been a slow and tedious process for decades. The process consists of capturing images, transmitting them back to Earth, and relying on human analysts to int