Google accused of using novices to fact-check Gemini's AI answers

There's no arguing that AI still has quite a few unreliable moments, but one would hope that at least its evaluations would be accurate. However, last week Google allegedly instructed contract workers evaluating Gemini not to skip any prompts, regardless of their expertise, TechCrunch reports based on internal guidance it viewed. Google shared a preview of Gemini 2.0 earlier this month.  

Google reportedly instructed GlobalLogic, an outsourcing firm whose contractors evaluate AI-generated output, not to have reviewers skip prompts outside of their expertise. Previously, contractors could choose to skip any prompt that fell far out of their expertise — such as asking a doctor about laws. The guidelines had stated, "If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task."

Now, contractors have allegedly been instructed, "You should not skip prompts that require specialized domain knowledge" and that they should "rate the parts of the prompt you understand" while adding a note that it's not an area they have knowledge in. Apparently, the only times contracts can skip now are if a big chunk of the information is missing or if it has harmful content which requires specific consent forms for evaluation. 

One contractor aptly responded to the changes stating, "I thought the point of skipping was to increase accuracy by giving it to someone better?" 

Google has not responded to a request for comment. 

This article originally appeared on Engadget at https://www.engadget.com/ai/google-accused-of-using-novices-to-fact-check-geminis-ai-answers-143044552.html?src=rss https://www.engadget.com/ai/google-accused-of-using-novices-to-fact-check-geminis-ai-answers-143044552.html?src=rss
Établi 5mo | 19 déc. 2024, 15:10:13


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

SteamOS now officially supports a second handheld

Valve officially added support for the Lenovo Legion Go S in its latest SteamOS

23 mai 2025, 00:10:20 | Engadget
You can now apply for verification on Bluesky

Bluesky is ramping up its verification program, even though it's still not exactly clear how it plans to determine which accounts are "authentic and notable" enough for a blue checkmark. One month

23 mai 2025, 00:10:19 | Engadget
The FTC will finally stop challenging Microsoft's purchase of Activision Blizzard

The Federal Trade Commission has finally given up the ghost on challenging Microso

23 mai 2025, 00:10:18 | Engadget
Summer Game Fest 2025: What to expect and how to watch games revealed live

As if early June wasn't already going to be a wild enough time in the gaming world with the arrival of the

23 mai 2025, 00:10:17 | Engadget
Boltgun — Words of Vengeance is Warhammer's grimdark answer to Typing of the Dead

Games Workshop, maker of the popular Warhammer 40K tabletop game, held its annual Warhammer Skulls festival today, and

22 mai 2025, 21:50:19 | Engadget
WhatsApp 'audio hangouts' are now open to group chats of any size

WhatsApp is expanding its Discord-like voice chat feature so that group chats

22 mai 2025, 21:50:18 | Engadget
How to use Nintendo Switch virtual game cards

Nintendo has shaken up how it treats your digital game purchases. It's now calling those

22 mai 2025, 21:50:16 | Engadget