Google accused of using novices to fact-check Gemini's AI answers

There's no arguing that AI still has quite a few unreliable moments, but one would hope that at least its evaluations would be accurate. However, last week Google allegedly instructed contract workers evaluating Gemini not to skip any prompts, regardless of their expertise, TechCrunch reports based on internal guidance it viewed. Google shared a preview of Gemini 2.0 earlier this month.  

Google reportedly instructed GlobalLogic, an outsourcing firm whose contractors evaluate AI-generated output, not to have reviewers skip prompts outside of their expertise. Previously, contractors could choose to skip any prompt that fell far out of their expertise — such as asking a doctor about laws. The guidelines had stated, "If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task."

Now, contractors have allegedly been instructed, "You should not skip prompts that require specialized domain knowledge" and that they should "rate the parts of the prompt you understand" while adding a note that it's not an area they have knowledge in. Apparently, the only times contracts can skip now are if a big chunk of the information is missing or if it has harmful content which requires specific consent forms for evaluation. 

One contractor aptly responded to the changes stating, "I thought the point of skipping was to increase accuracy by giving it to someone better?" 

Google has not responded to a request for comment. 

This article originally appeared on Engadget at https://www.engadget.com/ai/google-accused-of-using-novices-to-fact-check-geminis-ai-answers-143044552.html?src=rss https://www.engadget.com/ai/google-accused-of-using-novices-to-fact-check-geminis-ai-answers-143044552.html?src=rss
Établi 3mo | 19 déc. 2024, 15:10:13


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

Arkansas social media age verification law blocked by federal Judge

An Arkansas law requiring social media companies to verify the ages of their users has been

1 avr. 2025, 21:40:18 | Engadget
Amazon’s new cinema plan is perfect… for the ‘80s

If you ever needed a definitive example of how money doesn’t necessarily buy you success or taste, take a look at Amazon’s studio arm. The mega-retailer’s production division, now known as Amazon-M

1 avr. 2025, 17:10:41 | Engadget
Apple's Find My has finally launched in South Korea

Apple’s Find My feature has finally been enabled in South Korea,

1 avr. 2025, 17:10:40 | Engadget
TikTok's ban deadline is coming. What happens next?

TikTok's deadline to sell off or cede its US operations is once again approaching. The 75-day extension

1 avr. 2025, 17:10:39 | Engadget
Lazarus review: Wildly stylish, but it’s no Cowboy Bebop

You could call Shinichiro Watanabe's Lazarus a retread of his masterpiece, Cowboy Bebop. That’s not to say the show is bad — based on the five episodes I’ve seen so far, Lazar

1 avr. 2025, 17:10:38 | Engadget