Why LLMs still have problems with OCR

Document ingestion and the launch of Gemini 2.0 caused a lot of buzz this week. As a team building in this space, this is something we researched thoroughly. Here’s our take: ingestion is a multistep pipeline, and maintaining confidence from LLM nondeterministic outputs over millions of pages is a problem.


Comments URL: https://news.ycombinator.com/item?id=42966958

Points: 103

# Comments: 75

https://www.runpulse.com/blog/why-llms-suck-at-ocr

Creato 3h | 8 feb 2025, 08:20:04


Accedi per aggiungere un commento