Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual)

Hi HN,

I’ve been working on an OCR pipeline specifically optimized for machine learning dataset preparation. It’s designed to process complex academic materials — including math formulas, tables, figures, and multilingual text — and output clean, structured formats like JSON and Markdown.

Some features: • Multi-stage OCR combining DocLayout-YOLO, Google Vision, MathPix, and Gemini Pro Vision • Extracts and understands diagrams, tables, LaTeX-style math, and multilingual text (Japanese/Korean/English) • Highly tuned for ML training pipelines, including dataset generation and preprocessing for RAG or fine-tuning tasks

Sample outputs and real exam-based examples are included (EJU Biology, UTokyo Math, etc.) Would love to hear any feedback or ideas for improvement.

GitHub: https://github.com/ses4255/Versatile-OCR-Program

Comments URL: https://news.ycombinator.com/item?id=43590998

Points: 16

# Comments: 1

https://github.com/ses4255/Versatile-OCR-Program

Erstellt 16h | 05.04.2025, 06:50:06

Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

Database Protocols Are Underwhelming

Article URL: https://byroot.github.io/performance/2025/03/21/database-protocols.html

Comments URL:

05.04.2025, 20:50:14 | Hacker news

Faster interpreters in Go: Catching up with C++

Article URL: https://planetscale.com/blog/faster-interpreters-in-go-catching-up-with-cpp

Comments U

05.04.2025, 20:50:13 | Hacker news

NASA's Project Scientist Faces Painful Choices as Voyager Mission Nears Its End

Article URL: https://gizmodo.com/keeping-voyager-al

05.04.2025, 20:50:10 | Hacker news

Show HN: iPhone 2005 weird "Blob Keyboard" simulator

Hi HN,

I teach tech design history, and one of the key stories I cover is the development of the original iPhone keyboard by Ken Kocienda. Reading about it in his book "Creative Selection" is gr

05.04.2025, 20:50:08 | Hacker news

The Llama 4 herd

Article URL: https://ai.meta.com/blog/llama-4-multimodal-intelligence/

Comments URL:

05.04.2025, 20:50:06 | Hacker news

A Hash160 Collision

Article URL: https://lbc.cryptoguru.org/man/theory

Comments URL: https://news.yco

05.04.2025, 20:50:05 | Hacker news

Llama 4 Now Live on Groq

Article URL: https://groq.com/llama-4-now-live-on-groq-build-fast-at-the-lowest-cost-without-c

05.04.2025, 20:50:04 | Hacker news

Techie