Article URL: https://github.com/NVIDIA/nv-ingest
Comments URL: https://news.ycombinator.com/item?id=42654019
Points: 4
# Comments: 0
Creato
2d
|
10 gen 2025, 11:10:08
Accedi per aggiungere un commento
Altri post in questo gruppo
We’ve just open-sourced SemHash, a lightweight package for semantic text deduplication. It lets you effortlessly clean up your datasets and avoid pitfalls caused by duplicate samples in semantic s