Created
1mo
|
Dec 6, 2024, 4:30:12 PM
Login to add comment
Other posts in this group
We’ve just open-sourced SemHash, a lightweight package for semantic text deduplication. It lets you effortlessly clean up your datasets and avoid pitfalls caused by duplicate samples in semantic s