Finding near-duplicates with Jaccard similarity and MinHash