As a grad student (and an ADHDer), I had trouble doing literature review systematically. To combat this, I made a website that finds similar papers using the meaning of the thing I am looking for.
I used MixedBread's [^1] embedding model to generate vectors from the abstracts. I store and search similar vectors using Milvus [^2] and finally use Gradio [^3] to serve the frontend. I update the vector database weekly by pulling the metadata dataset from Kaggle [^4].
To speed up the search process on my free oracle instance, I binarise the embeddings and use Hamming distance as a metric.
I would love your feedback on the site :) Happy Holidays!
[1]: https://www.mixedbread.ai/docs/embeddings/mxbai-embed-large-... [2]: https://milvus.io/ [3]: https://www.gradio.app/ [4]: https://www.kaggle.com/datasets/Cornell-University/arxiv
Comments URL: https://news.ycombinator.com/item?id=42507116
Points: 14
# Comments: 0
Accedi per aggiungere un commento
Altri post in questo gruppo

Hey HN! I posted this on April 1st when it launched, and though it didn't get traction here, it was a minor hit on reddit! Now that we've got a few thousand monkeys under our belt, wanted to give

Article URL: https://www.jameco.com/Jameco/workshop/Howitworks/how-servo-motors-work.html
Comments
Article URL: https://slipstream.readthedocs.io/en/1.0.1/
Comments URL: http

Article URL: https://niila.fi/en/ai-cheats/
Comments URL: https://news.ycombinator.com/i