Show HN: I made a website to semantically search ArXiv papers

As a grad student (and an ADHDer), I had trouble doing literature review systematically. To combat this, I made a website that finds similar papers using the meaning of the thing I am looking for.

I used MixedBread's [^1] embedding model to generate vectors from the abstracts. I store and search similar vectors using Milvus [^2] and finally use Gradio [^3] to serve the frontend. I update the vector database weekly by pulling the metadata dataset from Kaggle [^4].

To speed up the search process on my free oracle instance, I binarise the embeddings and use Hamming distance as a metric.

I would love your feedback on the site :) Happy Holidays!

[1]: https://www.mixedbread.ai/docs/embeddings/mxbai-embed-large-... [2]: https://milvus.io/ [3]: https://www.gradio.app/ [4]: https://www.kaggle.com/datasets/Cornell-University/arxiv

Comments URL: https://news.ycombinator.com/item?id=42507116

Points: 14

# Comments: 0

https://papermatch.mitanshu.tech/

Creato 4mo | 25 dic 2024, 10:10:08

Accedi per aggiungere un commento

Altri post in questo gruppo

ACM's flagship magazine seeks submissions by/for practitioners

ACM's flagship magazine seeks submissions by/for practitioners

Article URL: https://cacm.acm.org/practice/call-for-papers-cacm-practice-section/

Comments URL:

26 apr 2025, 03:30:16 | Hacker news

Reading RSS content is a skilled activity

Reading RSS content is a skilled activity

Article URL: https://www.doliver.org/articles/rss-as-a-skill

Comments URL:

26 apr 2025, 03:30:14 | Hacker news

Your phone isn't secretly listening to you, but the truth is more disturbing

Your phone isn't secretly listening to you, but the truth is more disturbing

Article URL: https://newatlas.com/computers/smartphone-listening-conversations-ads-facebook/

Co

26 apr 2025, 03:30:14 | Hacker news

I wrote a book called "Crap Towns". It seemed funny at the time

I wrote a book called "Crap Towns". It seemed funny at the time

Article URL: https://samj.substack.com/p/that-joke-isnt-funny-any-more

Comments URL:

26 apr 2025, 03:30:12 | Hacker news

Berkeley Humanoid Lite – open-source robot

Berkeley Humanoid Lite – open-source robot

Article URL: https://lite.berkeley-humanoid.org/

Comments URL: https://news.ycombin

26 apr 2025, 03:30:12 | Hacker news

Mathematicians just solved a 125-year-old problem, uniting 3 theories in physics

Mathematicians just solved a 125-year-old problem, uniting 3 theories in physics

Article URL: https://www.scientificamerican.com/article/lofty-math-problem-cal

26 apr 2025, 03:30:10 | Hacker news

Rockets, robots and supercars made in Greece

Rockets, robots and supercars made in Greece

Article URL: https://greekanalyst.substack.com/p/rockets-robots-and-supercars-made

Comments URL:

26 apr 2025, 01:10:11 | Hacker news

Techie