DeepSeek's multi-head latent attention and other KV cache tricks

Article URL: https://www.pyspur.dev/blog/multi-head-latent-attention-kv-cache-paper-list

Comments URL: https://news.ycombinator.com/item?id=42858741

Points: 109

# Comments: 10

https://www.pyspur.dev/blog/multi-head-latent-attention-kv-cache-paper-list

Creato 1mo | 29 gen 2025, 00:20:21

Accedi per aggiungere un commento

Altri post in questo gruppo

Youth and what happens when it's gone

Article URL: https://tolstoyan.substack.com/p/youth

Comments URL: https://news.y

3 mar 2025, 17:20:15 | Hacker news

Show HN: Knowledge graph of restaurants and chefs, built using LLMs

Hi HN!

My latest side project is knowledge graph that maps the French culinary network using data extracted from restaurant reviews from LeFooding.com. The project uses LLMs to extract structure

3 mar 2025, 17:20:15 | Hacker news

Drone Delivery for Defense

Article URL: https://seanobannon.substack.com/p/drone-delivery-for-defense

Comments URL:

3 mar 2025, 17:20:14 | Hacker news

Ask HN: Freelancer? Seeking freelancer? (March 2025)

Please lead with either SEEKING WORK or SEEKING FREELANCER, your location, and whether remote work is a possibility.

Please only post if you are personally looking to hire a freelancer or work a

3 mar 2025, 17:20:13 | Hacker news

Ask HN: Who is hiring? (March 2025)

Please state the location and include REMOTE for remote work, REMOTE (US) or similar if the country is restricted, and ONSITE when remote work is not an option.

Please only post if you pe

3 mar 2025, 17:20:11 | Hacker news

Apple's Software Quality Crisis: When Premium Hardware Meets Subpar Software

Article URL: https://www.eliseomartelli.it/blog/2025-03-02-apple-quality

Comments URL:

3 mar 2025, 17:20:10 | Hacker news

An Attempt to Catch Up with JIT Compilers

Article URL: https://arxiv.org/abs/2502.20547

Comments URL: https://news.ycombinator.c

3 mar 2025, 17:20:10 | Hacker news

Techie