DeepSeek's multi-head latent attention and other KV cache tricks

Article URL: https://www.pyspur.dev/blog/multi-head-latent-attention-kv-cache-paper-list

Comments URL: https://news.ycombinator.com/item?id=42858741

Points: 109

# Comments: 10

https://www.pyspur.dev/blog/multi-head-latent-attention-kv-cache-paper-list

Vytvořeno 1mo | 29. 1. 2025 0:20:21

Chcete-li přidat komentář, přihlaste se

Ostatní příspěvky v této skupině

Hot take: GPT 4.5 is a nothing burger

Hot take: GPT 4.5 is a nothing burger

Article URL: https://garymarcus.substack.com/p/hot-take-gpt-45-is-a-nothing-burger

Comments URL:

1. 3. 2025 0:20:12 | Hacker news

AI is killing some companies, yet others are thriving – let's look at the data

AI is killing some companies, yet others are thriving – let's look at the data

Article URL: https://www.elenaverna.com/p/ai-is-killing-some-companies-yet

Comments URL:

1. 3. 2025 0:20:12 | Hacker news

400 reasons to not use Microsoft Azure

400 reasons to not use Microsoft Azure

Article URL: https://azsh.it

Comments URL: https://news.ycombinator.com/item?id=43210536

Poi

1. 3. 2025 0:20:11 | Hacker news

Why it's so hard to build a jet engine

Why it's so hard to build a jet engine

Article URL: https://www.construction-physics.com/p/why-its-so-hard-to-build-a-jet-engine

Comments

1. 3. 2025 0:20:11 | Hacker news

Show HN: Torii – a framework agnostic authentication library for Rust

Show HN: Torii – a framework agnostic authentication library for Rust

Article URL: https://github.com/cmackenzie1/torii-rs

Comments URL: https://news

1. 3. 2025 0:20:10 | Hacker news

Inheriting is becoming nearly as important as working

Inheriting is becoming nearly as important as working

Article URL: https://www.economist.com/leaders/2025/02/27/inheriting-is-becoming-nearly

1. 3. 2025 0:20:08 | Hacker news

Open Source LLMOps Stack

Open Source LLMOps Stack

Some background: I work on Langfuse and we've been collaborating with LiteLLM.

(LiteLLM is a Python library and proxy/gateway that handles cost management, virtual keys, caching, and rate-limiti

28. 2. 2025 22:10:10 | Hacker news

Techie