Show HN: Beating Pokemon Red with RL and <10M Parameters

Hi everyone!

After spending hundreds of hours, we're excited to finally share our progress in developing a reinforcement learning system to beat Pokémon Red. Our system successfully completes the game using a policy under 10M parameters, PPO, and a few novel techniques. With the release of Claude Plays Pokémon, now feels like the perfect time to showcase our work.

We'd love to get feedback!

Comments URL: https://news.ycombinator.com/item?id=43269330

Points: 41

# Comments: 26

https://drubinstein.github.io/pokerl/

Vytvořeno 1mo | 5. 3. 2025 20:20:12

Chcete-li přidat komentář, přihlaste se

Ostatní příspěvky v této skupině

Damn Vulnerable MCP Server

Article URL: https://github.com/harishsg993010/damn-vulnerable-MCP-server

Comments URL:

16. 4. 2025 18:40:07 | Hacker news

Www.hive.co (YC S14) Is Hiring a Head of Engineering

Article URL: https://jobs.ashbyhq.com/hive.co/684574a0-9150-4fba-b954-2f34d9c74468

Comments URL:

16. 4. 2025 18:40:06 | Hacker news

OpenAI o3 and o4-mini

Article URL: https://openai.com/index/introducing-o3-and-o4-mini/

Comments URL:

16. 4. 2025 18:40:06 | Hacker news

Kaggle and the Wikimedia Foundation are partnering on open data

Article URL: https://blog.google/technology/developers/kaggle-wikimedia/

Comments URL:

16. 4. 2025 18:40:04 | Hacker news

The UCSD p-System, Apple Pascal, and a dream of cross-platform compatibility

Article URL: https://markbessey.blog/2025/04/14/a-blast-from-the-past/

Comments URL:

16. 4. 2025 18:40:03 | Hacker news

Kermit: A typeface for kids

Article URL: https://microsoft.design/articles/introducing-kermit-a-typeface-for-kids/

Comments URL:

16. 4. 2025 16:20:19 | Hacker news

Launch HN: Jasmine (YC S22) – Automating REC compliance and payouts for solar

Hi HN — we’re Nathalie, Dalton, Vince, and Matt, and we’re launching Jasmine Energy (https://www.jasmine.energy), a tool that helps residential and commerc

16. 4. 2025 16:20:18 | Hacker news

Techie

Show HN: Beating Pokemon Red with RL and 10M Parameters

Ostatní příspěvky v této skupině