Alignment faking in large language models

Article URL: https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models

Comments URL: https://news.ycombinator.com/item?id=42733593

Points: 22

# Comments: 2

https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models

Vytvorené 1mo | 19. 1. 2025, 15:30:09

Ak chcete pridať komentár, prihláste sa

Ostatné príspevky v tejto skupine

Thailand to Cut Power to Myanmar Scam Hubs

Thailand to Cut Power to Myanmar Scam Hubs

Article URL: https://bangkoklocal.info/2025/02/05/thailand-to-cut-power-to-myanmar-scam-hubs/

23. 2. 2025, 14:40:12 | Hacker news

NASA Downgrades the Risk of 2024 YR4 to Below 1%

NASA Downgrades the Risk of 2024 YR4 to Below 1%

Article URL: https://www.universetoday.com/171047/nasa-downgrades-the-risk-of-2024-yr4-to-below-1/

23. 2. 2025, 14:40:10 | Hacker news

Tell HN: Five random IndieWeb blog links on your terminal

Tell HN: Five random IndieWeb blog links on your terminal

Hello HN! I believe some of you might have come across this pretty interesting post about discovering IndieWeb blogs, one blog at a time:

23. 2. 2025, 14:40:08 | Hacker news

But good sir, what is electricity?

But good sir, what is electricity?

Article URL: https://lcamtuf.substack.com/p/but-good-sir-what-is-electricity

Comments URL:

23. 2. 2025, 14:40:07 | Hacker news

OpenJKDF2 – A cross-platform reimplementation of JKDF2 in C

OpenJKDF2 – A cross-platform reimplementation of JKDF2 in C

Article URL: https://github.com/shinyquagsire23/OpenJKDF2

Comments URL: ht

23. 2. 2025, 14:40:06 | Hacker news

War Rooms vs. Deep Investigations

War Rooms vs. Deep Investigations

Article URL: https://rachelbythebay.com/w/2025/02/22/war/

Comments URL: ht

23. 2. 2025, 14:40:05 | Hacker news

BYD has already produced its first solid-state cells

BYD has already produced its first solid-state cells

Article URL: https://www.electrive.com/2025/02/17/byd-has-already-produced-its-first-solid-stat

23. 2. 2025, 14:40:04 | Hacker news

Techie