Alignment faking in large language models

Article URL: https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models

Comments URL: https://news.ycombinator.com/item?id=42733593

Points: 22

# Comments: 2

https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models

Erstellt 1mo | 19.01.2025, 15:30:09

Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

See the submissions you have flagged (maybe accidentally)

See the submissions you have flagged (maybe accidentally)

Article URL: https://news.ycombinator.com/flagged

Comments URL: https://news.ycomb

23.02.2025, 16:50:18 | Hacker news

Vietnamese Graphic Design

Vietnamese Graphic Design

Article URL: https://vietgd.com/

Comments URL: https://news.ycombinator.com/item?id=43149266

23.02.2025, 16:50:17 | Hacker news

Navigating a Broken Dev Culture

Navigating a Broken Dev Culture

My Current Job is a Mess—But I Won’t Let It Define Me

(Disclaimer: Not here to trash my company, just sharing my experience.)

I work on the AI team at my current job, meaning I get to play wit

23.02.2025, 16:50:15 | Hacker news

Immune markers of post COVID vaccination syndrome indicate future research

Immune markers of post COVID vaccination syndrome indicate future research

Article URL: https://news.yale.edu/2025/02/19/immune-markers-post-vaccinatio

23.02.2025, 16:50:15 | Hacker news

Pee If You Want to Go Deeper (2021)

Pee If You Want to Go Deeper (2021)

Article URL: https://peeifyouwanttogofaster.com/2021/05/24/pee-if-you-want-to-go-deeper/

Comments U

23.02.2025, 16:50:14 | Hacker news

'Everybody is looking at their phones,' says man freed after 30 years in prison

'Everybody is looking at their phones,' says man freed after 30 years in prison

Article URL: https://news.sky.com/story/everybody-is-looking-at-their-pho

23.02.2025, 16:50:13 | Hacker news

Chipzilla Devours the Desktop

Chipzilla Devours the Desktop

Article URL: https://www.abortretry.fail/p/chipzilla-devours-the-desktop

Comments URL:

23.02.2025, 16:50:12 | Hacker news

Techie