ASTRA: HackerRank's coding benchmark for LLMs

We help companies hire & upskill developers. A customer recently asked: What % of HackerRank problems can LLMs solve? That got us thinking—how should hiring evolve when AI can translate natural language to code?

Our belief: AI will handle much of code generation, so developers will be assessed more on SDLC skills with AI assistants.

To explore this, we’re benchmarking LLMs on real-world software dev scenarios—starting with 65 unseen problems across 10 domains. Beyond correctness, we evaluated consistency—an often overlooked aspect of AI reliability. We’re open-sourcing the dataset on Huggingface and expanding it to cover more domains, ambiguous specs, and harder challenges.

Would love the HN community’s take on this!

Comments URL: https://news.ycombinator.com/item?id=43015631

Points: 4

# Comments: 0

https://www.hackerrank.com/ai/astra-reports

Created 1mo | Feb 11, 2025, 10:10:17 PM

Login to add comment

Other posts in this group

Hann: A Fast Approximate Nearest Neighbor Search Library for Go

Hann: A Fast Approximate Nearest Neighbor Search Library for Go

Article URL: https://github.com/habedi/hann

Comments URL: https://news.ycombinator.com/i

Mar 25, 2025, 3:50:13 PM | Hacker news

VGGT: Visual Geometry Grounded Transformer

VGGT: Visual Geometry Grounded Transformer

Article URL: https://github.com/facebookresearch/vggt

Comments URL: https://ne

Mar 25, 2025, 3:50:11 PM | Hacker news

What Killed Innovation?

What Killed Innovation?

Article URL: https://www.shirleywu.studio/notebook/2025-02-innovation-killer

Comments URL:

Mar 25, 2025, 3:50:10 PM | Hacker news

If you get the chance, always run more extra network fiber cabling

If you get the chance, always run more extra network fiber cabling

Article URL: https://utcc.utoronto.ca/~cks/space/blog/sysadmin/RunMoreExtraNetworkFiber

Comments URL

Mar 25, 2025, 3:50:09 PM | Hacker news

The Practical Limitations of End-to-End Encryption

The Practical Limitations of End-to-End Encryption

Article URL: https://soatok.blog/2025/03/25/the-practical-limitations-of-end-to-end-encryption/

Mar 25, 2025, 3:50:08 PM | Hacker news

Beej's Guide to C Programming [pdf]

Beej's Guide to C Programming [pdf]

Article URL: https://beej.us/guide/bgc/pdf/bgc_a4_c_1.pdf

Comments URL: ht

Mar 25, 2025, 3:50:06 PM | Hacker news

My Favorite C++ Pattern: X Macros (2023)

My Favorite C++ Pattern: X Macros (2023)

Article URL: https://danilafe.com/blog/chapel_x_macros/

Comments URL: https:

Mar 25, 2025, 3:50:04 PM | Hacker news

Techie