Steiner is a series of reasoning models trained on synthetic data using reinforcement learning. These models can explore multiple reasoning paths in an autoregressive manner during inference and autonomously verify or backtrack when necessary, enabling a linear traversal of the implicit search tree.
Blog: https://medium.com/@peakji/a-small-step-towards-reproducing-...
Hugging Face: https://huggingface.co/collections/peakji/steiner-preview-67...
Comments URL: https://news.ycombinator.com/item?id=41915735
Points: 20
# Comments: 7
https://medium.com/@peakji/a-small-step-towards-reproducing-openai-o1-b9a756a00855
Login to add comment
Other posts in this group

Article URL: https://www.bbc.com/news/live/c9wpq8xrvd9t
Comments URL: https:
Article URL: https://mrwint.github.io/winter/writeup/writeup.html

Article URL: https://www.runpyxl.com/gpio
Comments URL: https://news.ycombinator.com/item?

Article URL: https://jobs.ashbyhq.com/optery
Comments URL: https://news.ycombinator.com
Article URL: https://ratfactor.com/cards/naur-vs-llms
Comments URL: https://ne