Steiner is a series of reasoning models trained on synthetic data using reinforcement learning. These models can explore multiple reasoning paths in an autoregressive manner during inference and autonomously verify or backtrack when necessary, enabling a linear traversal of the implicit search tree.
Blog: https://medium.com/@peakji/a-small-step-towards-reproducing-...
Hugging Face: https://huggingface.co/collections/peakji/steiner-preview-67...
Comments URL: https://news.ycombinator.com/item?id=41915735
Points: 20
# Comments: 7
https://medium.com/@peakji/a-small-step-towards-reproducing-openai-o1-b9a756a00855
Autentifică-te pentru a adăuga comentarii
Alte posturi din acest grup
Article URL: https://ratfactor.com/cards/naur-vs-llms
Comments URL: https://ne


tldr; skip to the --------
Last time I "Asked HN", I was in a very different place. Fresh out of a bootcamp, right at the peak, and subsequent collapse of the Covid hiring. It didn't go well. Ho

Article URL: https://github.com/SureScaleAI/cleverbee
Comments URL: https://ne