We've open-sourced Klarity - a tool for analyzing uncertainty and decision-making in LLM token generation. It provides structured insights into how models choose tokens and where they show uncertainty.
What Klarity does:
- Real-time analysis of model uncertainty during generation - Dual analysis combining log probabilities and semantic understanding - Structured JSON output with actionable insights - Fully self-hostable with customizable analysis models
The tool works by analyzing each step of text generation and returns a structured JSON:
- uncertainty_points: array of {step, entropy, options[], type} - high_confidence: array of {step, probability, token, context} - risk_areas: array of {type, steps[], motivation} - suggestions: array of {issue, improvement}
Currently supports hugging face transformers (more frameworks coming), we tested extensively with Qwen2.5 (0.5B-7B) models, but should work with most HF LLMs.
Installation is simple: `pip install git+https://github.com/klara-research/klarity.git`
We are building OS interpretability/explainability tools to visualize & analyse attention maps, saliency maps etc. and we want to understand your pain points with LLM behaviors. What insights would actually help you debug these black box systems?
Links:
- Repo: https://github.com/klara-research/klarity - Our website: [https://klaralabs.com](https://klaralabs.com/)
Comments URL: https://news.ycombinator.com/item?id=42918237
Points: 7
# Comments: 0
Connectez-vous pour ajouter un commentaire
Autres messages de ce groupe
Article URL: https://physicsworld.com/a/when-bohr-got-it-wrong-
Article URL: https://irreducible.io/blog/my-wasm-interpreter/
I built this product back in 2018 as a small side project: a tool that turns short videos into physical flipbooks. After launching it, I didn't touch it for years. Life and work took over, and it
Article URL: https://chrissardegna.com/blog/reverse-engineering-apples-typedstream-format/
Commen
Please state the location and include REMOTE for remote work, REMOTE (US) or similar if the country is restricted, and ONSITE when remote work is not an option.
Please only post if you pe