Would you board a plane safety-tested by GenAI?

Ben and Ryan are joined by Robin Gupta for a conversation about benchmarking and testing AI systems. They talk through the lack of trust and confidence in AI, the inherent challenges of nondeterministic systems, the role of human verification, and whether we can (or should) expect an AI to be reliable. https://stackoverflow.blog/2024/05/24/would-you-board-a-plane-safety-tested-by-genai/

Created 11mo | May 24, 2024, 5:50:06 AM

Other posts in this group

How do you fact-check an AI?

Ryan chats with Amr Awadallah, founder and CEO of GenAI platform Vectara. They cover how retrieval-augmented generation (RAG) has advanced, why fact-checking and accurate data are essential in buildin

Apr 11, 2025, 5:40:06 AM | StackOverflow blog

“There is a real cost to moving fast”: Using AI to accelerate drug discovery

On this episode of Leaders of Code, Ben Popper hosts a conversation with Maureen Makes, VP of Engineering at Recursion, and Ellen Brandenberger, Senior Director of Product Strategy for Overflow API. T

Apr 10, 2025, 6:30:07 AM | StackOverflow blog

Bottom of the first: A veteran VC’s take on the AI landscape

Ryan welcomes Tomasz Tunguz of Theory Ventures back to the podcast to talk about the intersection of AI and venture capital, the implications of AI on the labor market, and the future of AI applicatio

Apr 8, 2025, 5:40:09 AM | StackOverflow blog

Open-source AI: Are younger developers leading the way?

In March, over 1,000 developers and technologists gave us insights into what they think about open source and the role it plays with AI. https://stackoverflow.blog/2025/04/07/open-source-ai-are-younge

Apr 7, 2025, 3:50:08 PM | StackOverflow blog

Using GenAI as a learning tool, not a crutch

AI is changing how we think about coding. While tools evolve, critical thinking, problem-solving, and creativity remain the essential skills for top developers. https://stackoverflow.blog/2025/04/04/u

Apr 4, 2025, 4:10:07 PM | StackOverflow blog

Is AI a bubble or a revolution? The answer is yes.

At HumanX 2025, Ryan sat down with HumanX CEO Stefan Weitz and Crunchbase CEO Jager McConnell to talk about where the money is in the AI space, where most enterprise AI strategies fall short, how comp

Apr 4, 2025, 6:50:02 AM | StackOverflow blog

From training to inference: The new role of web data in LLMs

Data has always been key to LLM success, but it's becoming key to inference-time performance as well. https://stackoverflow.blog/2025/04/03/from-training-to-inference-the-new-role-of-web-data-in-llms

Apr 3, 2025, 4:50:09 PM | StackOverflow blog

Tomas_r2