Integrating Image-To-Text And Text-To-Speech Models (Part 1)

Joas Pambou built an app that integrates vision language models (VLMs) and text-to-speech (TTS) AI technologies to describe images audibly with speech. This audio description tool can be a big help for people with sight challenges to understand what’s in an image. But how this does it even work? Joas explains how these AI systems work and their potential uses, including how he built the app and ways to further improve it. https://smashingmagazine.com/2024/07/integrating-image-to-text-and-text-to-speech-models-part1/

Created 9mo | Jul 24, 2024, 4:20:19 PM

Other posts in this group

What Does It Really Mean For A Site To Be Keyboard Navigable

Keyboard navigation is a vital aspect of accessible web design, and a detail-oriented approach is crucial. Prioritizing keyboard navigation prioritizes the user experience for a diverse audience, exte

Apr 18, 2025, 2:50:08 PM | Smashing magazine

Fostering An Accessibility Culture

While there’s no definitive playbook for building an accessibility culture, Dani shares lessons from his experience in shaping it through habits rather than mandates. https://smashingmagazine.com/2025

Apr 17, 2025, 1:30:04 PM | Smashing magazine

Inclusive Dark Mode: Designing Accessible Dark Themes For All Users

Dark mode isn’t just a trendy aesthetic. It’s a gateway to more inclusive digital experiences, but only if designed thoughtfully. Discover how to craft dark modes that don’t just look good but work fo

Apr 15, 2025, 3:10:08 PM | Smashing magazine

Gild Just One Lily

“Gilding the lily” isn’t always bad. In design, a touch of metaphorical gold — a subtle animated transition, a hint of color, or added depth in a drop shadow — can help communicate a level of care and

Apr 10, 2025, 4:50:11 PM | Smashing magazine

Using Manim For Making UI Animations

Animation makes things clearer, especially for designers and front-end developers working on UI, prototypes, or interactive visuals. Manim is a tool that lets you create smooth and dynamic animations,

Apr 8, 2025, 6:40:02 PM | Smashing magazine

How To Build A Business Case To Promote Accessibility In Your B2B Products

When passion for accessibility meets business indifference, what bridges the gap? Gloria Diaz Alonso shares how she turned frustration into strategy — by learning to speak the language of business. ht

Apr 4, 2025, 7:30:15 PM | Smashing magazine

Building A Drupal To Storyblok Migration Tool: An Engineering Perspective

In this article, Edoardo Dusi shares the engineering and architectural choices made by the team at Storyblok and how real-world migration challenges were addressed using modern PHP practices. https://

Apr 3, 2025, 3:40:08 PM | Smashing magazine

Techie