Trending...
- Expert E-Bike Safety Advocate Issues Urgent Warning Following Recent Southern California Fatalities
- RAS AP Consulting Advances to RFP Stage in Heidelberg Materials' SAP Vendor & Customer Master Data Modernization Initiative
- Colorado: Governor Polis Appoints Leslee K. Balten to the Eagle County Court
BOULDER, Colo. - ColoradoDesk -- Built on Vision Agents with Anam and Inworld to demonstrate emotionally aware, video-first AI
Stream released an open-source AI agent that responds to a user's facial expressions, gaze, and engagement in real time. The agent, called Crashout Buddy, is live at visionagents.ai.
The era of the floating orb is over. Most voice agents today are blind. They convert speech to text, run it through an LLM, and read the response back in a flat tone regardless of whether the user is laughing, frustrated, or close to tears. Built on Stream's Vision Agents framework in collaboration with Anam and Inworld, Crashout Buddy watches the user's face and shapes both what the agent says and how it says it. When the user goes quiet, it notices. When they look like they're about to lose it, it softens.
How It Works
The agent runs a multimodal perception stack on Stream's global edge network. MediaPipe tracks 52 facial blendshapes at 8 fps to classify emotion, gaze, and engagement. That signal is injected into the LLM (Gemini) on every turn, which steers Inworld's TTS-2 voice model using natural-language direction such as [say warmly with light, easy energy]. Anam renders a photorealistic, lip-synced avatar. Deepgram handles speech-to-text.
More on Colorado Desk
The same pattern (facial state, rich agent context, expressive voice, lip-synced avatar) suits apps in dating, coaching, recruitment, tutoring, and customer support.
Key capabilities include:
Availability
The full project is open source. Try the demo at visionagents.ai, read the guide on the Stream blog, or explore the code at: https://github.com/GetStream/Vision-Agents
Stream released an open-source AI agent that responds to a user's facial expressions, gaze, and engagement in real time. The agent, called Crashout Buddy, is live at visionagents.ai.
The era of the floating orb is over. Most voice agents today are blind. They convert speech to text, run it through an LLM, and read the response back in a flat tone regardless of whether the user is laughing, frustrated, or close to tears. Built on Stream's Vision Agents framework in collaboration with Anam and Inworld, Crashout Buddy watches the user's face and shapes both what the agent says and how it says it. When the user goes quiet, it notices. When they look like they're about to lose it, it softens.
How It Works
The agent runs a multimodal perception stack on Stream's global edge network. MediaPipe tracks 52 facial blendshapes at 8 fps to classify emotion, gaze, and engagement. That signal is injected into the LLM (Gemini) on every turn, which steers Inworld's TTS-2 voice model using natural-language direction such as [say warmly with light, easy energy]. Anam renders a photorealistic, lip-synced avatar. Deepgram handles speech-to-text.
More on Colorado Desk
- All About Technology Celebrates 25 Years of Bridging Detroit's Digital Divide
- iatroX surpasses 500,000 clinical queries and expands specialist exam coverage
- Inside-Out Hollywood: The Relentless Rise of Joseph Nybyk (AKA Joseph Neibich)
- Colorado: Governor Polis Appoints Donald P. Delaney to the Larimer County Court
- Colorado: Governor Polis Orders Flags to Fly at Half-Staff For Peace Officers Memorial Day
The same pattern (facial state, rich agent context, expressive voice, lip-synced avatar) suits apps in dating, coaching, recruitment, tutoring, and customer support.
Key capabilities include:
- Emotion, gaze, and engagement classification with hysteresis to prevent flicker
- Natural-language voice steering in 100+ languages via Inworld TTS-2
- Photorealistic lip-synced avatar via Anam's CARA model
- Proactive re-engagement when the user drifts off-camera or goes quiet
- Composable processors running at independent frame rates
Availability
The full project is open source. Try the demo at visionagents.ai, read the guide on the Stream blog, or explore the code at: https://github.com/GetStream/Vision-Agents
Source: Getstream.io
Filed Under: Technology
0 Comments
Latest on Colorado Desk
- Colorado Springs: 33rd annual Hummingbird Festival returns to Starsmore Visitor and Nature Center on Saturday
- Governor Polis Signs Bills into Law Making Colorado An Even better Place to Do Business, Breaking Down Barriers and Reducing Regulation
- TAYP Expands Athlete Exposure Platform Beyond Georgia With New Push Into Virginia and the 757
- KT Medical Staffing Expands Concierge Nursing and Private Duty Nursing Services in Orange County
- Colorado: Governor Polis, Lt. Governor Primavera, Legislative Leaders Celebrate Progress in 2026 Legislative Session, Reflect on Eight Years of Strong Results
- The Millennium Alliance Achieves Great Place To Work® Certification™ Amid Continued Growth
- The Millennium Alliance Appoints Former Adweek Executive Eric Hayden Shakun as Chief Financial Officer to Accelerate Next Phase of Growth
- T. Jones Group Named Finalist Across Multiple Categories at the 2026 Georgie Awards
- The Simplest Small Business You're Probably Not Thinking About
- San Francisco Writer Wins Webby Award, Internet's Highest Honor, for Website Based on her Novel
- EDC Weekend Comedy Special Featuring Don Barnhart & Friends — Use Promo Code FRIEND for 50% Off
- N Y S E: OTH Off The Hook YS Is Building a Vertically Integrated Marine Empire — And Investors Are Starting to Notice
- Concierge Title Agency Merges with Independence Title, Inc. to Deliver an Expanded Concierge Closing Experience Across South Florida
- Grow My Security Company Launches Next-Generation Website and Expands Strategic Marketing Solutions for the Security Industry
- $4.8M in Contracted AI Revenue with Projections of $30M Over 6-12 Months for Diversified AI Software and Platform-Based Services Provider XMax Inc
- Michelangelo's Great Secret Hiding in Plain Sight
- Colorado: Results Delivered: General Assembly Adjourns 2026 Legislative Session
- Colorado: Governor Polis Appoints A. Danielle Touart to the 18th Judicial District Court
- Rocky Mountain Photography Expands Commercial Drone and Video Production Services Across Denver
- From Blank Page to Published Book