Trending...
- Colorado State Land Board Acquires Lake Fork Ranch to Expand Trust's Revenue and Conservation Opportunities
- Colorado Agencies Sign Landmark One Health Agreement to Protect Public, Animal, and Environmental Health
- Celebrating the Latest Certified Six Sigma Green Belt Graduates
First Open-Platform, Video-First SDK for Real-Time Vision AI
BOULDER, Colo. - ColoradoDesk -- Stream, the leading provider of scalable chat, video, and feeds APIs, today announced Vision Agents, the first open-source, open-platform SDK bringing real-time video and audio intelligence into developer applications.
Unlike existing frameworks that bolt video onto voice-first systems, Vision Agents were designed video-first from day one.
"Most frameworks started with voice and later added video," said Thierry Schellenbach, CEO and Co-Founder of Stream. "We built the opposite: a video-first foundation that's open, extensible, and developer-friendly."
Developers can now create AI-powered agents that see, hear, and remember in real time, enabling a new generation of interactive, multimodal applications.
Open Platform for AI Innovation
Vision Agents works with Stream Video by default but also integrates with other video SDKs and supports AI providers, including OpenAI Realtime, Google Gemini, and custom models. This flexibility lets companies adopt Vision Agents without disrupting existing infrastructure, while Stream Video and Chat users gain deep integrations for memory, messaging, and performance.
More on Colorado Desk
Real-Time, Video-First Intelligence
Vision Agents process live video with low latency, enabling real-time perception, scene detection, and natural audio or text responses. Core features include:
Wide-Ranging Applications
Use cases span manufacturing (defect detection), collaboration (AI note-taking, transcription), gaming (coaching, avatars), accessibility (captions, descriptions), and customer support (multimodal assistants).
Open Source and Availability
Fully open-source, Vision Agents invites community contributions to extend providers and tools.
"Vision AI today feels like ChatGPT in 2022, it's just beginning to show what's possible," said Thierry Schellenbach, CEO and Co-Founder of Stream.
Developers and partners can contribute new processors, adapters, and integrations directly on GitHub: https://github.com/GetStream/Vision-Agents
Unlike existing frameworks that bolt video onto voice-first systems, Vision Agents were designed video-first from day one.
"Most frameworks started with voice and later added video," said Thierry Schellenbach, CEO and Co-Founder of Stream. "We built the opposite: a video-first foundation that's open, extensible, and developer-friendly."
Developers can now create AI-powered agents that see, hear, and remember in real time, enabling a new generation of interactive, multimodal applications.
Open Platform for AI Innovation
Vision Agents works with Stream Video by default but also integrates with other video SDKs and supports AI providers, including OpenAI Realtime, Google Gemini, and custom models. This flexibility lets companies adopt Vision Agents without disrupting existing infrastructure, while Stream Video and Chat users gain deep integrations for memory, messaging, and performance.
More on Colorado Desk
- Putting Your Roses to Bed for Winter in the Deep South - A Gentleman's Guide to Fall Rose Care
- Tens of Thousands Complete Course to Master Entire Bible, Including Revelation
- Colorado: Governor Polis Calls on Trump Administration to Stop Delaying Food Assistance for Millions of Hungry Americans Following Supreme Court Action
- UK Financial Ltd Unveils The First ERC-3643 Security Token Born from a Meme: Introducing MayaCat Regulated Security Token (SMCAT) Successor to MayaCat
- Colorado: State Requests Full SNAP Food Assistance Funding for November, SNAP participants will likely see full payments over the next few days
Real-Time, Video-First Intelligence
Vision Agents process live video with low latency, enabling real-time perception, scene detection, and natural audio or text responses. Core features include:
- Video-first intelligence for scene understanding.
- Real-time audio with transcription, speech, and voice activity detection.
- Memory and context to recall details naturally.
- Action-ready design to connect with external APIs and services.
Wide-Ranging Applications
Use cases span manufacturing (defect detection), collaboration (AI note-taking, transcription), gaming (coaching, avatars), accessibility (captions, descriptions), and customer support (multimodal assistants).
Open Source and Availability
Fully open-source, Vision Agents invites community contributions to extend providers and tools.
"Vision AI today feels like ChatGPT in 2022, it's just beginning to show what's possible," said Thierry Schellenbach, CEO and Co-Founder of Stream.
Developers and partners can contribute new processors, adapters, and integrations directly on GitHub: https://github.com/GetStream/Vision-Agents
Source: GetStream.io
Filed Under: Technology
0 Comments
Latest on Colorado Desk
- Green Point Roofing Announces Kern Scott as Recipient of 8th Annual Veteran Roof Donation
- Kaplan Morrell Law Firm Represents Former NHL Player in Workers' Compensation Case Drawing National Attention
- Local Lighting Experts Debut AI Christmas Decorator: Upload a Photo, Get Instant Professional Holiday Design-- Completely Free
- Surf Air Mobility (N Y S E: SRFM) Accelerates Regional Air Mobility Revolution with Electra Aero Partnership, Palantir Alliance, and Record Revenue
- Cybersecurity is Fast Becoming a Vital Issue for Protecting Personal Information and Portfolio Wealth
- 10 Essential Tips for Maximizing Value When Choosing Your Orlando Wedding Venue
- Americans Are Trading Offices for Beaches: How Business Ownership Enables the Ultimate Location Freedom
- Boston Industrial Solutions' Natron® DC Series Ink Has Had an Upgrade!
- Colony Ridge Proudly Supports the All Ears! 2025 Sporting Clays Tournament
- Governor Polis Launches Major Colorado Action Plan to Increase Public Safety and Reduce Auto Insurance Premium Costs
- Jacob Emrani Nominated for LA Executive Award
- Kansas City Steak Company Shares the Return of Their Holiday Gift Box
- Colorado Springs: Costilla Street to close through January 2026 east of Wahsatch Avenue
- Colorado Springs: Mayor Yemi honors seven emerging leaders at annual Mayor's Young Leader Awards ceremony
- Colorado Springs: East Las Vegas Street closing under Circle Drive for two weeks
- Colorado: Gov. Polis: Speaker Pelosi is a Transformational American Leader
- Dr. Jay A. Johannigman Delivers Lecture at the John R. Border Memorial Lectureship in Buffalo
- Powering the Next Frontier of the $1 Trillion Space Economy: Ascent Solar Technologies (N A S D A Q: ASTI)
- Taikan's T-V856S VMC Earns Prestigious 2025 Vogel Global Pioneer Award
- Flick Truck Accident Law Joins the Commercial Vehicle Safety Alliance to Strengthen Truck Safety Advocacy