Trending...
- Liquidity Aggregation: US-Registered JHKXWL Integrates AI Analytics for Brazilian and Global Institutional Traders - 461
- Bent Danholm Lists Contemporary Lakefront Residence in Winter Garden's Avalon Cove
- ZEELOOL 2025 Black Friday and Cyber Monday Big Deals
First Open-Platform, Video-First SDK for Real-Time Vision AI
BOULDER, Colo. - ColoradoDesk -- Stream, the leading provider of scalable chat, video, and feeds APIs, today announced Vision Agents, the first open-source, open-platform SDK bringing real-time video and audio intelligence into developer applications.
Unlike existing frameworks that bolt video onto voice-first systems, Vision Agents were designed video-first from day one.
"Most frameworks started with voice and later added video," said Thierry Schellenbach, CEO and Co-Founder of Stream. "We built the opposite: a video-first foundation that's open, extensible, and developer-friendly."
Developers can now create AI-powered agents that see, hear, and remember in real time, enabling a new generation of interactive, multimodal applications.
Open Platform for AI Innovation
Vision Agents works with Stream Video by default but also integrates with other video SDKs and supports AI providers, including OpenAI Realtime, Google Gemini, and custom models. This flexibility lets companies adopt Vision Agents without disrupting existing infrastructure, while Stream Video and Chat users gain deep integrations for memory, messaging, and performance.
More on Colorado Desk
Real-Time, Video-First Intelligence
Vision Agents process live video with low latency, enabling real-time perception, scene detection, and natural audio or text responses. Core features include:
Wide-Ranging Applications
Use cases span manufacturing (defect detection), collaboration (AI note-taking, transcription), gaming (coaching, avatars), accessibility (captions, descriptions), and customer support (multimodal assistants).
Open Source and Availability
Fully open-source, Vision Agents invites community contributions to extend providers and tools.
"Vision AI today feels like ChatGPT in 2022, it's just beginning to show what's possible," said Thierry Schellenbach, CEO and Co-Founder of Stream.
Developers and partners can contribute new processors, adapters, and integrations directly on GitHub: https://github.com/GetStream/Vision-Agents
Unlike existing frameworks that bolt video onto voice-first systems, Vision Agents were designed video-first from day one.
"Most frameworks started with voice and later added video," said Thierry Schellenbach, CEO and Co-Founder of Stream. "We built the opposite: a video-first foundation that's open, extensible, and developer-friendly."
Developers can now create AI-powered agents that see, hear, and remember in real time, enabling a new generation of interactive, multimodal applications.
Open Platform for AI Innovation
Vision Agents works with Stream Video by default but also integrates with other video SDKs and supports AI providers, including OpenAI Realtime, Google Gemini, and custom models. This flexibility lets companies adopt Vision Agents without disrupting existing infrastructure, while Stream Video and Chat users gain deep integrations for memory, messaging, and performance.
More on Colorado Desk
- Colorado: Governor Polis Calls on Treasury Department Not to Increases Costs for Americans with Suspension of Easy-to-Use DirectFile
- New Collection of Work Celebrates Poets Laureate of Colorado, Features Unpublished Work by Andrea Gibson
- Cummings Graduate Institute for Behavioral Health Studies Celebrates New DBH Graduates
- $80M+ Backlog as Florida Statewide Contract, Federal Wins, and Strategic Alliance Fuel Next Phase of AI-Driven Cybersecurity Growth: Cycurion $CYCU
- High-Conviction CNS Disruptor Aiming to Transform Suicidal Depression, Ketamine Therapeutics, and TMS - Reaching Millions by 2030
Real-Time, Video-First Intelligence
Vision Agents process live video with low latency, enabling real-time perception, scene detection, and natural audio or text responses. Core features include:
- Video-first intelligence for scene understanding.
- Real-time audio with transcription, speech, and voice activity detection.
- Memory and context to recall details naturally.
- Action-ready design to connect with external APIs and services.
Wide-Ranging Applications
Use cases span manufacturing (defect detection), collaboration (AI note-taking, transcription), gaming (coaching, avatars), accessibility (captions, descriptions), and customer support (multimodal assistants).
Open Source and Availability
Fully open-source, Vision Agents invites community contributions to extend providers and tools.
"Vision AI today feels like ChatGPT in 2022, it's just beginning to show what's possible," said Thierry Schellenbach, CEO and Co-Founder of Stream.
Developers and partners can contribute new processors, adapters, and integrations directly on GitHub: https://github.com/GetStream/Vision-Agents
Source: GetStream.io
Filed Under: Technology
0 Comments
Latest on Colorado Desk
- Explosive Growth in U.S. Cryptocurrency Cloud Mining Sets The Stage for New Platform Launch with Daily Rewards in a Transparent Revenue-Share Model
- Qtex Cierra Ronda de $7 Millones para Estandarizar la Banca Transfronteriza en los Mercados Emergentes de Latinoamérica
- HSX Exchange Enhances Global Institutional Infrastructure With New Connectivity Upgrade
- Colorado Springs Parks, Recreation & Cultural Services Director Britt Haley to retire
- Colorado Springs: Issaquah Drive to close south of Dublin Boulevard starting Tuesday
- Colorado: Governor Polis Appoints Daniel M. St. John II to the 8th Judicial District Court
- Governor Polis Applauds Dr. Angie Paccione, Service to State of Colorado and Leadership in Higher Education
- America's Most Festive Garages Wanted for Garage.com's 2025 Holiday Contest
- Advanced Precision Machining Releases New Guide to Custom Aerospace Machining in Colorado
- Colorado Approved for $420 Million in Federal Broadband Funding, Connecting Rural Colorado
- FDA Accepts ANDA for KETAFREE™ as Analyst Sets $34 Price Target for NRx Pharmaceuticals: (N A S D A Q : NRXP) NRx is Poised for a massive Breakthrough
- Moms Feelin' Themselves Announces Expanded U.S. Tour Following Breakout First Year
- BBBSC Celebrates Colorado Gives Day Impact
- Videos2Worship Expands Christmas Motion Background Library for 2025 Worship Services
- BEC Technologies Expands MX-220 5G Industrial Router Series for Edge Connectivity
- "Latino Leaders Speak: Personal Stories of Struggle and Triumph, Volume II" Documents the Truth About Latino Excellence and Impact on American Society
- Broadway Smile Boutique Unveils Modern Website for Enhanced Patient Experience
- Fenix Consulting Group Expands Orange County Office to Meet Growing Client Demand
- Signature Smiles Dental Group Unveils New User-Friendly Website
- CCHR: New Data Shows Millions of U.S. Children Caught in Escalating Psychiatric Polypharmacy