Welcome to this week’s NexaQuanta newsletter, where we bring you the most relevant developments shaping the future of enterprise AI.
As organisations continue to scale their AI investments, the focus is rapidly shifting from experimentation to performance, usability, and measurable business value.
In this edition, we highlight key updates that signal where enterprise AI is heading next.
- IBM and ElevenLabs bring voice-first, multilingual AI agents to the enterprise
- AWS and Cerebras target ultra-fast AI inference for real-time applications
- OpenAI exits video AI, signalling a shift toward high-ROI, practical use cases
- Google DeepMind advances real-time conversational AI with Gemini 3.1 Flash Live
Voice-Enabled AI Becomes Enterprise-Ready with IBM and ElevenLabs Integration
IBM and ElevenLabs have announced a strategic integration that brings advanced voice capabilities to IBM watsonx Orchestrate. This move enables enterprises to build AI agents that communicate through natural, human-like voice interactions at scale.
Expanding Agentic AI Beyond Text
With the addition of premium Text-to-Speech (TTS) and Speech-to-Text (STT), businesses can now move from text-based automation to voice-first AI experiences. These agents can interact in over 70 languages, with support for regional accents and tones.
This is especially relevant for sectors such as:
- Banking and insurance
- Healthcare providers
- Government services
- Utilities and customer support
Organisations can now serve diverse user bases more effectively through multilingual and conversational AI systems.
Addressing Key Enterprise Challenges
Traditional voice systems often struggle with long wait times, rigid workflows, and robotic responses. This integration directly addresses these limitations by enabling:
- More natural and expressive voice interactions
- Scalable, high-volume communication handling
- Improved user experience across customer and employee touchpoints
Built for Security, Compliance, and Scale
The integration is designed with enterprise requirements in mind. Key capabilities include:
- PCI compliance for secure transactions
- Zero Retention Mode supporting HIPAA-aligned data handling
- Data residency controls for regional compliance
- Access to a library of over 10,000 voices
These features ensure that organisations can deploy voice-enabled AI while maintaining strict governance and security standards.
Strategic Implications for Businesses
This collaboration signals a clear shift toward voice-first AI in enterprise environments. As AI agents become more embedded in daily operations, the quality of interaction will play a critical role in user trust and adoption.
Click here to read more about this news.
Faster AI Inference Set to Reshape Enterprise AI Performance with AWS–Cerebras Collaboration
AWS and Cerebras Systems have announced a new collaboration to deliver the fastest AI inference capabilities in the cloud. The solution will be available through Amazon Bedrock and deployed across AWS data centres in the coming months.
A New Benchmark for Speed and Performance
The joint solution combines AWS Trainium chips with Cerebras CS-3 systems, connected through high-speed Elastic Fabric Adapter (EFA) networking.
This architecture is designed to significantly improve inference speed for generative AI and large language model (LLM) workloads.
Key outcomes for businesses include:
- Faster response times for AI-driven applications
- Improved performance for real-time use cases such as coding assistants and chat systems
- Higher efficiency in handling large-scale AI workloads
Introducing Inference Disaggregation
At the core of this innovation is a technique called inference disaggregation. It separates AI inference into two stages:
- Prefill: Processing the input prompt (handled by AWS Trainium)
- Decode: Generating output tokens (handled by Cerebras CS-3)
By assigning each stage to specialised hardware, the system optimises performance and reduces bottlenecks—particularly in the decode phase, which typically consumes the most time.
Built for Enterprise-Scale Workloads
The solution is designed to meet enterprise demands for speed, scale, and reliability. It is built on AWS’s secure cloud infrastructure, ensuring:
- High-performance and low-latency operations
- Consistent security and isolation standards
- Seamless integration with existing AWS environments
Additionally, AWS plans to support leading open-source models and its own Amazon Nova models on Cerebras hardware later this year.
Want to read more about this? Click here!
OpenAI Shuts Down Sora and Ends Disney Deal, Signals Strategic Shift Away from Video AI
OpenAI has discontinued its AI video-generation platform Sora and cancelled its $1 billion partnership with Disney, marking a significant shift in its product and investment priorities.
Pivot Toward High-Impact AI Areas
The company confirmed it is exiting the video-generation space to focus on other areas, particularly robotics and agentic AI.
These technologies are expected to deliver more practical, real-world applications, including autonomous task execution with minimal human input.
This move indicates a broader strategy to prioritise:
- AI systems with direct operational value
- Automation of physical and enterprise workflows
- Scalable agent-based technologies
Monetisation and Risk Challenges
Despite strong initial interest, Sora struggled to generate meaningful revenue. Reports indicate:
- Sora generated approximately $1.4 million in in-app revenue
- ChatGPT generated significantly higher returns during the same period
In addition to weak monetisation, the platform faced several challenges:
- Difficulty controlling misuse, including non-consensual content
- Increased risk of misinformation and deepfakes
- Ongoing concerns around copyright and intellectual property violations
These factors made the platform costly to maintain with limited return on investment.
End of Disney Partnership
As part of the shutdown, OpenAI has also ended its high-profile agreement with Disney. The deal had allowed Sora to use popular intellectual property, including well-known characters, in AI-generated videos.
The cancellation reflects:
- Continued sensitivity around IP usage in AI
- Growing caution among media companies
- A shift toward more controlled and compliant AI partnerships
Here you can read more about this news.
Google Launches Gemini 3.1 Flash Live to Power Real-Time Conversational AI at Scale
Google DeepMind has introduced Gemini 3.1 Flash Live, a new model designed to enable real-time, low-latency conversational AI.
Available via the Gemini Live API, the model allows developers to build voice and vision agents that respond at the speed of human interaction.
Advancing Real-Time AI Capabilities
The model delivers significant improvements in latency, reliability, and conversational quality—key requirements for enterprise-grade, voice-first applications.
Key enhancements include:
- Faster response times for seamless, real-time interactions
- Improved understanding of tone, intent, and speech patterns
- More natural and fluid conversational outputs
This positions businesses to deploy AI agents that feel more intuitive and human-like in live environments.
Improved Performance in Real-World Conditions
Gemini 3.1 Flash Live is optimised for complex, real-world scenarios where traditional systems often fail.
Notable capabilities:
- Strong performance in noisy environments by filtering background sounds
- Higher task completion rates during live interactions
- Better adherence to complex instructions and operational constraints
These improvements make the model more reliable for customer-facing and operational use cases.
Multilingual and Multimodal by Design
The model supports over 90 languages and enables multimodal interactions, including voice and visual inputs. This allows businesses to:
- Build globally scalable conversational systems
- Deliver consistent experiences across diverse user bases
- Integrate voice, video, and contextual inputs into a single AI workflow
Click here to read the detailed news.
Stay Ahead with NexaQuanta
As AI continues to evolve, staying informed is critical for making the right strategic decisions. Subscribe to the NexaQuanta weekly newsletter to get concise, business-focused insights delivered directly to you—helping you navigate the fast-changing AI landscape with clarity and confidence.

