Spaces:
Running
Running
NVIDIA Pipecat 0.1.0 (23 April 2025)
The NVIDIA Pipecat library augments the Pipecat framework by adding additional frame processors and services, as well as new multimodal frames to enhance avatar interactions. This is the first release of the NVIDIA Pipecat library.
New Features
- Added Pipecat services for Riva ASR (Automatic Speech Recognition), Riva TTS (Text to Speech), and Riva NMT (Neural Machine Translation) models.
- Added Pipecat frames, processors, and services to support multimodal avatar interactions and use cases. This includes
Audio2Face3DService,AnimationGraphService,FacialGestureProviderProcessor, andPostureProviderProcessor. - Added
ACETransport, which is specifically designed to support integration with existing ACE microservices. This includes a FastAPI-based HTTP and WebSocket server implementation compatible with ACE. - Added
NvidiaLLMServicefor NIM LLM models andNvidiaRAGServicefor the NVIDIA RAG Blueprint. - Added
UserTranscriptSynchronizationprocessor for user speech transcripts andBotTranscriptSynchronizationprocessor for synchronizing bot transcripts with bot audio playback. - Added custom context aggregators and processors to enable Speculative Speech Processing to reduce latency.
- Added
UserPresence,Proactivity, andAcknowledgementProcessorframe processors to improve human-bot interactions. - Released source code for the voice assistant example using
nvidia-pipecat, along with thepipecat-ailibrary service, to showcase NVIDIA services withACETransport.
Improvements
- Added
ElevenLabsTTSServiceWithEndOfSpeech, an extended version of the ElevenLabs TTS service with end-of-speech events for usage in avatar interactions.