Spaces:
Running
Running
File size: 2,537 Bytes
53ea588 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
# SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: BSD 2-Clause License
"""Riva frames for Interim Transcription.
This module provides frame definitions for NVIDIA Riva's speech-to-text functionality,
specifically focused on interim transcription handling.
Classes:
RivaInterimTranscriptionFrame: Frame for interim transcription results with stability metrics
"""
from dataclasses import dataclass
from pipecat.frames.frames import InterimTranscriptionFrame
@dataclass
class RivaInterimTranscriptionFrame(InterimTranscriptionFrame):
"""An interim transcription frame with stability metrics from Riva.
Extends the base InterimTranscriptionFrame to include Riva-specific stability
scoring for speculative speech processing. These frames are generated during
active speech and help determine when to trigger early response generation.
Also see:
- InterimTranscriptionFrame : Base class for interim transcriptions
Args:
stability (float): Confidence score for the transcription, ranging 0.0-1.0.
- 0.0: Highly unstable, likely to change
- 1.0: Maximum stability, no expected changes
Only transcripts with stability=1.0 are processed for speculative
speech handling. Defaults to 0.1.
user_id (str): Identifier of the speaking participant.
text (str): The interim transcription text.
language (str): Language code of the transcription.
timestamp (float): Timestamp of when the transcription was generated.
Typical usage example:
>>> frame = RivaInterimTranscriptionFrame(
... text="Hello world",
... stability=0.95,
... user_id="user_1",
... language="en-US",
... timestamp=1234567890.0
... )
>>> print(frame) # Output will be:
RivaInterimTranscriptionFrame(
user: user_1,
text: [Hello world],
stability: 0.95,
language: en-US,
timestamp: 1234567890.0
)
"""
stability: float = 0.1
def __str__(self):
"""Return a string representation of the frame.
Returns:
str: A formatted string containing all frame attributes.
"""
return (
f"{self.name}(user: {self.user_id}, text: [{self.text}], "
f"stability: {self.stability}, language: {self.language}, timestamp: {self.timestamp})"
)
|