voicebox package
Subpackages
Submodules
voicebox.audio module
- class voicebox.audio.Audio(signal: ndarray, sample_rate: int)[source]
Bases:
objectRepresents an audio signal.
- Parameters:
signal – Audio signal represented as a 1D array of samples, each in the range
[-1, 1].sample_rate – Number of samples per second.
- check() None[source]
Raises
ValueErrorif the audio is invalid.For an audio to be valid, it must satisfy the following conditions:
Must have at least one sample.
All samples must be in the range
[-1, 1].The sample rate must be greater than 0.
- copy(signal: ndarray = None, sample_rate: int = None) Audio[source]
Returns a deep copy of self, with optional new property values.
- property len_bytes: int
Length of audio signal in bytes.
- property len_seconds: float
Length of audio signal in seconds.
- property sample_period: float
Sample period in seconds.
- sample_rate: int
- signal: ndarray
voicebox.ssml module
- class voicebox.ssml.SSML[source]
Bases:
strA Speech Synthesis Markup Language (SSML) string.
By wrapping a string in this class, the string is treated as SSML by
TTSengines that support it.Example
>>> from voicebox.tts import ESpeakNG >>> from voicebox import SSML >>> tts = ESpeakNG() >>> text = SSML('<speak>Hello world</speak>') >>> audio = tts.get_speech(text)
voicebox.types module
voicebox.utils module
- voicebox.utils.reliable_tts(ttss: TTS | Iterable[TTS] = None, retry_max_attempts: int = 3, cache_max_size: int | float = 60, cache_size_func: Literal['bytes', 'count', 'seconds'] | Callable[[Any], int | float] = 'seconds') TTS[source]
Takes zero or more TTS instances and returns a single TTS that will attempt to use each TTS in the order given, up to
retry_max_attemptstimes each, until one succeeds. Outputs will also be cached to speed up retrieval of repeated phrases.This is useful if e.g. you have an online TTS that is subject to network failures, which the retries may alleviate, and you want to fall back to an offline TTS in the event that the online TTS fails all attempts.
If no TTS instance is provided, then a default TTS instance will be used.