Skip to content

Configure TTS Engines

Text-to-speech engines convert typed text into audio for system announcements, dynamic IVR responses, and notifications. UnifiedBX ships with TTS support; you wire in an engine (Google, Polly, Flite, etc.) and modules can call it.

Before You Start

  • You've chosen an engine: Flite (free, low quality, on-host), Google Cloud TTS (paid, high quality, online), AWS Polly (paid, high quality, online), Sangoma TTS (paid, hosted), or others.
  • For paid engines: API credentials.
  • For Flite: package installed on the host (yum install flite or equivalent).

Steps

  1. Go to System Attributes → Text To Speech Engines (sometimes labeled TTS or TTS Engines).
  2. Click + Add Engine.
  3. Pick the engine type:
    • Flite — free, no creds. Voice options are limited (kal_diphone, slt, etc.).
    • Google Cloud Text-to-Speech — provide a service account JSON.
    • AWS Polly — provide AWS access key, secret, region.
    • Sangoma TTS — uses your Sangoma account.
  4. Configure engine-specific options:
    • For Google: paste the service-account JSON, pick voice (e.g. en-US-Wavenet-D).
    • For Polly: pick voice (Joanna, Matthew, etc.) and region.
  5. Click Submit.
  6. Click Apply Config.

Use the TTS engine

Modules that support TTS will offer the engine as a source. Examples:

  • Announcements — instead of a recorded file, type text.
  • Voicemail email — TTS rendering of voicemail content (separate feature).
  • Outroutemsg — outbound route messages.

For ad-hoc TTS in a System Recording, some versions of UnifiedBX let you type text on the recording-add page; otherwise generate the audio externally and upload as WAV.

Verify

Add a test Announcement using the TTS engine, point an inbound route at it, and call. The synthesized voice should play.

Common Issues

  • No audio / fallback to default voice. Engine credentials wrong or rate-limited. Check engine logs.
  • Flite voice sounds robotic. That's Flite — switch to Google or Polly for natural voices.
  • Long text cuts off. Some engines have per-call character limits. Split into smaller chunks.
  • Slow first-play. TTS engines synthesize on-demand and cache. First play hits the API; subsequent plays are cached.
  • API quota exceeded. Online engines bill per character. Watch usage; cache aggressively.