How Conversent Learns Your Voice | Conversent Help Center

Conversent help center

How Conversent learns your voice

Conversent can improve automatic captioning accuracy over time by learning from your past episodes. The more quality-checked content you add, the better your captions can get — for podcasts, videos, educational talks, and more.

1
Build your caption library
  • 1
    Upload your audio or video content to Conversent. This can include existing podcast episodes, recorded lectures, YouTube videos, or any media file you want to caption or make searchable.
  • 2
    Attach matching captions to each episode. Conversent will suggest matches automatically to save you time.
  • 3
    If you don't have existing captions, use Conversent's Auto Caption tools to generate them automatically.
  • 4
    Open the Caption Editor to review your captions for accuracy. To fix incorrect text, double-click any caption line and type your correction.
  • 5
    To adjust caption timing, click the text bubble in the timeline viewer at the bottom of the Caption Editor. Drag the left or right edge to shift the start or end timestamp. Click Save when the timing looks right.
The more accurately your captions match what was actually said, the better Conversent can get at understanding who spoke and what they said — for educational talks, podcast transcripts, and video captions alike.
2
Name your known speakers
  • 1
    Conversent's automatic captions label each speaker by number (e.g. Speaker_1, Speaker_2). Each of these is a unique voice that Conversent detected in the audio.
  • 2
    In the Caption Editor, click the Speakers button in the top right. Listen to short clips of each detected voice, then assign each one a real name. If multiple voices that Conversent thinks are different turn out to be the same person, tag them with the same name, and Conversent will merge them together for you.
  • 3
    Only assign names to people who have given their consent to have their voice modeled. Conversent uses these assignments to build a voiceprint for each named person.
  • 4
    Once you've confirmed the caption text, timing, and speaker names, click Approve to mark the episode as complete. Approved episodes are used by Conversent to learn your team's voices.
For more information on how Conversent uses voiceprints to identify unique speakers, please refer to Conversent's Privacy Policy and Terms of Use.
3
Improve speaker ID and speech recognition

Speaker ID unlocks at
2 min
Once you tag 2+ minutes of a speaker's voice, Conversent can start recognizing that voice automatically in new episodes.
Better accuracy unlocks at
1 hour
After confirming accuracy for 1+ hour of a speaker's captions, Conversent can begin learning that speaker's accent, vocabulary, and speech patterns for better caption accuracy.
  • 1
    Once enabled, Speaker ID and auto-captioning accuracy can keep improving. Keep correcting any errors you find in Conversent's captions to give it more examples of mistakes to avoid.
  • 2
    To activate Speaker ID, contact Conversent to enable it for you. This will become automatic in a future update.

Contact us to activate Speaker ID →

start Building Your Library Today