Conversent help center
How Conversent learns your voice
Conversent can improve automatic captioning accuracy over time by learning from your past episodes. The more quality-checked content you add, the better your captions can get — for podcasts, videos, educational talks, and more.
1
Build your caption library
-
1Upload your audio or video content to Conversent. This can include existing podcast episodes, recorded lectures, YouTube videos, or any media file you want to caption or make searchable.
-
2Attach matching captions to each episode. Conversent will suggest matches automatically to save you time.
-
3If you don't have existing captions, use Conversent's Auto Caption tools to generate them automatically.
-
4Open the Caption Editor to review your captions for accuracy. To fix incorrect text, double-click any caption line and type your correction.
-
5To adjust caption timing, click the text bubble in the timeline viewer at the bottom of the Caption Editor. Drag the left or right edge to shift the start or end timestamp. Click
Savewhen the timing looks right.
The more accurately your captions match what was actually said, the better Conversent can get at understanding who spoke and what they said — for educational talks, podcast transcripts, and video captions alike.
2
Name your known speakers
-
1Conversent's automatic captions label each speaker by number (e.g. Speaker_1, Speaker_2). Each of these is a unique voice that Conversent detected in the audio.
-
2In the Caption Editor, click the
Speakersbutton in the top right. Listen to short clips of each detected voice, then assign each one a real name. If multiple voices that Conversent thinks are different turn out to be the same person, tag them with the same name, and Conversent will merge them together for you. -
3Only assign names to people who have given their consent to have their voice modeled. Conversent uses these assignments to build a voiceprint for each named person.
-
4Once you've confirmed the caption text, timing, and speaker names, click
Approveto mark the episode as complete. Approved episodes are used by Conversent to learn your team's voices.
For more information on how Conversent uses voiceprints to identify unique speakers, please refer to Conversent's Privacy Policy and Terms of Use.
3
Improve speaker ID and speech recognition
Speaker ID unlocks at
2 min
Once you tag 2+ minutes of a speaker's voice, Conversent can start recognizing that voice automatically in new episodes.
Better accuracy unlocks at
1 hour
After confirming accuracy for 1+ hour of a speaker's captions, Conversent can begin learning that speaker's accent, vocabulary, and speech patterns for better caption accuracy.
-
1Once enabled, Speaker ID and auto-captioning accuracy can keep improving. Keep correcting any errors you find in Conversent's captions to give it more examples of mistakes to avoid.
-
2To activate Speaker ID, contact Conversent to enable it for you. This will become automatic in a future update.
Contact us to activate Speaker ID →