Dovetail’s built-in transcription and video highlights are a powerful way to share stories about your research and develop a repository of searchable audio and video clips.
Upload a video or audio recording you’ve taken from an interview, usability test, sales call, or product demo. Dovetail will process the recording into a fast, streamable format, and transcribe it using an advanced AI-powered speech engine. Then, create highlights (Highlight and tag project content) to turn your raw recording into tagged, searchable audio and video clips.
To upload a video or audio recording:
Click Data in the sidebar.
Click + Add data.
Click the Video or Audio icon.
Choose a file from your computer.
Next to Transcribe this file, click Begin.
Your recording will be uploaded into a note and processed to ensure fast playback. The amount of time this takes depends on the length of the recording. In general, processing takes about 30% of the length of the file; e.g. a 60 minute recording will take approximately 20 minutes to process and transcribe.
You can close the note and continue using other parts of Dovetail while your file is uploading. You can safely leave Dovetail entirely (e.g. close the browser window or turn off your computer) while it’s processing and being transcribed, and come back later.
While our AI speech-engine will attempt to automatically detect different speakers, it sometimes doesn’t get it right. You can change and rename the speaker for a monologue (a period of speech by one person) by clicking their name, which defaults to Speaker 1 and Speaker 2. Changing the speaker for a monologue only changes it there, but renaming a speaker renames it everywhere it's used.
You can merge two monologues into one by placing your cursor at the start of the second monologue and pressing backspace. Similarly, you can split one monologue into two by placing your cursor inside and pressing enter.
When you upload a video to a note, Dovetail shows an early frame of the video as a thumbnail on your data board. To change this thumbnail:
Play the video
Click the Actions (···) menu in the top right of the video
Click Save as cover to save the current frame as the note's thumbnail
Note that only the first video in a note can be used as its thumbnail and this feature isn't available in Safari.
Dovetail provides a number of keyboard shortcuts to control the playback for audio and video. To see them, click the Actions (···) menu in the top right of a note, tag, or insight, then Shortcuts.
Dovetail supports the following file formats:
Video formats:
mp4
mov
mpeg
avi
Audio formats:
mp3
m4a
wav
English
Spanish (Español)
German (Deutsch)
French (Français)
Portuguese (Português)
For when you need human-level accuracy with your transcripts, or to analyze conversations in a language that we don't yet support – we've also added the option to bring your own transcript into Dovetail.
We support importing any WebVTT (Web Video Text Tracks) caption file. When you upload a .vtt
caption file, we'll use the caption timestamps to sync with playback with your video or audio file.
For human-level accuracy we recommend using Rev.com who are able to supply a compatible .vtt
file for use in Dovetail once one of their skilled transcriptionists finish your job. If you have a non-compatible .srt
caption file, you can convert this to a compatible .vtt
file using Rev's free caption converter.
To import your own transcript:
Click the ••• button on a video or audio file.
Click Upload transcript.
Select a compatible .vtt
file from your system.
Your transcript will be uploaded and processed by us.
Here are a few tips to improve the quality of your recordings and transcript:
Record in a quiet setting with minimal background noise.
Invest in quality recording equipment, such as a microphone or recorder.
Speak clearly, loudly, and slowly.
Avoid talking over other people.
To improve the accuracy of transcripts you can submit a list of custom words or phrases that are not found in a dictionary (for example company names or industry jargon) before starting a transcription.
You can paste a comma separated list of terms that you want to include in the transcription dialog to avoid having to type out the terms every time.
The AI speech engine we use is trained on 50,000+ hours of human-transcribed content across a diversity of topics, industries, and accents. This makes our transcripts some of the most accurate available.
Each pricing plan includes a number of hours that are used monthly.
At the moment, we are not enforcing the limit of hours and are monitoring excess usage and customer feedback to understand the right amount of hours to include in each plan.
Once we've finalized the number of included hours in each plan, we will include a way to purchase additional hours when you reach the advertised limit. The cost will be $9 USD per additional hour.
Can’t find what you’re looking for? Search through our articles or contact our support team and get a response within 24 hours.
Get help