BREAK CAPTION GROUPS AT LOGICAL PLACES
Deciding when to end one caption group and create a new one requires paying attention to
three things:
1. The rhythm of speech. When a speaker pauses, that's a good spot to break.
2. The grammatical structure of the speech. Punctuation, conjunctions, and prepositions are
good spots to break.
3. The length (character count) of a caption group. You cannot exceed 60 characters in a
caption group.
Always create a new caption group whenever the speaker changes or a sentence ends with
punctuation . ? !
Caption groups must be under 60 characters.
A caption group can have a maximum of 60 characters. Dash has built-in color coding to help guide
you. Aim for the caption group box to be green or yellow. If the caption group box turns red, it is too
long.
If a sentence is longer than 60 characters, break mid-sentence. Breaking shouldn't interrupt
comprehension or readability, so you should try to break:
After punctuation , : ;
Or before conjunctions such as: and, nor, but, or, yet, so, by.
Or before prepositional phrases such as: that, who, because, in order to, not only, as we, in
which, where, with, what, how, for, through, until, to, as, of.
Or before complete proper nouns (e.g. do not break between "United States of America").
SPEAKER LABELING
Speaker ID Rules
1. Use a dash and a space EVERY time a NEW speaker starts speaking or when the speaker
CHANGES. I.e. "- "
o Atmospherics do not count as a change of speaker. If an atmospheric is used that
breaks up dialogue from the same speaker, do not include a dash after the atmospheric.
2. If the speaker CANNOT be visually identified, identify the speaker with a speaker ID. E.g., "-
[Mark]", "- [Narrator]"
o If a speaker is unknown, use an appropriate identifier. Some possible examples
could be "- [Interviewer]", "- [Guest]", "- [Ghostly Voice]"
o Exception: There is no need to identify a speaker if they are visible for any portion of
their dialogue AND they are not interrupted
E.g. Mark starts speaking off-screen but then walks into the frame while
talking
ADDITIONAL NOTES
When you NEED to identify a speaker, here's how:
You should use the speaker's first name if it is known: "- [Mark]"
If a character name is not known, use a visible descriptive identifier: "- [Blonde Woman]"
o Never use race or other discriminatory identifiers. Instead, use a descriptor such as
occupation, clothing, height, etc.
ATMOSPHERICS
Include atmospherics for these main conditions:
1. A sound effect is heard which is integral to the story or message of a video. If a character
reacts to a sound, you should probably include it. E.g. "(gun fires)"
2. Background music is heard in a way that sets a specific mood as part of the story telling.
Only include a background music atmospheric if there's a significant gap in speech and the
music seems important. E.g. "(dramatic orchestral music)"
Sound effects often help tell the story
Atmospherics are put in parentheses and are always in lowercase, E.g. "(loud snoring)".
o Atmospheric-only caption groups do not need a dash or speaker ID.
Only include significant sound effects that help tell the story. Use your best judgment.
If in doubt, include the atmospheric.
o E.g. If a character reacts to an off-screen gun shot, use "(gun fires)"
o E.g. If there is a group of children playing, you could use "(children laughing)"
o E.g. "(plane flying overhead)" or "(car honking)" should be included if characters
react to it.
Include sounds made by the speaker, E.g. "(laughs loudly)".
Atmospherics should always be present tense, E.g. "(laughs loudly)", never past tense
"(laughed loudly)"
Always describe with an action verb, E.g. "(frogs croaking)", never with an onomatopoeia of
a sound "(ribbit, ribbit, ribbit)"
Mood Music
Music is often used in videos to help set a mood or underscore actions.
If there's at least a 2-second gap in speech AND it does not seem that the lyrics are intended to be
clearly heard AND the background music is setting a specific mood, then caption the atmospheric as
mood music.
Most of the time you won't know the artist or title, so you should use a description:
o E.g. "(gentle music)", "(bright pop music)", "(heavy metal music)", "(electronic dance
music)".
o A list of music adjectives can be found here.
If someone talks over the music, focus on the speaker and don't add the atmospheric.
Dialogue is always more important.
PRE-EXISTING ON-SCREEN TEXT
There are 2 main scenarios in which you will see pre-existing on-screen text:
1. Pre-existing text shown as slides, graphics, whiteboards, or a software interface. DO NOT
use the up-arrow caret ^.
2. Pre-existing text shown that helps tell the story AND is in the lower 1/3 of the screen. DO use
the up-arrow caret ^ to indicate the overlap.
How to indicate overlap with on-screen storytelling text in lower 1/3 of the screen:
1. Up-arrow carets ^ should be used whenever there is pre-existing burned-in text on the lower
1/3 of the screen.
2. When you flag this occurence, insert the up-arrow caret ^ in the caption group by typing
shift+6 on your keyboard and Dash will display a blue up-arrow caret at the beginning of the
caption group.
3. You must add an up-arrow caret ^ to EVERY caption group that occurs at the same time as
the burned-in text, even if it occurs for only a split second.
ACCURACY
Type what the speaker says.
Never correct (edit) the speaker's grammar.
Never paraphrase or substitute words.
Never insert words not spoken.
Never rearrange the order of speech.
Don't correct phonetics unless it distracts from readability.
Do remove speech disfluency that distracts from readability.
Pay extra attention to the spelling and capitalization of special words.
- Research spellings and terminology you may not be familiar with.
Take time to research the proper spelling and capitalization of important words and proper nouns.
Research proper spellings and capitalizations for important words. Google proper spelling of
proper nouns (e.g., names, brands, and places) and topic-specific vocab (e.g., Adobe
Premiere Pro).
When you cannot confidently hear or understand a word, use an atmospheric.
Use an appropriate atmospheric such as "(mumbles)" for a single speaker or "(background
noise drowns out other sounds)". - It's a long drive to (mumbles).
It is NEVER acceptable to use (inaudible), (unintelligible), or ??? in captioning.
SYNCING
Start Time of a Caption Group
The start time should align with the beginning of the sound. This applies to both
atmospherics and speech.
While the video plays, press the UP arrow key on your keyboard to align the start of a
caption group precisely when the first word or sound of a caption group is heard.
o Note: When you press the UP arrow key, Dash automatically bumps your timestamp
forward by 1/4 second to compensate for normal lag in human reaction time.