Skip to content

SRT / VTT / ASS Subtitle Formats

SRT (SubRip Subtitle), VTT (WebVTT), and ASS (Advanced SubStation Alpha) are three very common subtitle formats. The following provides a detailed introduction to each subtitle format, its attributes, and settings.

SRT Subtitle Format

SRT is a simple and widely used subtitle format with the suffix .srt. It is especially popular in video players and subtitle editors. Its basic structure includes subtitle numbers, timestamps, and subtitle text. Subtitle attributes cannot be directly defined by SRT (e.g., color, font) and usually rely on the player's default settings or external style files for control.

SRT Format Structure

Each subtitle block in an SRT file is arranged in the following format:

  1. Subtitle Number (incrementing line by line)
  2. Timestamp (shows start and end times, accurate to milliseconds)
  3. Subtitle Content (can contain multiple lines of text)
  4. A blank line (used to separate subtitle blocks)

SRT Example

plaintext
1
00:00:01,000 --> 00:00:04,000
Hello my friend!

2
00:00:05,000 --> 00:00:08,000
The weather is nice today, what do you think.

Detailed Explanation

  • Subtitle Number: Each subtitle block has a unique number, incrementing sequentially. Numbers start from 1 and must be integers.

    • Example: 1
  • Timestamp: The format is HH:MM:SS,mmm, where HH is hours, MM is minutes, SS is seconds, and mmm is milliseconds. The timestamp consists of two times separated by --> with a space on each side of the symbol, indicating the start and end times of the subtitle.

    • Example: 00:00:01,000 --> 00:00:04,000
  • Subtitle Content: Subtitle text can contain one or more lines and is displayed on the video. SRT does not support formatting text such as color, font size, etc. These must be defined through player settings or additional style files.

    • Example: Hello my friend!

SRT Format Limitations

  • Does not support text formatting: Cannot directly set colors, fonts, etc., requiring style adjustments by the player or other tools.

VTT Subtitle Format

WebVTT (Web Video Text Tracks) is a subtitle format for HTML5 video elements designed specifically for online videos. It is more powerful than the SRT format, supporting properties such as styles, annotations, multiple languages, and location information. The subtitle file format suffix is .vtt. However, it cannot be directly embedded in videos and must be referenced in the <video> element of an HTML file.

VTT Format Structure

VTT files are similar to SRT but with more features. VTT files start with WEBVTT followed by a blank line and use a . dot symbol instead of a , to separate seconds and milliseconds.

VTT Example

plaintext
WEBVTT

1
00:00:01.000 --> 00:00:04.000
Hello, <b>friends!</b>

2
00:00:05.000 --> 00:00:08.000
The rain today is <i>very very heavy</i>.

Detailed Explanation

  • WEBVTT Declaration: All VTT files must start with WEBVTT to declare the file format.

    • Example: WEBVTT
  • Subtitle Number: Subtitle numbers are optional and not required like in the SRT format. Their purpose is to distinguish the order of each subtitle segment, but they can be omitted in VTT.

  • Timestamp: The format is HH:MM:SS.mmm, where HH is hours, MM is minutes, SS is seconds, and mmm is milliseconds. Use a . period to separate seconds and milliseconds instead of a ,. The timestamp consists of two times separated by -->, also with a space on each side.

    • Example: 00:00:01.000 --> 00:00:04.000
  • Subtitle Content: Subtitle text can contain HTML tags for formatting text, such as bold (<b>), italic (<i>), and underline (<u>).

    • Example:
      plaintext
      Hello, <b>friends!</b>

Other Features Supported by VTT

  1. Styles (CSS):

    • VTT supports adjusting text styles through CSS, such as color, font size, position, etc. Styles can be defined in HTML through the <style> tag or external CSS files.
    • Example:
      plaintext
      <c.red>Hello friends!</c>
      Defining .red { color: red; } in HTML will display Hello, world! in red.
  2. Location Information:

    • VTT supports setting the specific position of subtitles through attributes such as position and line.
    • Example:
      plaintext
      00:00:01.000 --> 00:00:04.000 position:90% line:10%
  3. Annotations:

    • VTT supports adding annotations to the file, starting with NOTE.
    • Example:
      plaintext
      NOTE This line is a comment and will not be displayed.
  4. Multi-Language Support:

    • VTT can support multiple language subtitles through metadata or HTML5's <track> tag.

Advantages of VTT Format

  • Text Formatting: Supports HTML tags for simple text formatting such as bold, italic, etc.
  • Styling and Positioning: Styles and positions of subtitles can be set via CSS.
  • Annotations and Metadata: Supports adding annotation information without affecting subtitle display.
  • Web Compatibility: Designed specifically for HTML5 videos, suitable for web environments.

Comparison of SRT and VTT

FeatureSRTVTT
File HeaderNoneWEBVTT followed by 1 blank line
Timestamp FormatHH:MM:SS,mmm, comma separated seconds and millisecondsHH:MM:SS.mmm period separated seconds and milliseconds
Text Formatting SupportNoSupports HTML tags, such as <b>, <i>
Subtitle NumberRequiredOptional
Style and Location SupportDepends on player or external style filesBuilt-in CSS style support, supports location information
CommentsNot supportedSupports NOTE comments
Supported Advanced FeaturesOnly basic subtitle featuresSupports Karaoke, comments, styles, etc.
Use CasesLocal video files, simple subtitle displayHTML5 video, network subtitles, complex subtitle display
Embedded in VideoCan be embedded in video filesCannot be embedded in video, can only be used in the web <video> element

WebVTT (VTT) subtitle format cannot be directly embedded in MP4 files, but VTT files can be associated with MP4 videos through the <track> tag of HTML5. When opening an MP4 in a browser, these associated subtitles can be displayed normally.

Playing MP4 in a Browser Using VTT Subtitles

In HTML5, the <video> element can be used to load an MP4 video, and the <track> element can be used to associate a VTT subtitle with the video.

HTML Example:

html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title></title>
</head>
<body>
    <video controls width="600">
        <source src="video.mp4" type="video/mp4">
        <track src="subtitles.vtt" kind="subtitles" srclang="zh" label="Simplified Chinese">
        Your browser does not support the video tag.
    </video>
</body>
</html>

HTML Element Explanation

  • <video>: Used to embed video files. The controls attribute allows users to control video playback (play/pause, etc.).
  • <source>: Defines the path and type of the video file, using MP4 here.
  • <track>: Defines the subtitle file, the src attribute points to the path of the VTT file, kind="subtitles" indicates that it is a subtitle, srclang specifies the language of the subtitle (zh indicates Chinese), and label gives the subtitle track a descriptive label.

Store the HTML file and related video and subtitle files in the same directory. Then, open the HTML file (such as index.html) through a browser, and you will see the video player, and the subtitles will be displayed automatically when you click play (if the player supports it and the user turns on subtitles).

Most modern browsers and video players support subtitle switching. You can select different subtitles (if there are multiple subtitle tracks) through the subtitle button in the video control bar.

VTT Subtitle Notes

  • Browser Compatibility: Almost all modern browsers (such as Chrome, Firefox, Edge, etc.) support the <video> element and WebVTT subtitles. As long as the VTT file and MP4 file are correctly associated, subtitles should be displayed when playing the video in the browser.

  • Cannot be Directly Embedded in MP4 Files: VTT subtitle files cannot be directly embedded into MP4 files like SRT or other subtitle formats. The MP4 file itself does not contain VTT subtitle tracks. You need to use external subtitle files and associate them through the HTML5 <track> tag.

  • Styling VTT Subtitles: In the browser, WebVTT subtitles can be styled to a certain extent through CSS. If you need to customize the appearance of subtitles, you can further modify the style through JavaScript and CSS.


ASS Subtitle Format

ASS (Advanced SubStation Alpha) is a feature-rich subtitle format widely used in animation, Karaoke subtitles, and other scenarios that require complex subtitle effects. It supports rich style control, including font, color, position, shadow, and outline.

Below is an example of an ass subtitle.

[Script Info]
; Script generated by FFmpeg/Lavc60.27.100
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288
ScaledBorderAndShadow: yes
YCbCr Matrix: None

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,黑体,16,&hffffff,&HFFFFFF,&h000000,&H0,0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,1
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.95,0:00:04.93,Default,,0,0,0,,This is an ancient galaxy,
Dialogue: 0,0:00:05.42,0:00:08.92,Default,,0,0,0,,We have observed it for several years,
Dialogue: 0,0:00:09.38,0:00:13.32,Default,,0,0,0,,The Webb Telescope recently sent many previously undiscovered photos.

ASS Subtitle Structure

A standard ASS subtitle file contains multiple parts:

  1. [Script Info]: Basic information about the script, such as the title, original subtitle author, etc.
  2. [V4+ Styles]: Subtitle style definitions, each style can be referenced by different subtitle lines.
  3. [Events]: Actual subtitle events, defining the appearance time, disappearance time, and specific content of the subtitles.

1. [Script Info] Section

This section contains metadata about the subtitle file, defining some basic information about the subtitles.

ini
[Script Info]
Title: Subtitle Title
Original Script: Subtitle Author
ScriptType: v4.00+
PlayDepth: 0
PlayResX: 1920
PlayResY: 1080
ScaledBorderAndShadow: yes
YCbCr Matrix: None
  • Title: The title of the subtitle file.
  • Original Script: The author information of the original subtitle.
  • ScriptType: Defines the script version, usually v4.00+.
  • PlayResX and PlayResY: Defines the resolution of the video, indicating the display effect of the subtitles at this resolution.
  • PlayDepth: The color depth of the video, generally 0.
  • ScaledBorderAndShadow: Specifies whether to scale the border (Outline) and shadow (Shadow) of the subtitles according to the screen resolution. yes is yes, no is no scaling
  • YCbCr Matrix: Specifies the YCbCr matrix used for color conversion. In video processing and subtitle rendering, YCbCr is a color space commonly used for video encoding and decoding. This setting may affect the display effect of subtitles in different color spaces

2. [V4+ Styles] Section

This section defines the style of the subtitles. Each style can control the font, color, shadow, etc. of the subtitles through fields. The format is as follows:

ini
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,20,&H00FFFFFF,&H0000FFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,1,0,2,10,10,20,1

Field Explanation:

  1. Name: The name of the style, used for reference.

    • Example: Default, indicating that this is the default style.
  2. Fontname: The font name.

    • Example: Arial, the subtitles will use the Arial font.
  3. Fontsize: The font size.

    • Example: 20, the font size is 20.
  4. PrimaryColour: The primary subtitle color, representing the main color of the subtitles (usually the displayed text color).

    • Example: &H00FFFFFF, white font. The color value format is &HAABBGGRR, where AA is the transparency.
  5. SecondaryColour: The secondary subtitle color, usually used for the transition color of Karaoke subtitles.

    • Example: &H0000FFFF, blue.
  6. OutlineColour: The outline color.

    • Example: &H00000000, black outline.
  7. BackColour: The background color, usually used in the case of BorderStyle=3 (subtitles with a background box).

    • Example: &H00000000, black background.
  8. Bold: Bold setting.

    • Example: -1 indicates bold, 0 indicates non-bold.
  9. Italic: Italic setting.

    • Example: 0 indicates non-italic, -1 indicates italic.
  10. Underline: Underline setting.

    • Example: 0 indicates no underline.
  11. StrikeOut: Strikethrough setting.

    • Example: 0 indicates no strikethrough.
  12. ScaleX: Horizontal scaling ratio, 100 indicates normal ratio.

    • Example: 100, indicates no scaling.
  13. ScaleY: Vertical scaling ratio.

    • Example: 100, indicates no scaling.
  14. Spacing: Character spacing.

    • Example: 0, indicates no extra spacing.
  15. Angle: Subtitle rotation angle.

    • Example: 0, indicates no rotation.
  16. BorderStyle: Border style, defines whether the subtitles have an outline or background box.

    • Example: 1 indicates there is an outline but no background box, 3 indicates there is a background box.
  17. Outline: Outline thickness.

    • Example: 1, indicates that the outline thickness is 1.
  18. Shadow: Shadow depth.

    • Example: 0, indicates no shadow.
  19. Alignment: Subtitle alignment, using numbers 1-9 to define different alignment positions.

    • Example: 2, indicates that the subtitles are center-aligned.

    Alignment explanation:

    • 1: Bottom left
    • 2: Bottom center
    • 3: Bottom right
    • 4: Middle left
    • 5: Center
    • 6: Middle right
    • 7: Top left
    • 8: Top center
    • 9: Top right
  20. MarginL, MarginR, MarginV: Left, right, and vertical margins, in pixels.

    • Example: 10, 10, 20, indicates that the left and right margins are 10 pixels, and the vertical margin is 20 pixels.
  21. Encoding: Encoding format, 1 indicates ANSI encoding, 0 indicates default encoding.


3. [Events] Section

This section defines the actual subtitle events, including timestamps, subtitle content, and the style used.

ini
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:05.00,Default,,0,0,0,,This is the first subtitle
Dialogue: 0,0:00:06.00,0:00:10.00,Default,,0,0,0,,This is the second subtitle

Field Explanation:

  1. Layer: Layer, controls the stacking order of subtitles, the larger the number, the higher the layer.

    • Example: 0, indicating the default layer.
  2. Start: Subtitle start time, format is hours:minutes:seconds.milliseconds.

    • Example: 0:00:01.00, indicating that the subtitle starts at 1 second.
  3. End: Subtitle end time.

    • Example: 0:00:05.00, indicating that the subtitle ends at 5 seconds.
  4. Style: The name of the subtitle style used, referring to the style defined in [V4+ Styles].

    • Example: Default, using the style named Default.
  5. Name: Optional field, usually used for character name annotation.

  6. MarginL, MarginR, MarginV: The left, right, and vertical margins of the subtitle, overriding the values ​​defined in the style.

  7. Effect: Subtitle effects, usually used for Karaoke subtitles, etc.

  8. Text: The actual content of the subtitle, you can use ASS format control characters to achieve line breaks, special styles, and positioning, etc.


Example Subtitle Event

ini
Dialogue: 0,0:00:01.00,0:00:05.00,Default,,0,0,0,,{\pos(960,540)}This is the first subtitle
  • {\pos(960,540)}: Controls the subtitle to be displayed at a specific position on the screen (960 pixels horizontally, 540 pixels vertically).
  • This is the first subtitle: The actual subtitle text displayed.

Color Settings in ASS

Taking &HAABBGGRR as an example, &HAABBGGRR is a hexadecimal format used to represent colors, which contains the transparency and the value of the color itself. This format is used to define the color attributes of subtitles, such as PrimaryColour, OutlineColour, and BackColour.

The meaning is as follows:

  • AA: Transparency (Alpha Channel), indicating the transparency of the color.
  • BB: Blue component.
  • GG: Green component.
  • RR: Red component.

The specific byte order is: Alpha (transparency) - Blue - Green - Red.

If you don’t want to use transparency, you can directly ignore the value in the AA position, for example, &HBBGGRR.

Transparency and Color Values

  • Completely Transparent: The color is completely transparent, that is, invisible. The representation is &H00BBGGRR, where the AA part is 00 (completely transparent).

    Example:

    plaintext
    &H00FFFFFF
    • Here, &H00FFFFFF represents completely transparent white. The transparency is 00 (completely transparent), and the color is FFFFFF (white).
  • Completely Opaque: The color is completely opaque, that is, the color display effect is the most obvious. The representation is &HFFBBGGRR, where the AA part is FF (completely opaque).

    Example:

    plaintext
    &HFF000000
    • Here, &HFF000000 represents completely opaque black. The transparency is FF (completely opaque), and the color is 000000 (black).

Actual Color Examples

  1. Completely Transparent Red:

    plaintext
    &H00FF0000
    • Transparency 00 (completely transparent), color FF0000 (red).
  2. Completely Opaque Green:

    plaintext
    &HFF00FF00
    • Transparency FF (completely opaque), color 00FF00 (green).
  • The AA part in &HAABBGGRR controls transparency, and the BB, GG, RR parts control color.
  • Completely Transparent: Transparency 00, for example, &H00FF0000 represents completely transparent red.
  • Completely Opaque: Transparency FF, for example, &HFFFF0000 represents completely opaque red.