Skip to content

SRT / VTT / ASS Subtitle Formats

SRT (SubRip Subtitle), VTT (WebVTT), and ASS (Advanced SubStation Alpha) are three very common subtitle formats. This article provides a detailed introduction to each subtitle format, including their attributes and settings.

SRT Subtitle Format

SRT is a simple and widely used subtitle format, with the extension .srt. It is especially popular in video players and subtitle editors. Its basic structure includes a subtitle number, timestamp, and subtitle text. Subtitle attributes cannot be directly defined within SRT files (e.g., color, font). They usually depend on the player's default settings or external style files.

SRT Format Structure

Each subtitle block in an SRT file is arranged in the following format:

  1. Subtitle Number (incrementing line by line)
  2. Timestamp (display start and end times, accurate to milliseconds)
  3. Subtitle Content (can contain multiple lines of text)
  4. A blank line (to separate subtitle blocks)

SRT Example

plaintext
1
00:00:01,000 --> 00:00:04,000
Hello, my friend!

2
00:00:05,000 --> 00:00:08,000
The weather is nice today, don't you think?

Detailed Explanation

  • Subtitle Number: Each subtitle block has a unique number that increments sequentially. The number starts from 1 and must be an integer.

    • Example: 1
  • Timestamp: The format is HH:MM:SS,mmm, where HH is hours, MM is minutes, SS is seconds, and mmm is milliseconds. The timestamp consists of two times separated by -->, with a space on each side of the symbol, indicating the start and end times of the subtitle.

    • Example: 00:00:01,000 --> 00:00:04,000
  • Subtitle Content: The subtitle text can contain one or more lines and is displayed on the video. SRT does not support formatted text, such as colors or font sizes. These must be defined through player settings or external style files.

    • Example: Hello, my friend!

SRT Format Limitations

  • No Text Formatting Support: Cannot directly set colors, fonts, etc. Styling requires player or other tools for adjustment.

VTT Subtitle Format

WebVTT (Web Video Text Tracks) is a subtitle format designed for HTML5 video elements, specifically for web videos. It is more powerful than the SRT format, supporting attributes such as styles, annotations, multiple languages, and position information. The subtitle file format extension is .vtt. However, it cannot be directly embedded in the video; it must be referenced in the <video> tag in HTML.

VTT Format Structure

VTT files are similar to SRT files but with more features. A VTT file starts with WEBVTT followed by a blank line and uses a . period instead of a , comma to separate seconds and milliseconds.

VTT Example

plaintext
WEBVTT

1
00:00:01.000 --> 00:00:04.000
Hello, <b>friends!</b>

2
00:00:05.000 --> 00:00:08.000
Today's rain is <i>very, very heavy</i>.

Detailed Explanation

  • WEBVTT Declaration: All VTT files must start with WEBVTT to declare the file format.

    • Example: WEBVTT
  • Subtitle Number: The subtitle number is optional, unlike in the SRT format where it is required. It serves to distinguish the order of each subtitle segment, but it can be omitted in VTT.

  • Timestamp: The format is HH:MM:SS.mmm, where HH is hours, MM is minutes, SS is seconds, and mmm is milliseconds. A . period is used to separate seconds and milliseconds instead of a , comma. The timestamp consists of two times separated by -->, also with a space on each side.

    • Example: 00:00:01.000 --> 00:00:04.000
  • Subtitle Content: The subtitle text can contain HTML tags for formatting text, such as bold (<b>), italics (<i>), underline (<u>), etc.

    • Example:
      plaintext
      Hello, <b>friends!</b>

Other Features Supported by VTT

  1. Styles (CSS):

    • VTT supports adjusting text styles via CSS, such as color, font size, and position. Styles can be defined in HTML via the <style> tag or in external CSS files.
    • Example:
      plaintext
      <c.red>Hello, friends!</c>
      Defining .red { color: red; } in HTML will display Hello, world! in red.
  2. Position Information:

    • VTT supports setting the specific position of subtitles using attributes such as position and line.
    • Example:
      plaintext
      00:00:01.000 --> 00:00:04.000 position:90% line:10%
  3. Annotations:

    • VTT supports adding annotations to the file, starting with NOTE.
    • Example:
      plaintext
      NOTE This line is a comment and will not be displayed.
  4. Multi-Language Support:

    • VTT can support multi-language subtitles through metadata or the HTML5 <track> tag.

Advantages of the VTT Format

  • Text Formatting: Supports HTML tags for simple text formatting, such as bold and italics.
  • Styling and Positioning: Styles and positions of subtitles can be set through CSS.
  • Annotations and Metadata: Supports adding annotation information without affecting subtitle display.
  • Web Compatibility: Specifically designed for HTML5 video, suitable for web environments.

SRT vs. VTT Comparison

FeatureSRTVTT
File HeaderNoneWEBVTT followed by a blank line
Timestamp FormatHH:MM:SS,mmm, comma separates seconds and millisecondsHH:MM:SS.mmm period separates seconds and milliseconds
Text Formatting SupportNoSupports HTML tags, such as <b>, <i>
Subtitle NumberRequiredOptional
Style and Position SupportDepends on player or external style filesBuilt-in CSS style support, supports position information
AnnotationsNoSupports NOTE annotations
Advanced Features SupportedBasic subtitle features onlySupports Karaoke, annotations, styles, etc.
Use CasesLocal video files, simple subtitle displayHTML5 video, web subtitles, complex subtitle display
Embedding in VideoEmbeddable in video filesCannot be embedded in video, can only be used within <video> element on webpage

VTT (WebVTT) subtitle format cannot be directly embedded into MP4 files, but VTT files can be associated with MP4 videos through the HTML5 <track> tag. When opening the MP4 in a browser, these associated subtitles can be displayed normally.

Playing MP4 with VTT Subtitles in a Browser

In HTML5, an MP4 video can be loaded via the <video> element, and VTT subtitles can be associated with the video using the <track> element.

HTML Example:

html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title></title>
</head>
<body>
    <video controls width="600">
        <source src="video.mp4" type="video/mp4">
        <track src="subtitles.vtt" kind="subtitles" srclang="en" label="English">
        Your browser does not support the video tag.
    </video>
</body>
</html>

HTML Element Explanation

  • <video>: Used to embed a video file. The controls attribute allows users to control video playback (play/pause, etc.).
  • <source>: Defines the path and type of the video file, using MP4 here.
  • <track>: Defines the subtitle file, with the src attribute pointing to the path of the VTT file, kind="subtitles" indicating that it is a subtitle, srclang specifying the language of the subtitle (en for English), and label giving the subtitle track a descriptive label.

Place the HTML file and the related video and subtitle files in the same directory. Then, open the HTML file (e.g., index.html) in a browser. You will see the video player, and the subtitles will be displayed automatically when you click play (if the player supports it and the user has enabled subtitles).

Most modern browsers and video players support subtitle switching. You can select different subtitles (if there are multiple subtitle tracks) via the subtitle button in the video control bar.

VTT Subtitle Considerations

  • Browser Compatibility: Almost all modern browsers (such as Chrome, Firefox, Edge, etc.) support the <video> element and WebVTT subtitles. As long as the VTT file and MP4 file are correctly associated, the subtitles should be displayed when playing the video in the browser.

  • Cannot Be Directly Embedded in MP4 File: VTT subtitle files cannot be directly embedded into MP4 files like SRT or other subtitle formats. MP4 files themselves do not contain VTT subtitle tracks. You need to use an external subtitle file and associate it through the HTML5 <track> tag.

  • VTT Subtitle Styling: In a browser, WebVTT subtitles can be styled to some extent via CSS. If you need to customize the subtitle appearance, you can further modify the styles via JavaScript and CSS.


ASS Subtitle Format

ASS (Advanced SubStation Alpha) is a feature-rich subtitle format widely used in anime, karaoke subtitles, and other scenarios that require complex subtitle effects. It supports rich style controls, including fonts, colors, positions, shadows, and outlines.

Below is an example of an ASS subtitle:

[Script Info]
; Script generated by FFmpeg/Lavc60.27.100
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288
ScaledBorderAndShadow: yes
YCbCr Matrix: None

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,黑体,16,&hffffff,&HFFFFFF,&h000000,&H0,0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,1
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.95,0:00:04.93,Default,,0,0,0,,This is an ancient galaxy,
Dialogue: 0,0:00:05.42,0:00:08.92,Default,,0,0,0,,We have been observing it for several years,
Dialogue: 0,0:00:09.38,0:00:13.32,Default,,0,0,0,,The Webb Telescope recently sent back many previously undiscovered photos.

ASS Subtitle Structure

A standard ASS subtitle file consists of several parts:

  1. [Script Info]: Basic information about the script, such as the title and original subtitle author.
  2. [V4+ Styles]: Style definitions for subtitles, with each style being referenced by different subtitle lines.
  3. [Events]: Actual subtitle events, defining the appearance time, disappearance time, and specific content of the subtitles.

1. [Script Info] Section

This section contains the metadata of the subtitle file, defining some basic information about the subtitle.

ini
[Script Info]
Title: Subtitle Title
date: 2024-01-22 14:33:00
description: 
Original Script: Subtitle Author
ScriptType: v4.00+
PlayDepth: 0
PlayResX: 1920
PlayResY: 1080
ScaledBorderAndShadow: yes
YCbCr Matrix: None
  • Title: The title of the subtitle file.
  • Original Script: The author information of the original subtitle.
  • ScriptType: Defines the script version, usually v4.00+.
  • PlayResX and PlayResY: Define the resolution of the video, indicating the display effect of the subtitles at that resolution.
  • PlayDepth: The color depth of the video, generally 0.
  • ScaledBorderAndShadow: Specifies whether to scale the outline and shadow of the subtitles according to the screen resolution. yes for scaling, no for not scaling.
  • YCbCr Matrix: Specifies the YCbCr matrix used for color conversion. In video processing and subtitle rendering, YCbCr is a color space commonly used for video encoding and decoding. This setting may affect the display effect of subtitles in different color spaces.

2. [V4+ Styles] Section

This section defines the styles of the subtitles, with each style allowing fields to control the font, color, shadow, etc. of the subtitles. The format is as follows:

ini
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,20,&H00FFFFFF,&H0000FFFF,&H00000000,&H00000000,-1,0,0,0,100,100,0,0,1,1,0,2,10,10,20,1

Field Explanation:

  1. Name: The name of the style, used for referencing.

    • Example: Default, indicating this is the default style.
  2. Fontname: The font name.

    • Example: Arial, the subtitles will use the Arial font.
  3. Fontsize: The font size.

    • Example: 20, the font size is 20.
  4. PrimaryColour: The primary subtitle color, representing the main color of the subtitles (usually the color of the displayed text).

    • Example: &H00FFFFFF, white font. The color value format is &HAABBGGRR, where AA is the transparency.
  5. SecondaryColour: The secondary subtitle color, usually used as the transition color in karaoke subtitles.

    • Example: &H0000FFFF, blue.
  6. OutlineColour: The outline color.

    • Example: &H00000000, black outline.
  7. BackColour: The background color, usually used when BorderStyle=3 (subtitles with a background box).

    • Example: &H00000000, black background.
  8. Bold: Bold setting.

    • Example: -1 indicates bold, 0 indicates non-bold.
  9. Italic: Italic setting.

    • Example: 0 indicates non-italic, -1 indicates italic.
  10. Underline: Underline setting.

    • Example: 0 indicates no underline.
  11. StrikeOut: Strikethrough setting.

    • Example: 0 indicates no strikethrough.
  12. ScaleX: Horizontal scaling ratio, 100 indicates normal ratio.

    • Example: 100, indicating no scaling.
  13. ScaleY: Vertical scaling ratio.

    • Example: 100, indicating no scaling.
  14. Spacing: Character spacing.

    • Example: 0, indicating no extra spacing.
  15. Angle: Subtitle rotation angle.

    • Example: 0, indicating no rotation.
  16. BorderStyle: Border style, defining whether the subtitles have an outline or background box.

    • Example: 1 indicates an outline but no background box, 3 indicates a background box.
  17. Outline: Outline thickness.

    • Example: 1, indicating the outline thickness is 1.
  18. Shadow: Shadow depth.

    • Example: 0, indicating no shadow.
  19. Alignment: Subtitle alignment, using numbers 1-9 to define different alignment positions.

    • Example: 2, indicating the subtitles are center-aligned.

    Alignment explanation:

    • 1: Bottom left
    • 2: Bottom center
    • 3: Bottom right
    • 4: Middle left
    • 5: Center
    • 6: Middle right
    • 7: Top left
    • 8: Top center
    • 9: Top right
  20. MarginL, MarginR, MarginV: Left, right, and vertical margins, in pixels.

    • Example: 10, 10, 20, indicating left and right margins of 10 pixels and a vertical margin of 20 pixels.
  21. Encoding: Encoding format, 1 indicates ANSI encoding, 0 indicates default encoding.


3. [Events] Section

This section defines the actual subtitle events, including timestamps, subtitle content, and the styles used.

ini
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.00,0:00:05.00,Default,,0,0,0,,This is the first subtitle
Dialogue: 0,0:00:06.00,0:00:10.00,Default,,0,0,0,,This is the second subtitle

Field Explanation:

  1. Layer: Layer level, controlling the stacking order of subtitles, with larger numbers indicating higher layers.

    • Example: 0, indicating the default layer.
  2. Start: Subtitle start time, in the format hours:minutes:seconds.milliseconds.

    • Example: 0:00:01.00, indicating the subtitle starts at 1 second.
  3. End: Subtitle end time.

    • Example: 0:00:05.00, indicating the subtitle ends at 5 seconds.
  4. Style: The name of the subtitle style used, referencing the style defined in [V4+ Styles].

    • Example: Default, using the style named Default.
  5. Name: Optional field, usually used for character name labeling.

  6. MarginL, MarginR, MarginV: Left, right, and vertical margins of the subtitle, overriding the values defined in the style.

  7. Effect: Subtitle effect, usually used for karaoke subtitles, etc.

  8. Text: The actual content of the subtitle, which can use ASS format control codes to achieve line breaks, special styles, and positioning, etc.


Example Subtitle Event

ini
Dialogue: 0,0:00:01.00,0:00:05.00,Default,,0,0,0,,{\pos(960,540)}This is the first subtitle
  • {\pos(960,540)}: Controls the subtitle to be displayed at a specific position on the screen (960 pixels horizontally, 540 pixels vertically).
  • This is the first subtitle: The actual subtitle text displayed.

Color Settings in ASS

In &HAABBGGRR, &HAABBGGRR is a hexadecimal format used to represent colors, which includes the transparency of the color and the color value itself. This format is used to define the color attributes of subtitles, such as PrimaryColour, OutlineColour, and BackColour.

The meaning is as follows:

  • AA: Alpha (transparency channel), indicating the transparency of the color.
  • BB: Blue component.
  • GG: Green component.
  • RR: Red component.

The specific byte order is: Alpha (transparency) - Blue - Green - Red.

If you don't want to use transparency, you can directly ignore the value at the AA position, for example, &HBBGGRR.

Transparency and Color Values

  • Fully Transparent: The color is completely transparent, i.e., invisible. The representation is &H00BBGGRR, where the AA part is 00 (fully transparent).

    Example:

    plaintext
    &H00FFFFFF
    • Here, &H00FFFFFF represents fully transparent white. The transparency is 00 (fully transparent), and the color is FFFFFF (white).
  • Fully Opaque: The color is completely opaque, i.e., the color display is most obvious. The representation is &HFFBBGGRR, where the AA part is FF (fully opaque).

    Example:

    plaintext
    &HFF000000
    • Here, &HFF000000 represents fully opaque black. The transparency is FF (fully opaque), and the color is 000000 (black).

Actual Color Examples

  1. Fully Transparent Red:

    plaintext
    &H00FF0000
    • Transparency 00 (fully transparent), color FF0000 (red).
  2. Fully Opaque Green:

    plaintext
    &HFF00FF00
    • Transparency FF (fully opaque), color 00FF00 (green).
  • The AA part in &HAABBGGRR controls the transparency, and the BB, GG, RR parts control the color.
  • Fully Transparent: Transparency 00, for example, &H00FF0000 represents fully transparent red.
  • Fully Opaque: Transparency FF, for example, &HFFFF0000 represents fully opaque red.