subtitles

Subtitles: .vtt (WEBvtt) format

You can convert .vtt subtitles to .srt with ffmpeg:

ffmpeg.exe -i subtitles.en.vtt subtitles.en.srt

Extract subtitles from mp4 file

Identify which track contains the subtitles:

$ ffmpeg -i video.mp4
...
    Stream #0:1(und): Audio: [...]
    Stream #0:2(eng): Subtitle: mov_text (tx3g / 0x67337874), 0 kb/s (default)
    Metadata:
      handler_name    : SubtitleHandler
    Stream #0:3: Video:  [...]

Then, replacing -codec:s:X.Y with the actual stream channel found in the output above:

ffmpeg -i video.mp4 -vn -an -codec:s:0.2 srt video.srt

Chinese subtitles not shown correctly

On a video web site like ted.com, Chinese characters are not displayed properly. Instead, you may see white rectangles (aka "Mahjong tiles").

E.g. bug report for the chromium browser:
The subtitiles of the Ted website video are filled with tofu.
https://bugs.chromium.org/p/chromium/issues/detail?id=542516
1

Mplayer and VLC problems with unicode Chinese subtitles
https://bugs.launchpad.net/ubuntu/+source/mplayer/+bug/589576

  • 1. Augustin TODO: upload ted-chinese-subtitles-ok.png, ted-chinese-subtitles-mahjong-tiles.png

SUB: Could not determine file format

Symptoms

Sometimes, mplayer won't recognize the format of a subtitle file.
In the console, it may be displayed:

SUB: Could not determine file format
Cannot load subtitles subtitle_file.srt.

Causes and solutions

Wrong encoding

For example the .srt file is encoded in UTF-16 when the system expects UTF-8.

A simple way to test the validity of the file encoding, is to do:

$ cat -t subtitle_file.srt

If you get some garbage output like:

^@8^@9^@3^@

What is S_HDMV/PGS?

extracting subtitles from mkv file

I have tried extracting some subtitles, but the file does not work.

+ EBML head
|+ EBML version: 1
|+ EBML read version: 1
|+ EBML maximum ID length: 4
|+ EBML maximum size length: 8
|+ Doc type: matroska
|+ Doc type version: 4
|+ Doc type read version: 2

| + A track
| + Track number: 14 (track ID for mkvmerge & mkvextract: 13)
| + Track UID: 23
| + Track type: subtitles
| + Default flag: 0
| + Lacing flag: 0
| + Codec ID: S_HDMV/PGS
| + Language: chi
| + Content encodings
| + Content encoding
| + Content compression

Extract subtitles from mkv file

Install mkvtoolnix.
http://www.bunkus.org/videotools/mkvtoolnix

Use mkvinfo to find which channel contains the subtitles you want to extract:

$ mkvinfo movie.mkv
...
| + A track
|  + Track number: 3
|  + Track UID: 1234
|  + Track type: subtitles
....

Then:

mkvextract tracks movie.mkv 3:movie.srt

pysrt: no option to merge two .srt files

pysrt has an option to split a srt file, but the opposite operation, merging two .srt files, is not possible.

The wiki page shows a method to do so, but the numbering of the entries is wrong: at the middle, where we second half starts, the numbering starts at 1 again.

1
00:00:02,002 --> 00:00:24,441
...

2
00:00:43,210 --> 00:00:47,214
...

1
01:04:04,667 --> 01:04:06,085
...

2
01:04:08,629 --> 01:04:11,966
...

How to merge two .srt files

Suppose we have two subtitle files: CD1.srt and CD2.srt. We want to merge them into one single file CD.srt.

We assume that CD1.srt is properly synchronized. We need to shift all the times in CD2.srt.
We need to find out the duration to add to CD2.srt.

Method 1:
ffmpeg -i CD1.avi
Check the line with the Duration of the file.

Method2:
Simply watch the movie and fast-forward to the beginning of CD2. Note the time at which the first line of CD2.srt appears. Subtract the time which was set for that line.

Install dev-python/pysrt then do:

Subtitle editors

About subtitles

This page is about subtitle editors. For other subtitle-related issues, check the wiki page of your video player. For mplayer, check:
http://linux.overshoot.tv/wiki/multimedia/mplayer

subtitleeditor

http://linux.overshoot.tv/media-video/subtitleeditor
A gtk-based subtitle editor, currently available in repositories.

GStreamer plugins

You may get the following message:
GStreamer plugins missing.
The playback of this movie requires the following decoders which are not installed:
XVID MPEG-4 decoder

Syndicate content