Removing formatting from srt files
This is a wiki page. Be bold and improve it!
If you have any questions about the content on this page, don't hesitate to open a new ticket and we'll do our best to assist you.
This page will describe how to remove formatting, time codes, closed caption from a .srt subtitle file.
subtitleeditor
subtitleeditor has an option to export as plain text. Simply open the .srt file with subtitleeditor then go to File >> Export >> Export Plain Text.
Pros:
It's very easy!
Cons:
* It does not strip tags like <i>
, etc.
* Dialogues written on multiple lines within the same time code are kept on separate lines. (In some instances, this can be considered a pro).
* There is no space left between lines of dialogues in consecutive time codes.
E.g.
15
00:01:37,460 --> 00:01:41,190
Keep going
till you <i>smell money</i>
or step in chocolate.
16
00:01:42,800 --> 00:01:45,230
Okay. Thank you.
will be output as:
Keep going
till you <i>smell money</i>
or step in chocolate.
Okay. Thank you.
Equivalent CLI command with sed
Note that the following one-line sed command achieves almost exactly as the above, the only difference being that a blank line is left between dialogues from consecutive time codes:
sed -r '/^[0-9]+$/{N;d}' subtitles.srt > dialogue.txt
outputs:
Keep going
till you <i>smell money</i>
or step in chocolate.
Okay. Thank you.
Custom script
Issues related to this page:
Project | Summary | Status | Priority | Category | Last updated | Assigned to |
---|---|---|---|---|---|---|
Linux software | script to remove time codes from srt file | active | normal | feature request | 8 years 43 weeks | |
Programming | Regex: what does {N;d} mean? | active | normal | support request | 8 years 43 weeks |