..

Changing timestamps in a VTT file with Python

Automation is one of the best reasons to learn to code, which is why Al Sweigart’s Automate the Boring Stuff with Python is such a great introduction to programming, especially for office workers without a CS degree. Completing about half of the course already armed me with enough knowledge to automate some seriously dull, repetitive tasks such as:

  • Converting file formats, e.g. M4A to MP3 files;
  • Finding hyperlinks in text files and opening them in the browser;
  • Naming files according to a convention, e.g. containing the date.

But in addition to automation, text manipulation is another great reason to learn even rudimentary coding. Say you have a Word doc open and you’re working on a research paper. It turns out some of the math you did was incorrect, say by a factor of 5. That’s not so bad if there’s two or three statistics to correct. But what if you’re working on something much more involved? What if you could perform math on all the integers in your file?

Here’s an example of a problem I recently solved using the kind of text manipulation Python excels at:

Good captions are a must for accessible video uploads. If you hire a captioning service for your videos, once your video is done, you’ll get what’s called a ‘WebVTT’ file along with your recording. It’s quite simple: A plain text file with time stamps, followed by a caption in the next line:

WEBVTT


00:00:00.951 --> 00:00:05.751
Here are some captions.

Adding this file to your YouTube upload is simple. Just upload the VTT when prompted by YouTube Studio, and be sure to check off the ‘with timings’ option.

Simple, right? But what if you need to edit your video before uploading? Say you need to chop off the first ten minutes. Easy enough to do with any video editing software. But if your video editing software doesn’t support importing and exporting VTT files, you now have a caption file that’s totally out of sync.

So here’s our problem: We’ve got a set of timestamps that require we perform some basic arithmetic. We need to replace the timestamp with the modified timestamp. And we need to loop through the file to take care of all the timestamps.

A caption file easily includes thousands of lines, so doing that manually is out of the question. You could potentially use a paid service, or even a free online tool. But a paid service would take time and money, and who knows what’s being done with your data when you use a free online service?

This is a really niche problem. But no matter how niche, nine times out of ten I’ve found there’s no problem too niche that someone hasn’t uploaded a script in Python online to deal with it. And indeed, it didn’t take long on DuckDuckGo to find an excellent resource: A WebVTT timestamp adjustment function written by Nat Dunn. See that script here.

How does it work? You’ll notice the VTT file timestamps contain an arrow, ‘–>’, so the lines with numbers we need to change are easily identifiable. Nat’s script loops through the file, does nothing to lines without the arrow, and converts timestamps in lines with the arrow to timedelta objects. You can do your math on that object - say, strip 300 seconds - and spit out the updated timestamp. Very cool trick.

This script worked for me in a pinch, but there was a slight problem: It spit out timestamps with three trailing zeros. YouTube didn’t like this. In a pinch I performed a global find/replace to strip out those three zeros in my text editor but that’s not a good solution. What if your captioner used the number ‘1,000,000’ during transcription? If you delete all groups of ‘000’ you’re now left with ‘1,,’.

The solution was simple - you can see my modified version of the script at my Github here. I tweaked the script to delete the last three characters of the time stamps:

vtt_output += f"{start_time[:-3]} --> {end_time[:-3]}\n"

And for the sake of convenience, instead of writing to a new file, I used the pyperclip module to copy the text of the new VTT file straight to my clipboard instead of writing to a new file. Now I could paste the output to a new file, and delete the caption text I don’t need.

This script could be much more sophisticated, e.g. it could be configured to output a new file that ignores the captions you don’t need. But I don’t work with these kinds of files often enough to really need anything more complicated than what I’ve got here.

So that’s how I solved my VTT file issue within about half an hour. I realize this is a super niche problem, but it’s just the sort of problem office workers tend to encounter constantly. It’s also the sort of problem that usually require a time-consuming solution: either waiting for someone else to fix the it or having to spend hours manually fixing the problem yourself.

Sometimes I wonder if learning to code was worth the time investment. Have I really saved time in the long run by automating repetitive tasks, given the hundreds of hours I’ve invested in learning Python so far? An objective comparison to the counterfactual situation where I never learned to code is impossible.

But the above example shows there are many great reasons to learn to code! At heart, programming puts you in control, and the more you invest in mastering a programming language, the more problems you can tackle. There’s really no feeling quite like it.