Subtle Subtitles

Arthur Clarke's tediously repeated claim that advanced technology seems like black magic is perhaps best appreciated by realizing that, although a thing must work somehow, we've really never bothered to look under the hood, or even wonder how it is possible. We just accept things as we see them.

Take subtitling on television and videos as an example. I vaguely remember seeing brief indicators about "Closed-captioned for the hearing impaired", perhaps 15 or 20 years ago. Then I saw in bars, where you couldn't hear a thing, that there were subtitles on the TV shows so you wouldn't miss a gem from a post-game interview. And then I saw a "caption" button on the remote control for my new TV. One day, I figured out how it worked.

Since then, I've wondered what's going on. Who does the subtitling? It's not done for every show or movie. And clearly, some people are more careful than others. In some movies, the subtitler inserts comments about noises, indicates that the speaker is offscreen, mentions the music playing and even gives the lyrics. In others, like a badly dubbed movie, the character will speak 20 words, but only seven will show up on the subtitle, occasionally merely paraphrased.

One of the most amazing things is how a live show can be subtitled. You can see the words blurt out spasmodically, trying to keep up, with typos and occasional spasms of gibberish. My best guess is that this is done by very very good stenographers. You might think such people don't exist, but they do. By law, there must be a record of each legal proceeding. As far as I know, no one trusts videotape or audiotape. Instead, a person sits in the front of the court with a special machine and types really really fast. Actually, they don't type like you and I might do. Instead, each time they press down, they are pressing down SEVERAL letter keys at once, and in general, these several keys are the code for an entire word. Vowels are ignored. Naturally, you would hope that the code for some words would match up with the word. We would hope that "dog" is spelled "DG", but then how do you spell "God", since you again have to press the same two keys and they don't have an ordering?

The rules for translating words into a code at a high speed, and for retranslating them into English so that a readable transcript can be prepared, are known as your theory. It's something you learn in training, but there is no universally accepted theory. Instead, what you learn from your trainers is a little different from what they taught last year, and what the people teach over in the next city, and as you are getting better yourself, you come up with different shortcuts and a personalized theory of your own. In other words, if the court stenographer drops dead before making the readable transcript, good luck! Better find someone who went to the same school with them!

Naturally, a whole computerized system has developed for attempting to make a first pass at converting the stenographic record back into English. But, for obvious reasons, it is highly imperfect, and requires the original stenographer to do careful editing. But I assume that this is the system employed by real time TV captioners: they sit there with a version of a stenographic machine that they are comfortable with, press down keys like mad, and whatever they press goes not to paper tape but to a translator that spits out its best guess for the intended word.

Surely the situation is more relaxed when, say, the captions for a movie are being prepared. (I have noticed, in particular, that in a movie like Grapes of Wrath, the captions will be carefully abbreviated. Certain phrases will be dropped, or in some cases, actually replaced but an equivalent but shorter rewording.) But nonetheless, time is money, so presumably a similar setup is used, a limited time is available, and while the stenographers do have a chance to edit their work, they are liable to miss some glaring mistakes.

So just when you start to think you're witnessing another case of Arthur C Clarke's black magic, little things go wrong that give you a laugh, and help you make a guess as to how things work. Clearly, some subtitlers don't check their work, don't listen carefully, get fumbly fingers, or use some kind of automatic abbreviation system that can go terribly wrong.

Here are a few instances that I have seen:

Last modified on 08 July 2014.