Moviegoer: Subtitle Features — Data Cleaning

This is part of a series describing the development of Moviegoer, a multi-disciplinary data science project with the lofty goal of teaching machines how to “watch” movies and interpret emotion and antecedents (behavioral cause/effect).

Subtitles are not traditionally part of the filmmaking process. They do not appear when screened in movie theaters, and are only added after film production, localized in different languages for home video or streaming services. There is little creativity or decision-making in their creation — they’re simply a ground-truth…