Moviegoer: Data Serialization

3 min readOct 25, 2020

This is part of a series describing the development of Moviegoer, a multi-disciplinary data science project with the lofty goal of teaching machines how to “watch” movies and interpret emotion and antecedents (behavioral cause/effect).

The majority of the action In Knives Out (2019) takes place at an estate. There are often establishing shots of the estate, so we can see if these are reused throughout the film

We’ve just begun to use all the clues we learn from a film’s visuals, audio, and subtitles together to understand a film. In the last post, we demonstrated how we could look for characters’ self-introductions to build composite facial encodings, and identify them throughout the entire film. We did this somewhat manually, so now it’s time to be more efficient with how we look up and store data. This is one of the first steps in automating the processes of analyzing film data, and is a precursor to building a prototype.

Treating this as a data science exercise, we’re building the pipeline to automatically preprocess data and store these as serialized pickle files, for later use. As an analogy, we were previously having the machine fast-forward and rewind throughout the film to look for clues. Going forward, we’ll have the machine watch the film a single time, and commit this data to memory.

Currently, we have three subtitle dataframes and two visual dataframes (audio dataframes coming soon!). The subtitle dataframes: srt_df, subtitle_df, and vision_df are easy enough to generate, because it’s just parsing of subtitle text. However, the two visual dataframes: face_df, and vision_df are computationally expensive, because they involve calculating faces and image information from image frames.

To be more efficient, we will go through all of the film’s frames a single time, and calculate all the facial and visual information in one shot. Then, we’ll serialize these dataframes as pickle files. We can then load these as dataframes any time we want, without having to go through the calculation process again. This will save us a lot of time (and computational power).

We can calculate shot clusters and the progression/changes of shots throughout the film

As a result of this efficiency, we discovered we can now effectively cluster facial and shot information throughout the entire film. This means we can identify every time a specific face is onscreen, any time in the film. We can also figure out every time a shot changes, like cutting from one character to another. These types of clustering are a huge help with breaking down the structure of the film, such as identifying specific scenes and which characters are in them. This will be the focus of our next effort.

Wanna see more?

Repository: Moviegoer
Project Intro: Can a Machine Watch a Movie?
Previous Story: Character Identification with a Self-Introduction
Next Story: Scene Details

Moviegoer: Data Serialization

Written by Tim Lee

No responses yet