Moviegoer: Key Dialogue

Tim Lee
3 min readDec 21, 2020

This is part of a series describing the development of Moviegoer, a multi-disciplinary data science project with the lofty goal of teaching machines how to “watch” movies and interpret emotion and antecedents (behavioral cause/effect).

In Rogue One (2016), Jyn declares (in first-person), her interpretation of events

We can identify the beginnings and ends of scenes, which means we can isolate the dialogue and analyze them as a single, self-contained conversation. This is most easily done by looking at the subtitles, which are ground-truth transcriptions of the dialogue — no audio recognition required. We’ve previously designed a system to load a film’s .srt subtitle file into pandas dataframes using the Python library pysrt. We’ll use the Python library SpaCy to conduct NLP analysis.

Our goal is to find key dialogue, locating sentences that are potentially important in terms of character information. We have a list of sentences of a two-character dialogue scene, and we can go through each to determine if it contains key dialogue. The code is available here.

First-Person Declarations

First, we’ll want to identify first-person declarations, which might be a character declaring something. Examples include:

  • “I can’t handle a big speech right now.”
  • “But I love you.”
  • “I pushed you away because I’m dumb.

We look for sentences where the subject of the sentence (designated by the dependency being “nsubj”), is “I”.

These are important because a character might be declaring something of personal note. For example, the first-person declaration of “I’m bringing an extra jacket” is more informative of the character than generally saying “It’s cold out”. We know the character is personally cold, which is more specific than knowing it’s cold outside.

In Parasite (2019), a first-person declaration

Second-Person Addresses

Next, we’ll want to look for second-person addresses, where one character speaks to another, addressing them directly as “you”. Examples include:

  • “You were right.”
  • “I think you should leave.”
  • “It’s because you’re lonely.”

We look for sentences where the subject of the sentence (designated by the dependency being “nsubj”), is “you”.

These are important because, in this two-character scene, the sentence spoken by one character directly pertains to the other. It may be either a subjective assessment or an objective fact, but it’s information about one of the scene’s characters, coming directly from the other character.

In Knives Out (2019), a second-person address

Directed Questions

Finally, we’ll look for Directed Questions, where one character asks a question to the other, addressing them as “you”. Examples include:

  • “How are you?”
  • “What are you doing here?”
  • “What did you study?”

We look for sentences where the subject of the sentence (designated by the dependency being “nsubj”), is “you”, and ending in a question mark.

These are very informative because it elicits a personal response from the other character. We can also store the responses to each question. In this example from Lost in Translation (2003), we successfully learn more about the main characters as they converse for the first time. They learn about each other by asking personal, directed questions.

In Lost in Translation (2003), a Directed Question
And the response

--

--

Tim Lee

Unlocking the emotional knowledge hidden within the world of cinema.