This research report examines the occurrence of listener visual cues during nonunderstanding episodes and investigates raters’ sensitivity to those cues. Nonunderstanding episodes (n = 21) and length-matched understanding episodes (n = 21) were taken from a larger dataset of video-recorded conversations between second language (L2) English speakers and a bilingual French-English interlocutor (McDonough, Trofimovich, Dao, & Abashidze, 2018). Episode videos were analyzed for the occurrence of listener visual cues, such as head nods, blinks, facial expressions, and holds. Videos of the listener’s face were manipulated to create three rating conditions: clear voice/clear face, distorted voice/clear face, and clear voice/blurred face. Raters in the same speech community (N = 66) were assigned to a video condition to assess the listener’s comprehension. Results revealed differences in the occurrence of listener visual cues between the understanding and nonunderstanding episodes. In addition, raters gave lower ratings of listener comprehension when they had access to the listener’s visual cues.