Listening vs. Comprehension

or why Game of Thrones isn’t enough to make you an expert listener

Mendelsohn and Wilt found that the daily communication of an individual is typically broken down like this: roughly 50% is spent listening, 30% speaking, 11% reading, and 9% writing.  Granted, these studies were done before the dawn of texting, but you get the idea. Despite the amount of time we dedicate to listening, our understanding of it as a skill and process is in a kind of fog; theories contradict one another, study results are ambiguous, and testing methods are a bit questionable.  

Considering all of this uncertainty, how can I tell my students with any sort of confidence that they won’t become amazing listeners by just watching Game of Thrones in English and listening to the BBC?  The short version: to efficiently improve basic listening skills, students need comprehensible input without contextual visual support, self-checking with written materials, and repetition*.  Let me explain.


Whichever definition you look at (and there’s a bunch of them), listening is usually described as a multi-stage process, during which a listener receives sounds from a speaker (we’ll call it input), perceives the sound as comprehensible speech, and then constructs, interprets, or assigns meaning to it (we’ll call this intake**).  For our purposes, we can simplify this by calling the first part of the process the listening-phase and the latter stages the comprehension-phase

Let’s break down comprehension just a little bit more to get a slightly better grasp on what’s going on. Both Anderson and Rivers have similar descriptions of the cognitive processes at work here. Firstly, we have the perception stage.  This is when we identify sound as actual language–something we can work with and process.  Next we have parsing.  This is essentially analysis; now that we’ve recognized sound as a language we know, we look at the elements and piece them together.  Lastly, we have utilization.  This is where we not only put all these pieces of information together, but we store them in our memory and begin assigning meaning to it (we have to do something with this input after all).  This is fairly universal to comprehending input (and turning it into intake).  As pointed out by Pearson and Fielding, it’s pretty similar to what goes on while reading.  Again, we have a number of processes that are being simultaneously orchestrated in order to make heads or tails of input. The obvious difference is the level of control you have with reading; listening has a number of wildcards in the mix that can strongly influence your comprehension (for better or for worse), which reading simply lacks.  

Processing Input

Brindley outlines seven variables that can affect how a listener comprehends input. They basically include the listener’s understanding of vocabulary, syntax, and the topic; how quickly the language is being spoken (rate of speech); contextual support, which could be something visual; background noise; and memory.  The first two (i.e. vocab and syntax) we can link to the listener’s overall understanding of the language being spoken (let’s call them linguistic skills), and background knowledge and context can both be attributed to the listener’s mastery of logic and drawing conclusions (what I’ll refer to as whodunit skills).  These are the main tools that are used to help us process information and make sense of it all.  

Traditionally, information is thought to be processed according to two models; bottom-up processing and top-down processing.  Bottom-up relies on the listener’s linguistic skills to comprehend input. This is clear cut, context-free comprehension.  The better your vocabulary and understanding of how language is constructed, the better prepared you are to engage in conversation.  Top-down processing relies on those whodunit skills.  In this case, the listener relies more on their own experiences and knowledge to build a framework for what could be said in the appropriate situation.  It’s kind of like predicting what information is being given in a certain situation.  

So what’s the deal with these models?  Looking strictly at bottom-up processing, there have a been a lot of criticisms from linguistics like Osada (and many more) who say that putting too much focus on words and constructions could actually distract the listener from getting the general message.  Remember how rate of speech was one of those factors that affect comprehension?  Here’s where it could become a major issue.  If a speaker is creating input faster than the listener can parse it, then we have a major breakdown in comprehension as the speaker has moved on to the next topic and the listener is still deciphering the first message.  At the same time, Carrell and Eisterhold (1983) point out that if a listener’s knowledge of a topic doesn’t match the speaker’s or if new knowledge is being presented, then top-down processing may lead the listener to the wrong conclusion or a misunderstanding.

Looking at these issues, it’s natural that, as Mendelsohn points out, in real life, people listen in different ways depending on the situation they’re in.  If you’ve been involved in conversations where you were not as in-the-know about a topic as others, you may have relied more on context to fill in your linguistic gaps, or maybe you relied on vocabulary to compensate for your lack of background knowledge.  Enter the interactive-compensatory model, where listeners operate according to both models simultaneously to get the full picture.  This is important.  As Gilakjani and Ahmadi point out, it’s natural not to understand every single word that’s being said. Since we want to maximize our intake, we develop strategies and rely on context to maximize our comprehension.

Unfortuanetly, even with top-down support, we may meet some major difficulties if our linguistic skills are still fairly low.  Thinking back to that 50%, how much of that input doesn’t have strong visual context to support it?  How much do we listen to acquire new information and not reinforce old background knowledge?  Sadly, context is a bit like Gandalf: when he’s there, things run pretty smoothly, but when he’s gone, times can get desperate.  These are dialogs.  These are interviews and instructions. These are the daily interactions that professional and personal relationships are built on.  

We need a fair understanding of vocabulary and grammar to be able to properly understand input, whether it be listening or reading (Meccarty). Next, we’ve got to be able to separate noise from input and identify speech (Hulstijn).  After that, we’ve got to be able to fluently listen, i.e. we’ve got to be able to automatically identify words and speech when they’re heard (Segalowitz & Segalowitz).  These are all those bottom-up processing linguistic skills.  I’m not saying those whodunit skills aren’t equally important, but they are mainly comprehension skills, and not basic listening skills. But how can we separate the one from the other?

Listening vs. Comprehension

Most listening tests are fairly straightforward: you listen to a passage, a dialog, or a lecture and then you’re asked a series of related questions.  This may seem fairly open and shut, but Brown and Yule observed some key flaws with this method, namely that it’s hard to distinguish what’s relevant in these listening tests.  

In most cases, it’s not just the listening skill that’s under scrutiny; it’s the listener’s grasp of vocabulary and grammar, their inferencing skills, background knowledge, etc. You may hear an announcement that a train station is closed on Tuesday, and when asked “What does this mean?” your only options may be: A. Visit the city on Wednesday, B. Use an alternative Metro exit, or C. Stay home (if you really want to know, the answer is B).  Even though more than one competence is being tested, the results supposedly reflect a single skill.  Similarly, a good number of studies related to listening comprehension don’t actually look at listening, but general comprehension.

This is often the case for research on the effects of subtitles on comprehension. Most offer modifications of the following model: create a number of easily digestible video clips from a TV show/movie (2-15 minutes in length); have a control group with no subtitles, a group with subtitles in the language being studied, and maybe a group with subtitles in the listener’s native language; have them watch in class (sometimes with home prep); and finally, have them answer a series of listening comprehension questions (what’s being said, what’s meant by…, why did so and so, etc.)  In the majority of these cases, the subtitled groups outperform the non-subtitled group.  Wow.    

As we’ve already established, listening and reading are processed similarly.  The main difference was that reading was easier to control, and in this sense, we could argue that it provided easier comprehension than listening (which had initial perception and parsing before assigning meaning).  Although in these studies, comprehension questions may be asked orally, the initial information is provided in duplicate: in speech and in writing.  Studies by Markham, Garza, and Huang and Eskey all more or less follow this model, and they have the same results as mentioned above.

Interestingly enough, when listening is rated outside of comprehension of specific material, subtitled groups do not perform on top. Mehdi Latifi, Ali Mobalegh, and Elham Mohammadi performed a study where the main goal wasn’t to check comprehension of materials, but improvement.  To measure this, students were presented with a pre- and post-study IELTS listening test.  Over the course of a summer, students watched 2-minute segments of an English-language movie (one group with subtitles, one without, and one with bi-modal subtitles).  Students were all instructed on phrases and constructions used, and each group viewed the same video segment.  

What’s significant about this study is that at the end of the course, which group performed better on the IELTS test?  No subtitles.  Yeah, really.  Granted, I’ve already explained why listening tests may not be the best means of measuring an individual’s listening skill, but this still raises a few eyebrows and starts pointing us in a direction we should have already considered: although all exposure to the language you study is good, there are ways to highlight a skill and maximize efficiency/improvement.

Game of Thrones and BBC?

It’s true that subtitles help immediate comprehension, but to really improve our listening skills, we have to isolate listening from reading and highlight these bottom-up processes. Although ditching subtitles is a good start, we should remember that the listening exercises in the above studies are accompanied by some level of instruction. Unfamiliar words and phrases may have been explained ahead to time to ensure that students can comprehend the speech they were given. This is important because if we were to just sit down to watch a film, we very well may just let top-down processing kick in and rely on visuals to fill in our comprehension gaps. Even though we’ve established that this is useful and a tool used in everyday communication, it could easily lull us into a false sense of confidence, feeling like we understand more than we actually do (take a look at this video in fauxEnglish to see what I mean).

The natural alternative to video then would be isolated audio. So why not listen to just BBC news to improve English listening? Here we again see some of those factors that can adversely affect our listening comprehension (namely rate of speech and knowledge of language).  Without the ability to control the flow of speech, it’s entirely possible that we will not able to comprehend the input we’re fed, even if the speech itself could be comprehended in written format. This is where we often see distinctions between input, a potentially comprehended product, and intake, the comprehended product necessary for acquisition. Think of it like this: if we were to watch a TV show in a language we do not remotely understand, there is a chance that through incidental learning we would pick up a few words or phrases (especially if we have subtitles and obvious visual context to assist us). However, if we were to rely solely on listening, then we can safely assume that at a natural rate of speech, we could very well listen for an indiscriminate period of time with little to no acquisition. In this case, there’s ample input, but there’s no intake. This especially applies if we’re listening to programs with little to no repetition.

To bridge this gap and maximize our effectiveness, Hulstijn outlines a basic process: listen, check for comprehension (if questions are not teacher provided, simply ask yourself questions…like, “Hey, did I understand why this happened?”), check the written text (scripts or transcripts), find what should have been understood but was not, study new material, replay the audio. The beauty of this model, is that it can be done at home provided you have audio modified for a foreign audience (depending on your level) and an accompanying text. Where does one find such materials?

PODCASTS! There is a plethora of podcasts for language learners (especially English), and a good number of them include a transcript. For more advanced learners, even some podcasts for natives include transcripts. By following the above method, we have an audio-only approach for improving bottom-up processes, which students can do on their own.  Additionally, if audio is short enough, it is not too much of an issue to find the time to relisten throughout the day/week.  Yay!


Students need input.  The more, the better.  However, despite the importance of top-down comprehension, we cannot neglect bottom-up listening skills. Even though we may feel that watching native-language television and movies is an enjoyable way to gain exposure to natural lanaguage, depending on the listener’s fundamental linguistic skills, the bulk of speech may never move from input to intake.

Still, every little bit does count.  If your goal is to relax and maybe pick up a few words here or there, then watching a show with or without subtitles seems like a great choice. However, if your goal is to effectively improve your listening and linguistic skills, then there are better means to optimize your time spent studying.

* I should add that these students are not living in an immersion enivornment (not in an English-speaking country) and have limited exposure to the language.

** Throughout the decades of study on listening acquisition, from Krashen on, there have been a number of different definitions of input and intake. I like to settle on the above, where input is the spoken utterance, and intake is understood speech.


  1. Ahmadi, M. & Gilakjani, A. “A study of factors affecting EFL learners’ English listening comprehension and the strategies for improvement.” Jounral of Language Teaching and Research, 2 (5), 2011. 977-988.
  2. Anderson, Stephen C. (1985). “Animate and inanimate pronominal systems in Ngyemboon- Bamileke.” Journal of West African Languages 15(2): 61-74
  3. Brindley, G. (1997). Investigating second language listening ability: Listening skills and item difficulty. In G. Brindley & G. Wigglesworth, (Eds.), Access: Issues in language test design and delivery (pp. 65-85). Sydney, Australia: Macquarie University, National Center for English Language Teaching and Research.
  4. Brown, G. and G. Yule. 1983. Teaching the spoken language. Cambridge: Cambridge University Press.
  5. Carrell, P. L., & Eisterhold, J. C. (1983). Schema theory and ESL reading pedagogy. TESOL Quarterly, 17, 553-573.
  6. Garza, T. (1991). Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24 (3), 239‐258.
  7. Huang, H‐C. & D. E. Eskey (1999–2000). The effects of closed‐captioned television on the listening comprehension of intermediate English as a foreign language (ESL) students. Journal of Educational Technology Systems, 28 (1), 75–96.
  8. Hulstijn, J. H. (2001). Intentional and incidental second language vocabulary learning: A reappraisal of elaboration, rehearsal and automaticity.  In P. Robinson (Ed.), Cognition and second language instruction (pp.258-286). Cambridge: Cambridge University Press.
  9. Hulstijn, J. h. (2003, in press).  Connectionist models of language processing and the training of listening skills with the aid of multimedia software.  Computer Assisted Language Learning.
  10. Hyslop, N. & Tone, B. “Listening: Are we teaching it, and if so, how?” ERIC Digest. 3, 1988.
  11. Latifi, M., Mobalegh, A. & Mohammadi, E. “Movie subtitles and the improvement of listening comprehension ability: does it help?” The Journal of Language Teaching and Learning. 1 (2), 2011. 18-29.
  12. Markham, P. [L.] (1999). Captioned videotapes and second‐language listening word recognition. Foreign
    Language Annals, 32 (3), 321–328
  13. Mecartty, F. (2000). Lexical and grammatical knowledge in reading and listening comprehension by foreign language learners of Spanish.  Applied Language Learning, 11, 323-348.
  14. Mendelsohn, D. J. (1994). Learning to listen: A strategy-based approach for the second language learner. San Diego: Dominie Press
  15. Osada, N. (2001). What strategy do less proficient learners employ in listening comprehension?: A reappraisal of bottom-up and top-down processing.  Journal of the Pan-Pacific Association of Applied Linguistics, 5, 73-90
  16. Pearson, P. David, and Fielding, Linda. “Instructional implications of listening comprehension research.” Urbana, Illinois: Center for the Study of Reading, 1983. 28 pp. [ED 227 464]
  17. Rivers, W. M. (1983B). Speaking in Many Tongues. 3rd edition. London: Cambridge University Press.
  18. Segalowitz, N., & Segalowitz, S. (1993). Skilled performance practice and the differentiation of speed-up of automatization effects: Evidence from second language word recognition.  Applied Psycholinguistics, 19, 53-67.
  19. Vandergrift, L. “Listening to learn or learning to listen?” Annual Review of Applied Linguistics, 24, 2004, pp. 3-25.
  20. Wilt, Miriam E. “A study of teacher awareness of listening as a factor in elementary education,” Journal of Educational Research, 43 (8), April, 1950, pp. 626-636.

Leave a Reply

Your email address will not be published. Required fields are marked *